在使用Matcher.group(int group)之前,先应对Pattern中的group的概念有比较清晰的理解。
1.下边来看一下JDK1.7官方文档中关于group概念的解释:
<strong><span style="font-size:14px;">Groups and capturing</span></strong><span style="font-size:14px;"> </span> Group number Capturing groups are numbered by counting their opening parentheses from left to right. In the expression ((A)(B(C))),for example,there are four such groups: 1 ((A)(B(C))) 2 (A) 3 (B(C)) 4 (C) Group zero always stands for the entire expression. Capturing groups are so named because,during a match,each subsequence of the input sequence that matches such a group is saved.The captured subsequence may be used later in the expression,via a back reference,and may also be retrieved from the matcher oncethe match operation is complete.
2.再来看一下关于group方法的解释:
<pre class="html" name="code"><strong><span style="font-size:14px;">Matcher.group()</span></strong> public String group(int group) Returns the input subsequence captured by the given group during the prevIoUs match operation. For a matcher m,input sequence s,and group index g,the expressions m.group(g) and s.substring(m.start(g),m.end(g)) are equivalent. Capturing groups are indexed from left to right,starting at one. Group zero denotes the entire pattern,so the expression m.group(0) is equivalent to m.group(). If the match was successful but the group specified Failed to match any part of the input sequence,then null is returned. Note that some groups,for example (a*),match the empty string. This method will return the empty string when such a group successfully matches the empty string in the input.
|——举个栗子:
public static void extractFromHtmlTag(String htmlTag,String reg){ Pattern pattern = Pattern.compile(reg); Matcher matcher = pattern.matcher(htmlTag); while(matcher.find()){ System.out.println(matcher.group(0)); System.out.println(matcher.group(1)); System.out.println(matcher.group(2)); } }
@Test public void extractHtmlTagTest() String reg = "href=\"(index1.html)(index2.html)\""; String string = "<a href=\"index1.html***index2.html\">首页</a>"; RegexDemo.extractFromHtmlTag(string,reg); }