1

I am trying to parse a specific area of html from this webpage:

http://en.wikipedia.org/w/api.php?action=parse&page=Ringo_Starr&prop=text&section=0&format=txtfm&disablepp&redirects

[Please note this is not the source page, it displays html tags but I am interested in the actual source of this page (Ctrl+u)].

Specifically, I am looking to put all of the lines that begin with:

<span style="color:blue;">&lt;p&gt;</span>

into a String.

enter image description here

Here's how I'm trying to solve -- but I seem to be way off:

      Document doc = Jsoup.connect("http://en.wikipedia.org/w/api.php?action=parse&page=Ringo_Starr&prop=text&section=0&format=txtfm&disablepp&redirects").get();   
      Elements elements = doc.select("span");
      for (Element e : elements) {
           if(e.text().equals("&lt;p&gt;")){
               System.out.println("now get that whole line");
           }
     }

Note: I am using jsoup here -- but would a straight regex would be more effective?

4

2 に答える 2