ruby - Ruby/Mechanize を使用して、選択した要素の次の要素を選択する

Question

この質問を具体的に見つけることができませんでした。うまくいけば、それが古い質問の新しいバリエーションであることに間違いはありません。

(一貫性のない) p.red 要素 text() の後にテーブルを選択できるようにしたいと考えています。ここで、「p」にはテキスト「Alphabetical」が含まれていませんが、テキスト「OVERALL」が含まれています..

DOM は次のようになります。

<p class=red>Some Text</p>
  <table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>Some Text</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>OVERALL</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

テーブルはページごとに異なる数で表示されます。

その p タグの text() を取得したいだけでなく、その直後のテーブルも取得します。繰り返しますが、text() には「OVERALL」が含まれていますが、「ALPHABETICAL」は含まれていません。配列を作成し、一致しない要素を .reject() する必要がありますか? 現時点ではよくわかりません。Ruby と Mechanize を使用するのはかなり初めてです。事前に助けてくれてありがとう!

score 2 · Accepted Answer

Nokogiri の CSS 評価を使用することは、素晴らしくクリーンです。

require 'nokogiri'

doc = Nokogiri::HTML(<<EOT)
<p class=red>Some Text</p>
  <table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>Some Text</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>

<p class=red>OVERALL</p>
<table class="newclass">
  <tr></tr>
  <tr></tr>
</table>
EOT

puts doc.at('p:contains("OVERALL")').to_html
# >> <p class="red">OVERALL</p>

puts doc.at('p:contains("OVERALL") ~ table').to_html
# >> <table class="newclass">
# >> <tr></tr>
# >> <tr></tr>
# >> </table>

score 1 · Accepted Answer

pタグ：

agent.parser.xpath('//p[.="OVERALL"]')[0]

その後のテーブル：

agent.parser.xpath('//p[.="OVERALL"]')[0].next.next

また：

agent.parser.xpath('//p[.="OVERALL"]/following-sibling::table[1]')[0]

ruby - Ruby/Mechanize を使用して、選択した要素の次の要素を選択する

2 に答える 2

Related

Reference