r - Rでbrタグの後にテキストを抽出するXPath

Question

br次の行のタグの後のテキストを抽出する方法:

<div id='population'>
    The Snow Leopard Survival Strategy (McCarthy <em>et al.</em> 2003, Table
    II) compiled national snow leopard population estimates, updating the work
    of Fox (1994). Many of the estimates are acknowledged to be rough and out
    of date, but the total estimated population is 4,080-6,590, as follows:<br>
    <br>
    Afghanistan: 100-200?<br>
    Bhutan: 100-200?<br>
    China: 2,000-2,500<br>
    India: 200-600<br>
    Kazakhstan: 180-200<br>
    Kyrgyzstan: 150-500<br>
    Mongolia: 500-1,000<br>
    Nepal: 300-500<br>
    Pakistan: 200-420<br>
    Russia: 150-200<br>
    Tajikistan: 180-220<br>
    Uzbekistan: 20-50
</div>

私は限りました：

xpathSApply(h, '//div[@id="population"]', xmlValue)

しかし、私は今立ち往生しています...

score 33 · Accepted Answer

テキストもノードであることを認識すると役立ちます。follow よりも div 内のすべてのテキストは、次<br/>の方法で取得できます。

//div[@id="population"]/text()[preceding-sibling::br]

技術的には、タグ間の <br/>意味は次のとおりです。

//div[@id="population"]/text()[preceding-sibling::br and following-sibling::br]

...しかし、それは現時点であなたが望んでいるものではないと思います。

r - Rでbrタグの後にテキストを抽出するXPath

1 に答える 1

Related

Reference