python - playwrightによるネストhtmlの要素を選択するにはどうすればよいですか

翻译自：https://stackoverflow.com/questions/70181226 2021-12-01T08:42:36.567

307 次

以下のhtmlからテキストを抽出したいのですが、別の方法を試してみましたが、それでも失敗します.page_id、article_idはランダムです.テキストのリストを取得したい.

html:

<div id=ufi_{page_id}>
  <div>
    <div></div>
    <div></div>
    <div></div>
    <div></div>    
    <div>
      <div id={article_id}>
          <div></div>
          <div>I want to get the text here</div>
          <div></div>
      </div>
      <div id={article_id2}>
          <div></div>
          <div>I want to get the text here</div>
          <div></div>
      </div>
      <div id={article_id3}>
          <div></div>
          <div>I want to get the text here</div>
          <div></div>
      </div>
    </div>
  </div>
</div>

コード：

comments = page2.query_selector(f'xpath=//div[@id="ufi_{page_id}"]>>div>>//div[5]')
comments_ls = comments.query_selector_all("div>>//div[1]")
if comments:
    for com in comments_ls:
        print(com.text_content())

python - playwrightによるネストhtmlの要素を選択するにはどうすればよいですか

1 に答える 1

Related

Reference