python - 変更中のタグの解析

Question

タグが次のように変化し続ける場合:

<tr id="CN13FUT">
<tr id="CU13FUT">
<tr id="CZ13FUT">
<tr id="CH14FUT">
[...]

BeautifulSoup を使用してこれをどのように読み取ることができますか? これは私が助けを必要としているものです:

table = BeautifulSoup(page)
for tr in table.findAll('tr', attrs = {'id': 'something_here'))
   print tr

使いたくないタグtable.findAll('tr')が他にもあるかもしれないという理由だけで使いたくはありません。tr

score 0 · Accepted Answer

正規表現パターンを使用して、必要な<tr>sを指定できます。

import bs4 as bs
import re

doc = '''<tr id="CN13FUT">
    <tr id="CU13FUT">
    <tr id="CZ13FUT">
    <tr id="CH14FUT">
    <tr id="ButNotThis">
   '''
table = bs.BeautifulSoup(doc)
for tr in table.findAll(id=re.compile(r'CN13|CU13|CZ13|CH14')):
    print(tr)

収量

<tr id="CN13FUT">
</tr>
<tr id="CU13FUT">
</tr>
<tr id="CZ13FUT">
</tr>
<tr id="CH14FUT">
</tr>

score 0 · Accepted Answer

すべての id 属性が「FUT」で終わる場合、

for tr in table.findAll(id=re.compile('FUT$')):
    print(tr)
    print(tr['id']) # to print the id attributes

すべての id 属性が同じ長さ (7) の場合、

for tr in table.findAll('tr', id=lambda x: x and len(x)==7):
    print(tr['id']) # to print the id attributes

python - 変更中のタグの解析

2 に答える 2

Related

Reference