コードは次のとおりです。
from pyquery import PyQuery
content = '''<td field="exceptions"><div style="white-space:normal;height:auto;" \
class="datagrid-cell datagrid-cell-c2-exceptions">Traceback (most recent call last):<br>\
File "./crawler.py", line 381, in <module><br> \
crawler.start()<br> File "./crawler.py", line 153, in start<br> \
raise RemoteTransportException(e)<br>RemoteTransportException: \
This socket is already used by another greenlet: <bound method Waiter.\
switch of <gevent.hub.Waiter object at 0x7f64d499d6e0>><br></div></td>'''
pq = PyQuery(content)
for content in pq('td div'):
print content.text # get Traceback (most recent call last):
for content in pq('td div'):
for sub in content.getchildren():
print sub.text
# Traceback (most recent call last):
# None
# None
# None
# None
# None
# None
td div
あなたが得るように、要素のコンテンツを取得したいのですが、そうでなければなりません
Traceback (most recent call last):
File "./crawler.py", line 381, in <module>
crawler.start()
File "./crawler.py", line 153, in start
raise RemoteTransportException(e)
RemoteTransportException: This socket is already used by another greenlet: <bound method Waiter.switch of <gevent.hub.Waiter object at 0x7f64d499d6e0>>
しかし、私はちょうど得 Traceback (most recent call last):
ました。では、サブラベルが含まれるすべてのテキストを見つけるにはどうすればよいtd div
でしょうか?