2

Web サイト (National Gallery of Art) で個々の検索結果へのリンクを取得しようとしています。しかし、検索へのリンクは検索結果をロードしません。これが私がそれをやろうとする方法です:

url = 'https://www.nga.gov/collection-search-result.html?artist=C%C3%A9zanne%2C%20Paul'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

個々の結果へのリンクが以下にあることがわかりますsoup.findAll('a')が、それらは表示されません。代わりに、最後の出力は空の検索結果へのリンクです: https://www.nga.gov/content/ngaweb/collection-search-結果.html

最初の検索結果 ( https://www.nga.gov/collection/art-object-page.52389.html )、2 番目の検索結果 ( https://www.nga.gov/collection/art-object-page.52085.html )など?

4

2 に答える 2

1

実際には、API呼び出しのjson応答からデータが生成されています。これがリンクの望ましいリストです。

コード:

import requests
import json

url= 'https://www.nga.gov/collection-search-result/jcr:content/parmain/facetcomponent/parList/collectionsearchresu.pageSize__30.pageNumber__1.json?artist=C%C3%A9zanne%2C%20Paul&_=1634762134895'
r = requests.get(url)

for item in r.json()['results']:
    url = item['url']
    abs_url = f'https://www.nga.gov{url}'
    print(abs_url)

出力:

https://www.nga.gov/content/ngaweb/collection/art-object-page.52389.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.52085.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.46577.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.46580.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.46578.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.136014.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.46576.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.53120.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.54129.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.52165.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.46575.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.53122.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.93044.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.66405.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.53119.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.53121.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.46579.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.66406.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.45866.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.53123.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.45867.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.45986.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.45877.html 
https://www.nga.gov/content/ngaweb/collection/art-object-page.136025.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.74193.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.74192.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.66486.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.76288.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.76223.html
https://www.nga.gov/content/ngaweb/collection/art-object-page.76268.html
于 2021-10-20T20:53:06.640 に答える