我正在尝试从这个网站中web上搜索第二个表:https://fbref.com/en/comps/9/stats/premier-leage-stats.然而,我只有在试图通过查找表标记来访问信息时,才能从第一个表中提取信息。 有谁能向我解释为什么我不能访问第二个表或者告诉我怎么做。
import requests
from bs4 import BeautifulSoup
url = "https://fbref.com/en/comps/9/stats/Premier-League-Stats"
res = requests.get(url)
soup = BeautifulSoup(res.text, 'lxml')
pl_table = soup.find_all("table")
player_table = tables[0]
这方面的事情应该可以做到
tables = soup.find_all("table") # returns a list of tables
second_table = tables[1]
该表位于HTML注释<!--...-->
内。
要从注释中获取表,可以使用以下示例:
import requests
from bs4 import BeautifulSoup, Comment
url = 'https://fbref.com/en/comps/9/stats/Premier-League-Stats'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
table = BeautifulSoup(soup.select_one('#all_stats_standard').find_next(text=lambda x: isinstance(x, Comment)), 'html.parser')
#print some information from the table to screen:
for tr in table.select('tr:has(td)'):
tds = [td.get_text(strip=True) for td in tr.select('td')]
print('{:<30}{:<20}{:<10}'.format(tds[0], tds[3], tds[5]))
打印:
Patrick van Aanholt Crystal Palace 1990
Max Aarons Norwich City 2000
Tammy Abraham Chelsea 1997
Che Adams Southampton 1996
Adrián Liverpool 1987
Sergio Agüero Manchester City 1988
Albian Ajeti West Ham 1997
Nathan Aké Bournemouth 1995
Marc Albrighton Leicester City 1989
Toby Alderweireld Tottenham 1989
...and so on.