我想遍历这个表并获得名称wins和loss,然后将它们插入CSV,列表或JSON文件中。 使用下面的代码,即使尝试for循环,我也只能获得表中第一个元素的HTML:
from bs4 import BeautifulSoup as bs
import requests
from requests import get
import pandas as pd
import json
import time
from time import sleep
url = 'https://www.basketball-reference.com/international/euroleague/2020.html'
time.sleep(2)
source = requests.get(url).text
time.sleep(4)
soup = bs(source,'lxml')
time.sleep(2)
for item in soup.find_all('div' , class_='table_outer_container'):
#prints only first item
team=item.div.table.tbody.tr
print(team)
表元素的结构:
null
<div class="table_outer_container">
<div class="overthrow table_container" id="div_elg_standings">
<table class="sortable stats_table now_sortable" id="elg_standings" data-cols-to-freeze="1"><caption>EuroLeague Standings Table</caption>
<colgroup><col><col><col></colgroup>
<thead>
<tr class="over_header"><th></th>
<th aria-label="" data-stat="Regular Season" colspan="2" class=" over_header center">Regular Season</th>
</tr>
<tr>
<th aria-label=" " data-stat="team" scope="col" class=" poptip center"> </th>
<th aria-label="Wins" data-stat="wins|Regular Season" scope="col" class=" poptip right" data-tip="Wins" data-over-header="Regular Season">W</th>
<th aria-label="Losses" data-stat="losses|Regular Season" scope="col" class=" poptip right" data-tip="Losses" data-over-header="Regular Season">L</th>
</tr>
</thead>
<tbody>
<tr data-row="0"><th scope="row" class="left " data-stat="team"><a href="/international/teams/anadolu-efes/2020.html">Anadolu Efes</a></th><td class="right " data-stat="wins|Regular Season">24</td><td class="right " data-stat="losses|Regular Season">4</td></tr>
<tr data-row="1"><th scope="row" class="left " data-stat="team"><a href="/international/teams/real-madrid/2020.html">Real Madrid</a></th><td class="right " data-stat="wins|Regular Season">22</td><td class="right " data-stat="losses|Regular Season">6</td></tr>
<tr data-row="2"><th scope="row" class="left " data-stat="team"><a href="/international/teams/barcelona/2020.html">FC Barcelona</a></th><td class="right " data-stat="wins|Regular Season">22</td><td class="right " data-stat="losses|Regular Season">6</td></tr>
<tr data-row="3"><th scope="row" class="left " data-stat="team"><a href="/international/teams/cska-moscow/2020.html">CSKA Moscow</a></th><td class="right " data-stat="wins|Regular Season">19</td><td class="right " data-stat="losses|Regular Season">9</td></tr>
<tr data-row="4"><th scope="row" class="left " data-stat="team"><a href="/international/teams/maccabi-tel-aviv/2020.html">Maccabi FOX Tel Aviv</a></th><td class="right " data-stat="wins|Regular Season">19</td><td class="right " data-stat="losses|Regular Season">9</td></tr>
<tr data-row="5"><th scope="row" class="left " data-stat="team"><a href="/international/teams/panathinaikos/2020.html">Panathinaikos OPAP</a></th><td class="right " data-stat="wins|Regular Season">14</td><td class="right " data-stat="losses|Regular Season">14</td></tr>
<tr data-row="6"><th scope="row" class="left " data-stat="team"><a href="/international/teams/ulker-fenerbahce/2020.html">Fenerbahçe Beko</a></th><td class="right " data-stat="wins|Regular Season">13</td><td class="right " data-stat="losses|Regular Season">15</td></tr>
<tr data-row="7"><th scope="row" class="left " data-stat="team"><a href="/international/teams/khimki/2020.html">Khimki</a></th><td class="right " data-stat="wins|Regular Season">13</td><td class="right " data-stat="losses|Regular Season">15</td></tr>
<tr data-row="8"><th scope="row" class="left " data-stat="team"><a href="/international/teams/vitoria/2020.html">Kirolbet Baskonia</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="9"><th scope="row" class="left " data-stat="team"><a href="/international/teams/olympiakos/2020.html">Olympiacos</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="10"><th scope="row" class="left " data-stat="team"><a href="/international/teams/zalgiris/2020.html">Žalgiris</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="11"><th scope="row" class="left " data-stat="team"><a href="/international/teams/valencia/2020.html">Valencia Basket</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="12"><th scope="row" class="left " data-stat="team"><a href="/international/teams/milano/2020.html">AX Armani Exchange Olimpia</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="13"><th scope="row" class="left " data-stat="team"><a href="/international/teams/red-star/2020.html">Crvena zvezda mts</a></th><td class="right " data-stat="wins|Regular Season">11</td><td class="right " data-stat="losses|Regular Season">17</td></tr>
<tr data-row="14"><th scope="row" class="left " data-stat="team"><a href="/international/teams/villeurbanne/2020.html">LDLC ASVEL</a></th><td class="right " data-stat="wins|Regular Season">10</td><td class="right " data-stat="losses|Regular Season">18</td></tr>
<tr data-row="15"><th scope="row" class="left " data-stat="team"><a href="/international/teams/alba-berlin/2020.html">Alba Berlin</a></th><td class="right " data-stat="wins|Regular Season">9</td><td class="right " data-stat="losses|Regular Season">19</td></tr>
<tr data-row="16"><th scope="row" class="left " data-stat="team"><a href="/international/teams/triumph-moscow/2020.html">Zenit Saint Petersburg</a></th><td class="right " data-stat="wins|Regular Season">8</td><td class="right " data-stat="losses|Regular Season">20</td></tr>
<tr data-row="17"><th scope="row" class="left " data-stat="team"><a href="/international/teams/bayern-muenchen/2020.html">Bayern Munich</a></th><td class="right " data-stat="wins|Regular Season">8</td><td class="right " data-stat="losses|Regular Season">20</td></tr>
</tbody></table>
</div>
</div>
null
我将非常感谢您的帮助,指导我正确地迭代这个元素并获得团队名称,赢和输。 提前谢谢你。
请尝试以下操作:
代码
import requests
from bs4 import BeautifulSoup
url = 'https://www.basketball-reference.com/international/euroleague/2020.html'
soup = BeautifulSoup(requests.get(url).text, 'html.parser')
teams = soup.find('div', class_='table_outer_container')
for team in teams.find_all('a'):
# prints only first item
team_name = team.text
wins = team.parent.parent.find('td', {'data-stat': 'wins|Regular Season'}).text
losses = team.parent.parent.find('td', {'data-stat': 'losses|Regular Season'}).text
print(team_name, wins, losses)
输出量
Anadolu Efes 24 4
Real Madrid 22 6
FC Barcelona 22 6
CSKA Moscow 19 9
Maccabi FOX Tel Aviv 19 9
Panathinaikos OPAP 14 14
Fenerbahçe Beko 13 15
Khimki 13 15
Kirolbet Baskonia 12 16
Olympiacos 12 16
Žalgiris 12 16
Valencia Basket 12 16
AX Armani Exchange Olimpia 12 16
Crvena zvezda mts 11 17
LDLC ASVEL 10 18
Alba Berlin 9 19
Zenit Saint Petersburg 8 20
Bayern Munich 8 20