使用span title和span class刮取数据

提问者：小点点

使用span title和span class刮取数据

我正在使用Python Anaconda将数据刮到Excel表单中。我遇到了两个网站的麻烦。

站点1

<div id="ember3815" class="ember-view">
<p class="org-top-card-module__company-descriptions Sans-15px-black-55%">
<span class="company-industries org-top-card-module__dot-separated-list">
  Industry
</span>
<span class="org-top-card-module__location org-top-card-module__dot-separated-list">
  City, State
</span>
<span title="62,346 followers" class="org-top-card-module__followers-count org-top-card-module__dot-separated-list">
  62,346 followers
</span>

null

我在试着拉跨度标题。我尝试过的东西（我也以find_all的形式尝试过它们）：

text = soup.find('span',{'class':"company-industries org-top-card-module__dot-separated-list"})

text = soup.find('p',{'class':"org-top-card-module__company-descriptions Sans-15px-black-55%"})

text = soup.body.find('span', attrs={'class': 'org-top-card-module__location org-top-card-module__dot-separated-list'})

text = soup.find('span',{'class': 'org-top-card-module__location org-top-card-module__dot-separated-list'})

我肯定也有我尝试过的其他事情没有列出，因为我不是全部都记得。我不是程序员，我只是想弄清楚这一点来拉数据进行分析。救命？

站点2

我需要从下面的html中提取值8,052。

<section class="zwlfE">
<div class="nZSzR">...</div>
<ul class="k9GMp ">
<li class="Y8-fY ">...</li>
<li class-"Y8-fY ">
<a class="g47SY " title="8,052">8,052</span>" followers"
</a>
</li>
<li class="Y8-fY ">...</li>
</ul>
<div class="-vDIg">...</div>
</section>

我试过：

text=soup.find('span',{'class':“g47sy”}）
与上面类似，但带有div和li标记

我所尝试的一切结果都是[]。

请帮帮忙？

共1个答案

匿名用户

获取跨距标题

from bs4 import BeautifulSoup
html ="""<div id="ember3815" class="ember-view">
<p class="org-top-card-module__company-descriptions Sans-15px-black-55%">
<span class="company-industries org-top-card-module__dot-separated-list">
  Industry
</span>
<span class="org-top-card-module__location org-top-card-module__dot-separated-list">
  City, State
</span>
<span title="62,346 followers" class="org-top-card-module__followers-count org-top-card-module__dot-separated-list">
  62,346 followers
</span>"""

soup = BeautifulSoup(html, "html.parser")
print( soup.find("span", class_="org-top-card-module__followers-count org-top-card-module__dot-separated-list")["title"])

输出：

62,346 followers

而对于站点2

print( soup.find("a", class_="g47SY")["title"])

使用span title和span class刮取数据

共1个答案

相关问题

热门标签

使用span title和span class刮取数据

共1个答案

相关问题

热门标签

微信关注