我试过很多种方法来解决这个问题,但都找不到答案
我有以下HTML:
<section id="content4" class="tab-content">
<p>
<div class="Text_Title">Product 1</div>
<div style="display: inline-block;">Red Ball<div></p>
<p>
<div class="Text_Title">Product 2</div>
<div style="display: inline-block;">Green Ball</div></p>
<p>
<div class="Text_Title">Product 3</div>
<div style="display: inline-block;">Yellow Ball</div></p>
我试图从div=text_title
和style=display:inline-block;
中提取文本
我试图获取的输出:
Product 1 - Red Ball
Product 2 - Green Ball
Product 3 - Yellow Ball
使用findAll
提取匹配给定条件的标记对象列表,然后使用zip
并行迭代显示可迭代对象。
from bs4 import BeautifulSoup
input_ = """<section id="content4" class="tab-content">
<p>
<div class="Text_Title">Product 1</div>
<div style="display: inline-block;">Red Ball<div></p>
<p>
<div class="Text_Title">Product 2</div>
<div style="display: inline-block;">Green Ball</div></p>
<p>
<div class="Text_Title">Product 3</div>
<div style="display: inline-block;">Yellow Ball</div></p>"""
soup = BeautifulSoup(input_, "html.parser")
for x, y in zip(soup.findAll("div", attrs={"class": "Text_Title"}),
soup.findAll("div", attrs={"style": "display: inline-block;"})):
print(x.text, "-", y.text)
Product 1 - Red Ball
Product 2 - Green Ball
Product 3 - Yellow Ball