我很难弄明白如何才能得到多个标题和标题的第一段。在这种情况下,我只需要h3标题和以下段落。
示例代码
function everything_in_tags($string, $tagname)
{
$pattern = "#<\s*?$tagname\b[^>]*>(.*?)</$tagname\b[^>]*>#s";
preg_match($pattern, $string, $matches);
return $matches[1];
}
$tagname = "h3";
$string = "<h1>This is my title</h1>
<p>This is a text right under my h1 title.</p>
<p>This is some more text under my h1 title</p>
<h2>This is my level 2 heading</h2>
<p>This is text right under my level 2 heading</p>
<h3>First h3</h3>
<p>First paragraph for the first h3</p>
<h3>Second h3</h3>
<p>First paragraph for the second h3</p>
<h3>Third h3</h3>
<p>First paragraph for the third h3</p>
<p>Second paragraph for the third h3</p>
<h2>This is my level 2 heading</h2>
<p>This is text right under my level 2 heading</p>";
//OUTPUT: First h3
echo everything_in_tags($string, $tagname);
我想实现一个foreach循环,但这需要上面的循环按预期工作。
foreach ($headings as $heading && $paragraphs as $paragraph) {
echo "<h3>".$heading."</h3>";
echo "<p>".$paragraph."</p>";
}
//Expected output:
//<h3>First h3</h3>
//<p>First paragraph for the first h3</p>
//<h3>Second h3</h3>
//<p>First paragraph for the second h3</p>
//<h3>Third h3</h3>
//<p>First paragraph for the third h3</p>
所以在上面的例子中,我可以得到第一个h3。但是经过大量的阅读,我似乎找不到如何获得所有的h3和每个的第一段。
如果有人能为我指出正确的方向,并向我解释如何做到这一点,我将不胜感激。非常感谢。
有一个强制性的事实上的答案,这是不使用正则表达式的HTML。受控HTML也有例外,或者错误/bug并不重要,但一般来说,我会同意这一点,相反,我会告诉您一个DOM感知的东西,您可以表达HTML标记和“下一步”的概念。
这是一个有效的示例,尽管您可能需要调整我正在转储的位置。
<?php
$html = <<<TAG
<h1>This is my title</h1>
<p>This is a text right under my h1 title.</p>
<p>This is some more text under my h1 title</p>
<h2>This is my level 2 heading</h2>
<p>This is text right under my level 2 heading</p>
<h3>First h3</h3>
<p>First paragraph for the first h3</p>
<h3>Second h3</h3>
<p>First paragraph for the second h3</p>
<h3>Third h3</h3>
<p>First paragraph for the third h3</p>
<p>Second paragraph for the third h3</p>
<h2>This is my level 2 heading</h2>
<p>This is text right under my level 2 heading</p>
TAG;
$dom = new DomDocument();
// Load the HTML, don't worry about it being a fragment
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
// Grab all H3 tags. This might need to be adjusted if there's more to the depth
$results = $xpath->query("//h3");
foreach ($results as $result) {
var_dump(sprintf('<h3>%1$s</h3>', $result->textContent));
// See if the next element is a P tag
$next = $result->nextElementSibling;
if ($next && 'p' === $next->nodeName) {
var_dump(sprintf('<p>%1$s</p>', $next->textContent));
}
}
输出:
string(17) "<h3>First h3</h3>"
string(39) "<p>First paragraph for the first h3</p>"
string(18) "<h3>Second h3</h3>"
string(40) "<p>First paragraph for the second h3</p>"
string(17) "<h3>Third h3</h3>"
string(39) "<p>First paragraph for the third h3</p>"
此处演示:https://3v4l.org/gvBrv