我正试图提取一个文件的内容。文档文件(及其样式),然后将其上载到WordPress以创建新帖子。
我正在使用PHPWord库,但我只能获取纯文本的内容,我想知道是否有可能提取数据及其样式。
目前,从. doc中提取信息的代码部分如下所示:
// Read contents
$source = 'c0000001.doc';
echo date('H:i:s'), " Reading contents from `{$source}`";
$phpWord = \PhpOffice\PhpWord\IOFactory::load($source, 'MsDoc');
$text = '';
$sections = $phpWord->getSections();
foreach ($sections as $s) {
$els = $s->getElements();
foreach ($els as $e) {
if (get_class($e) === 'PhpOffice\PhpWord\Element\Text') {
$text .= $e->getText();
}
}
}
非常感谢你的帮助。
是的,你可以像跟随一样
// Read contents
$source = 'c0000001.doc';
echo date('H:i:s'), " Reading contents from `{$source}`";
$phpWord = \PhpOffice\PhpWord\IOFactory::load($source, 'MsDoc');
$text = '';
$sections = $phpWord->getSections();
foreach ($sections as $s) {
$els = $s->getElements();
foreach ($els as $e) {
if(!$e instanceof \PhpOffice\PhpWord\Element\Text){
continue;
}
$text .= $e->getText();
$styles = $e->getParagraphStyle(); //do somethign witth the style
$styles = $e->getFontStyle(); //do somethign witth the style
}
}
方法来源:https://github.com/PHPOffice/PHPWord/blob/develop/src/PhpWord/Element/Text.php