pdx:embedHTML Inserts HTML content into the Word document

pdx:embedHTML

Inserts HTML content into the Word document.

Description

Element definition

<pdx:content>
    <pdx:embedHTML pdx:isFile="" pdx:baseURL="" pdx:customListStyles="" pdx:downloadImages="" pdx:parseAnchors="" pdx:parseDivsAsPs="" pdx:parseFloats="" pdx:filter="" pdx:wordFragmentName="" pdx:addDefaultStyles="" pdx:removeLineBreaks="" pdx:useHTMLExtended="">
        <pdx:data pdx:dataId="" pdx:type="HTML"><![CDATA[html code]]></pdx:data>
        <pdx:wordStyles pdx:strict="">
            <pdx:wordStyle pdx:styleType="class" pdx:name="style">class value</pdx:wordStyle>
            <pdx:wordStyle pdx:styleType="id" pdx:name="style">id value</pdx:wordStyle>
            <pdx:wordStyle pdx:styleType="tag" pdx:name="style"><![CDATA[html code]]></pdx:wordStyle>
        </pdx:wordStyles>
    </pdx:embedHTML>
</pdx:content>

This element allows the insertion of HTML into the current Word document.

Practically all HTML tags and CSS styles are supported.

This element transforms HTML directly into WordML and it is compatible with OpenOffice and PDF conversion.

You may find a more detailed explanation of this useful element in the HTML to Word section of the API documentation.

Attributes and sub-elements

html

The html code to be translated into WordML: it could be a string or the path to a file.

options

key	Description
addDefaultStyles	True as default, if false prevents adding default styles when strictWordStyles is false.
baseURL	The base URL used to complete the relative paths of links and images.
customListStyles	If true checks if there is a custom list style with that name and uses it.
downloadImages	If true inserts the images into the docx document, otherwise just links them as an external source.
filter	Only renders the filtered contents. It could be an string denoting an id ('#myId'), a class ('.myClass'), a HTML tag or a valid XPath expression ('//expression').
isFile	True for files and false for strings.
parseAnchors	If true parses the anchors included in the HTML code.
parseDivsAsPs	If true parses the div elements as paragraphs.
parseFloats	If true preserves the floating properties of images and tables.
strictWordStyles	If true ignores all CSS styles and uses the styles set via the wordStyles option (see next).
useHTMLExtended	If true allows using HTML Extended tags. Disable as default. Only available in Premium licenses.
wordStyles	One may associate different Word styles to HTML classes, ids or tags.

Code samples

#Example 1

config.xml

<?xml version="1.0" encoding="UTF-8"?>
<pdx:document xmlns:pdx="http://www.phpdocx.com/main">
    <pdx:config>
        <pdx:output pdx:name="output" pdx:type="docx" />
    </pdx:config>
</pdx:document>

content.xml

<?xml version="1.0" encoding="UTF-8"?>
<pdx:document xmlns:pdx="http://www.phpdocx.com/main">
    <pdx:content>
        <pdx:embedHTML>
            <pdx:data pdx:dataId="" pdx:type="HTML"><![CDATA[<h1 style="color: #b70000">An embedHTML() example</h1><p>We draw a table with border and rawspans and colspans:</p><table border="1" style="border-collapse: collapse"><tbody><tr><td style="background-color: yellow">1_1</td><td rowspan="3" colspan="2">1_2</td></tr><tr><td>Some random text.</td></tr><tr><td><ul><li>One</li><li>Two <b>and a half</b></li></ul></td></tr><tr><td>3_2</td><td>3_3</td><td>3_3</td></tr></tbody></table>]]></pdx:data>
        </pdx:embedHTML>
    </pdx:content>
</pdx:document>

settings.xml

<?xml version="1.0" encoding="UTF-8"?>
<pdx:document xmlns:pdx="http://www.phpdocx.com/main">
    <pdx:settings>
    </pdx:settings>
</pdx:document>

import os as os
import sys as sys
sys.path.append(os.path.abspath("wrappers/python/XmlDocx"))
import XmlDocx as XmlDocx
document = XmlDocx.XmlDocx("config.xml")
document.setDocumentProperties("settings.xml")
document.addContent("content.xml")
document.setXmlDocxPath("xmldocx path")
document.render()

The resulting Word document looks like: