The xmlhtml package

Contains renderers and parsers for both XML and HTML 5 document fragments, which share data structures wo that it's easy to work with both. Document fragments are bits of documents, which are not constrained by some of the high-level structure rules (in particular, they may contain more than one root element).

Note that this is not a compliant HTML 5 parser. Rather, it is a parser for HTML 5 compliant documents. It does not implement the HTML 5 parsing algorithm, and should generally be expected to perform correctly only on documents that you trust to conform to HTML 5. This is not a suitable library for implementing web crawlers or other software that will be exposed to documents from outside sources. The result is also not the HTML 5 node structure, but rather something closer to the physical structure. For example, omitted start tags are not inserted (and so, their corresponding end tags must also be omitted).


AuthorChris Smith <>
Upload dateSat Feb 5 15:47:03 UTC 2011
