XHTML ist directly rendered by PDFreactor using a default CSS style sheet for HTML in addition to the document's style.
PDFreactor automatically renders HTML and XHTML form controls such as buttons, input fields and text areas. PDFreactor can even be used to automatically generate interactive PDF forms (sometimes refered to as AcroForms) from HTML / XHTML forms. For details please see the chapter "PDF specific features".
HTML must be converted to XHTML first. This is automatically done by one of the built-in cleanup processes of PDFreactor.
XHTML code is also automatically cleaned, when a parse error occurs, e.g. if the document is not well-formed.
The following Cleanup processes are available:
CyberNeko. CyberNeko is the default cleanup used by PDFreactor. This HTML parser fixes the following XHTML incompatibilities:
adds missing parent elements
automatically closes elements
handles mismatched end tags
It is generally recommended to use the CyberNeko clean-up process.
jTidy. If the cleanup performed by CyberNeko is not sufficient, use jTidy. This cleanup handles content a bit more aggressively than CyberNeko, and may drop elements if it can not clean them. JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. jTidy provides the following features (among others):
Missing or mismatched end tags are detected and corrected
End tags in the wrong order are corrected
Recovers from mixed up tags
Adding the missing "/" in end tags for anchors
Correcting lists by putting in tags missed out
Missing quotes around attribute values are added
Unknown/Proprietary attributes are reported
Proprietary elements are recognized and reported as such
Tags lacking a terminating '>' are spotted
TagSoup. The TagSoup cleanup is able to fix most namespace issues that may occur when importing content from non-standard sources such as Office applications. It has the following cleanup features:
It always returns a cleaned document, i.e. it does not throw an exception
Unbound namespace prefixes are fixed.
This clean-up process is recommended for documents exported from Office applications.
To use a cleanup on a document fragment (or any other document with no "<html>" root element) you must manually set the doctype to XHTML.