2. XHTML/HTML + CSS

XHTML ist directly rendered by PDFreactor using a default CSS style sheet for HTML in addition to the document's style.

PDFreactor automatically renders HTML and XHTML form controls such as buttons, input fields and text areas. PDFreactor can even be used to automatically generate interactive PDF forms (sometimes refered to as AcroForms) from HTML / XHTML forms. For details please see the chapter "PDF specific features".

HTML must be converted to XHTML first. This is automatically done by one of the built-in cleanup processes of PDFreactor.

XHTML code is also automatically cleaned, when a parse error occurs, e.g. if the document is not well-formed.

The following Cleanup processes are available:

CyberNeko. CyberNeko is the default cleanup used by PDFreactor. This HTML parser fixes the following XHTML incompatibilities:

jTidy. If the cleanup performed by CyberNeko is not sufficient, use jTidy. This cleanup handles content a bit more aggressively than CyberNeko, and may drop elements if it can not clean them. JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. jTidy provides the following features (among others):

TagSoup. The TagSoup cleanup is able to fix most namespace issues that may occur when importing content from non-standard sources such as Office applications. It has the following cleanup features:

Note:

To use a cleanup on a document fragment (or any other document with no "<html>" root element) you must manually set the doctype to XHTML.