5. Tagged PDF

Tagged PDF files contain information about the structure of the document. The information about the structure is transported via so-called "PDF tags". Tagging a PDF usually makes it more accessible to screen readers, handhelds an similar devices.

Using the setAddTags API method, you can add PDF tags to the PDF documents generated with PDFreactor. If you are generating a PDF from HTML or XHTML, the HTML and XHTML elements are automatically mapped to the corresponding PDF tags, so all you have to do is setting this property to enable this feature:

pdfReactor.setAddTags(true);

If you are generating a PDF from another XML language (e.g. DocBook), the elements of this XML language are not mapped to PDF tag types. Instead, they are translated to PDF tags using the XML element name. Thus, the element "para" would be mapped to to the PDF tag "para", while the PDF tag type "P" may be more appropriate for this element.

You can however manually map XML elements to PDF tag types using style properties. These style properties are "-ro-pdf-tag-type" and "-ro-alt-text". "-ro-pdf-tag-type" is used to map an element of the XML language you are using to a PDF tag, for example:

para { -ro-pdf-tag-type: "P"; }

If you were using DocBook, this would map the DocBook element "para" to the PDF tag "P".

The property "-ro-alt-text" is used to specify an alternative description for an XML element. Example:

img {
    -ro-pdf-tag-type: "Figure";
}
img[alt] {
    -ro-alt-text: attr(alt);
}

The example above maps the XHTML & HTML element "img" to the PDF tag "Figure", and the content of its "alt" attribute to an alternative description for this tag.