PDF-Indexing
This extension makes it possible to convert the textual content of PDF documents into a nodeset. This means the content of a PDF document can be used for outputting on the website for example. Probably the most frequent application is the indexing of PDF contents for the normal full text search of a website.
So that the methods for text extraction from .pdf files are available, the module must be configured in the “web.config” of the Render Engine as follows:
<module type="Onion.RenderEngine.CommonModules.PDFIndexing.Module, Onion.RenderEngine.CommonModules.PDFIndexing" />
Namespace: http://www.getit.de/2008/indexing/pdf
Name | Parameter | Return type | Description |
---|---|---|---|
parsePdf | xlink:xlink [fromDataSource:boolean] |
nodeset | Returns the textual content of a PDF file as a NodeSet. |