PDF Extraction

PDF Extraction 0.9

Free
The PDF Extraction Toolkit (formerly PDF Analyser) is a Java framework.
 
0.9 (See all)

The PDF Extraction Toolkit (formerly PDF Analyser) is a Java framework built upon the PDFBox library for performing document analysis of PDF files and creating custom conversion methods to HTML and other formats. GraphWrap, a system for graph-based wrapping, or semi-automatic data extraction, from PDF files, is also included within the PDF Extraction Toolkit. A GUI is also included, built upon the XMIllum library, which enables the results of the document analysis process to be visualized. Also, an interactive graph visualization is provided to view the graph structures created by the system and allow the interactive creation and testing of graph-based wrappers on PDF documents.

Info updated on: