PDF Extraction 0.9
Free
Latest version:
0.9
See all
Developer:
The PDF Extraction Toolkit (formerly PDF Analyser) is a Java framework built upon the PDFBox library for performing document analysis of PDF files and creating custom conversion methods to HTML and other formats. GraphWrap, a system for graph-based wrapping, or semi-automatic data extraction, from PDF files, is also included within the PDF Extraction Toolkit. A GUI is also included, built upon the XMIllum library, which enables the results of the document analysis process to be visualized. Also, an interactive graph visualization is provided to view the graph structures created by the system and allow the interactive creation and testing of graph-based wrappers on PDF documents.
Comments