This Forum is Currently Locked – Access is Read Only
You must be logged in to post Login


Lost Your Password?

Search Forums:


 






Minimum search word length is 4 characters – Maximum search word length is 84 characters
Wildcard Usage:
*  matches any number of characters    %  matches exactly one character

PDF to TXT COM For Table Analyzer

UserPost

4:33 am
October 27, 2011


VeryDOC

Admin

posts 31

Hello,
I am interesting in automatically extracting tables from PDFs into some format like CSV or XML.

I downloaded the “PDF2TXT COM For Table Analyzer” package in order to evaluate it and have tried running the test.exe against various PDFs I have, which do convert to plain text in notepad, but I am unsure how to get the data from the tables out of the PDF?

4:34 am
October 27, 2011


VeryDOC

Admin

posts 31

Thanks for your message, we suggest you may download VeryDOC PDF Parser SDK from following web page to try,

http://www.verydoc.com/pdfparsersdk.html

you can use VeryDOC PDF Parser SDK to convert your PDF file to HTML file first, the HTML file is contain the X, Y, Width, Height, information for each word, you can analyse these information to get the position for each word easily, you can also layout these word to get more detailed information.

Please refer to a converted HTML file from following web page,

http://www.verydoc.com/pdfpars…..g_0001.htm

VeryDOC


About the VeryDOC Knowledge Base Forum

Forum Timezone: UTC 8

Most Users Ever Online: 33

Currently Online:
2 Guests

Currently Browsing this Topic:
1 Guest

Forum Stats:

Groups: 1
Forums: 1
Topics: 15
Posts: 33

Membership:

There are 36657 Members
There has been 1 Guest

There is 1 Admin

Top Posters:

neomie – 1

Recent New Members: Leonardo Magalhães, Attila Balázs, Zamjet, HvrubchevskayaToth, libideamb, suman

Administrators: VeryDOC (31 Posts)



 
VN:F [1.9.20_1166]
Rating: 6.1/10 (9 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 2 votes)
Forum, 6.1 out of 10 based on 9 ratings