PostScript to Text Converter

VB.NET, C# examples for Postscript to Text Converter SDK DLL Library

VeryDOC Postscript to Text Converter SDK does convert Postscript files to text files,

https://www.verydoc.com/ps-to-text.html

https://www.verydoc.com/ps2txt.zip

The following is the VB, VB.NET and C# examples for Postscript to Text Converter SDK  product,

VB example,

----------------------------------------------
Private Declare Function VeryPDF_PSToText Lib "ps2txtsdk.dll" (ByVal strCommandLine As String) As Long

Private Sub ps2txt_Click()
Dim nRet As Long
Dim strCmd As String

strCmd = "ps2txt -$ XXXXXXXXXXXXXXXXXXXXXXXX"
strCmd = strCmd & " C:\test.ps"
strCmd = strCmd & " C:\test.txt"

nRet = VeryPDF_PSToText(strCmd)
MsgBox (Str(nRet))
End Sub
----------------------------------------------

This the VB.NET example,
----------------------------------------------
Public Class Form1

Private Declare Function VeryPDF_PSToText Lib "ps2txtsdk.dll" (ByVal strCommandLine As String) As Integer

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim nRet As Long
Dim strCmd As String
Dim strInPDFFile As String
Dim strOutFile As String

strInPDFFile = Application.StartupPath() & "\test.ps"
strOutFile = Application.StartupPath() & "\vb_net_test.txt"

strCmd = "ps2txt -$ XXXXXXXXXXXXXXXXXXXXXXXX"
strCmd = strCmd & " """ & strInPDFFile & """"
strCmd = strCmd & " """ & strOutFile & """"

MsgBox(strCmd)
nRet = VeryPDF_PSToText(strCmd)
MsgBox(Str(nRet))
End Sub
End Class
----------------------------------------------

This is the C# source code to convert PS file to Text file,
----------------------------------------------
[DllImport("ps2txtsdk.dll")]
internal static extern int VeryPDF_PSToText(string strCommandLine);

private void button1_Click(object sender, EventArgs e)
{
string appPath = Path.GetDirectoryName(Application.ExecutablePath);
string strCmd = string.Empty;
string psFile = appPath + "\\test.ps";
string txtFile = appPath + "\\_out_C#_test.txt";
int nRet;
strCmd = "ps2txt -$ XXXXXXXXXXXXXXXXXXXXXXXX " + "\"" + psFile + "\" " + "\"" + txtFile + "\" ";
MessageBox.Show(strCmd);
nRet = VeryPDF_PSToText(strCmd);
}
----------------------------------------------

If you need any other examples, please feel free to let us know.

VeryDOC

@VeryDOC SDK & COM & CLI

PDF Optimizer (PDF Linearizer) is slow on mapped drive

Hello, we are using your PDF Optimizer (PDF Linearizer) Command Line v2.0.

We've noticed it's very slow in a 64bit environment. We've had to limit our users to files no bigger than 5mg, because we've had so many server crashes.

Do you have any suggestions?

Thanks as always.
Customer
-----------------------------------------------
I think I have the problem figured out. The pdfs we were optimizing were on a mapped drive. If I change to work locally, performance is much better 🙂

Thanks!
Customer
-----------------------------------------------

Thanks for your great information.

VeryDOC

DOC to Any Converter

Questions for Free Trial version of DOC to Any Converter SDK/COM software

Hi,

We are evaluating your product for integration with one of our malware products. Originally we built a converter in-house, but we are becoming increasingly overwhelmed by the amount of file conversions that we must support. Therefor, we have decided to try a third party solution and VeryDoc looks very promising.

We specifically need to support:
doc/docx -> txt, pdf, bmp, jpg, (html)
xls/xlsx -> csv, pdf, bmp, jpg, (html)
ppt/pptx -> pdf, bmp, jpg, (html)

pdf -> txt, bmp, jpg, (html)
html -> txt, pdf, bmp, jpg

any image -> pdf, bmp, jpg

(converting to html is not mandatory but a nice to have..)

Furthermore, we need to be sure of several items or need clarification (please respond inline):
1)Is there a single sdk which we can call that will do all of the above conversions? If not a single, than how many different sdk?

2)Is the sdk threadsafe?

3)When converting from doc, xls, ppt, I am assuming VeryDoc has built their own binary parsing algorithm which is based off the published information on MSDN. The only developers which I know that have successfully performed this are the fellows at Open Office. If VeryDOC has not taken this path, then please disclose in general terms how you tackled this issue. The method by which you extract data has direct effects on our Program since we deal with malware.

4)When converting from docx, xlsx, pptx, I am again assuming VeryDoc uses OpenXML SDK 2.5. Do you have any known limitations when converting? For example, no foreign characters. Cannot handle graphs in xslx, etc.

5)If any of the above conversions use a "intermediate" conversion, we will need to know. Again, since we deal with malware, these could pose as major risks. For example, when we convert to docx -> jpg. We first convert docx to html, and from html we convert to jpg.

6)Are there any size limitations on the conversions?

7)All conversions complete within 10seconds. (From our own conversions, we have noted that conversion to bmp for large files (50mb+) can take a bit of time.)

8)What is your relationship with VeryPDF? I understand you are partners, however, when I click on your "License Agreement", I am sent to VeryPDF's. We just need one SDK, whether it comes from VeryDOC or VeryPDF, this does not concern us.

Best regards,
Customer

------------------------------------------------------

>>We specifically need to support:
>>doc/docx -> txt, pdf, bmp, jpg, (html)
>>xls/xlsx -> csv, pdf, bmp, jpg, (html)
>>ppt/pptx -> pdf, bmp, jpg, (html)
>>pdf -> txt, bmp, jpg, (html)
>>html -> txt, pdf, bmp, jpg
>>any image -> pdf, bmp, jpg
>>(converting to html is not mandatory but a nice to have..)

Yes, our "DOC to Any Converter Command Line" product has above functions, you may download the trial version of "DOC to Any Converter Command Line" from our website to try,

https://www.verydoc.com/doc-to-any.html

>>1)Is there a single sdk which we can call that will do all of the above conversions? If not a single, than how many different sdk?

Yes, we have "DOC to Any Converter SDK/COM" product on following web page,

https://www.verydoc.com/doc-to-any.html

"DOC to Any Converter SDK/COM" is a single SDK product to do the above conversion.

>>2)Is the sdk threadsafe?

Yes, "DOC to Any Converter SDK/COM" is threadsafe.

>>3)When converting from doc, xls, ppt, I am assuming VeryDoc has built their own binary parsing algorithm which is based off the published information on MSDN. The only developers which I know that have successfully performed this are the fellows at Open Office. If VeryDOC has not taken this path, then please disclose in general terms how you tackled this issue. The method by which you extract data has direct effects on our Program since we deal with malware.

Thanks for your message, yes, we can parse DOC format by ourself (without depend on MS Office and OpenOffice), but we can't parse the XLS and PPT formats without MS Office and OpenOffice installed.

Our "DOC to Any Converter SDK/COM" and "DOC to Any Converter Command Line" products are not require MS Office installed to convert Office documents to PDF files and Image files.

"DOC to Any Converter SDK/COM" and "DOC to Any Converter Command Line" products will work by following solutions:

1. If your system has MS Office 2007 + PDF&XPS addon installed, doc2any will use PDF&XPS addon to save MS Office documents to PDF and XPS files,
2. If your system has MS Office installed but without PDF&XPS addon, doc2any will use MS Office to print documents to PDF and XPS files,
3. If your system hasn't MS Office installed, but has OpenOffice installed, doc2any will use OpenOffice to convert documents,
4. If your system hasn't both MS Office and OpenOffice installed, doc2any will use ourself's DOC/RTF render to convert DOC and RTF formats to other formats, but doc2any will not support PPT and XLS formats at this time.

In general, you can do following conversions if your system hasn't MS Office and OpenOffice installed,
1. RTF to HTML without MS Word or OpenOffice,
2. RTF to DOC without MS Word or OpenOffice,
3. RTF to PDF without MS Word or OpenOffice,
4. DOC to HTML without MS Word or OpenOffice,
5. DOC to RTF without MS Word or OpenOffice,
6. DOC to PDF without MS Word or OpenOffice,

If you need convert DOCX, PPT, PPTX, XLS, XLSX document formats to PDF or XPS format, you need install MS Office or OpenOffice.

>>4)When converting from docx, xlsx, pptx, I am again assuming VeryDoc uses OpenXML SDK 2.5. Do you have any known limitations when converting? For example, no foreign characters. Cannot handle graphs in xslx, etc.

Thanks for your message, we are not using OpenXML SDK to convert DOCX, XLSX, PPTX formats, we are using OpenOffice or MS Office to convert DOCX, XLSX, PPTX formats.

>>5)If any of the above conversions use a "intermediate" conversion, we will need to know. Again, since we deal with malware, these could pose as major risks. For example, when we convert to docx -> jpg. We first convert docx to html, and from html we convert to jpg.

Thanks for your message, it is not a good solution to convert DOCX => HTML => JPEG, because HTML may lost the layout or format in original DOCX format. So, our doc2any does convert DOCX to JPEG format directly, we don't use HTML or other formats as the "intermediate".

>>6)Are there any size limitations on the conversions?

Our doc2any is no maximum number of file size limited. It is also no limitation on maximum number of document pages which can be processed at one time.

>>7)All conversions complete within 10seconds. (From our own conversions, we have noted that conversion to bmp for large files (50mb+) can take a bit of time.)

The conversion time is depend on the complexity of input document, if input document contains complexity graphics, the conversion is need more time.

>>8)What is your relationship with VeryPDF? I understand you are partners, however, when I click on your "License Agreement", I am sent to VeryPDF's. We just need one SDK, whether it comes from VeryDOC or VeryPDF, this does not concern us.

VeryDOC is another website which belong to VeryPDF, VeryPDF is focus on PDF products, VeryDOC is focus on DOC, PPT, XLS, RTF, DOCX, PPTX, XLSX, Postscript, XPS, XML, etc. document formats, it is not limited to PDF format. They are belong to same company, just for different type of software.

VeryDOC

DOC to Any Converter

Proper MIME media type for PDF files

What is MIME media type PDF?

The MIME PDF media type indicates that an email’s attached file falls under the application MIME media type category. The file’s subtype is PDF, which stands for Portable Document File. The PDF format allows for the exchange of documents online and via E-mail despite compatibility issues between computers. The mime application/PDF was originally registered as a media type in 1993. A number of applications use the PDF format with Adobe Acrobat being the most prominent. When an email arrives as a mime application/PDF, it means that a PDF file is attached.

It is known to all that PDF contains text, graphics, metadata, annotations, hyperlinks, and bookmarks. For opening PDF file, we must have PDF reader installed and in order to open PDF file properly, we also develop PDF to internal standard which is intended as an electronic document file format for long term preservation. Because the MIME application/PDF is widely understood and the PDF format widely used, the MIME type is used in other Internet Engineering Task Force (IETF) specifications.

What are differences between MIME types Application/PDF and application/x-pdf?

The standard MIME type is application/PDF. The assignment is defined in RFC 3778, The application/PDF Media Type, referenced from the MIME Media Types registry.MIME types are controlled by a standards body, The Internet Assigned Numbers Authority (IANA). This is the same organization that manages the root name servers and the IP address space.The use of x-pdf predates the standardization of the MIME type for PDF. MIME types in the x- namespace are considered experimental, just as those in the vnd. namespace are considered vendor-specific. x-pdf might be used for compatibility with old software.

This is a convention defined in RFC 2045 - Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies.

  • Private [subtype] values (starting with "X-") may be defined bilaterally between two cooperating agents without outside registration or standardization. Such values cannot be registered or standardized.
  1. New standard values should be registered with IANA as described in RFC 2048.

  • A similar restriction applies to the top-level type. From the same source,

2. If another top-level type is to be used for any reason, it must be given a name starting with "X-" to indicate its non-standard status and to avoid a potential conflict with a future official name.(Note that per RFC 2045, "[m]atching of media type and subtype is ALWAYS case-insensitive", so there's no difference between the interpretation of 'X-' and 'x-'.)

So it's fair to guess that "application/x-foo" was used before the IANA defined "application/foo". And it still might be used by folks who aren't aware of the IANA token assignment.

As Chris Hanson said MIME types are controlled by the IANA. This is detailed in RFC 2048 - Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures. According to RFC 3778, which is cited by the IANA as the definition for "application/PDF ",

The application/pdf media type was first registered in 1993 by Paul Lindner for use by the gopher protocol; the registration was subsequently updated in 1994 by Steve Zilles.The type "application/PDF " has been around for well over a decade. So it seems to me that wherever "application/x-PDF " has been used in new apps, the decision may not have been deliberate.

And we really would like to discuss more information about those two file formats. You can contact us as support@verydoc.com .

DOC to Any Converter

Failed to call doc2any.exe from C# or VB.NET source code

Hi. I try to call doc2any.exe from a PHP script using your exeshell class on a 64bit Windows Server 2008 R2 system. The exeshell-x64.dll was registered successfully - but I get the following error if I start the PHP script: PHP Fatal error: Uncaught exception 'com_exception' with message 'Failed to create COM object `exeshell.shell': Klasse nicht registriert.

What can I do to solve this problem?

Thanks in advance.
Regards
Customer
--------------------------------------
Hello.

Actually I'm evaluating your product and downloaded the latest version from your homepage.

Build: Nov 10 2012

The registration of the exeshell-x64.dll was successfull:

Furthermore I activated the php-extension "php_com_dotnet.dll" within the IIS-Manager.

But I still get this error.

Thanks in advance.
Customer
--------------------------------------
I suggest you may call doc2any.exe from your C# or PHP or ASP.NET or other languages directly on your Windows 2008 system, you can use CreateProcess() or Process.Start() to call doc2any.exe application.

You need also set MS Office DCOM run inside an interactive user account instead of system user account, please look at following web pages for more information,

https://www.verydoc.com/doc-to-any-faq.html
https://www.verydoc.com/blog/aspnet-account-dcom-permisson-for-ms-word.html
https://www.verydoc.com/blog/microsoft-excel-application-entry-missing-in-dcomcnfg.html
https://www.verydoc.com/blog/how-to-make-iis7-play-nice-with-office-interop.html
https://www.verydoc.com/others/configure-word-and-excel.htm
https://www.verydoc.com/others/configure%20office%20applications%20to%20run%20under%20the%20interactive%20user%20account.htm
http://www.verypdf.com/wordpress/201201/how-to-call-doc2any-exe-or-htmltools-exe-from-a-service-20896.html

You can also set more answers in our Knowledge Base,

https://www.verydoc.com/blog/category/doc-to-any-converter

If you still can not get it work, please feel free to let us know, we will assist you continue.

VeryDOC
--------------------------------------
Hello.

Thank you for the links to the different resources - but I get an error just during the calling/initialization of the EXEShell COM Library - far before doc2any/Excel/Word is started ....

My actual problem is, that I can't use your EXEShell class as mentioned on your page "CAll DOC to Any Converter Command Line from ..."

https://www.verydoc.com/doc-to-any-shell.html

Special interest for me: calling it from a Windows 2008 Server with IIS 7 and PHP:

..."Example #6 (PHP example),

Customer
--------------------------------------
Windows 2008 Server is the 64bit system, so you need use 64bit EXEShell COM Library, you need use exeshell-x64.dll, but not the exeshell.dll library.

VeryDOC
--------------------------------------
If you check the whole support request/thread you will find my second email and the attachments ...

Of course I'm using exeshell-x64.dll ...

Customer
--------------------------------------
Sorry for the confusion, it seems this problem is caused by permission settings on your server.

The first, please to check, can you run doc2any.exe in CMD window to convert your DOC file to PDF file properly? e.g.,

doc2any.exe D:\test.doc D:\out.pdf

Can you get it work fine in CMD window by manual?

If you can run doc2any.exe in CMD window to convert DOC file to PDF file correctly, but failed when you call it from C#, it is indicate this problem caused by permission of your IIS Server, you need give enough permission to IIS Server, e.g.,

1. Allow IIS Server to launch EXE application,
2. Allow "Everyone" user account to "Full Control" the folder where doc2any.exe exist,
3. Arrange MS Word DCOM Run inside "Administrator" user account instead of default system user account,

IIS_Permission_1

IIS_Permission_2

Once above three permission issues are resolved, I think you should no problem to call doc2any.exe or exeshell-x64.dll from C# to convert DOC files to PDF files.

If you still can not get it work on your server, you may create a remote desktop account on your test machine, send to us the user name and password, we will arrange our engineer to login your test machine and research this problem for you asap.

We suggest you may share your PC or Server by TeamViewer, TeamViewer can be downloaded from following website,

http://www.teamviewer.com

after you installed TeamViewer, please send to us your TeamViewer’s ID and Password, please also arrange your test machine running at 24 hours, after we logged into your test machine and solved the problem, we will send an email to you, then you can close the TeamViewer application.

By the way, we will release a VeryPDF Cloud API Service within a few weeks, with VeryPDF Cloud API Service, you can convert office documents to PDF files easily over the internet, you need install nothing to your Server System, you can just send a HTTP post to convert a DOC, DOCX, PPT, PPTX, XLS, XLSX, RTF file to PDF file, for example,

http://online.verypdf.com/api/?apikey=XXXXXXX&app=doc2any
&infile=https://dl.dropboxusercontent.com/u/5570462/SPAIN.docx

after you execute above URL, you will get a “http://online.verypdf.com/u/public/api/20130711-020606-rta622ohvn.pdf” file, you can download it to your local disk easily.

You can pass “infile” parameter to “  http://online.verypdf.com/api/  ”, our Cloud API Service will convert it to PDF file automatically.

VeryDOC