I have a data file with a PDF imbedded in it. I want to be able to extract the PDF data and create a separate file with that data.
I can do it manually but I cannot seem to stumble across the correct command to allow me to do it in Progress.
Thank you
Can you please be a bit more specific on what exactly are you trying to do? How do you define embedded, are you referring to a PDF document that has 'data' in it (like customer info in header, order data in some document section and so on)?
The PDF file is simply the PDF file opened as raw data and put into the data file..
The data file lay out is like this:
Header
Data
Header
PDF Data.
Thank you for taking an interest in this...
The PDF file is simply the PDF file opened as raw data and put into the data file..
You say you can do it manually. What does that mean? From within Acrobat or Acrobat reader? Do you use copy & paste?
Yes open the data file up in an editor that does not modify the data and strip out the non PDF part of the data and save the file as a PDF.
Yes open the data file up in an editor that does not modify the data and strip out the non PDF part of the data and save the file as a PDF.
Sounds a bit odd.... But when it's basically just plain text operation (and you're on OpenEdge 10), define a LONGCHAR variable and use the COPY-LOB statement. Then you may try the INDEX and SUBSTRING kind of ABL functions to extract parts of the LONGCHAR.
Yes but now it is not just plain text. The PDF part of the data file contains control codes and such that when i tried with a Character data type it stripped out all not Characters. i tried with LongChar but Import/Put does not like LongChar.
Then COPY-LOB it into a MEMPTR and use the GET-BYTE kind of functions.
And there is my problem...i do not understand MEMPTR's....could you give me a sample?
And there is my problem...i do not understand MEMPTR's....could you give me a sample?
Progress Documentation: Programming Interfaces, the Chapter about "Introduction to External Program
Interfaces" has some samples.
COPY-LOB is the easiest way to read a large binary file into a MEMPTR.
And don't forget to use SET-SIZE (mptr) = 0 . when you're done to free the memory (put it in the FINALLY block).
dchalom wrote:
And there is my problem...i do not understand MEMPTR's....could you give me a sample?
forget about memptr, even if you understand that being able to read data from a PDF document won't be as easy as using common string manipulation routines (index, substring, ...)
your best bet is to try to find an 'external' tool to do it, either escaping to OS command or using shared libraries (assemblies could be an option if you have the most recent OE version and are on windows)... for instance try to see what you can get from Xpdf and other tools that use it pdf2ipe (ipe is plain xml), or something that translate it to excel...
good luck
If you are using OEA, System.IO has tons of stream related classes/objects that you can use to manipulate file easily by byte.