Mark
asked on
how to save a pdf file
I have the code shown below. Seems simple enough. I'm opening a pdf document, then saving it elsewhere. However, when I do this I cannot open the new document. I get "A drawing error occured" in adobe reader. The original file has 3 pages and the new drawing-error file has 3 blank pages.
Surely this is something people have programmed billions of time over the years. What's my problem?
Surely this is something people have programmed billions of time over the years. What's my problem?
String inputFile = "/www/tomcat/webapps/courtscan/OH/demo/documents/2009/1-60.pdf";
String outputFile = "/usr/local/apache/htdocs/new.pdf";
PDDocument doc = PDDocument.load( inputFile );
out.print(doc.getNumberOfPages() );
doc.save( outputFile );
doc.close();
Are you *certain* no exceptions occur when your'e saving or closing?
ASKER
ozlevanon, I'm using the PDFbox package. Yes, I know how to simply copy a file. I am intending to muck with the pdf contents and write the results (see https://www.experts-exchange.com/questions/24303092/How-to-add-graphic-to-pdf-file.html), and I got this same error. My code shown is the result of stripping things out little by little to identify the fundamental error.
CEHJ, I get no tomcat exception and I've not seen anything in the mod_jk.log and nothing fishy in $CATALINA_HOME/logs. I had a try/catch around the save(), but it caught nothing, so I took it out for simplicity in my posting. It does create the new.pdf file with size 53122 which is a bit smaller than the original: 54353. The first few characters of the file are "%PDF-1.5", just like the original. Does code like I've shown work for you? If you want to examine the actual output it can be found in http://www.fluxrunner.com/new.pdf. Adobe can save it and everything. A copy of the original is there too as old.pdf.
CEHJ, I get no tomcat exception and I've not seen anything in the mod_jk.log and nothing fishy in $CATALINA_HOME/logs. I had a try/catch around the save(), but it caught nothing, so I took it out for simplicity in my posting. It does create the new.pdf file with size 53122 which is a bit smaller than the original: 54353. The first few characters of the file are "%PDF-1.5", just like the original. Does code like I've shown work for you? If you want to examine the actual output it can be found in http://www.fluxrunner.com/new.pdf. Adobe can save it and everything. A copy of the original is there too as old.pdf.
Are you *certain* no exceptions occur when your'e saving or closing?
>>Does code like I've shown work for you?
I'll try it
I'll try it
It works for me. Does it correctly print the number of pages?
Both old.pdf and new.pdf are valid files of 3 pages from what i can see ...
ASKER
It does correctly print the number of pages.
Are you able to open the output file with Acrobat Reader? I've tried opening http://www.fluxrunner.com/new.pdf on a couple of different computers and at best I get 3 blank pages.
Are you able to open the output file with Acrobat Reader? I've tried opening http://www.fluxrunner.com/new.pdf on a couple of different computers and at best I get 3 blank pages.
>>Are you able to open the output file with Acrobat Reader?
I'm not using Acrobat at the moment, but i can read it fine
I'm not using Acrobat at the moment, but i can read it fine
ASKER
> I'm not using Acrobat at the moment, but i can read it fine
1. What are you using to read it? Do you have access to Acrobat? If so, could you try it? This is going to have to be readable by Acrobat since that's what joe-average has.
2. Are you reading the file from my link or the file you created with your program? If the latter, could you post that pdf in the file section of your response and I'll download it and check it out.
3. What version of PDFbox are you using? Mine appears to be 0.7.3; leastwise, that's the name on the jarfile: PDFBox-0.7.3.jar
1. What are you using to read it? Do you have access to Acrobat? If so, could you try it? This is going to have to be readable by Acrobat since that's what joe-average has.
2. Are you reading the file from my link or the file you created with your program? If the latter, could you post that pdf in the file section of your response and I'll download it and check it out.
3. What version of PDFbox are you using? Mine appears to be 0.7.3; leastwise, that's the name on the jarfile: PDFBox-0.7.3.jar
>>1. What are you using to read it?
xpdf
Do you have access to Acrobat?
Not right at the moment
2. Are you reading the file from my link or the file you created with your program?
Both. Both work (is there much point in attaching?)
3. What version of PDFbox are you using?
The same, oldish one. You might try another API perhaps
xpdf
Do you have access to Acrobat?
Not right at the moment
2. Are you reading the file from my link or the file you created with your program?
Both. Both work (is there much point in attaching?)
3. What version of PDFbox are you using?
The same, oldish one. You might try another API perhaps
ASKER
I'm stumped. I'm open to suggestions. What do you mean by "another API"?
If you don't mind, yes, go ahead and attach the output of your program. I'd like to see if acrobat can read it AND it like to compare it with mine.
If you don't mind, yes, go ahead and attach the output of your program. I'd like to see if acrobat can read it AND it like to compare it with mine.
code is fine, looks like pdfbox is creating a pdf that is incompatible with your version of acrobat. Perhaps try updating pdfbox to the latest version if you haven't already.
ASKER
This is getting frustrating. I went to http://www.pdfbox.org. Apparently 0.7.3 is the latest version of PDFBox from October 2006. I re-downloaded anyway and I did an md5sum on my jar and the new jar. They match. So, I downloaded the latest vesion of Acrobat (9.1) to my XP notebook and tried opening my new.pdf with that. With the 9.1 version I got the error message: "Insufficient data for an image"
I don't get it. Why can you guys open my pdf fine, but I can't? What else could be wrong? Some other library or jar? This is a fairly recent linux build, so I should have the latest of everything.
I've attached a dump of the first block of the new and original pdf files. As you can see they are different.
Could there be some PDF settings for encryption or drawing that are defaulted differently on my setup than yours?
I'm out of ideas and I have to get this program working for a trade show at the end of the month. Help!
I don't get it. Why can you guys open my pdf fine, but I can't? What else could be wrong? Some other library or jar? This is a fairly recent linux build, so I should have the latest of everything.
I've attached a dump of the first block of the new and original pdf files. As you can see they are different.
Could there be some PDF settings for encryption or drawing that are defaulted differently on my setup than yours?
I'm out of ideas and I have to get this program working for a trade show at the end of the month. Help!
root@webhost1:/usr/local/apache/htdocs# fd -p new.pdf
0: 25 50 44 46 2D 31 2E 35 0A 25 F6 E4 FC DF 0A 31 %PDF-1.5.%.....1
10: 20 30 20 6F 62 6A 0A 3C 3C 0A 2F 4C 61 6E 67 20 0 obj.<<./Lang
20: 28 78 2D 64 65 66 61 75 6C 74 29 0A 2F 50 61 67 (x-default)./Pag
30: 65 73 20 32 20 30 20 52 0A 2F 54 79 70 65 20 2F es 2 0 R./Type /
40: 43 61 74 61 6C 6F 67 0A 3E 3E 0A 65 6E 64 6F 62 Catalog.>>.endob
50: 6A 0A 33 20 30 20 6F 62 6A 0A 3C 3C 0A 2F 43 72 j.3 0 obj.<<./Cr
60: 65 61 74 69 6F 6E 44 61 74 65 20 28 44 3A 32 30 eationDate (D:20
70: 30 37 30 35 31 31 31 38 32 36 31 39 2B 30 30 27 070511182619+00'
80: 30 30 27 29 0A 2F 43 72 65 61 74 6F 72 20 28 50 00')./Creator (P
90: 61 70 65 72 50 6F 72 74 20 31 31 2E 30 29 0A 2F aperPort 11.0)./
A0: 4D 6F 64 44 61 74 65 20 28 44 3A 32 30 30 37 30 ModDate (D:20070
B0: 35 31 31 31 38 32 36 31 39 2B 30 30 27 30 30 27 511182619+00'00'
C0: 29 0A 2F 50 72 6F 64 75 63 65 72 20 28 50 61 70 )./Producer (Pap
D0: 65 72 50 6F 72 74 20 31 31 2E 30 29 0A 3E 3E 0A erPort 11.0).>>.
E0: 65 6E 64 6F 62 6A 0A 32 20 30 20 6F 62 6A 0A 3C endobj.2 0 obj.<
F0: 3C 0A 2F 43 6F 75 6E 74 20 33 0A 2F 4B 69 64 73 <./Count 3./Kids
? q
root@webhost1:/usr/local/apache/htdocs# fd -p old.pdf
0: 25 50 44 46 2D 31 2E 35 0D 0A 25 F1 F9 F7 F6 33 %PDF-1.5..%....3
10: 2E 33 0D 0A 34 20 30 20 6F 62 6A 0D 0A 3C 3C 0D .3..4 0 obj..<<.
20: 0A 2F 42 69 74 73 50 65 72 43 6F 6D 70 6F 6E 65 ./BitsPerCompone
30: 6E 74 20 31 20 0D 0A 2F 43 6F 6C 6F 72 53 70 61 nt 1 ../ColorSpa
40: 63 65 20 2F 44 65 76 69 63 65 47 72 61 79 20 0D ce /DeviceGray .
50: 0A 2F 46 69 6C 74 65 72 20 2F 4A 42 49 47 32 44 ./Filter /JBIG2D
60: 65 63 6F 64 65 20 0D 0A 2F 48 65 69 67 68 74 20 ecode ../Height
70: 32 31 39 39 20 0D 0A 2F 4C 65 6E 67 74 68 20 35 2199 ../Length 5
80: 20 30 20 52 20 0D 0A 2F 4E 61 6D 65 20 2F 69 6D 0 R ../Name /im
90: 61 67 65 30 20 0D 0A 2F 53 75 62 74 79 70 65 20 age0 ../Subtype
A0: 2F 49 6D 61 67 65 20 0D 0A 2F 54 79 70 65 20 2F /Image ../Type /
B0: 58 4F 62 6A 65 63 74 20 0D 0A 2F 57 69 64 74 68 XObject ../Width
C0: 20 31 37 30 30 20 0D 0A 3E 3E 0D 0A 73 74 72 65 1700 ..>>..stre
D0: 61 6D 0D 0A 00 00 00 00 30 00 01 00 00 00 13 00 am......0.......
E0: 00 06 A4 00 00 08 97 00 00 00 C8 00 00 00 C8 01 ................
F0: 00 00 00 00 00 01 00 01 01 00 00 37 AD 08 00 02 ...........7....
?
>>I've attached a dump of the first block of the new and original pdf files. As you can see they are different.
That could be down to version differences. Having said that, good reader software should be able to cope with different versions
That could be down to version differences. Having said that, good reader software should be able to cope with different versions
If you're downloading from linux to windows over say ftp, make sure that you're doing that in binary format or it will corrupt the file. md5sum it at both ends - the sums should be identical
ASKER
One would think Acrobat could cope with version differences.
I'm trying to access the file in two ways. I create the file in the apache htdocs directory so I can get to it via the web using http://www.fluxrunner.com/new.pdf. IE will open that file in Acrobat. Also, I scp'd it to my local linux host with my Windows workstation samba mounting it AND to my windows workstation directly. md5sum confirms that the new.pdf created on fluxrunner is the same as the one I've downloaded locally.
I tried again using a different source pdf; only one page. Same thing. "Insufficient data for an image"
Are you *sure* you've actually opened MY pdf and it worked OK?
I'm trying to access the file in two ways. I create the file in the apache htdocs directory so I can get to it via the web using http://www.fluxrunner.com/new.pdf. IE will open that file in Acrobat. Also, I scp'd it to my local linux host with my Windows workstation samba mounting it AND to my windows workstation directly. md5sum confirms that the new.pdf created on fluxrunner is the same as the one I've downloaded locally.
I tried again using a different source pdf; only one page. Same thing. "Insufficient data for an image"
Are you *sure* you've actually opened MY pdf and it worked OK?
ASKER
Here are the imports I'm using. Am I missing a critical one for proper saving?
<%@ page import="java.lang.String,java.io.*,java.util.*" %>
<%@ page import="java.util.Date,java.text.SimpleDateFormat,
java.lang.StringBuffer,java.text.FieldPosition" %>
<%@ page import="java.lang.Object,
org.pdfbox.exceptions.COSVisitorException,
org.pdfbox.io.RandomAccessFile,
org.pdfbox.pdmodel.PDDocument,
org.pdfbox.pdmodel.PDPage,
org.pdfbox.pdmodel.edit.PDPageContentStream,
org.pdfbox.pdmodel.graphics.xobject.PDCcitt,
org.pdfbox.pdmodel.graphics.xobject.PDJpeg,
org.pdfbox.pdmodel.graphics.xobject.PDXObjectImage" %>
Well let's check again. It's this one is it not: http://www.fluxrunner.com/new.pdf ?
ASKER
I've also stripped out of the last import all but java.lang.Object and org.pdfbox.pdmodel.PDDocum ent, same error :(
>>Here are the imports I'm using
That looks fine. You don't need the lang ones. If you were missing any you'd get exceptions - and you tell me you haven't got any ..?
That looks fine. You don't need the lang ones. If you were missing any you'd get exceptions - and you tell me you haven't got any ..?
ASKER
> Well let's check again. It's this one is it not: http://www.fluxrunner.com/new.pdf ?
Yes, does it open OK for you?
Yes, does it open OK for you?
Opens fine for me
ASKER
> Opens fine for me
Amazing! Are you opening it with Acrobat or something else? It needs to open with acrobat since most users will be accessing from their PC's. I also tried this URL from a completely different windows laptop, same problem.
> You don't need the lang ones. If you were missing any you'd get exceptions - and you tell me you haven't got any ..?
No, no exceptions.
Amazing! Are you opening it with Acrobat or something else? It needs to open with acrobat since most users will be accessing from their PC's. I also tried this URL from a completely different windows laptop, same problem.
> You don't need the lang ones. If you were missing any you'd get exceptions - and you tell me you haven't got any ..?
No, no exceptions.
>>Amazing! Are you opening it with Acrobat or something else?
xpdf still
You said you had more luck with the browser. Does it open with that?
xpdf still
You said you had more luck with the browser. Does it open with that?
> What else could be wrong? Some other library or jar?
As I mentioned earlier your code is fine. The pdf you are creating is fine.
The incompatibility seems to be with your version of acrobat and pdfbox, is strange though.
have you tried opening it on a different box?
As I mentioned earlier your code is fine. The pdf you are creating is fine.
The incompatibility seems to be with your version of acrobat and pdfbox, is strange though.
have you tried opening it on a different box?
ASKER
> You said you had more luck with the browser. Does it open with that?
No, because it launches Acrobat, so same thing. I don't have another windows based reader. If you can get a hold of a workstation running acrobat reader, I'd be curious what your results are. I suspect you won't be able to read it there either. I've tried Acrobat on a couple of different computers.
Meanwhile I'm going to investigate another api. If PDFBox generates pdf's that only work with xpdf or such-like then it's of little use to me. Adobe is the inventor of pdf and if an api doesn't work with Adobe's reader there's something wrong. Nobody's jumped into this topic saying they've had no problem reading PDFBox pdf's in Acrobat. I'm beginning to think PDFBox is a dead product anyway. The latest (and only) release appears to be 0.7.3 from October 2006. So no one's working on it. The http://incubator.apache.org/pdfbox/download.html site says "No releases of Apache PDFBox are yet available" and refers you back to the 0.7.3 release (besides, the "0" first digit says "beta" to me).
You mentioned iText. Does that have, or can I build a jar using it? Do you have another recommendation? I think I'll try something like that today. I don't know what else I can try with PDFBox.
No, because it launches Acrobat, so same thing. I don't have another windows based reader. If you can get a hold of a workstation running acrobat reader, I'd be curious what your results are. I suspect you won't be able to read it there either. I've tried Acrobat on a couple of different computers.
Meanwhile I'm going to investigate another api. If PDFBox generates pdf's that only work with xpdf or such-like then it's of little use to me. Adobe is the inventor of pdf and if an api doesn't work with Adobe's reader there's something wrong. Nobody's jumped into this topic saying they've had no problem reading PDFBox pdf's in Acrobat. I'm beginning to think PDFBox is a dead product anyway. The latest (and only) release appears to be 0.7.3 from October 2006. So no one's working on it. The http://incubator.apache.org/pdfbox/download.html site says "No releases of Apache PDFBox are yet available" and refers you back to the 0.7.3 release (besides, the "0" first digit says "beta" to me).
You mentioned iText. Does that have, or can I build a jar using it? Do you have another recommendation? I think I'll try something like that today. I don't know what else I can try with PDFBox.
ASKER
> have you tried opening it on a different box?
Yes, as mentioned in my previous message, and both version 6 and version 9 of Acrobat.
Yes, as mentioned in my previous message, and both version 6 and version 9 of Acrobat.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
> I'm beginning to think PDFBox is a dead product anyway.
it is
what are your requirements?
it is
what are your requirements?
ASKER
Requirements: I am creating a website for attorneys to file documents online. The document must be timestamped (so I have to add text and/or images) and the stamped document saved as a pdf file. These pdf's must be accessible by attorneys and the court, virtually all of whom will be accessing from their office computers running windows and acrobat.
ASKER
oh yeah, and it is being demo'd at a show in two weeks!
hang on while I boot a windows box and try it here
If you have too much difficulty, just treat it as an image, timestamp it and save it as a jpg and have done with it
same problem here, if you want to chase I'd be looking at any images in the pdf (eg. try on a pdf without images)
if you want to look at itext then the following are good resources
http://itextdocs.lowagie.com/tutorial/general/webapp/index.php
http://javaboutique.internet.com/tutorials/iText/
http://www.geek-tutorials.com/java/itext/itext_index.php
if you want to look at itext then the following are good resources
http://itextdocs.lowagie.com/tutorial/general/webapp/index.php
http://javaboutique.internet.com/tutorials/iText/
http://www.geek-tutorials.com/java/itext/itext_index.php
ASKER
Thanks for the feedback.
I'm getting ready to install itext. I'll be back when I have some info.
Those pdf's I've been testing with are simple court filings, no images. If iText works, no sense chasing. If not, then I've got problems.
I'm getting ready to install itext. I'll be back when I have some info.
Those pdf's I've been testing with are simple court filings, no images. If iText works, no sense chasing. If not, then I've got problems.
they are actually all images
ASKER
Yeah!!!! Finally!!!! It works with iText. Now I can move on to trying to samp the document (which is another posting!)
Thanks for your time and patience.
Thanks for your time and patience.
<%@ page import="java.lang.Object,
com.lowagie.text.pdf.PdfReader,
com.lowagie.text.pdf.PdfStamper" %>
<%
String inputFile = "/www/tomcat/webapps/courtscan/OH/demo/documents/2009/2-20.pdf";
String outputFile = "/usr/local/apache/htdocs/new.pdf";
PdfReader doc = new PdfReader(inputFile);
PdfStamper stamp = new PdfStamper(doc,new FileOutputStream(outputFile));
stamp.close();
doc.close();
%>
ok - glad you're making progress
:-)
Open in new window