• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 569
  • Last Modified:

Extracting small section from PDF and saving as JPEG

I have 4,000 full-page pdfs, each with a signature box in the lower left corner. What I need is just the siganture boxes as separate jpegs. Clearly, too many to do by hand. Looking to automate it as much as possible. The pdfs are from scanned hard copies, so there are no images embedded. I would love to hear some ideas, thanks.
0
K_Deutsch
Asked:
K_Deutsch
1 Solution
 
Gary DavisDir Internet SvcsCommented:
Perhaps do a batch convert of the 4000 pdfs to an image (http://www.medicalnerds.com/batch-converting-pdf-to-jpgjpeg-using-free-software/) and then, assuming the signatures are in the same location - nn pixels down and to the left and with a standard height and width, you can then batch trim the images to result in just those clips. Snagit Convert is one tool that can do that.

Gary Davis
0
 
todd_beedyCommented:
AutoHotkey can replicate keystrokes and mouse moves on a screen.

If the signature locations are all in the same location on the same document you can "record" your mouse moves and keyboard strokes, fine tune it, and then use that.

I envision something such as segregating the documents into groups of 50, open all 50, run the replication part, saving them with an filename based on your save structure.

While this may not be the best solution, it is easy to implement and will be fairly fast once started.
0
 
K_DeutschAuthor Commented:
Error message

C:\Program Files\ImageMagick-6.7.6-Q16>convert test.pdf c:\test.jpg

convert.exe: `%s' (%d) "gswin32c.exe" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOP
ROMPT -dMaxBitmap=500000000 -dEPSCrop -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=
pngalpha" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72"  "-sOutputFile=C:/Us
ers/KYLEDE~1/AppData/Local/Temp/magick-ua7W5Bdo--0000001" "-fC:/Users/KYLEDE~1/A
ppData/Local/Temp/magick-w0TlEW0v" "-fC:/Users/KYLEDE~1/AppData/Local/Temp/magic
k-hYIDsMq_" @ error/utility.c/SystemCommand/1896.
convert.exe: Postscript delegate failed `test.pdf': No such file or directory @
error/pdf.c/ReadPDFImage/668.
convert.exe: missing an image filename `c:\test.jpg' @ error/convert.c/ConvertIm
ageCommand/3017.
0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

 
Joe Winograd, Fellow&MVEDeveloperCommented:
This can easily be achieved with a great freeware imaging program called IrfanView that I've been using for many years:
http://www.irfanview.com/

Click the Download link on the left to download IrfanView and click the PlugIns link on the left to download the PlugIns, which are needed to give you PDF capability. Install IrfanView first, then install the PlugIns.

Look in Help, click the Index tab, and then click <Batch conversion>. If you'd like to check it out before downloading and installing, I have attached that Help section as a PDF file. The quick summary is that you'll be doing a Batch Crop, i.e., cropping each PDF in the lower left corner by specifying the pixels to crop. You'll need to experiment manually to get the cropping where you want it, and then you'll let it rip on all 4,000 files. Of course, I strongly suggest that you make a copy (or two!) of all 4,000 files for safe keeping elsewhere before you start this process. Regards, Joe
IrfanView-Batch-Conversion-Help.pdf
0
 
K_DeutschAuthor Commented:
Here is a sample of the PDF I am working with. Again I want a jpeg of only the signature box bottom left corner.

https://docs.google.com/file/d/0B9Ga3bzjO-rUVUVOZnhyanpTU213WWdyQThXTmZRZw/edit

I am liking InfranView, but I am inexperienced with the X-pos, y-pos, etc. crop settings. The sheet is standard 8.5X11. Could you speak to the ballaprk crop settings I would be using to get only the desired area. I can fien tune from there. Thanks!
0
 
K_DeutschAuthor Commented:
It is simple, I see.
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
Yes, very simple. I trust you figured it out. If you didn't notice this, here's a great trick. Use your mouse to select an area...just left-click and drag to create a rectangle. When you release the button, it will show the crop area in pixels in the title bar. Here's an example that I did on the lower left section of a letter size page:
IrfanView cropIn this example it says 79x963;295x82. Those are exact parameters you can feed to the Batch Crop screen, as shown here:
IrfanView-batch-crop-settingRegards, Joe
0
 
K_DeutschAuthor Commented:
I never imagined the solution could be so simple and switfly executed. Well done.
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
Thanks! I'm glad it worked for you. Regards, Joe
0
 
K_DeutschAuthor Commented:
Just had a thought. When this project goes into mass production, I will have have a 4,000 page PDF as a starting point...all the same form just a unique signature per form. Any chance of Infraview cropping the same thing out of all PDF pages and saving 4,000 JPEGS that way, or should I just split the PDF first?
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
Sort of. It can't do it against each page of the PDF, but it can easily create all 4,000 JPGs from the source PDF. Start the process by running:

Options>Multipage images>Extract all pages...

You'll get this screen:
IrfanView-multipage-extractMake sure you select JPG in the <Save as> box. IrfanView will create 4,000 separate JPGs in the folder of your choice. Then you're all set for the Batch Crop run. Regards, Joe
0
 
K_DeutschAuthor Commented:
Great! At this point, I am branching this project of mine into a new and separate question so more points are put on the table. Hope to hear from you, Joe!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now