[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Remove puch hole marks from image with Java

Posted on 2012-08-10
12
Medium Priority
?
304 Views
Last Modified: 2013-01-08
Hello,

There are some commercial, and expensive tools out there to remove puch hole marks from scanned documents.

I need to automate that and do it in Java so that I have a clean scanned page.

I am trying to find algorithms or free libraries to do so. That is, image enhancing with Java.

At this point I was able to learn how to auto deskew an image but I also need to remove the puch holes.

To goal is to have a java app to do a batch operation on scanned pages and remove puch holes if they exist.

Any ideas or suggestions would be extremely helpful.

Thanks.
0
Comment
Question by:CarlosScheidecker
12 Comments
 
LVL 12

Expert Comment

by:basav_com
ID: 38282983
Try Jmagick which is a wrapper for imagemagick
http://www.jmagick.org/lenya/jmagick/live/index.html
0
 
LVL 1

Author Comment

by:CarlosScheidecker
ID: 38282993
I've looked at it but it doesn't have what I need. JAI seems more complete than that and I could not figure an algorithm or a class that would accomplish the removal or fillings of the holes.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 38283426
Any chance of a sample image as attachment?
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 1

Author Comment

by:CarlosScheidecker
ID: 38284390
CEHJ,

Here is an image to ilustrate the problem. I will also attach a sample one.It has a page with the hole and then treatment without them.
0
 
LVL 1

Author Comment

by:CarlosScheidecker
ID: 38284396
Here is a sample image. Note that aside the holes, there are black margin marks that need to be removed automatically as well.

Sample image
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 38285166
Two things spring to mind

a. copy the right margin to the left (or a white rectangle)
b. try OCR on the whole thing
0
 
LVL 1

Author Comment

by:CarlosScheidecker
ID: 38285872
Some OCRs will not work if you have the holes, Tesseract will. But we want to treat the image not OCR it. There are some algorithms for it that I am exploring but I was hoping I would have some solutions here as well.
0
 
LVL 28

Expert Comment

by:dpearson
ID: 38290584
Most image enhancement is based around finding the right type of filter to apply to the image, so it removes the noise and leaves behind the original image.

There's lots of examples of this sort of thing in Java here:
http://java.sun.com/products/java-media/jai/forDevelopers/jai1_0_1guide-unc/Image-enhance.doc.html

However in your case you're looking to remove a very specific mark rather than general noise, so I think you're going to want to write an algorithm that identifies that mark directly.  That seems pretty simple - something like summing the value of the neighboring 10x10 pixels around each pixel in the image.  Then the peak values of those summations are likely the centers of your holes - which you then target explicitly and erase.

You could further bias it to add in the distance of the pixel from the center of the page, so circles close to the edge score higher than ones in the center.

The size of the box (the 10x10 I gave) should be targeted to approximately the size of the holes you expect - the better that fit the better the algorithm should work.

Doug
0
 
LVL 1

Author Comment

by:CarlosScheidecker
ID: 38290600
Doug,

First of all, thanks for your comment. I am familiar with that page. The name of the algorithm is actually Houch Circle which searches for circles on a binary image. So I have been reading about it and working on its implementation.

Any other suggestions as far as removing borders, etc are quite welcome.
0
 
LVL 1

Author Comment

by:CarlosScheidecker
ID: 38290601
Sorry for the typo. Hough circle.
0
 
LVL 1

Accepted Solution

by:
mjdeale earned 1000 total points
ID: 38446214
Carlos,

Take a look at the leptonic image processing library.  It is in C, but ...

-- Michael
0
 
LVL 1

Author Closing Comment

by:CarlosScheidecker
ID: 38755742
It somewhat does not fix the problem.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface This is the third article about the EE Collaborative Login Project. A Better Website Login System (http://www.experts-exchange.com/A_2902.html) introduces the Login System and shows how to implement a login page. The EE Collaborative Logi…
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.
Suggested Courses
Course of the Month18 days, 19 hours left to enroll

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question