Solved

Optical Matching/ OCR

Posted on 2011-02-14
11
1,074 Views
Last Modified: 2012-05-11
I am trying to find a API that would match two images I have and give a % similarity and also even match text on the two images being compared.
0
Comment
Question by:surajguptha
  • 6
  • 5
11 Comments
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 500 total points
Comment Utility
OpenCV is a powerful image processing library. I would recommend it for most of those types of applications. It's in C/C++. (It has a C++ wrapper, but all the real code runs in C so it's very fast).
http://opencv.willowgarage.com/wiki/

Here's an intro to doing OCR in OpenCV.
http://blog.damiles.com/?p=93

You would need to decide what makes images similar but code exists for many different options. I won't bother to post any since there are so many.
0
 
LVL 21

Author Comment

by:surajguptha
Comment Utility
Thanks, I would like to use this in my .Net application. Is there any image processing library that is more suitable to use with .net and perhaps even written in .Net?
0
 
LVL 37

Expert Comment

by:TommySzalapski
Comment Utility
You don't want image processing written in .NET. It's too slow. What you really want is image processing code written in C with .NET wrappers around it so you can call the routines from .NET.

Fortunately for you, you are not the only one who wanted this. Emgu CV is exactly what you are looking for I think. You can write all your code in .NET, but the behind-the-scenes code for the processing will run in efficient and fast C code.
http://www.emgu.com/wiki/index.php/Main_Page
0
 
LVL 21

Author Comment

by:surajguptha
Comment Utility
It is indeed an awesome library for image processing but I will have to develop a lot of intelligence teaching this system about a ton of languages I want my software to support. Is there perhaps some other software for just OCR that is aware of a group of languages?

Thanks
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 500 total points
Comment Utility
Tesseract-OCR is Google's open source OCR solution.
http://code.google.com/p/tesseract-ocr/
There are a lot of people working on it for many different languages and scripts.
Agian, there already exists a .NET wrapper for it called tessnet2
http://www.pixel-technology.com/freeware/tessnet2/
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 21

Author Comment

by:surajguptha
Comment Utility
Thanks Tommy! It looks very promising.
I tried downloading it and when I launched the Demo application, it crashes. It was heart breaking :P
0
 
LVL 37

Expert Comment

by:TommySzalapski
Comment Utility
Hmm... Make sure all the dlls are in the right folders (Tessnet2.dll). Probably needs to be in the same folder as the .exe or a folder referenced in your %PATH% system variable.
I assume you have the needed runtimes already installed.
0
 
LVL 21

Author Comment

by:surajguptha
Comment Utility
Yes, I tried every combination of folder structure and tried putting the files everywhere just in case I was doing wrong. The moment I click on "OCR", the button that is supposed to convert, it crashes! No events in event viewer, nothing. Just dies.
0
 
LVL 37

Expert Comment

by:TommySzalapski
Comment Utility
I guess the next step would be to see if you can compile a quick test project. If that works, who needs the demo? Maybe they referenced some weird runtime functions or something that your computer doesn't have.
0
 
LVL 21

Author Comment

by:surajguptha
Comment Utility
I actually did. I took the source of the demo application and compiled it and then used the newly generated exe. But did reuse the existing .net wrapper since I did not have VC++ to recompile it on my machine.
0
 
LVL 21

Author Comment

by:surajguptha
Comment Utility
Works fine! Had to change some folder structures!! Thats all!
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Update 21-May-2015: I temporarily removed the source code to make major changes to the program. Regards, Joe INTRODUCTION This article presents a solution to a question (http://www.experts-exchange.com/Programming/Installation/Q_28396542.html)…
Iteration: Iteration is repetition of a process. A student who goes to school repeats the process of going to school everyday until graduation. We go to grocery store at least once or twice a month to buy products. We repeat this process every mont…
In this video, we show how to convert an image-only PDF file into a PDF Searchable Image file, that is, a file with both the image (typically from scanning) and text, which is created in an automated fashion with Optical Character Recognition (OCR) …
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now