Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x

OCR

519

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Share tech news, updates, or what's on your mind.

Sign up to Post

Hello,

What factors determine the quality and accuracy of OCR (optical character recognition) and is there much variability among different OCR software?

If there is variability in software, what applications are best (both free and purchased)?

Thanks
0
Important Lessons on Recovering from Petya
LVL 10
Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Extract Text From Images?i have many images,i searched and found some online convertors but doesn't work becouse i have 10.000 image,so i need a mass tool,can someone help me with this,thank you
0
On a Mac Is there a way in Hazel to tell if a file has been ocr’d.
0
Is it possible to program the Raspberry PI camera to do OCR?
Need it to recognize a list of characters and when seen, write the info to a file.  
Need the following info with each entry;
Captured data
Date and time (down to the second)
Location (can be GPS or user entry)

And once a day (or at user request) upload data to a online database.
0
i included opencv and tesseract ocr in visual studio
#include<opencv2\core\core.hpp>
#include<opencv2\highgui\highgui.hpp>
#include "opencv2/imgproc/imgproc.hpp"
#include<baseapi.h>
#include<allheaders.h>
#include<iostream>
#include <vector>
#include <fstream>
#define _CRT_SECURE_NO_WARNINGS
using namespace cv;
using namespace std;
tesseract::TessBaseAPI ocr;

int main()
{
   Mat input = imread("C:\\eurotext.tif",1);
   cvtColor( input, input, CV_BGR2GRAY );

  ocr.Init(NULL, "eng", tesseract::OEM_TESSERACT_ONLY);
 
  ocr.SetPageSegMode(tesseract::PSM_SINGLE_BLOCK);
  ocr.SetImage(input.data, input.cols, input.rows, 1, input.step);
  char* text = ocr.GetUTF8Text();
  cout << "Text:" << endl;
  cout << text << endl;
  cout << "Confidence: " << ocr.MeanTextConf() << endl << endl;
  

}

Open in new window

the build was succeeded but when running

erreur_run.PNG
and


erreur_run2.PNG
0
i tried to add tesseract ocr to visual studio 2010
the build is succeded but when i run  there is an error 0xc0150002

0xc0150002.PNG
i tried to find th missing dll with dependency walker it shows

dependency.PNG
and

Error: The Side-by-Side configuration information for "c:\users\eouerten\documents\visual studio 2010\projects\tess_open\debug\LIBTESSERACT302D.DLL" contains errors. Lapplication na pas pu dmarrer car sa configuration cte--cte est incorrecte. Pour plus dinformations, consultez le journal dvnements dapplications ou utilisez loutil de ligne de commande sxstrace.exe (14001).
Error: The Side-by-Side configuration information for "c:\users\eouerten\documents\visual studio 2010\projects\tess_open\debug\LIBLEPT168D.DLL" contains errors. Lapplication na pas pu dmarrer car sa configuration cte--cte est incorrecte. Pour plus dinformations, consultez le journal dvnements dapplications ou utilisez loutil de ligne de commande sxstrace.exe (14001).
Error: Modules with different CPU types were found.
Warning: At least one module has an unresolved import due to a missing export function in a delay-load dependent module.

Open in new window

0
i included opencv  and tesseract ocr in visual studio 2010
#include<opencv2\core\core.hpp>
#include<opencv2\highgui\highgui.hpp>
#include "opencv2/imgproc/imgproc.hpp"
#include<tesseract\baseapi.h>
#include<leptonica\allheaders.h>
#include<iostream>
#include <vector>
#include <fstream>
#define _CRT_SECURE_NO_WARNINGS
using namespace cv;
using namespace std;
tesseract::TessBaseAPI ocr;

int main()
{
   Mat input = imread("C:\Program Files (x86)\Tesseract-OCR");
	 cvtColor( input, input, CV_BGR2GRAY );

  ocr.Init(NULL, "eng", tesseract::OEM_TESSERACT_ONLY);
 
  ocr.SetPageSegMode(tesseract::PSM_SINGLE_BLOCK);
  ocr.SetImage(input.data, input.cols, input.rows, 1, input.step);
  char* text = ocr.GetUTF8Text();
  cout << "Text:" << endl;
  cout << text << endl;
  cout << "Confidence: " << ocr.MeanTextConf() << endl << endl;
  

}

Open in new window


when i builded

 c:\program files (x86)\tesseract-ocr\include\leptonica\environ.h(277): warning C4005: 'snprintf' : macro redefinition
1>          c:\program files (x86)\tesseract-ocr\include\tesseract\platform.h(33) : see previous definition of 'snprintf'
1>c:\program files (x86)\tesseract-ocr\include\leptonica\pix.h(169): warning C4305: 'initializing' : truncation from 'double' to 'const l_float32'
1>c:\program files (x86)\tesseract-ocr\include\leptonica\pix.h(171): warning C4305: 'initializing' : truncation from 'double' to 'const l_float32'
1>tessopen.obj : warning LNK4075: ignoring '/EDITANDCONTINUE' due to '/INCREMENTAL:NO' specification
1>  tess_open.vcxproj -> C:\Users\eouerten\documents\visual studio 2010\Projects\tess_open\Debug\tess_open.exe
1>FinalizeBuildStatus:
1>  Deleting file "Debug\tess_open.unsuccessfulbuild".
1>
1>Build succeeded.
1>
1>Time Elapsed 00:00:02.93
========== Build: 1 succeeded, 0 failed, 0 up-to-date, 0 skipped ==========


an when running
0xc0150002.PNG
0
I have some python code that takes in an image of an A4 printed letter, then draws bounding boxes around each character.

I want to know how to save each bounding box as an image, so essentially it's taking every character it detects and saving it. Preferable as a .png resized to 20x20


Here is my code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from scipy.misc import imread,imresize
from skimage.segmentation import clear_border
from skimage.morphology import label
from skimage.measure import regionprops


image = imread('./ocr/testing/adobe.png',1)

#apply threshold in order to make the image binary
bw = image < 120

# remove artifacts connected to image border
cleared = bw.copy()
clear_border(cleared)

# label image regions
label_image = label(cleared,neighbors=8)
borders = np.logical_xor(bw, cleared)
label_image[borders] = -1

print label_image.max()

fig, ax = plt.subplots(ncols=1, nrows=1, figsize=(6, 6))
ax.imshow(bw, cmap='jet')



for region in regionprops(label_image, ['Area', 'BoundingBox']):
    # skip small images
    if region['Area'] > 50:

        # draw rectangle around segmented characters
        minr, minc, maxr, maxc = region['BoundingBox']
        rect = mpatches.Rectangle((minc, minr), maxc - minc, maxr - minr,
                              fill=False, edgecolor='red', linewidth=2)
        ax.add_patch(rect)

plt.show()

Open in new window


I've tried a few solutions such as adding the following in my for loop

image_patch = img[minc:maxc, minr:maxr]  # get region of interest (slice)
plt.imsave("filename.png", image_patch)

But that doesn't obtain the right boundaries for some reason.
The hard part is already done, (Drawing the boundries around the characters) I literally just want to save each boundry as an image now but I have no idea how.
0
Hi All,

Currently looking at automatic invoice processing systems. Looking for a comparison between abby flexicapture and Kofax total agility and which one you think is better and why ?

Regards,

JK
0

OCR

519

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Top Experts In
OCR
<
Monthly
>