OCR

516

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Share tech news, updates, or what's on your mind.

Sign up to Post

Hello and Good Morning Everyone,

          From a previously closed post, I found out I can scan a text document straight into MS Word by using either ABBY Finereader or Paperport.  At this point, I am interested in knowing which program would work best for achieving this goal.  Any shared thoughts, suggestions, or tips will be greatly appreciated.

          Thank you.

           George
0
[Webinar] Learn How Hackers Steal Your Credentials
LVL 9
[Webinar] Learn How Hackers Steal Your Credentials

Do You Know How Hackers Steal Your Credentials? Join us and Skyport Systems to learn how hackers steal your credentials and why Active Directory must be secure to stop them. Thursday, July 13, 2017 10:00 A.M. PDT

I'm trying to  get OCR working using YAGF.  I read this is what Google used to scan books. I tried scanning a cookbook page and it gave me nothing.  Then I scanned my mom's harlequin romance book so there were no pictures.  That didn't work either.  Any guesses?  This is using ubuntu.
0
how do I upgrade from from pp 12.1 to 14.5
0
Can a Fujitsu ScanSnap iX500 use OCR to specifically NAME a file (PDF) as the value found when scanning?  For example, it we have a stack of invoices, all formatted the same with the Invoice Number in the same location, can we scan those and ask ScanSnap to name each individual scan page based on the value of the Invoice Number?  (I.e. Inv123.pdf, Inv456.pdf, Inv789.pdf)

Alternatively, I am more sure that we can make a PDF SEARCHABLE so that we could SEARCH within each document for a given invoice number... (i.e. search ALL PDFs that contain INV456)).  Only problem is, that would be slower than simply eyeing down a list of filenames...

Best way to scan and file similar documents?

Thank you!
0
- Numerous PDFs in a network folder, or to be pulled into the solution via a network scanner
- Need to read the bar code and extract the 5 pieces of data for indexing. OR, OCR portions of the page with the same data as in the bar code.
- Use this data to store the document for search and retrieval later - methods may vary. Would like documents placed into folders by the date in the bar code.
- Some sort of compression or load into a database is preferred to keep file size down.
- Windows or Linux based
- OpenSource only: I want to get my hands dirty with it.

Any ideas?
0
I had to reinstall PaperPort 14 but when I try to open it I get an error stating that it has stopped working. How do I eliminate this problem?
0
Nuance is offering Paperport Professional 14 for $59.99.  Info still shows no support for W10.  Will your 14.5 upgrade work on this one?
0
I have tried a simple OCR program from youtube. Here, i am getting 'type expected' ERROR near Graphics. please help in this regards. I am using Visual Studio 2012.

Imports Emgu.CV
Imports Emgu.Util
Imports Emgu.CV.OCR
Imports Emgu.CV.Structure
Public Class Form1
    Dim OCRz As Tesseract = New Tesseract("tessdata", "eng", Tesseract.OcrEngineMode.OEM_DEFAULT)
    Dim pic As Bitmap = New Bitmap(270, 100)
    Dim gfx As Graphics = Graphics.FromImage(pic)



    Private Sub Timer1_Tick(sender As Object, e As EventArgs) Handles Timer1.Tick
        gfx.CopyFromScreen(New Point(Me.Location.X + PictureBox1.Location.X + 4, PictureBox1.Location.Y + 30), New Point(0, 0), pic.Size)
        PictureBox1.Image = pic
        PictureBox1.Image = Nothing
    End Sub
    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        OCRz.Recognize(New Image(Of Bgr, Byte)(pic))
        RichTextBox1.Text = OCRz.GetText
    End Sub
0
Hi Experts

Could you give a way on how to configure a "price reader device" - used in supermarkets f.e. ?

img_leitor

I guess that for the use it has to take a formated file with codebars/ prices, maybe the file format is  
defined by the device manufacturer. Once the device read this file (by USB f.e.) the device reader use could start.

Isn't it?

Could you clear?

Thanks in advance
0
Does anyone know of a way to increase the font size of PDF so that a 5pt document prints out in 10-12 pt font?  The attached page is 1 page of a 476 page document.

I tried converting it to Word and Excel, but those aren’t viable options when you think of the time it takes to OCR the doc.  Also, the output is horrible and there are so many possible mistakes, it doesn’t bare thinking of it as an option.

I don’t think there’s a way to do it, not even zooming with a copier works.
0
[Webinar] How Hackers Steal Your Credentials
LVL 9
[Webinar] How Hackers Steal Your Credentials

Do You Know How Hackers Steal Your Credentials? Join us and Skyport Systems to learn how hackers steal your credentials and why Active Directory must be secure to stop them. Thursday, July 13, 2017 10:00 A.M. PDT

Hi Experts,

I'm looking for an easy to use Windows OCR Application that can convert small screen captures to text.

Here are a few sample screen captures.

sample screen capture
sample screen capture
sample screen capture
Regards,
Leigh
0
How can I convert an unsearchable PDF file into one that is?
0
Hi

I'm looking for a solution that will OCR and index scanned documents for retrieval later by searching for key works or numbers.

Are there any industry standard programs for this sort of thing?

Thanks
0
Has anyone used this software?  I'd appreciate any comments anyone can offer - I can't find a single review on the internet, which is odd.

Thanks!
0
Hello
  I am trying to convert a document from .ocr to Excel. I tried using adobe Pro but it did not work very well.
wordHS.docx
0
I am trying to install the software but it keeps freezing. Its not doing anything when I press the next button. Can any one please help me out?
screenshot.jpgscreenshot2.jpgscreenshot3.jpgpic1.jpgpic2.jpg
http://www.neat.com/helpcenter/download-neat-scanner-drivers/
0
I have a google document with some embedded images that have text content in them. Is there a way to convert the Google doc so all the images are converted to text so all I have is a text document?
0
I have a pdf file of several invoices for analysis.  The pdf files have the invoice as a picture, rather than scanned text.
Is there a good way to convert this picture to editable data?

I cannot include my example, as it has confidential information contained.

Many thanks
David Phelops
0
Hello, I was wondering if anyone knows a program that convert  pdf to
MS Word

Situation: we have a project where we need to scan 300+ pages (yes in paper) to pdf and somehow get it to Word so we can modify some of the text in those pages without having to re-type everything.

any idea?

Thanks
0
On Demand Webinar: Networking for the Cloud Era
LVL 9
On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

I've looked at most of the mainstream MAC OCR applications. They seem to only capture structured handwriting, but I can't be sure from the features lists and sales information.
0
Googling around, there are tools that measure accuracy of OCR (converting image to characters).

Has anyone measured Code Green's OCR using any of these tools or has some indications of
Code Green's OCR accuracy?
0
I've been scouting around dozens of PDF to Word conversion but they mostly convert
2 page for trial & requires sign up;  

a couple of them are not accurate when I convert, eg:
   http://www.convertmypdf.net/

OCR is not crucial as the PDF document I want to convert are of searchable text but
when I convert using the above url, tables got lost & some of the text became
overlap etc

Attachment 1 is the PDF doc I'm trying to convert.
Attachment 1 is the resultant conversion which is rather bad.

I'm not allowed to installed software on this corporate laptop, not even Opensource
0
I need to OCR a large number of PDF files.  The PDFs are printed from Word documents that have had changes tracked, so inserted words underlined, deleted words stuck out.  I need the inserted and struck words to be somehow flagged, maybe with a * or ^ or whatever, but something that an AI system can recognize as a flag.  Best if the OCR engine can run under Linux.
0
What are the ways to detect if an image that was screen shot from original
source has been altered / doctored.  

Do list any freeware & professional tools
1
Hello,

I started using an Epson GT-1500 scanner to digitize technical journals.  I have been using Epson Scan Software (in Office Mode) on my WIndows 7 laptop to create PDF files.  Many of the articles were duplex which required me to manually scan both sides of the paper and use Epson's software to re-arrange the pages in order.  I checked each file to make sure I could open and see all the pages in Acrobat Reader before tossing the original journal.

I discovered a problem which did not show up until I decided to scan an entire 75 page journal.  Rather than use Epson's software to delete unwanted pages and reorder the pages, I just scanned all of the even pages in one scan and the odd pages in a separate file.  I then tried to clean up the file using Acrobat Pro on my PC.  The problem is when I try to delete unwanted pages I get this error:

There was a problem reading the document (14).


Also, when I open this (and many other of my files) I get this error:

An error exists on this page.  Acrobat may not display the page correctly.  Please contact the person who created the PDF document to correct the problem.

After hitting OK I can see the file fine.  Eventually I want to make these files searchable on key words and suspect that all of them have some form of compatibility problem that will prevent this.  I also wanted to be able to shrink/compress these files and have problems doing that.   I also cannot clean up my 75 page journal…
0

OCR

516

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Top Experts In
OCR
<
Monthly
>