• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 125
  • Last Modified:

How accurate is Code Green's (a DLP tool) OCR

Googling around, there are tools that measure accuracy of OCR (converting image to characters).

Has anyone measured Code Green's OCR using any of these tools or has some indications of
Code Green's OCR accuracy?
0
sunhux
Asked:
sunhux
1 Solution
 
btanExec ConsultantCommented:
not so sure about code green OCR accuracy but I understand Forcepoint has this OCR support but limited as its OCR engine does not support handwriting based image; nor are images containing text that is skewed more than 10 degrees. To share further, how FP does its check is
All other PDF documents, including hybrid files containing both searchable text and scanned text, are sent to the default Data Security extractor, not the OCR server. Should the system fail to extract text from a PDF, it is forwarded to the OCR server.
https://www.websense.com/content/support/library/data/v78/help/ocr_main.aspx

another candidate is Core DLP from GTB Tech is strong in OCR engine
Core Detection & Analysis Algorithms

Methods for describing sensitive content are abundant.  They can be divided into two categories: precise methods and imprecise methods.

Precise methods are, by definition, those that involve Content Registration and trigger almost zero false positive incidents.

All other methods are imprecise.  They include:  keywords, lexicons, regular expressions, extended regular expressions, meta data tags, Bayesian analysis, statistical analysis such as Machine Learning, etc.

Combined with the proprietary algorithms, GTB's AccuMatchTM detection algorithms have virtually zero false positives and a very high resilience to data modifications including:

Excerpting, inserting, file type conversion, formatting,    ASCII ->UNICODE conversion,     UNIX–Windows conversion,   partial data match, and so on.
https://gttb.com/data-loss-prevention/core-dlp-technology/

I will suggest you ask Code Green to share and compare against the above two DLP engine - if they do not even know these two provider I do see that they may be quite far off in improving their OCR leadership, likewise if they do know, there should be accuracy matrix to share on its limits
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now