We have requirement to search huge number of key words against text from large documents. The search includes few fuzzy logic as well.
Fuzzy logic Ref: http://www.codeguru.com/vb/gen/vb_database/microsoftaccess/article.php/c13137/Fuzzy-Matching-Demo-in-Access.htm
Currently we are OCR the scanned document and converting to text and then performing search by looping the list of key words one by one against the scanned text/words. We have 4 set of fuzzy logic to go thru.
Current keyword list is limited to 10 – 20 words but in production the list can go up to 1000 or more. We need to perform the search as fast as possible. What would be the best approach is? Using SQL Server Full Text search or any search algorithm apart from fuzzy logic?
Thought about multi-threading, but had a bad experience in the past. We always encountered database deadlocks. Is there any other efficient approach or architecture?
Any suggestions or approaches are welcome, either programming logic, application architecture, hardware setup etc.