We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

Using gapped suffix arrays for approximate string matching

Medium Priority
561 Views
Last Modified: 2012-05-11
Hi there,

I have been tasked with implementingin an algorithm that builds an approximate index on a set of text (eg: a collection of websites) and perform some operations related to it (eg: Search, Find & Replace).

The suggested technique to implement this is to use gapped suffix arrays. I have searched online for more detail but with no luck. I understand what suffix arrays are about, but not the "gapped" piece.

Can anyone kindly assist me or point me to a web link with more information on this topic?

Thanks very much,
C
Comment
Watch Question

CERTIFIED EXPERT
Top Expert 2016
Commented:
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview
CERTIFIED EXPERT
Awarded 2010
Top Expert 2013
Commented:
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview

Author

Commented:
Thank you Sara and Tommy for your prompt responses.

I will have a read-through of the wiki links shortly.

Tommy,

Looking at the research paper, I agree that the gapped suffix arrays technique is probably not the path I should go down with, in the interest of time. Can you kindly mention the more standard methods that are definitely proven to work? Perhaps its worth my time doing my ground work on alternate "working" methods :)

Thanks again..

C
CERTIFIED EXPERT
Awarded 2010
Top Expert 2013

Commented:
Regular Suffix arrays would work. The original paper (Suffix arrays: A new method for on-line string searches) was published in 1989 and has over 1000 citations. There is a good Wikipedia page for suffix arrays (it is listed in Sara's first link under indexing methods too). I would start there.

Author

Commented:
Ok fantastic, thanks so much!

Will give that a go and let you know if I need further input.

Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a free trial preview!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.