Here's some general hints:
Save a ton of memory and copying by making the changes in-place. You can do it if
you first scan the text for your pattern and just record those places (can even be done right
in the text if the pattern is long enough. Then go back and make the changes. If the
replacement string is longer than the original, you'll have to grow the memory bloack and process the text backwards, otherwise process it forwards. This saves a ton of string moves and temporaries and copies.
Search for the least likely character in your search string. This will give you the fewest false hits.
Don't restore the whole buffer after each replace, just copy the minimum needed.
If the outptu text is going into some sequential consumer task, then you can optimize things even more
by passing on the data without ever having actually made a complete copy of it.
Hope this helps,
grg99
Main Topics
Browse All Topics





by: gj62Posted on 2003-07-09 at 12:59:51ID: 8888432
Hi sizak,
/~lecroq/s tring/node 14.html
Most of your time is probably spent on the actual search (let me know if that's incorrect). If you are using strstr(), you can get significant improvement using Boyer-Moore search algo. Here's the best link I know (which has some source)...
http://www-igm.univ-mlv.fr
Obviously, I'm assuming you need to do this in C - there are existing tools out there that can do this (though generally they have not been optimized for exact string searches)...
Cheers!