MadIce
asked on
Search and replace text in documents using wildcard
I have hundreds of html pages with tags that I need to get rid of. I can go thru each page and remove but that would take a very long time. Is there a way to use a wildcard to search and remove these tags? I've used the following when I know what the text is:
fileReader = My.Computer.FileSystem.Rea dAllText(" C:\Documen ts\Working \Content\" & strTag & "\" & strFile).Replace("<img src=""""*.jpg", strTag & "-" & strPhotoNum & ".jpg")
My.Computer.FileSystem.Wri teAllText( "C:\Docume nts\Workin g\Content\ " & strTag & "\" & strFile, fileReader, False)
Can Regex be used to do this? if so can you point me to example?
Using VB.NET 2010
fileReader = My.Computer.FileSystem.Rea
My.Computer.FileSystem.Wri
Can Regex be used to do this? if so can you point me to example?
Using VB.NET 2010
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Dim jpgPattern = "(<img .+?src\s*=\s*[""'])(.*?.jpg)(['""].+?/>)"
Dim testString = "some html text <img class=""someclass"" id='1' src='test.jpg' /> rest of html <img class=""someclass"" id='2' src = 'subfolder/test2.jpg' />"
Dim regex = New System.Text.RegularExpressions.Regex(jpgPattern)
MessageBox.Show(regex.Replace(testString, "$1New.jpg$3"))
Notepad++ allows you to find and replace by regex too... I second the idea that it's an appropriate tool for the job.
BTW, Total Commander as well as some other console-like apps ( including famous FAR) allow regex too ... :)
ASKER
btpringle,
Tried notepad ++ and home and was able to do what I needed using regex. Wasn't familiar with Notepad ++ or regex. I'm have software that has the regex feature already. Thanks for the info. Thanks to everyone else as well.
Tried notepad ++ and home and was able to do what I needed using regex. Wasn't familiar with Notepad ++ or regex. I'm have software that has the regex feature already. Thanks for the info. Thanks to everyone else as well.
ASKER