I am trying to help a friend sort out a problem with some web sites that he administers. These are Joomla sites and he has found that the files for each of the sites have been hacked and iFrames have been injected. He wants to remove these dubious iFrames from all the sites and asked if I could provide some code to do this which he could run as a command at SSH.
A search of the web showed that one way of removing these was to use grep and sed but my knowledge of bash is limited and I’m much more knowledgable with Perl so I looked for a Perl solution. The offending code is of the form
I thought that there might be some other legitimate iFrames on the site so I set about producing some code that would remove the <!-- . --> tags and the iFrames within them and produced the following online Perl code:
perl -pi.bak -e '$pattern="<!-- . -->"; s/$pattern.+?$pattern//gs' `find . -name "*" -type f`
The first problem I came across was that this didn’t find the offending code although I had included the ’s’ option to treat everything as a single line. Although the regex I was using worked perfectly well using the Perl code normally I found that the problem seemed to be due to the fact that the file I had been given to test had CRLF line endings. I used another one liner to change all line endings to LF and tried my code again.
The code now seemed to work all right in removing the offending code. However when I tried it on a file that had two lots of that bad code it didn’t remove both bits despite the ‘g’ option in the regex, and in fact it didn’t remove the first bit properly.
Can someone explain why this code didn’t work with the CRLF line endings, and also why the global version didn’t remove all the bad code. Can I make any change to the one liner so that it works properly?