I have a regex problem where I am trying to search within a string (html based) for "href=.......".
What I am trying to do is wrap the contents of href and insert it back into the html (ie replace the link with my own).
I have an exception to the rule - where if the href="##~NOT_THIS~##" then no replacement should take place.
I also need to use the original href contents within the replacement string (ie as a back-reference).
I have been using the following regex expression (the debugging version):
regexp_replace('VERY LONG HTML STRING','((href=")[^(##~NO
','0=\0 1=\1 2=\2 3=\3 4=\4 5=\5')
What I get is (assuming href within 'VERY LONG HTML STRING' is href="http://www.google.com/
" 2=href=" 3=ttp://www.google.com/
My problem is that the 3rd backreference (the one I'm interested in) always drops the first character (in this case it drops the "h" in "http://www.google.com/
I assume the problem is with the [^(##~NOT_THIS~##)] part of the expression. When I remove it, the first character is not dropped, however then the href's I want the replace to ignore are not ignored.
I'm a regex newbie, and I've killed hours and hours on this - any help to this problem much appreciated.