huji
asked on
I want to creat a reg exp for it
Hi
Suppose that I have saved the content of an HTML file in a variable, named strContent. I want to find all locations in this HTML matching this pattern:
<img src="..................... ." ....>
Where src DOES NOT start with "http://", and change them to such:
<img src="http://domain.com/folder/..................... ......." ....>
That is, I want to add http://domain.com/folder/ to the begining of the SRC attributes which have missed it.
How would you design a regexp for it?
Huji
Suppose that I have saved the content of an HTML file in a variable, named strContent. I want to find all locations in this HTML matching this pattern:
<img src=".....................
Where src DOES NOT start with "http://", and change them to such:
<img src="http://domain.com/folder/.....................
That is, I want to add http://domain.com/folder/ to the begining of the SRC attributes which have missed it.
How would you design a regexp for it?
Huji
ASKER
Well I coded what I needed in PHP, very simply. We should accept that PHP uses RegExp in a better way when compared with ASP (VBscript indeed.)
Also, I've designed a regexp that finds the "src"s in ASP. But the problem is with the replacing part. In PHP you can simply choose the second sub-pattern to be changed, but you can't do the same in ASP.... I've coded some ways, but it is not neat and tidy, AT ALL. That's why I asked it here.
EVERYBODY, please provide me with the full code, that replaces the needed parts in the string.
Thanks
huji
Also, I've designed a regexp that finds the "src"s in ASP. But the problem is with the replacing part. In PHP you can simply choose the second sub-pattern to be changed, but you can't do the same in ASP.... I've coded some ways, but it is not neat and tidy, AT ALL. That's why I asked it here.
EVERYBODY, please provide me with the full code, that replaces the needed parts in the string.
Thanks
huji
this does the find/replace thing well... but does not exclude the ones with http... some tuning needed on regex
<%
Option Explicit
Dim str
str = "<img src=""logo.gif"" >sfgjh<img src=""banner.gif"" >"
Dim objRegExp
Set objRegExp = New RegExp
objRegExp.IgnoreCase = True
objRegExp.Global = True
'Repalce all instances of Perl with ASP
objRegExp.Pattern = "(<img src=)(( *|\"")?)([^http:\/\/])([^\""]*)(( *|\"")?)( *)>"
str = objRegExp.Replace(str, "<img src=""http://gcek.net/img/$5"">")
response.write str
Set objRegExp = Nothing 'Clean up!
%>
<%
Option Explicit
Dim str
str = "<img src=""logo.gif"" >sfgjh<img src=""banner.gif"" >"
Dim objRegExp
Set objRegExp = New RegExp
objRegExp.IgnoreCase = True
objRegExp.Global = True
'Repalce all instances of Perl with ASP
objRegExp.Pattern = "(<img src=)(( *|\"")?)([^http:\/\/])([^\""]*)(( *|\"")?)( *)>"
str = objRegExp.Replace(str, "<img src=""http://gcek.net/img/$5"">")
response.write str
Set objRegExp = Nothing 'Clean up!
%>
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Well, I still need some RegExp references! I feel there are things for me to learn! the $4 was the key, but I hadn't seen it before! :o( (Would be glad if you recommend me a good online resource)
Now some questions:
1) About the pattern you made:
"(<img src=)(( *|\"")?)(?!http:)([^\""]*)(( *|\"")?)( *)>"
-----------------^
What is the marked part supposed to do? To match things like : src= something.jpg ??? (I'm specially focusing on the SPACE before the astrisk.
2) What if one IMG tag is like this:
<img alt="" src="a.jpg">
This pattern will miss that. I'm going to add a part like [^>]* after the (<img in your pattern. Can you help me with it?
Thanks a lot
Huji
Now some questions:
1) About the pattern you made:
"(<img src=)(( *|\"")?)(?!http:)([^\""]*)(( *|\"")?)( *)>"
-----------------^
What is the marked part supposed to do? To match things like : src= something.jpg ??? (I'm specially focusing on the SPACE before the astrisk.
2) What if one IMG tag is like this:
<img alt="" src="a.jpg">
This pattern will miss that. I'm going to add a part like [^>]* after the (<img in your pattern. Can you help me with it?
Thanks a lot
Huji
i was looking at this page while doing that...
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnanchor/html/scriptinga.asp
well... microsoft's version of regexp is always a bit confusing! :)
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnanchor/html/scriptinga.asp
well... microsoft's version of regexp is always a bit confusing! :)
ASKER
fozylet,
I fixed it myself. Thank you for sharing your knowledge.
Good luck
Huji
I fixed it myself. Thank you for sharing your knowledge.
Good luck
Huji
ASKER
Thanks Huji :)
Any time!
Any time!
i used this in a php page, and it returns ALL the src attributes ... including all the ones with the http:// ... so you may need to change this around ... it also returns javascript src's