• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 279
  • Last Modified:

Not Validated RSS feed, uses relative src for images/script... how to inject the FQDN?

The RSS feed is out of my hands, so please do not ask me to get them to change it ...

I am coming across an issue with the feed.  The images and scripts are all in it with a relative source, rather than using the FQDN/folder/file.ext as the source.

Is there any way I can inject the FQDN into the scr="" for these?
0
kevp75
Asked:
kevp75
  • 10
  • 6
1 Solution
 
kevp75Author Commented:
update.  The following function will get me the required FQDN...i just need to figure out how I can inject it into the image src, script src....
    Private Function stripFQDN(strURL)
        Set objRegExp = New RegExp
        objRegExp.IgnoreCase = True
        objRegExp.Multiline = True
        objRegExp.Global = True
        objRegExp.Pattern = "[a-zA-Z0-9]+([a-zA-Z0-9\-\.]+)?\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU)"
        Set myMatches = objRegExp.Execute(strURL)
        For Each myMatch In myMatches
            stripFQDN = stripFQDN & myMatch.Value & vbcrlf
        Next
    End Function

Open in new window

0
 
Martin-SmithCommented:
So assuming you have a fully qualified domain of

http://www.bbc.co.uk


You want to manipulate all things like

src="xyz/page.asp"


so they end up like


src="http://www.bbc.co.uk/xyz/page.asp"?

If so you can use another RegEx as below


Dim ResultString As String
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Global = True
myRegExp.Pattern = "src=""([^"":]*)"""
ResultString = myRegExp.Replace(SubjectString, "src=""http://www.bbc.co.uk/$1""")
0
 
kevp75Author Commented:
precisely.

I'll give that a shot and let you know in a few
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
kevp75Author Commented:
what is the $1 for?
0
 
kevp75Author Commented:
ok.  I think I'm off on something.  When I try what I'm doing with the code you posted, I get a blank page.

Here's what I have for code, and the way I am using it
Include.asp:
<%
Response.Expires=-1
Response.ExpiresAbsolute = Now() - 1
Response.CacheControl="private"
Response.CacheControl="no-cache"
Response.CacheControl="no-store"
 
Class clsFeedPuller
    'Get The files path
    Public Function strGetFilePath()
        Dim lsPath, arPath
        lsPath = Request.ServerVariables("SCRIPT_NAME")
        arPath = Split(lsPath, "/")
        arPath(UBound(arPath,1)) = ""
        strGetFilePath = Join(arPath, "/")
    End Function
    'Regular Expression Parser
    Public Function strParseContent(strContent, strPattern)
        Set objRegExp = New RegExp
        objRegExp.IgnoreCase = True
        objRegExp.Multiline = True
        objRegExp.Global = True
        objRegExp.Pattern = strPattern
        Set myMatches = objRegExp.Execute(strContent)
        For Each myMatch In myMatches
            strParseContent = strParseContent & myMatch.Value & vbcrlf
        Next
    End Function
    'Content Puller
    Public Function strGetContent(strURL)
	    'create an instance of the MS XMLhttp component.
	    Set xmlObj = Server.CreateObject("MSXML2.ServerXMLHTTP")
	    'Open the connection and send the request Set the optional Async parameter to True 
	    xmlObj.Open "GET", strURL, False  
	    Call xmlObj.Send()
	    'Turn off error handling
	    On Error Resume Next
	    'Wait for up to 3 seconds if we've not gotten the data yet
	    If xmlObj.readyState <> 4 Then xmlObj.waitForResponse 3
		    'Did an error occur?  If so, use a default value for our data
		    If Err.Number <> 0 Then
			    strGetContent = "There was an error retreiving the remote page"
		    Else
			    'If we reach here, we know the server responded
			    'now check for a 200 status and a ready state 4
			    If (xmlObj.readyState <> 4) Or (xmlObj.Status <> 200) Then
				    'Abort the request
				    xmlObj.Abort
				    strGetContent = "Problem communicating with remote server..."
			    Else
			        
				    strGetContent = injectFQDN(xmlObj.ResponseText, strURL)
				    'response.Write(injectFQDN(strGetContent, strURL))
			    End If
	    End If
    End Function
    'Feed Puller
    Public Function strGetRSS(strURL, strFeedsToShow) 
	    'Let's set our object
	    dim xmlDom, nodeCol, oNode, oChildNode
	    set xmlDom = Server.CreateObject("MSXML2.Domdocument")
		    xmlDOM.async = False
		    'Set our HTTP Request
		    call xmlDom.setProperty("ServerHTTPRequest", true)
		    xmlDom.async = False
		    'Now we load the document
		    call xmlDom.load(strURL)
		    'Check for elements
		    if not xmlDom.documentElement is nothing then
			    set nodeCol = xmlDom.documentElement.selectNodes("channel/item")
				    'Start a count of the articles to display
				    i = 0			  
				    'Start to loop through each article
				    for each oNode in nodeCol
					    'This number sets the number of articles to display
					    if i < strFeedsToShow then
						    Response.Write("<div>" & vbCrLf)
						    'The Link
						    set oChildNode = oNode.selectSingleNode("link")
							    if not oChildNode is nothing then
								    strRSSLink = oChildNode.text
							    end if
						    set oChildNode = nothing
						    'The Title
						    set oChildNode = oNode.selectSingleNode("title")
							    if not oChildNode is nothing then
								    strRSSTitle = Server.HTMLEncode(oChildNode.text)
								    strGetRSS = strGetRSS & "<div class='rssTitle'><a href=""#"" onclick=""loadurl('/golfTipsMagModule/content.asp?url=" & server.URLEncode(strRSSLink) & "&pt="&Request.QueryString("c")&"&title="&server.URLEncode(strRSSTitle)&"','rssFull');return false;"">" & strRSSTitle & "</a></div>"
							    end if
						    set oChildNode = nothing
						    'Published Date
						    set oChildNode = oNode.selectSingleNode("pubDate")
							    if not oChildNode is nothing then
								    strRSSPubDate = Server.HTMLEncode(oChildNode.text)
								    strGetRSS = strGetRSS & "<div class='rssDate'>" & strRSSPubDate & "</div>" & vbCrLf
							    end if
						    set oChildNode = nothing
						    'The Description
						    set oChildNode = oNode.selectSingleNode("description")
							    if not oChildNode is nothing then
								    strRSSDesc = oChildNode.text
								    strGetRSS = strGetRSS & "<div class='rssDesc'>" & strRSSDesc & "</div>"
							    end if
						    set oChildNode = nothing
						    'Add 1 to the article count number
						    i = i + 1
						    strGetRSS = strGetRSS & "</div>" & vbCrLf
					    end if
				    next
			    set nodeCol = nothing
		    else
			    strGetRSS = strGetRSS & strPANError & vbCrLf
		    end if
	    set xmlDom = nothing
    End Function
    
    'Strip the FQDN for image and script injection
    Private Function stripFQDN(strURL)
        Set objRegExp = New RegExp
        objRegExp.IgnoreCase = True
        objRegExp.Multiline = True
        objRegExp.Global = True
        objRegExp.Pattern = "[a-zA-Z0-9]+([a-zA-Z0-9\-\.]+)?\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU)"
        Set myMatches = objRegExp.Execute(strURL)
        For Each myMatch In myMatches
            stripFQDN = stripFQDN & myMatch.Value & vbcrlf
        Next
    End Function
    'Inject the domain into src
    Private Sub injectFQDN(strString, strURL)
        Set objRegExp = New RegExp
            objRegExp.IgnoreCase = True
            objRegExp.Multiline = True
            objRegExp.Global = True
            objRegExp.Pattern = "src=""([^"":]*)"""
            ResultString = objRegExp.Replace(strString, "src=""http://" & stripFQDN(strURL) & "/$1""")
        set objRegExp = nothing
    End Sub
End Class
%>
 
page.asp:
<%
set objContent = new clsFeedPuller
 
    strPattern = "<!-- PRODUCER NOTE -->([\s\S]*?)<!-- RIGHT COLUMN -->"
    strContent = objContent.strGetContent("http://www.vinfolio.com/do/store/detail?vid=93869&utm_source=RSS&utm_medium=RSS")
    response.Write(objContent.strParseContent(strContent,strPattern))
    
    
set objContent = nothing
 
%>

Open in new window

0
 
Martin-SmithCommented:
injectFQDN  should surely be a function not a sub?
0
 
Martin-SmithCommented:
Did the above work?

Also sorry I missed you earlier question as to what the $1 is for.

The Regular Expression matches everything like

src="xyz"

where xyz is any length string of characters not including either a " (as this is the end delimiter) or a : (as this would indicate an absolute URL that shouldn't be adjusted)

the xyz stuff is put into a "backreference" by enclosing it in brackets. It is the first and only backreference in the expression.

The $1 in the replace expression basically means substitute the back reference value.

If you want to learn more about RegEx's I strongly recommend RegExBuddy.
0
 
kevp75Author Commented:
ok.  looks like it works for src="something", but what about src='something' and src=something?
0
 
Martin-SmithCommented:
Change the pattern to "src=(""|')?([^"":]*)\1"

Change $1 to $2
0
 
kevp75Author Commented:
got it.  thanks
0
 
kevp75Author Commented:
I stand corrected.  This still does not work for src='sopmething.ext' and src=something.ext

nor does it seem to be working with anything other than images...

any thoughts?  or should I re-open the question?
0
 
kevp75Author Commented:
bueller?
0
 
kevp75Author Commented:
bueller, bueller.....anyone?
0
 
Martin-SmithCommented:
It should work for single quotes.

The (""|') portion of the Regex means match either a single or double quote.

You may need to tweak the regex to allow for spaces next to the "=" character or something along those lines.

Your question only asked about src.

If you want to match, eg, href as well use the alternation character as well.

src|href=(""|')?([^":]*)\1
0
 
Martin-SmithCommented:
Try the following pattern

"(?:src|href)[\s]*=[\s]*(""|')([^"":]*)\1"
0
 
kevp75Author Commented:
sorry bout that...
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

  • 10
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now