Link to home
Start Free TrialLog in
Avatar of CRNorthAmerica
CRNorthAmerica

asked on

RSS XML feed problem - I need an url to conatin "&" and can't figure it out!!!

Hello All,

I have an RSS feed I have created: http://www.lajobhunter.com/lajh/jobfeedrss2.cfm

in each item there is a detailPage URL that looks like below:

http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=%24%22%5DLL00%20%20%0A

in the URL the part right after position-description it has "&amp" I need to output an actual & there.  I have to escape when it builds the doc since the & in XML is a new entity but isn't there a way to output it so I can have an url with a variable?

Thanks,

Bill
Avatar of David
David
Flag of United States of America image

Did you try it?  In some circumstances XML automatically converts the "&" into a "&" character.
Avatar of CRNorthAmerica
CRNorthAmerica

ASKER

That is the actual xml output there.  You can see it at that url
I am not completely sure of the environment.  But, if you are getting that URL to show up as you mentioned, can you not do a replace function on it:

<cfset CorrectedURL = replace(CurrentURL, "&amp;", "&", "ALL")>
#CorrectedURL#

Something like that...
wherever you need to use that URL, just use the built in function for reading URL's in Coldfusion in proper format

URLDecode()

example:

<cfset CorrectedURL = URLDecode(CurrentURL)>

More info on that URLDecode() function here:
http://livedocs.macromedia.com/coldfusion/7/htmldocs/wwhelp/wwhimpl/common/html/wwhelp.htm?context=ColdFusion_Documentation&file=00000658.htm
XML documents should convert special characters, normally with xmlFormat() when creating values for XML nodes, to maintain integrity of the document since special characters can sometimes cause issues with XML documents, which is why it's stored in that format. URLDecode() will convert those characters converted back to its original format for URLs. This will also convert other special characters back to its original state, which is why URLDecode() would be better becuase it's meant just for that purpose (converting URLs) so it'll change the "&amp;" back to &, the %20 back to a space, and whatever other special characters which have been encoded, it will convert back for use in the URL. You may also want to check out URLEncodedFormat() which encode's URL's, similar to xmlFormat(), however they have their different purposes and variations....
URLDecode() does not process the "&amp;" string.  Rather it processes escape codes such as %20 = blank.  I tried it to make sure :--)

The replace() function (or replacenocase() function if you are not sure of the upper/lower case status of the &amp; string) may do it.  But I am not sure of the exact sequence of your processing.
I repeat (sorry) that URLDecode() does not seem to handle &amp;, but rather handles "%" escape codes.

Maybe I am wrong (so far I am outvoted!) but here is the code I used to make sure about it:

plain: http://www.this.com?aaa=b&c=d%20for%20dog
decoded: #URLDecode("http://www.this.com?aaa=b&c=d%20for%20dog")#
sure it does, even using your example....

<cfset mystring = "http://www.this.com?aaa=b&c=d%20for%20dog">
<cfoutput>#URLDecode(mystring)#</cfoutput>

result is:
http://www.this.com?aaa=b&c=d for dog
You need to look at the source code of the output page, not at what is displayed.
You need to look at the source code of the output page, not at what is displayed (not here, but where you actually create it)
if you look at the page http://www.lajobhunter.com/lajh/jobfeedrss2.cfm
 now you will see what I mean, when the "&" sign is output it fails.
that is all from the source code, and not what is displayed

I am not sure what is supposed to happen in that URL, and where the error shows up (excuse me if I am being blind to the obvious).  Can you be specific about where to look, and how to get there?

But, in general, the trick is to be very aware of the environment into which your code is going.  For example, if you want to display a URL, you should convert all the "&" characters to "&amp;" before outputting them.  But if storing in a database for backend use later, you would probably just keep the "&".  Just think carefully of where the stuff is going to be used, and perhaps run some tests on what these different functions do.  It can be confusing at first, but after a bit it all makes sense.
It is being output, if I convert all the URL's to &amp; then that is how they are output and they do not work as url parameters (can't click on the url and have the parameter passed)
If you want to show the URL, you would enclose the visible version (using &amp;) with the <a href="true_version_with_&_character"> and </a>.  The part in the href does not display, so there is no problem.

Is this going in email, or other environment.  There is a way to solve it, no matter what!
how are you generating the XML?
opalcomp,
you mentioned "You need to look at the source code of the output page, not at what is displayed."

I thought it was for display purposes, which is the reason I mentioned what I did....

from coldfusion do all the source code is below:

<cfsetting enablecfoutputonly="yes">
<cfquery name="rsJobFeed" datasource="lajhdb">
SELECT     dbo.LAJHRecruiterJobs.JobID, dbo.LAJHRecruiterJobs.JobInternalID, dbo.LAJHRecruiterJobs.JobTitle, dbo.LAJHRecruiterJobs.JobDescription,
                      dbo.LAJHRecruiterJobs.JobCity, dbo.LAJHRecruiterJobs.JobStateID, dbo.USStates.StateName
FROM         dbo.LAJHRecruiterJobs INNER JOIN
                      dbo.USStates ON dbo.LAJHRecruiterJobs.JobStateID = dbo.USStates.StateID
</cfquery>
<cfsavecontent variable="theXML">
<cfoutput><?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="2.0">
<channel>
      <title>LaJobHunter Job Feed</title>
      <link>http://www.lajobhunter.com/lajh</link>
      <description>Current LaJobHunter Job Listing</description>
      <language>en-us</language>
      <copyright>Copyright 2006 LaJobHunter</copyright>
      <docs>http://lajobhunter.com/lajh/rss</docs>
      <lastBuildDate>#dateFormat(now(), "ddd, dd mmm yyy")# #timeformat(now(),"HH:mm:ss")#PST</lastBuildDate>
</cfoutput>
<cfloop from="1" to="#rsJobFeed.RecordCount#" index="ctr">
<!--- cleanup and ensure all is xml compliant--->
<cfscript>
jobTitle = replace(rsJobFeed.JobTitle[ctr], "<", "&lt;", "ALL");
jobDescription = replace(rsJobFeed.jobDescription[ctr], "<", "&lt;", "ALL");
jobID = replace(rsJobFeed.JobID[ctr], "<", "&lt;", "ALL");
jobState = replace(rsJobFeed.StateName[ctr], "<", "&lt;", "ALL");
jobCity = replace(rsJobFeed.JobCity[ctr], "<", "&lt;", "ALL");
jobTitle = replace(rsJobFeed.JobTitle[ctr], "#chr(10)#", "", "ALL");
jobDescription = replace(rsJobFeed.jobDescription[ctr], "#chr(10)#", "", "ALL");
jobID = replace(rsJobFeed.JobID[ctr], "#chr(10)#", "", "ALL");
jobState = replace(rsJobFeed.StateName[ctr], "#chr(10)#", "", "ALL");
jobCity = replace(rsJobFeed.JobCity[ctr], "#chr(10)#", "", "ALL");
jobTitle = replace(rsJobFeed.JobTitle[ctr], "#chr(13)#", "", "ALL");
jobDescription = replace(rsJobFeed.jobDescription[ctr], "#chr(13)#", "", "ALL");
jobID = replace(rsJobFeed.JobID[ctr], "#chr(13)#", "", "ALL");
jobState = replace(rsJobFeed.StateName[ctr], "#chr(13)#", "", "ALL");
jobCity = replace(rsJobFeed.JobCity[ctr], "#chr(13)#", "", "ALL");
jobTitle = replace(rsJobFeed.JobTitle[ctr], "#chr(38)#", "&amp;", "ALL");
jobDescription = replace(rsJobFeed.jobDescription[ctr], "#chr(38)#", "&amp;", "ALL");
jobID = replace(rsJobFeed.JobID[ctr], "#chr(38)#", "'&'", "ALL");
jobState = replace(rsJobFeed.StateName[ctr], "#chr(38)#", "&amp;", "ALL");
jobCity = replace(rsJobFeed.JobCity[ctr], "#chr(38)#", "&amp;", "ALL");
</cfscript>
<!--- this is the actuall rss output --->
      <cfoutput>
      <item>
            <jobid>#jobID#</jobid>
            <jobtitle>#jobTitle#</jobtitle>
            <jobdescription>#jobdescription#</jobdescription>
            <detailPage>http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=#URLEncodedFormat(Encrypt(JobID,"key"))#</detailPage>
            <jobState>#jobState#</jobState>
            <jobCity>#jobCity#</jobCity>
      </item>
      </cfoutput>
</cfloop>
<cfoutput>
</channel>
</rss>
</cfoutput>
</cfsavecontent>
<cffile action="write" file="#expandPath(".")#\jobFeed.xml" output="#theXML#">
<cfcontent type="text/xml">
<cfoutput>#theXML#</cfoutput>
see how this code works for ya....

<cfsetting enablecfoutputonly="yes">
<cfquery name="rsJobFeed" datasource="lajhdb">
SELECT     dbo.LAJHRecruiterJobs.JobID, dbo.LAJHRecruiterJobs.JobInternalID, dbo.LAJHRecruiterJobs.JobTitle, dbo.LAJHRecruiterJobs.JobDescription,
                      dbo.LAJHRecruiterJobs.JobCity, dbo.LAJHRecruiterJobs.JobStateID, dbo.USStates.StateName
FROM         dbo.LAJHRecruiterJobs INNER JOIN
                      dbo.USStates ON dbo.LAJHRecruiterJobs.JobStateID = dbo.USStates.StateID
</cfquery>
<cfsavecontent variable="theXML">
<cfoutput><?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="2.0">
<channel>
     <title>LaJobHunter Job Feed</title>
     <link>http://www.lajobhunter.com/lajh</link>
     <description>Current LaJobHunter Job Listing</description>
     <language>en-us</language>
     <copyright>Copyright 2006 LaJobHunter</copyright>
     <docs>http://lajobhunter.com/lajh/rss</docs>
     <lastBuildDate>#dateFormat(now(), "ddd, dd mmm yyy")# #timeformat(now(),"HH:mm:ss")#PST</lastBuildDate>
</cfoutput>
<cfloop from="1" to="#rsJobFeed.RecordCount#" index="ctr">
  <cfset theURL = "http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=#URLEncodedFormat(Encrypt(JobID,"key"))#">
  <cfoutput>
     <item>
          <jobid>#xmlformat(jobID)#</jobid>
          <jobtitle>#xmlformat(jobTitle)#</jobtitle>
          <jobdescription>#xmlformat(jobdescription)#</jobdescription>
          <detailPage>#xmlformat(theURL)#</detailPage>
          <jobState>#xmlformat(jobState)#</jobState>
          <jobCity>#xmlformat(jobCity)#</jobCity>
     </item>
  </cfoutput>
</cfloop>
<cfoutput>
</channel>
</rss>
</cfoutput>
</cfsavecontent>
<cffile action="write" file="#expandPath(".")#\jobFeed.xml" output="#theXML#">
<cfcontent type="text/xml">
<cfoutput>#theXML#</cfoutput>
this is actually a little cleaner version of above...


<cfsetting enablecfoutputonly="yes">
<cfquery name="rsJobFeed" datasource="lajhdb">
SELECT     dbo.LAJHRecruiterJobs.JobID, dbo.LAJHRecruiterJobs.JobInternalID, dbo.LAJHRecruiterJobs.JobTitle, dbo.LAJHRecruiterJobs.JobDescription,
                      dbo.LAJHRecruiterJobs.JobCity, dbo.LAJHRecruiterJobs.JobStateID, dbo.USStates.StateName
FROM         dbo.LAJHRecruiterJobs INNER JOIN
                      dbo.USStates ON dbo.LAJHRecruiterJobs.JobStateID = dbo.USStates.StateID
</cfquery>
<cfsavecontent variable="theXML">
<cfoutput>
<?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="2.0">
  <channel>
     <title>LaJobHunter Job Feed</title>
     <link>http://www.lajobhunter.com/lajh</link>
     <description>Current LaJobHunter Job Listing</description>
     <language>en-us</language>
     <copyright>Copyright 2006 LaJobHunter</copyright>
     <docs>http://lajobhunter.com/lajh/rss</docs>
     <lastBuildDate>#dateFormat(now(), "ddd, dd mmm yyy")# #timeformat(now(),"HH:mm:ss")#PST</lastBuildDate>
       <cfloop from="1" to="#rsJobFeed.RecordCount#" index="ctr">
         <cfset theURL = "http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=#URLEncodedFormat(Encrypt(JobID,"key"))#">
     <item>
          <jobid>#xmlformat(jobID)#</jobid>
          <jobtitle>#xmlformat(jobTitle)#</jobtitle>
          <jobdescription>#xmlformat(jobdescription)#</jobdescription>
          <detailPage>#xmlformat(theURL)#</detailPage>
          <jobState>#xmlformat(jobState)#</jobState>
          <jobCity>#xmlformat(jobCity)#</jobCity>
     </item>
       </cfloop>
  </channel>
</rss>
</cfoutput>
</cfsavecontent>
<cffile action="write" file="#expandPath(".")#\jobFeed.xml" output="#theXML#">
<cfcontent type="text/xml">
<cfoutput>#theXML#</cfoutput>
you want to use xmlFormat when populating xml values, you don't really have to check for every special character in that cfscript block, xmlFormat() takes care of that for you...
That actually ends up giving me:

Query String:  
Diagnostics: Variable JOBID is undefined.
The error occurred on line 21.


which is not correct, never got that before
sorry, one more version, you also want to use trim() around your query values you want to output just to get rid of any whitespace leading and trailing that may come from the values in the database...

so here's maybe a better way, I also added the xmlformat() to your date outputs... you should always use xmlformat when creating an XML document...

<cfsetting enablecfoutputonly="yes">
<cfquery name="rsJobFeed" datasource="lajhdb">
SELECT     dbo.LAJHRecruiterJobs.JobID, dbo.LAJHRecruiterJobs.JobInternalID, dbo.LAJHRecruiterJobs.JobTitle, dbo.LAJHRecruiterJobs.JobDescription,
                      dbo.LAJHRecruiterJobs.JobCity, dbo.LAJHRecruiterJobs.JobStateID, dbo.USStates.StateName
FROM         dbo.LAJHRecruiterJobs INNER JOIN
                      dbo.USStates ON dbo.LAJHRecruiterJobs.JobStateID = dbo.USStates.StateID
</cfquery>
<cfsavecontent variable="theXML">
<cfoutput>
<?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="2.0">
  <channel>
     <title>LaJobHunter Job Feed</title>
     <link>http://www.lajobhunter.com/lajh</link>
     <description>Current LaJobHunter Job Listing</description>
     <language>en-us</language>
     <copyright>Copyright 2006 LaJobHunter</copyright>
     <docs>http://lajobhunter.com/lajh/rss</docs>
     <lastBuildDate>#xmlformat(dateFormat(now(), "ddd, dd mmm yyy"))# #xmlformat(timeformat(now(),"HH:mm:ss"))#PST</lastBuildDate>
       <cfloop from="1" to="#rsJobFeed.RecordCount#" index="ctr">
         <cfset theURL = "http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=#URLEncodedFormat(Encrypt(JobID,"key"))#">
     <item>
          <jobid>#trim(xmlformat(jobID))#</jobid>
          <jobtitle>#trim(xmlformat(jobTitle))#</jobtitle>
          <jobdescription>#trim(xmlformat(jobdescription))#</jobdescription>
          <detailPage>#trim(xmlformat(theURL))#</detailPage>
          <jobState>#trim(xmlformat(jobState))#</jobState>
          <jobCity>#trim(xmlformat(jobCity))#</jobCity>
     </item>
       </cfloop>
  </channel>
</rss>
</cfoutput>
</cfsavecontent>
<cffile action="write" file="#expandPath(".")#\jobFeed.xml" output="#theXML#">
<cfcontent type="text/xml">
<cfoutput>#theXML#</cfoutput>
try this, it should correct the error, it wasn't recognizing your query values...


<cfsetting enablecfoutputonly="yes">
<cfquery name="rsJobFeed" datasource="lajhdb">
SELECT     dbo.LAJHRecruiterJobs.JobID, dbo.LAJHRecruiterJobs.JobInternalID, dbo.LAJHRecruiterJobs.JobTitle, dbo.LAJHRecruiterJobs.JobDescription,
                      dbo.LAJHRecruiterJobs.JobCity, dbo.LAJHRecruiterJobs.JobStateID, dbo.USStates.StateName
FROM         dbo.LAJHRecruiterJobs INNER JOIN
                      dbo.USStates ON dbo.LAJHRecruiterJobs.JobStateID = dbo.USStates.StateID
</cfquery>
<cfsavecontent variable="theXML">
<cfoutput>
<?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="2.0">
  <channel>
     <title>LaJobHunter Job Feed</title>
     <link>http://www.lajobhunter.com/lajh</link>
     <description>Current LaJobHunter Job Listing</description>
     <language>en-us</language>
     <copyright>Copyright 2006 LaJobHunter</copyright>
     <docs>http://lajobhunter.com/lajh/rss</docs>
     <lastBuildDate>#xmlformat(dateFormat(now(), "ddd, dd mmm yyy"))# #xmlformat(timeformat(now(),"HH:mm:ss"))#PST</lastBuildDate>
       <cfloop from="1" to="#rsJobFeed.RecordCount#" index="ctr">
         <cfset theURL = "http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=#URLEncodedFormat(Encrypt(rsJobFeed.JobID[ctr],"key"))#">
     <item>
          <jobid>#trim(xmlformat(rsJobFeed.JobID[ctr]))#</jobid>
          <jobtitle>#trim(xmlformat(rsJobFeed.jobTitle[ctr]))#</jobtitle>
          <jobdescription>#trim(xmlformat(rsJobFeed.jobdescription[ctr]))#</jobdescription>
          <detailPage>#trim(xmlformat(theURL))#</detailPage>
          <jobState>#trim(xmlformat(rsJobFeed.jobState[ctr]))#</jobState>
          <jobCity>#trim(xmlformat(rsJobFeed.jobCity[ctr]))#</jobCity>
     </item>
       </cfloop>
  </channel>
</rss>
</cfoutput>
</cfsavecontent>
<cffile action="write" file="#expandPath(".")#\jobFeed.xml" output="#theXML#">
<cfcontent type="text/xml">
<cfoutput>#theXML#</cfoutput>
one issue now - goto http://www.lajobhunter.com/lajh/jobfeedrss.cfm

and you will the xml parse error, my xml was clean before - if you look at the source code you will see in that detailpage url it is not putting an actual & out it is still outputting &amp
Are you sure that is an error?  XML does not like raw ampersands, and escapes them.  You should be able to reverse the process when retrieving.  But trailblazzyr55 seems to know what he is doing, he can probably clarify further.
I just opened http://www.lajobhunter.com/lajh/jobfeedrss.cfm and I do not see any errors, it seems to display how you want including the "&" correctly how you want... can you post the error it's giving you?
just to show you what I see, this is directly copy and pasted from the page you mentioned, it's only the first few nodes since there are so many....

  <?xml version="1.0" encoding="ISO-8859-1" ?>
- <rss version="2.0">
- <channel>
  <title>LaJobHunter Job Feed</title>
  <link>http://www.lajobhunter.com/lajh</link>
  <description>Current LaJobHunter Job Listing</description>
  <language>en-us</language>
  <copyright>Copyright 2006 LaJobHunter</copyright>
  <docs>http://lajobhunter.com/lajh/rss</docs>
  <lastBuildDate>Tue, 31 Oct 06 15:36:57PST</lastBuildDate>
- <item>
  <jobid>2072</jobid>
  <jobtitle>Health Care Rep</jobtitle>
  <jobdescription>Health Care Rep. Great income helping Doctors and Dentists get patients. P/T or F/T. Work from home or from our offices. Request interview at: www.911WorkAtHome.com</jobdescription>
  <detailPage>http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=%24%22%5DLL00%20%20%0A</detailPage>
  <jobState>California</jobState>
  <jobCity>Los Angeles</jobCity>
  </item>
- <item>
  <jobid>2073</jobid>
  <jobtitle>Insurance sales agent</jobtitle>
  <jobdescription>Managing General Agents & Independent Health Insurance Agents We are a company that offers low cost discount health benefits to people who have limited insurance or people who cannot afford or have been denied traditional insurance coverage. We recently instituted a new program for Managing General Agents, where they can add our benefits to their portfolios. www.911USAdoctors.com * NO cost to enroll * NO cost for websites * NO cost for marketing materials (brochures etc) 20% to 30% LEVEL residuals for life of the business. 90% persistency rate. Managing General Agents must have at least 5 captive agents under them in order to qualify. We also have other programs for Independent Agents as well. Call for additional info: 321-206-1777 Mr. McCormack</jobdescription>
  <detailPage>http://www.lajobhunter.com/lajh/JobSeeker/search.cfm?page=position-description&JobID=%24%22%5DLL0%20%20%20%0A</detailPage>
  <jobState>California</jobState>
  <jobCity>los Angeles</jobCity>
  </item>

............

from both firefox and IE 7

XML Parsing Error: xml declaration not at start of external entity
Location: http://www.lajobhunter.com/lajh/jobfeedrss.cfm
Line Number 2, Column 1:<?xml version="1.0" encoding="ISO-8859-1" ?>
hey Tailblazzy - seriously appreciate your help on this, I have learned quite a bit from this on cleaner code.
it's probably something with IE7 or Firefox... works fine in IE 6, that's not to say there isn't an issue

check out:

http://www.notes.xythian.net/2005/02/17/not-at-start-of-external-entity/
http://wordpress.org/support/topic/81099

my feeling is that this may solve the problem...

<cffile action="write" file="#expandPath(".")#\jobFeed.xml" output="#trim(toString(theXML))#">
ASKER CERTIFIED SOLUTION
Avatar of trailblazzyr55
trailblazzyr55

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
also when you want to display it, read it from the file and parse it....

save it....

<cffile action="write" file="#expandPath(".")#\jobFeed.xml" output="#trim(toString(theXML))#">

then read it...

<cffile action="read" file="#expandPath(".")#\jobFeed.xml" variable="myXMLDoc">

an can output for your display purposes by.... <cfdump var="#xmlParse(myXMLDoc)#">

or output how you were...

<cfoutput>#myXMLDoc#cfoutput>
I don't actually see the output though, I see the feed page that firefox and ie display, and I see it in the source, but the area itself is blank
try going to: http://www.lajobhunter.com/lajh/jobFeed.xml

when you want to view the XML document it should be directly to the XML file... you shouldn't be displaying XML from a coldfusion page as if it were and XML document.

I also had a look at http://www.lajobhunter.com/lajh/jobFeed.xml on Mozilla too and it worked properly.