Correctly encode URLs c# for w3c xhtml validation

Hi All,

Sorry but only have a basic understanding of encoding.

I am running w3c validator of one of my pages, one of the errors is this -

reference to external entity in attribute value

This is generally the sign of an ampersand that was not properly escaped for inclusion in an attribute, in a href for example. You will need to escape all instances of '&' into '&'.

    * Line 528, column 489: reference to external entity in attribute value

      …woods/A912_SP333_02_UA483?fmt=jpeg&qlt=90&wid=245&hei=410&color=255,255,255&si…

So I assume I need to start urlencoding (especially the & ampersands) my (3rd party supplied) URLs for href/src etc in my html?

I have tried

(example src link is http://s7v1.scene7.com/is/image/Littlewoods/A912_SP333_02_UA483?fmt=jpeg&qlt=90&wid=245&hei=410)

HttpUtility.UrlPathEncode - but this does not touch the &'s

HttpUtility.UrlEncode - but this messes up the http:// by encoding that also

Can I just urlencode everything after the http://? is there a built in function for this.

Have I understood this?  You must always URLENCODE your url paths in an XHTML html document.

Thanks
Matt


wickedwAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Paul MacDonaldDirector, Information SystemsCommented:
"Can I just urlencode everything after the http://?"
Yes.
"is there a built in function for this"
No, but you can strip the prefix prior to and re-add it subsequent to, URLEncoding the URL
"You must always URLENCODE your url paths in an XHTML html document."
I wouldn't generally consider this an issue unless I were passing the URL as a parameter in another URL.  Since you're just trying to pass validation, it really comes down to how important that is for you.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
wickedwAuthor Commented:
Hi Paulmacd,

Thats great, do you know of a regexp expression that I can use to grab everything upto the ? so I can remove start, encode and add start again?

Thanks
Matt
0
Paul MacDonaldDirector, Information SystemsCommented:
I'm not a C# guy and it's not clear if you're using codebehind, but you could just REPLACE(url, "http://", "")
You could do the whole thing on one line like:
Dim strEncodedURL as String = "http://" & Server.UrlEncode( Replace(url, "http://", "") )
 
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

wickedwAuthor Commented:
Yeh thanks paul, but the only trouble with that is it encodes the /'s in the domain, say http://abc.com/a/b/page?blah=blah& ...

Need to strip out all before the ? and encode rest

No worries, ill sort it, you can have the points :)
0
wickedwAuthor Commented:
A code example would have been great.
0
Paul MacDonaldDirector, Information SystemsCommented:
I don't think you'll run into any problems URLEncoding the whole thing.  
Certainly you can use something like:
Uri.UriSchemeHttp & Uri.SchemeDelimiter & Request.Url.Authority & URLEncode(Request.Url.PathAndQuery)
Lastly, you could brute force it:
Dim strOldURL As String = "http://abc.com/a/b/page?blah=blah& "
Dim intStart As Integer = strOldURL.IndexOf(CChar("?"))
Dim strNewURL As String = strOldURL.Substring(0, intStart) & "?" & Server.UrlEncode(strOldURL.Substring(intStart + 1))
0
wickedwAuthor Commented:
Thanks Paul,  

I went down the brute force way as you suggested, other posts on stack overflow seemed to indicate not to encode the lot, and this seemed the best compromise, thanks for your help :)
0
Paul MacDonaldDirector, Information SystemsCommented:
No one sees the code but us anyway.  As long s it works...
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Languages and Standards

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.