Call 'InternetCanonicalizeUrl()
Main Topics
Browse All TopicsI want to check URLs, for example : http://abc.com/ggg/add.htm
How to check systax of URL?
Thanks in advanced!
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Call 'InternetCanonicalizeUrl()
>> use regex to determine valid syntax. ( regular expression )
For that, you'll probably need this :
http://tools.ietf.org/rfc/
(RFC : Uniform Resource Identifier (URI): Generic Syntax)
Howdy....
I think that we misread the question. Checking to see if a URL is valid (as the original questions asks) goes well beyond checking to see if the URL is properly formatted.
Hi star6868,
To see if a URL is valid, you're going to need to connect to the server at the address in the URL and issue a GET request for the page named on the URL. Using your example, http://abc.com/ggg/add.htm
Then you're going to have to send the host name to the DNS to retrieve the IP address.
Next, create a client (socket) connection to the IP address on port 80. (If the URL specified a different port, use it.)
Next, issue a GET ggg/add.html. This can be a bit tricky. If the server still supports older HTML, you can issue just the command:
GET /ggg/add.html'
Otherwise, you'll have to supply a version.
GET /ggg/addh.html HTTP/1.0
Note that as newer versions are specified, the exchange becomes more conversational. Older versions were one request, one answer. Newer versions often want additional information before the page is returned. The answer from the server will have to be parsed to see if it is a valid page or if an error code is returned.
You included the C++ Builder zone. It has a lot of built in tools that make this pretty easy. Using the IDE you can drop a TClient on the form, set the Host and Port variables in the TClient object (Host is the domain name so that the DNS lookup is done for you) and connect to the server. Then send the GET xxxxx.
You still have to handle the response, but about 1 page of code will get you to the point where you're ready to do that.
Good Luck,
Kent
star6868
Regular Expressions in C++ with Boost.Regex
http://www.onlamp.com/pub/
http://www.onlamp.com/pub/
http://www.onlamp.com/pub/
http://www.onlamp.com/pub/
http://www.boost.org/libs/
http:#22430900 contains a working solution, so I am not sure why this should be deleted.
Business Accounts
Answer for Membership
by: deadropPosted on 2008-08-27 at 21:08:48ID: 22331470
use regex to determine valid syntax. ( regular expression )