How do i remove all non url friendly characters in a string

Posted on 2011-04-22
Last Modified: 2012-05-11
I have a strng e.g. The cat sat on the mat

I'd like this to form a url page name e.g. The_cat_Sat_on_the_mat

But is there any easier and more accuracte way to ensure that "ALL" other non freidnly url can be removed.

By solution at the min is this ...

title.Replace(" ", "_").Replace(".", "").Replace(",", "").Replace("-", "").Replace("&", "").Replace("$", "")

But I'll be there all day saying < , > ?, | , ( , ) and so on. Is there a regular expression i can use which will remove only show 0-9 and a-z and remove all others?

And if possible could i get this in a working example.

Question by:Webbo_1980
    LVL 142

    Expert Comment

    by:Guy Hengel [angelIII / a3]
    LVL 4

    Accepted Solution

    Note that regular expression will only find the matches and it can validate given string that either it is a valid URL or not.

    For removing unnecessary characters from string, you can use given below simple logic. As you mentioned that valid URL will have only "a-z" or "A-z" or "0-9" so here is the C# function for it. You can alter it to add any other character.
    private string ValidateURL(string value)
                string returnVal = "";
                char[] charArray = value.ToCharArray();
                for (int i = 0; i < charArray.Length; i++)
                    int asciVal = (int)charArray[i];
                    if ((asciVal >= 65 && asciVal <= 91) || (asciVal >= 97 && asciVal <= 123) || (asciVal >= 48 && asciVal <= 57) || asciVal == 32 || asciVal == 95)
                        returnVal = returnVal + charArray[i];
                return returnVal;

    Open in new window

    LVL 74

    Expert Comment

    by:käµfm³d 👽

    That sure is a lot of code just to sanitize a string  = )


    I would have suggested this sooner, but I misread the question initially.
    string value = "The cat sat on the mat";
    string cleansed = System.Text.RegularExpressions.Regex.Replace(value, "[^\w]", "_");  // Any character NOT A-Z, a-z, 0-9, or _

    Open in new window


    Featured Post

    Courses: Start Training Online With Pros, Today

    Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

    Join & Write a Comment

    For those of you who don't follow the news, or just happen to live under rocks, Microsoft Research released a beta SDK ( for the Xbox 360 Kinect. If you don't know what a Kinect is (http:…
    Wouldn’t it be nice if you could test whether an element is contained in an array by using a Contains method just like the one available on List objects? Wouldn’t it be good if you could write code like this? (CODE) In .NET 3.5, this is possible…
    Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
    Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

    734 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    22 Experts available now in Live!

    Get 1:1 Help Now