• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 411
  • Last Modified:

string manipulation - extract number from URL

Hi, I have to extract a number from a URL

the urls are all different but look something like this
http://sermon.net/somename/sermonid/2673850
however, it could also look very different, but the number willl be the same
What i need to do is extract the number:
2673850

but if the number goes up to 8 digitis it needs to allowfor this too
could someone give me the C sharp code that can extract this please?

Thanks
0
websss
Asked:
websss
  • 3
  • 3
  • 2
  • +1
2 Solutions
 
JosephEricDavisCommented:
String test = "http://sermon.net/somename/sermonid/2673850";
            test = test.Substring(test.LastIndexOf("/") + 1);
0
 
JosephEricDavisCommented:
This code will work under the assumption that the number is always the last thing in the url and is separated by a '/'
0
 
websssAuthor Commented:
thanks

but that assumption is incorrect
It could be half way through the URL
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
websssAuthor Commented:
here is another example of it
http://somename.sermon.net/da/2668218/play
0
 
JosephEricDavisCommented:
Try this

Int32 test = getNumberFromURL("http://sermon.net/somename/sermonid/2673850");

with the method below

This method will return an integer of the number found in the url.
It will return -1 if no number is found in the url.
public Int32 getNumberFromURL(String url)
{
    string[] segments = url.Split(new char[] { '/' });
    Int32 result;
    foreach (String str in segments)
    {
        if (Int32.TryParse(str, out result))
        {
            return result;
        }
    }
    return -1;
}

Open in new window

0
 
wdosanjosCommented:
Please try the following:

string number = Regex.Match(url, @"(?<=/)\d+(?=/|$)").Value;  // where url contains your url

Open in new window

0
 
anarki_jimbelCommented:
Honestly, I quite oppose Regex. It's too slow. Much slower than my code below, I can bet. And I don't know them well :)

JosephEricDavis's code will work. However, it has too many string and string parsing operations and aslo can be slow. In some cases it does matter. Really, people working in web should try to write FAST code.

That what I tried to do. Believe me, iterating through characters is much faster than regex. I have no string operations apart from returning a result as a substring. If no "numeric" segment found - return empty string.
private void button1_Click(object sender, EventArgs e)
        {
            // those two indeces designate a segment between slashes '/' 
            int currentSplitCharPosition = -1;
            int previousSplitCharPosition = -1;
            // variable to check if all chars in a segment are digits; if false - cannot be changed until next segment styarts
            bool currentSegmentIsNumeric = true;
            //result
            string numericPartStr = "";

            // iterate through characters
            for(int i  = 0; i< url.Length;i++)
            {
                char ch = url[i];
                if (ch == '/' || i==url.Length-1)
                {
                    previousSplitCharPosition = currentSplitCharPosition;
                    currentSplitCharPosition = i;
                    if (currentSegmentIsNumeric)
                    {
                        if (currentSplitCharPosition - previousSplitCharPosition > 1)
                        {
                            int startIndex = previousSplitCharPosition+1;
                            int length;
                            
                            if(i==(url.Length-1))
                            {
                                length = currentSplitCharPosition - previousSplitCharPosition;
                            }
                            else
                            {
                                length = currentSplitCharPosition - previousSplitCharPosition-1;
                            }
                            
                            numericPartStr = url.Substring(startIndex, length);
                            
                            break; 
                        }

                    }
                    currentSegmentIsNumeric = true;
                }
                else if (ch >= 48 && ch <= 57) // digit
                {
                    // valid character - do nothing; if all digits in this segment, 
                    // currentSegmentIsNumeric will stay True
                }
                else
                {
                    currentSegmentIsNumeric = false;
                }
            }
                System.Diagnostics.Debug.WriteLine("String = '" + numericPartStr + "'");
        }

Open in new window

0
 
websssAuthor Commented:
thanks all
Joe's solution was the one implemented
0
 
wdosanjosCommented:
Out of curiosity based on @anarki_jimbel comments, I ran a benchmark of the three approaches.  Here are the results after running each solution 3,000,000 times.

Case 1: 00:00:04.8145487   (@JosephEricDavis, String Split - Accepted Solution) ~4 secs
Case 2: 00:00:10.9864231   (@wdosanjos, Regular Expression) ~11 secs
Case 3: 00:00:01.1221370   (@anarki_jimbel, Character iteration) ~1 sec

Here is the code I used for the benchmark:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Diagnostics;

namespace Benchmark
{
    public class Program
    {
        private const int ITERATIONS = 1000000;

        public static void Main(string[] args)
        {
            string[] tests = new string[] { 
                "http://sermon.net/somename/sermonid/2673850", 
                "http://sermon.net/somename/sermonid/2673850/play", 
                "http://sermon.net/somename/sermonid/2673s850/play" 
            };

            Stopwatch sw = new Stopwatch();

            // Case 1
            sw.Reset();
            sw.Start();
            for (int i = 0; i < ITERATIONS; i++)
            {
                foreach (var test in tests)
                {
                    getNumberFromURL1(test);
                }
            }
            sw.Stop();
            Console.WriteLine("Case 1: {0}", sw.Elapsed);

            // Case 1
            sw.Reset();
            sw.Start();
            for (int i = 0; i < ITERATIONS; i++)
            {
                foreach (var test in tests)
                {
                    getNumberFromURL2(test);
                }
            }
            sw.Stop();
            Console.WriteLine("Case 2: {0}", sw.Elapsed);

            // Case 1
            sw.Reset();
            sw.Start();
            for (int i = 0; i < ITERATIONS; i++)
            {
                foreach (var test in tests)
                {
                    getNumberFromURL3(test);
                }
            }
            sw.Stop();
            Console.WriteLine("Case 3: {0}", sw.Elapsed);

            // 
            Console.ReadKey();
        }

        public static Int32 getNumberFromURL1(String url)
        {
            string[] segments = url.Split(new char[] { '/' });
            Int32 result;
            foreach (String str in segments)
            {
                if (Int32.TryParse(str, out result))
                {
                    return result;
                }
            }
            return -1;
        }

        private static Regex pattern = new Regex(@"(?<=/)\d+(?=/|$)");

        public static Int32 getNumberFromURL2(string url)
        {
            Match match = pattern.Match(url);

            return match.Success ? Int32.Parse(match.Value) : -1;
        }

        public static Int32 getNumberFromURL3(string url)
        {
            // those two indeces designate a segment between slashes '/' 
            int currentSplitCharPosition = -1;
            int previousSplitCharPosition = -1;
            // variable to check if all chars in a segment are digits; if false - cannot be changed until next segment styarts
            bool currentSegmentIsNumeric = true;
            //result
            string numericPartStr = "";

            // iterate through characters
            for(int i  = 0; i< url.Length;i++)
            {
                char ch = url[i];
                if (ch == '/' || i==url.Length-1)
                {
                    previousSplitCharPosition = currentSplitCharPosition;
                    currentSplitCharPosition = i;
                    if (currentSegmentIsNumeric)
                    {
                        if (currentSplitCharPosition - previousSplitCharPosition > 1)
                        {
                            int startIndex = previousSplitCharPosition+1;
                            int length;
                            
                            if(i==(url.Length-1))
                            {
                                length = currentSplitCharPosition - previousSplitCharPosition;
                            }
                            else
                            {
                                length = currentSplitCharPosition - previousSplitCharPosition-1;
                            }
                            
                            numericPartStr = url.Substring(startIndex, length);
                            
                            break; 
                        }

                    }
                    currentSegmentIsNumeric = true;
                }
                else if (ch >= 48 && ch <= 57) // digit
                {
                    // valid character - do nothing; if all digits in this segment, 
                    // currentSegmentIsNumeric will stay True
                }
                else
                {
                    currentSegmentIsNumeric = false;
                }
            }

            return (numericPartStr.Length > 0) ? Int32.Parse(numericPartStr) : -1;
        }
    }
}

Open in new window


Thanks for the insight @anarki_jimbel.
0
 
anarki_jimbelCommented:
Thank you, wdosanjos.
The test was interesting.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 3
  • 3
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now