Parse this string

I want to parse this string in C# and get the values for User, BusinessNameId, ToBusinessNameId, PatientId and ToProviderId

User: hw@gmail.com BusinessNameId: 1973 ToBusinessNameId: 1973 PatientId: 1506 ToProviderId: 1608


I think I need to use IndexOf

string test = "User: xyz@gmail.com BusinessNameId: 1973 ToBusinessNameId: 1973 PatientId: 1506 ToProviderId: 1608"

s.IndexOf('User:") but then how do I get xyz@gmail.com that comes right after it?

Or

s.IndexOf("BusinessNameId") but how do I get 1973 ?

There will always be a space between the strings (I can use * or any other character if it's better. It doesn't have to be a space)
LVL 8
CamilliaAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Ryan ChongCommented:
try this:

public string Between(string STR, string FirstString, string LastString = null)
        {
            string FinalString;
            int Pos1 = STR.IndexOf(FirstString) + FirstString.Length;
            int Pos2 = LastString == null ? STR.Length : STR.IndexOf(LastString);
            FinalString = STR.Substring(Pos1, Pos2 - Pos1);
            return FinalString;
        }

Open in new window


then:

string test = "User: xyz@gmail.com BusinessNameId: 1973 ToBusinessNameId: 1974 PatientId: 1506 ToProviderId: 1608";
            string User = Between(test, "User:", "BusinessNameId:").Trim();
            string BusinessNameId = Between(test, "BusinessNameId:", "ToBusinessNameId:").Trim();
            string ToBusinessNameId = Between(test, "ToBusinessNameId:", "PatientId:").Trim();
            string PatientId = Between(test, "PatientId:", "ToProviderId:").Trim();
            string ToProviderId = Between(test, "ToProviderId:").Trim();

Open in new window

0
Dirk StraussSenior Full Stack DeveloperCommented:
There are many ways to parse a string. If I were you, I would definitely look at using regular expressions in C#. If you are new to regular expressions, then have a look at Regular Expressions Succinctly by Syncfusion. To test your Regular Expression patterns, have a look at Regex Storm.

With Regular Expressions you are guaranteed to always find the match you are looking for (provided your pattern is correct). The .NET Framework has built in support for regular expressions. Have a look at this sample code for extracting all email addresses from a body of text:

using System.IO;
using System.Text.RegularExpressions;
using System.Text;

class MailExtracter
{

    public static void ExtractEmails(string inFilePath, string outFilePath)
    {
        string data = File.ReadAllText(inFilePath); //read File 
        //instantiate with this pattern 
        Regex emailRegex = new Regex(@"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*",
            RegexOptions.IgnoreCase);
        //find items that matches with our pattern
        MatchCollection emailMatches = emailRegex.Matches(data);

        StringBuilder sb = new StringBuilder();

        foreach (Match emailMatch in emailMatches)
        {
            sb.AppendLine(emailMatch.Value);
        }
        //store to file
        File.WriteAllText(outFilePath, sb.ToString());
    }
}

Open in new window

You can read more here at Stack Overflow: extract all email address from a text using c#

You will notice that the only part of the code that is RegEx is this:

Regex emailRegex = new Regex(@"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*", RegexOptions.IgnoreCase);
//find items that matches with our pattern
MatchCollection emailMatches = emailRegex.Matches(data);

Open in new window

0
CamilliaAuthor Commented:
thanks, let me take a look
0
Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

Chris StanyonWebDevCommented:
Regex is the way to go here. Based on your string, you could split the input on a space that's not followed by a colon, and then split that result on the space to create a dictionary of key/value pairs.

public IDictionary<string, string> splitString(string input) {

	var dict = new Dictionary<string, string>();

	foreach (var item in Regex.Split(input, @"(?<!:)\s"))
	{
		var pair = Regex.Split(item, @"\s");
		dict[pair[0].Trim(':')] = pair[1];
	}

	return dict;
}

Open in new window

You can then use it like so:

var results = splitString("User: hw@gmail.com BusinessNameId: 1973 ToBusinessNameId: 1973 PatientId: 1506 ToProviderId: 1608");
Console.WriteLine( results["User"] );
Console.WriteLine( results["BusinessNameId"] );

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
CamilliaAuthor Commented:
Thanks, I'll try them tonight
0
sarabandeCommented:
I would definitely look at using regular expressions in C#

in my opinion regular expressions are uneffectual for normal string parsing. they are badly readable, expensive to process, highly complex, and unsafe because they allow changes to the current code which neither could be fully tested nor fully overseen how they will apply, even by  experienced writers.

Ryan has showed how the expression simply can parsed if it is guaranteed that the keywords are all present and the string contains nothing else but the 5 string-value pairs. if you would put the values into <> you could change Ryan's Between function by

       public string Between(string str, string key)
        {
            string val = "";
            string findkey = key + "=";
            int pos1 = str.IndexOf(findkey);
            // the the next code block is necessary because "BusinessId" is a substring of "ToBusinessId"
            if (pos1 > 0 && str[pos1-1] != ' ') 
            {
                findkey = " " + findkey;
                pos1 = str.IndexOf(findkey, pos1+1);
            }
            if (pos1 >= 0)
            {
                 // the following code is necessary because "BusinessId" is a substring of "ToBusinessId"
                 if (pos1 > 0 && str[pos1-1] != ' ') 
                 int pos2 = str.IndexOf('<', pos1);
                 int pos3 = str.IndexOf('>', pos2);
                 if (pos2 == pos1+1 && pos3 > pos2)
                 {
                        val = str.SubString(++pos2, pos3-pos2);
                 }
            }
            return val;
        }

Open in new window


which would allow to extract key-value pairs from any text if it matches the pattern "key=<value>".

            string test = "PatientId=<1506> some other text. ToBusinessNameId=<1974> User=<xyz@gmail.com>  BusinessNameId=<1973>  ";
            string User = Between(test, "User");
            string BusinessNameId = Between(test, "BusinessNameId");
            string ToBusinessNameId = Between(test, "ToBusinessNameId";
            string PatientId = Between(test, "PatientId");
            // the ToProviderID is not in the test string and gets an empty string therefore.
            string ToProviderId = Between(test, "ToProviderId");

Open in new window


Sara
0
CamilliaAuthor Commented:
I'll take a look. I can do <>
0
Chris StanyonWebDevCommented:
Have to disagree with SaraBande on this one ...

RegEx is perfect for parsing strings. Your string is a known format, so a single call to the method will give you a complete indexed 'model' of your string, rather than having to call the same method over and over again. Regex may not be the fastest in certain cases, but given your needs, the call to regex is not going to perform with any noticeable lag at all. The method I suggested is easily unit testable, so not sure why it seems complex, unsafe and untestable!

This may be subjective, but reading a method that loops over a regex result seems a lot cleaner to read than the pos1, pos2, pos3. Also less chance of bugs creeping in (there are potential bugs in Sara's code)

As can be seen from this question, there are many ways to skin a cat, and it's always good to have options :)
2
CamilliaAuthor Commented:
Yeah, performance is not an issue here. I'll try the method Chris has and post back.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
ASP.NET

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.