C#: Regex help

trevor1940
trevor1940 used Ask the Experts™
on
Given the code bellow

using System;
using System.Text;				
using System.Text.RegularExpressions;

public class Program
{
	public static void Main()
	{
		string Sentance  =@"pest, irritant, nag, nuisance, Colloq pain, pain in the neck or Brit taboo arse or US taboo ass; Slang US nudge";
		string[] Colloq = new string[] { "Brit", "US", "Australian", "Canadian", "New Zealand" };
        string[] Labels = new string[] { "Colloq", "Slang", "Taboo", "Archaic", "Old-fashioned" };
		
		string[] Words = Sentance.Split(',');
		foreach (var word in Words)
		{
			if (word.Contains(Labels))
			{
				// everything upto the next ; is that Label 
				// I need to capture the Label , Colloq and word treat 'or' as separate word
			}
			else{
			AlternateWords alternateWords = new AlternateWords()
			{
				AlertnateWord = word.Trim()
			};
			thesauri.alternateWords.Add(alternateWords);
			}
		}

    }
	
}

Open in new window


I need
AlertnateWord pest
AlertnateWord irritant
AlertnateWord  nag
AlertnateWord   nuisance

AlertnateWord Colloq pain
AlertnateWord Colloq pain in the neck
AlertnateWord Colloq Brit taboo arse
AlertnateWord Colloq US taboo ass
AlertnateWord Slang US nudge

Open in new window


Sorry I don't know how else to explain this
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Éric MoreauSenior .Net Consultant
Top Expert 2016

Commented:
Why would Colloq repeat on rows 7 to 9 when it is not in the original array?
Éric MoreauSenior .Net Consultant
Top Expert 2016

Commented:
and how are AlternateWords and thesauri declared?

Author

Commented:
Sorry this is part of a bigger project I don't have exact code  it's at work I extracted this to demo

thesauri is a is a list of class thesaurus and AlternateWords
thesauri  has other properties like term lexical and example

Why would Colloq repeat on rows 7 to 9

the Label Colloq is set with  pain Line 6 this remains true until either a new Colloq label or a new block of words / Sentence
Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

Éric MoreauSenior .Net Consultant
Top Expert 2016

Commented:
does it have to be a regex or just pure C# code?
ǩa̹̼͍̓̂ͪͤͭ̓u͈̳̟͕̬ͩ͂̌͌̾̀ͪf̭̤͉̅̋͛͂̓͛̈m̩̘̱̃e͙̳͊̑̂ͦ̌ͯ̚d͋̋ͧ̑ͯ͛̉Glanced up at my screen and thought I had coded the Matrix...  Turns out, I just fell asleep on the keyboard.
Most Valuable Expert 2011
Top Expert 2015

Commented:
What's wrong with the code you already have? Why do you think regex is the answer?

Author

Commented:
does it have to be a regex or just pure C# code?
Don't really care as long as the parts are isolated and I know which Colloq or Label is found

What's wrong with the code you already have?

The Code I had only extracts the list of alternate words
When I wrote it I hadn't realised there could be different  AlternateWords  Colloquialisms

For the example gave the term is "bother"  verb there are 8 sets of Alternate Words  I've dealt with the other 7

Why do you think regex is the answer? 

Open in new window


Only because I Need both the colloq / Label which can be pattern matched + the next word(s)
If there is an other way of achieving the objective then I'm open to suggestions

Author

Commented:
I've just realised lines 6 -9 of output should probably read something like

AlertnateWord Label = Colloq  word = pain
AlertnateWord Label = Colloq  word = pain in the neck
AlertnateWord Label = Colloq  word =  pain in the Colloq  = Brit Label taboo arse
AlertnateWord Label = Colloq  word =   pain in the Colloq   = US Label  taboo ass

Open in new window

Senior .Net Consultant
Top Expert 2016
Commented:
I have something that works as per my tests!
        private void button2_Click(object sender, EventArgs e)
        {
            listBox1.Items.Clear();

            string Sentance = @"pest, irritant, nag, nuisance, Colloq pain, pain in the neck or Brit taboo arse or US taboo ass; Slang US nudge";
            //string[] Colloq = new string[] { "Brit", "US", "Australian", "Canadian", "New Zealand" };
            string[] Labels = new string[] { "Colloq", "Slang", "Taboo", "Archaic", "Old-fashioned" };

            string strRemaining = Sentance;
            while (!string.IsNullOrWhiteSpace(strRemaining))
            {
                int intSeparator = strRemaining.IndexOf(',');
                if (intSeparator < 0) intSeparator = strRemaining.Length;
                string strPart = strRemaining.Substring(0, intSeparator);

                string strFirstWord = strPart.Trim().Split(' ')[0];

                if (Labels.Contains(strFirstWord))
                {
                    listBox1.Items.Add(strPart);
                    strRemaining = strRemaining == strPart ? string.Empty : strRemaining.Substring(strPart.Length + 1);
                    int intSemiColon = strRemaining.IndexOf(';');
                    if (intSemiColon < 0) intSemiColon = strRemaining.Length;
                    foreach (string strSubPart in strRemaining.Substring(0,intSemiColon).Split(new string[] {" or "}, StringSplitOptions.None))
                    {
                        if (!string.IsNullOrWhiteSpace(strSubPart)) listBox1.Items.Add(strFirstWord + " " + strSubPart);
                    }

                    if (!string.IsNullOrWhiteSpace(strRemaining)) strRemaining = strRemaining.Substring(intSemiColon + 1);
                }
                else
                {
                    listBox1.Items.Add(strPart);

                    strRemaining = intSeparator >= strRemaining.Length ? string.Empty : strRemaining.Substring(intSeparator + 1);
                }
            }
        }

Open in new window

Author

Commented:
Thanx
I'm using this approach for other lines of text
it has revealed errors & missing data in my original code and resulted in a rethink of requirements
It helped me prove to the powers that be what   they wanted wasn't possible

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial