Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

RegEx Help

Posted on 2014-04-28
18
Medium Priority
?
346 Views
Last Modified: 2014-04-29
I want to look for matches in a string and those matches may contain Alpha-Numeric values as well as special characters Here is the example source

%*)
bright
bright light
bright future

The expression is

\W(bright|bright light|bright future)\W

It completely ignores the %*) and find "bright" twice, no "bright light" or "bright future". Just bright. Can someone please help?

Thanks
JS
0
Comment
Question by:jimmysaunders
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 4
  • 3
  • +2
18 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 40027887
(bright light|bright future|bright|%\*\))
0
 

Author Comment

by:jimmysaunders
ID: 40027895
\W(bright light|bright future|bright|%\*\))\W

This only finds bright light.
0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 40027899
Try this:
(bright( light| future)?|%\*\))
0
Plesk WordPress Toolkit

Plesk's WordPress Toolkit allows server administrators, resellers and customers to manage their WordPress instances, enabling a variety of development workflows for WordPress admins of all skill levels, from beginners to pros.

See why 2/3 of Plesk servers use it.

 

Author Comment

by:jimmysaunders
ID: 40027912
@"\W(bright( light| future)?|%\*\))\W"

Finds bright lights and lights. Also, at the run time, I'm getting the words from a database so a lot more to do if that is the only way to go.
0
 
LVL 84

Expert Comment

by:ozo
ID: 40027921
Are you saying that none of them find "bright future"?
What are the characters immediately preceding and following  "bright future"?
0
 
LVL 15

Expert Comment

by:WalkaboutTigger
ID: 40027931
What do you want it to return - "%*)" ?

If that's the case, try

\W|_
0
 

Author Comment

by:jimmysaunders
ID: 40027937
Thanks for the responses folks.

@WalkaboutTigger: I want it to find all four words.

@ozo: Yes, none of them found "bright future". In the source string, the characters following it are \r\n (it's a file whose contents I am searching hence the line feed)
0
 
LVL 15

Expert Comment

by:WalkaboutTigger
ID: 40027947
What language are you using this in?
And you only wish to find those 4 specific strings or ?
0
 

Author Comment

by:jimmysaunders
ID: 40027954
C#. And I am using those words just as an example. The list is huge.
0
 

Author Comment

by:jimmysaunders
ID: 40027960
Maybe the original code will help

var regex = new Regex(string.Format(@"\W({0})\W", strKey), RegexOptions.IgnoreCase);

var matches = regex.Matches(filecontents);
0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 40027964
OK.
Ignore for the moment the regular expressions and please state in words what you want to achieve.
0
 

Author Comment

by:jimmysaunders
ID: 40027997
I have a list of words phrases and symbols in a table in a database. I have a flat file and I want to find out all the words, phrases and symbols in that file from the table. The filecontent variable in the above example is the string that contains the contents of the file that I am looking in and strKey is the pipe-delimited word list from the table which, for example, can be something like

(bright|bright future|bright light)


It works fine as long as i'ts a single word and contains no special characters. But phrases and symbols are not working.
0
 

Author Comment

by:jimmysaunders
ID: 40028015
Interestingly, it finds the phrases that are unique but in the above example it finds "bright" and then "bright " and another "bright "
0
 
LVL 15

Expert Comment

by:WalkaboutTigger
ID: 40028038
So you essentially have two arrays (the flat file and the table from the database) and you are attemtping to determine intersections.

Using a regular expression for this will dramatically effect performance.

Likely the fastest method of solving this scenario is to first sort both arrays and then, using the db array as your authoritative source, iterate through the contents of the flat file array and determine each entry is stored in the db.

Are you trying to keep the matches or the differences?

What language do you want this written in, or is pseudo-code sufficient?
0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 40028039
That's normal. 'bright' is the first in the alternation, so the moment it finds 'bright' the match is complete.

You would need to reverse the alternation so it finds the rest of forms. See Ozo's solution.
0
 

Author Comment

by:jimmysaunders
ID: 40028049
@WalkaboutTigger: I'm using C# and I am trying to keep the matches.
0
 
LVL 15

Accepted Solution

by:
WalkaboutTigger earned 1200 total points
ID: 40028102
bool matchFound = false;
foreach (string str in strArray)
    {
       foreach (string str2 in strArray2)
       {
           if (str == str2)
           {
              matchFound = true;
              Console.WriteLine("a match has been found");
           }
       }

       if (matchFound == false)
       {
          Console.WriteLine("no match found");
       }
    }

Open in new window


or, in fewer lines

foreach (string str in strArray)
{
    if(strArray2.Contains(str))
    {
       Console.WriteLine("a match has been found");
    }
    else
    {
       Console.WriteLine("no match found");
    }
}

Open in new window


A regular expression would take far, far, FAR longer to execute.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 40028472
Off Topic
A regular expression would take far, far, FAR longer to execute.
That's an over exaggeration, I think. Depending on how the regex is written, a regex could theoretically outperform nested loops.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Iteration: Iteration is repetition of a process. A student who goes to school repeats the process of going to school everyday until graduation. We go to grocery store at least once or twice a month to buy products. We repeat this process every mont…
Prime numbers are natural numbers greater than 1 that have only two divisors (the number itself and 1). By “divisible” we mean dividend % divisor = 0 (% indicates MODULAR. It gives the reminder of a division operation). We’ll follow multiple approac…
This video teaches users how to migrate an existing Wordpress website to a new domain.
Learn how to set-up custom confirmation messages to users who complete your Wufoo form. Include inputs from fields in your form, webpage redirects, and more with Wufoo’s confirmation options.

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question