Solved

Need regex help!!!

Posted on 2009-05-07
11
346 Views
Last Modified: 2012-05-06
I need to extract items 3,5, and 6 from the markup below:

<ul>
    <p id="XXX" XXX="test" class="something title test XXX" id="XXX" XXX="test">tester 2 XXX tester</p>
        <XXX>
           tester 3 XXX Care is a type of care that allows people facing
        </p>
        <XXX>
            testXXXer 4 Care is a type of care that allows people facing
        </p>
        <XXX>
            tester 5 XXX Care is a type of 6 XXX care that allows people facing
        </p>

Currently the regex I have (seen below) gets 2,3,5, and 6.  How do I get it to stop hitting 2?
(?<!<[^>]*(class="[^"]*title[^"]*")[^>]*)

(?<=(>[^<>]*))

(\sXXX\s)

Open in new window

0
Comment
Question by:abemiester
  • 5
  • 5
11 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 24327852
([^<>2]*[356]([^<>2]*)
0
 
LVL 84

Expert Comment

by:ozo
ID: 24327858
([^<>2]*[356][^<>2]*)
0
 

Author Comment

by:abemiester
ID: 24327883
I apologize for the confusion.  I'm trying to match "XXX" the numbers i refer to are the ones infront of "XXX".
0
 

Author Comment

by:abemiester
ID: 24327896
And to clarify those are the ONLY instances of "XXX" i want to match.  If you look at the source you can see there are several unnumbered instances of XXX as well that must not be matched.
0
 
LVL 84

Expert Comment

by:ozo
ID: 24327957
(?:\b[356]\b\s*)(\w+)
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:abemiester
ID: 24328227
Close.  Let me explain what I am tryign to achieve.

I want to match any "XXX" that is not inside of a tag that uses the class "title".  In addition the XXX cannot be inside of a tag.

Example (Should not match):
<p XXX="id" id="XXX">test</p>

Example(Should not match):
<p class="test XXX title">test</p>

Example(Should not match):
<p class="test title">XXX</p>

Example(Should match):
<p class="test">XXX</p>

Does this help clarify what i'm trying to do?
0
 
LVL 4

Expert Comment

by:orbitus
ID: 24329166
How about...

(\d+)\sXXX(?!.+?</p>)
0
 

Author Comment

by:abemiester
ID: 24330060
The digit is really irrelevant.  The only reason i put it in the text was so I could refer to it in this post.

I want the XXX that is NEXT to 3,5, and 6.  I don't care about the numbers 3,5, or 6.  I just included them so i could explain which XXX i wanted to match.  I think that the original regex i provided is a good place to start.

  Also it can be any html tag.  Not just <p>.  Trying to match XXX that is not in ANY tag with the class "title" and XXX that is not an attribute of or ANY tag.
0
 
LVL 84

Expert Comment

by:ozo
ID: 24330397
(?:<(\w+)\b[^>]*class="[^"]*title[^"]*"[^>]*>[\s\S]*?<\/\1|[\S\s])*?(XXX)(?![^<>]*>)
0
 

Author Comment

by:abemiester
ID: 24330473
That's almost it ozo!  Only problem is that it matches the XXX with a 4 infront of it.  If we can stop matching that one we're set.
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 24330584
(?:<(\w+)\b[^>]*class="[^"]*title[^"]*"[^>]*>[\s\S]*?<\/\1|[\S\s])*?\b(XXX)\b(?![^<>]*>)
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

919 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now