Solved

Need regex help!!!

Posted on 2009-05-07
11
345 Views
Last Modified: 2012-05-06
I need to extract items 3,5, and 6 from the markup below:

<ul>
    <p id="XXX" XXX="test" class="something title test XXX" id="XXX" XXX="test">tester 2 XXX tester</p>
        <XXX>
           tester 3 XXX Care is a type of care that allows people facing
        </p>
        <XXX>
            testXXXer 4 Care is a type of care that allows people facing
        </p>
        <XXX>
            tester 5 XXX Care is a type of 6 XXX care that allows people facing
        </p>

Currently the regex I have (seen below) gets 2,3,5, and 6.  How do I get it to stop hitting 2?
(?<!<[^>]*(class="[^"]*title[^"]*")[^>]*)

(?<=(>[^<>]*))

(\sXXX\s)

Open in new window

0
Comment
Question by:abemiester
  • 5
  • 5
11 Comments
 
LVL 84

Expert Comment

by:ozo
Comment Utility
([^<>2]*[356]([^<>2]*)
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
([^<>2]*[356][^<>2]*)
0
 

Author Comment

by:abemiester
Comment Utility
I apologize for the confusion.  I'm trying to match "XXX" the numbers i refer to are the ones infront of "XXX".
0
 

Author Comment

by:abemiester
Comment Utility
And to clarify those are the ONLY instances of "XXX" i want to match.  If you look at the source you can see there are several unnumbered instances of XXX as well that must not be matched.
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
(?:\b[356]\b\s*)(\w+)
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:abemiester
Comment Utility
Close.  Let me explain what I am tryign to achieve.

I want to match any "XXX" that is not inside of a tag that uses the class "title".  In addition the XXX cannot be inside of a tag.

Example (Should not match):
<p XXX="id" id="XXX">test</p>

Example(Should not match):
<p class="test XXX title">test</p>

Example(Should not match):
<p class="test title">XXX</p>

Example(Should match):
<p class="test">XXX</p>

Does this help clarify what i'm trying to do?
0
 
LVL 4

Expert Comment

by:orbitus
Comment Utility
How about...

(\d+)\sXXX(?!.+?</p>)
0
 

Author Comment

by:abemiester
Comment Utility
The digit is really irrelevant.  The only reason i put it in the text was so I could refer to it in this post.

I want the XXX that is NEXT to 3,5, and 6.  I don't care about the numbers 3,5, or 6.  I just included them so i could explain which XXX i wanted to match.  I think that the original regex i provided is a good place to start.

  Also it can be any html tag.  Not just <p>.  Trying to match XXX that is not in ANY tag with the class "title" and XXX that is not an attribute of or ANY tag.
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
(?:<(\w+)\b[^>]*class="[^"]*title[^"]*"[^>]*>[\s\S]*?<\/\1|[\S\s])*?(XXX)(?![^<>]*>)
0
 

Author Comment

by:abemiester
Comment Utility
That's almost it ozo!  Only problem is that it matches the XXX with a 4 infront of it.  If we can stop matching that one we're set.
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
Comment Utility
(?:<(\w+)\b[^>]*class="[^"]*title[^"]*"[^>]*>[\s\S]*?<\/\1|[\S\s])*?\b(XXX)\b(?![^<>]*>)
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now