Regular Exp Help! Matching where _NOT_ ?=lookahead
Posted on 2006-05-30
I need some help with a regexp.
I am doing screen scrapes and I need to filter out the last match here is an example of the code and the hits:
# I understand that the .+ is greedy, I need to find a way to make it not so much: lookaheads?
match = re.findall("[0-9]+.html\">.+<\/a>",s)
for sMatch in match:
#====================== OUTPUT =====
165810536.html">Free Wrought Iron Railings</a>
165807325.html">2 Free Baby Seats! (Get 'em now!)</a>
165806607.html">Free Brookstone Dustbuster- Needs Fixing</a>
100.html">next 100 postings</a> #<----- This line needs to _not_ be a match
For reference, the site im scraping is Craigslist.org