Improve company productivity with a Business Account.Sign Up

x
?
Solved

Need one Standard Regular Expression string

Posted on 2013-02-06
6
Medium Priority
?
208 Views
Last Modified: 2013-10-15
I am using a program G-mapper to create a sitemap.xml file.  However, I do not want it to index certain pages.  

The program allows the filtering out of pages using Standard Regular Expressions.  Their help page offers RegEx help links to the following sites:
   
Following is an example of an aspx page that I do not want indexed.
   http://www.companysite.com/ca/anaheim/6008-e.-calle-cedro/4641217/?sorigin=hb

For the record, the above URL is a profile page for a Real Estate listing.  We only list properties in California, so /ca/ is considered static text.  
   http://www.companysite.com/ca/{city}/{address}/{propertyID}/{variable}.  


Based on the above URL, I do not want to crawl any /ca/{city}/{address} pages.   But I am okay with it crawling other sub directory city pages such as /ca/{city}/housingmarkettrends.  

So in laymen terms, below is what I figure is the pattern that I need to trap.  For ease of reading I have broken down each piece of the URL string in its own row below:

   

1.

http://www.companysite.com/ca/
 

2.

followed by {any string of chars, including special chars: hyphens, periods, etc. that ends with a forward slash}  

3.

followed by {string of chars that begin with a digit (zero thru nine) and ends with a forward slash}  

4.

followed by {string of chars that only contain digits (zero thru nine) and ends with a forward slash}  

5.

followed by {string of chars that begin with a question mark and ends with a forward slash}
FYI, I was provided an expression that seems to be legal, but the program seems to ignore it.  Maybe it's not a STANDARD Regular Expression???

   http://www.companysite.com/ca/[^?&/]+/\d[^?&/]*/\d+/

I look forward to any expert advice on the topic.  Best Regards.
0
Comment
Question by:PAEWINS
  • 3
  • 2
6 Comments
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 300 total points
ID: 38861787
One thing you could try is changing each \d to [0-9]
0
 
LVL 20

Assisted Solution

by:simon3270
simon3270 earned 300 total points
ID: 38863329
Also, the "+" (match one or more times) is not in Basic regexes, you only have * (match 0 or more times) or "?" (match 0 or 1 times).  You can get the "+" effect of, for example, "[^?&/]+" with:
    [^?&/][^?&/]*
and "\d+" with:
    [0-9][0-9]*

so mixing Terry and my suggestions, you get:
     http://www.companysite.com/ca/[^?&/][^?&/]*/[0-9][^?&/]*/[0-9][0-9]*/
0
 

Author Comment

by:PAEWINS
ID: 39553195
I am closing this old issue.
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 20

Expert Comment

by:simon3270
ID: 39553785
Were our suggsetions useful at the time?
0
 

Accepted Solution

by:
PAEWINS earned 0 total points
ID: 39553838
No.  But thanks.
0
 

Author Closing Comment

by:PAEWINS
ID: 39573143
No solution provided.  But I appreciate the attempts.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
As most anyone who uses or has come across them can attest to, regular expressions (regex) are a complicated bit of magic. Packed so succinctly within their cryptic syntax lies a great deal of power. It's not the "take over the world" kind of power,…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

607 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question