Link to home
Start Free TrialLog in
Avatar of digitalpacman
digitalpacman

asked on

ASP & VBscript RegEx Problem

Hi,

I know how to use RegEx pretty well, but this problem is mind boggling!

I am forced to use VBScript for hosting the ASP page, and VBscript RegEx does not support lookbehind!

I need to find a text string that is NOT directly after an id="

This means if I'm looking for ABC, then it would not find it in -         test id="ABC"
But it would find it in -      test ABC

With lookbehind this is cake, but I just can't find a way around it for VBScript!

I tried using word boundaries, lookforwards for the exact match, none of it works cause it just fails, consumes no characters, and goes onto the next character. So eventually it gets after the " mark thinking nothings wrong!

Anyone got any tips on how to simulate lookbehind? =(
ASKER CERTIFIED SOLUTION
Avatar of WMIF
WMIF

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of WMIF
WMIF

another option is to place the = and " in brackets and negate them.

regex.pattern = "[^=""]abc"
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of digitalpacman

ASKER

I like your solution clockwatcher but I cannot use it, sadly. What I'm actually looking for is a long or statement of different commands, so I have no way to replace them back using your method. Well of course unless I do them one at a time and record all necessary information but that is just too much.

WMIF- Your pattern will not work. What is going to be entered is limitless. Spaces, no spaces, special characters, doesn't matter. I have to just find that exact word every single time. Like finding "the" in a blog, but not ignoring typos of "andthe"

WMIF- Your second pattern will not work, because RegEx (god knows freakin why), does not allow you to negate entire groups. What yours says is negate ONE " or ONE =. That means "abc" wont match. Also, doing [^id=""]{4} won't work because what if the user puts idd"ABC. It won't match, and that's valid.

The only way to do this I found was the replace method explained above, or a lookbehind which is not supported.

What I ended up choosing to do:
(?:RegEx for finding id="ABC") | ("ABC")

This forces the RegEx to capture the id="ABC" in a non capturing group, and if ABC is ever found outside ID it is put inside a captured group. So then I look through all the returned results and skip ones that have no captured group. (sort of)

Thanks for your help!
can you give more examples about the data you want returned and the data you dont want returned?  its tough to build a regex expression with only one sample.  the reason i suggested those patterns is because it does work with that sample.  you are right that it only negates ONE = or ONE ", but in front of that ABC in your sample, my pattern worked.
sorry, got interupted.  should have been:

but in front of that ABC in your sample there is only one character.  i was assuming that this was html code and that there may or may not be a quote around the value.  i built my pattern to work with that.
Sure I'll give you some examples uhm..

I want to find all occurances of NAME, and replace with digitalpacman, not inside an ID tag


<div id="NAME">Greetings NAME, how are you today? Would you like some coffee?</div>
<div id="NAME">Greetings digitalpacman, how are you today? Would you like some coffee?</div>

What kind of NAME do you enjoy?
What kind of digitalpacman do you enjoy?

<a href="NAME/fniishthelink.asp">NAME</a>
<a href="digitalpacman/fniishthelink.asp">NAME</a>

<img src="../NAME/wooo.asp"/>
<img src="../digitalpacman/wooo.asp"/>

itdoesntklju2489rfjsio7u213760-941328990274oi12lk4;4e[rfpsps[NAMERokdfgjhdjkfh324987384293oprsfsNAMEmatterlsjkfdhskfdoesntmatterowu489324y4w]f\rwrfsdf87rw84rs7f8<NAME78fs8d4 5f642w3894rw64r5e3id="NAME"

itdoesntklju2489rfjsio7u213760-941328990274oi12lk4;4e[rfpsps[digitalpacmanRokdfgjhdjkfh324987384293oprsfsdigitalpacmanmatterlsjkfdhskfdoesntmatterowu489324y4w]f\rwrfsdf87rw84rs7f8<digitalpacman78fs8d4 5f642w3894rw64r5e3id="NAME"


P.S. This has to be completely BUG proof, no matter what string is being searched it will never have an error no matter what the circumstance. The only leeway you get is the id tag will always have "" marks around it if its the string you are searching for.

I am not only searching for one specific string, I am searching for an occurance of a long list of them.

My regex would have this in it:      .Pattern = "[^id=""](THISSTRING|THATSTRING|ORTHISTOO|MAYBETHISASWELL)
The lookbehind would be: .Pattern = "(?<!(?:id=""))(THISSTRING|THATSTRING|ORTHISTOO|MAYBETHISASWELL)
PS. I'm willing to give you both split points, even if my answer isn't found
i dont have time to setup the testing right now for an asp page, but i do know that asp.net supports lookahead/behind.  could that be an option to write a page and call that from the asp page?
Nope. Can't use ASP.NET already thought about that option.

What I did was just grab all of them and loop through checking if its id=" outside the regex and in an asp vbscript function instead.