[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 485
  • Last Modified:

ASP & VBscript RegEx Problem

Hi,

I know how to use RegEx pretty well, but this problem is mind boggling!

I am forced to use VBScript for hosting the ASP page, and VBscript RegEx does not support lookbehind!

I need to find a text string that is NOT directly after an id="

This means if I'm looking for ABC, then it would not find it in -         test id="ABC"
But it would find it in -      test ABC

With lookbehind this is cake, but I just can't find a way around it for VBScript!

I tried using word boundaries, lookforwards for the exact match, none of it works cause it just fails, consumes no characters, and goes onto the next character. So eventually it gets after the " mark thinking nothings wrong!

Anyone got any tips on how to simulate lookbehind? =(
0
digitalpacman
Asked:
digitalpacman
  • 5
  • 4
2 Solutions
 
WMIFCommented:
what about just forcing a space in front?  do you have other scenarios that would kill this option?  it works perfectly in your example above.

regex.pattern = "\sabc"
0
 
WMIFCommented:
another option is to place the = and " in brackets and negate them.

regex.pattern = "[^=""]abc"
0
 
clockwatcherCommented:
Not sure whether you just wanted to check for existence or do a replace-- checking for existence is much easier.  Below is an example of both.  I know you won't like the replace (and admittedly it's a hack).  But w/o a look-behind, it's the easiest way to do it--- and in practice you wouldn't run into a problem with it.

Assuming you can't make any assumptions about the text being searched and that you would want to match this "is" and all the 'is's in between and just not match this: id="is".  The cheater's way is to do something like this:

option explicit

function patternExists(searchIn, pattern)

   dim re,c1,c2
   set re = new RegExp
   re.pattern = "id=""" & pattern
   re.global = true
   c1 = re.execute(searchIn).count
   re.pattern = pattern
   c2 = re.execute(searchIn).count

   patternExists = c2 > c1

end function

function patternReplace(searchIn, pattern, replacewith)

   dim boguspattern,re
      
   boguspattern = "~~THE_ODDS~OF_EVeR_having_THIS_PATTERN_occur_NAtURALLY_Are#INFINITESMAL"

   set re = new RegExp
   re.ignorecase = true
   re.global=true

   re.pattern = pattern
   boguspattern = re.replace(boguspattern,"")

   re.pattern = "id=""" & pattern
   searchIn = re.replace(searchIn, boguspattern)
   
   re.pattern = pattern
   searchIn = re.replace(searchIn, replacewith)
   
   re.pattern = boguspattern
   re.ignorecase = false
   patternReplace = re.replace(searchIn, "id=""" & pattern)

end function


dim s
s = "Assuming you can't make any assumptions about the text being searched and that you would want to match this ""is"" and all the 'is's in between and just not match this: id=""is"".  The cheater's way is to do something like this:"

response.write patternReplace(s, "is", "isnot")

if patternExists("id=""is""", "is") then
   response.write "Yep"
else
   response.write "Nope"
end if


if patternExists("is id=""is""", "is") then
   response.write "Yep"
else
   response.write "Nope"
end if
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
digitalpacmanAuthor Commented:
I like your solution clockwatcher but I cannot use it, sadly. What I'm actually looking for is a long or statement of different commands, so I have no way to replace them back using your method. Well of course unless I do them one at a time and record all necessary information but that is just too much.

WMIF- Your pattern will not work. What is going to be entered is limitless. Spaces, no spaces, special characters, doesn't matter. I have to just find that exact word every single time. Like finding "the" in a blog, but not ignoring typos of "andthe"

WMIF- Your second pattern will not work, because RegEx (god knows freakin why), does not allow you to negate entire groups. What yours says is negate ONE " or ONE =. That means "abc" wont match. Also, doing [^id=""]{4} won't work because what if the user puts idd"ABC. It won't match, and that's valid.

The only way to do this I found was the replace method explained above, or a lookbehind which is not supported.

What I ended up choosing to do:
(?:RegEx for finding id="ABC") | ("ABC")

This forces the RegEx to capture the id="ABC" in a non capturing group, and if ABC is ever found outside ID it is put inside a captured group. So then I look through all the returned results and skip ones that have no captured group. (sort of)

Thanks for your help!
0
 
WMIFCommented:
can you give more examples about the data you want returned and the data you dont want returned?  its tough to build a regex expression with only one sample.  the reason i suggested those patterns is because it does work with that sample.  you are right that it only negates ONE = or ONE ", but in front of that ABC in your sample, my pattern worked.
0
 
WMIFCommented:
sorry, got interupted.  should have been:

but in front of that ABC in your sample there is only one character.  i was assuming that this was html code and that there may or may not be a quote around the value.  i built my pattern to work with that.
0
 
digitalpacmanAuthor Commented:
Sure I'll give you some examples uhm..

I want to find all occurances of NAME, and replace with digitalpacman, not inside an ID tag


<div id="NAME">Greetings NAME, how are you today? Would you like some coffee?</div>
<div id="NAME">Greetings digitalpacman, how are you today? Would you like some coffee?</div>

What kind of NAME do you enjoy?
What kind of digitalpacman do you enjoy?

<a href="NAME/fniishthelink.asp">NAME</a>
<a href="digitalpacman/fniishthelink.asp">NAME</a>

<img src="../NAME/wooo.asp"/>
<img src="../digitalpacman/wooo.asp"/>

itdoesntklju2489rfjsio7u213760-941328990274oi12lk4;4e[rfpsps[NAMERokdfgjhdjkfh324987384293oprsfsNAMEmatterlsjkfdhskfdoesntmatterowu489324y4w]f\rwrfsdf87rw84rs7f8<NAME78fs8d4 5f642w3894rw64r5e3id="NAME"

itdoesntklju2489rfjsio7u213760-941328990274oi12lk4;4e[rfpsps[digitalpacmanRokdfgjhdjkfh324987384293oprsfsdigitalpacmanmatterlsjkfdhskfdoesntmatterowu489324y4w]f\rwrfsdf87rw84rs7f8<digitalpacman78fs8d4 5f642w3894rw64r5e3id="NAME"


P.S. This has to be completely BUG proof, no matter what string is being searched it will never have an error no matter what the circumstance. The only leeway you get is the id tag will always have "" marks around it if its the string you are searching for.

I am not only searching for one specific string, I am searching for an occurance of a long list of them.

My regex would have this in it:      .Pattern = "[^id=""](THISSTRING|THATSTRING|ORTHISTOO|MAYBETHISASWELL)
The lookbehind would be: .Pattern = "(?<!(?:id=""))(THISSTRING|THATSTRING|ORTHISTOO|MAYBETHISASWELL)
0
 
digitalpacmanAuthor Commented:
PS. I'm willing to give you both split points, even if my answer isn't found
0
 
WMIFCommented:
i dont have time to setup the testing right now for an asp page, but i do know that asp.net supports lookahead/behind.  could that be an option to write a page and call that from the asp page?
0
 
digitalpacmanAuthor Commented:
Nope. Can't use ASP.NET already thought about that option.

What I did was just grab all of them and loop through checking if its id=" outside the regex and in an asp vbscript function instead.
0

Featured Post

Prep for the ITIL® Foundation Certification Exam

December’s Course of the Month is now available! Enroll to learn ITIL® Foundation best practices for delivering IT services effectively and efficiently.

  • 5
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now