Help with ColdFusion / Cold Fusion Regular Expression

I need to build a regular expression that finds a certain nested block of text.  Let me give you the example:

This is a {great|{nice|cool}} way to {{accomplish|finish}|my task}

So what I need to extract are:
{nice|cool}
{accomplish|finish}

I will then replace them with randomly selected text (either nice or cool, then either accomplish or finish)

which will leave me with:

This is a {great|cool} way to {finish|my task}

I will then need to extract: {great|cool} and {finish|my task}
... and do the same.

The trick is that I need the inner most nested ones first.

I don't speak REGEX :(  so right now I'm doing it with FIND() and similar functions, but I'm not able to get the nested ones (the first example)

Any regular expression gurus would be much appreciated !!!!

Thanks


LVL 1
drgdrgAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

SuperdaveCommented:
Regular expressions can't help you there, even in theory (unless CF has some kind of regular expression extensions that make them more powerful than what should be called REs, but I'm guessing it doesn't.  Microsoft NET apparently does.).

You need to look through the string until you find the closing } -- that way, you find the innermost nest first.  Then either look backwards for the {, or keep track of the {'s as you go either on a stack data structure or by doing a recursive call whenever you find a { so that the position of the { is one of the function's parameters, and after you find the } you have the innermost nest; pick one of the alternative strings, and return so the next outer level of recursion will go on looking for the next }.
käµfm³d 👽Commented:
Continuing along Superdave's track, you are not going to accomplish the recursion using regexes--you will need to create a recursive function. That function can use regex to match your pattern.

I believe the following pattern will work for you, but you'll need to set it up to be used recursively. I do not speak CF, so I cannot work it in to a suitable function for you  :(
{[^{}]+}

Open in new window

käµfm³d 👽Commented:
If CF supports capture groups, then you could use the following to capture the inner text of the bracketed string. The capture groups would be numbered going from left to right and starting at 1.

Using the pattern below with your first example of

    This is a {great|{nice|cool}} way to {{accomplish|finish}|my task}

"{nice|cool}" should be matched and the capture groups should read as

    1:  nice
    2:  cool

You don't have to use the capture groups, but it would give your strings to choose from already extracted.
{([^{}]+)\|([^{}]+)}

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Fundamentals of JavaScript

Learn the fundamentals of the popular programming language JavaScript so that you can explore the realm of web development.

käµfm³d 👽Commented:
I believe this would be an example of using the backreference. Unless I'm completely wrong with this CF syntax, below would change

    This is a {great|{nice|cool}} way to {{accomplish|finish}|my task}

to

    This is a {great|nice} way to {{accomplish|finish}|my task}

To use backreferences, you specify the syntax as "\#", where "#" corresponds to the number of the capture group, starting at 1. For the example below, "\1" picked the first capture group, which corresponded to "nice".
REReplace("This is a {great|{nice|cool}} way to {{accomplish|finish}|my task}", "{([^{}]+)\|([^{}]+)}","\1", "ONE")

Open in new window

drgdrgAuthor Commented:
What about - and my syntax is complete wrong - but searching for something like this:

{*{*}*}

Again, I don't know regular expressions, but something that starts with a bracket, then "something" (or nothing) then another opening bracket, then something, then a close, then something, than a close

But if we can even do that, I'm just getting {*{*}*} and not {*}, correct?

Even that wouldn't be so bad, because I could just do a different call over the results.
drgdrgAuthor Commented:
Sorry, several posts came in between me writing this and clicking submit... I'll look at the examples provided.

Thanks
drgdrgAuthor Commented:
Thank you Kaufmed, your second regular expression worked for me.
The CF syntax was off, but the RE was spot on.

For any CF people out there, here is the code I used:

<CFOUTPUT>
      <CFSET String="This is a {great|{nice|cool}} way to {{accomplish|finish}|my task}">
    String=#String#<BR><BR>

    <CFSET start = refind("{([^{}]+)\|([^{}]+)}",string,"1") + 1>
      <CFIF start GT 1>
            <CFSET end = find("}", string, start+1)>
        <CFSET choices = mid(string, start, end-start)>
      </CFIF>    
    Choices = #choices#
</CFOUTPUT>

This returned: nice|cool

After that, its just looping
käµfm³d 👽Commented:
NP. Glad to help  :)

Thanks for cutting me some slack on the CF  ;)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Programming Languages-Other

From novice to tech pro — start learning today.