Avatar of Christian de Bellefeuille
Christian de BellefeuilleFlag for Canada asked on

Stripping piece of text with RegEx

I would like to be able to remove a part of text using RegEx

Example:
"[BONJOUR]Everything here should be removed as well[/BONJOUR]This part should remain"
Woud become "This part should remain".

I've tried many things, but none work:
[BONJOUR].+[/BONJOUR]
\x5bBONJOUR\x5d.+\x5b/BONJOUR\x5d

Thanks for your help.
Regular ExpressionsC++

Avatar of undefined
Last Comment
pepr

8/22/2022 - Mon
SOLUTION
kaufmed

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
pepr

As kaufmed said. Anyway, regular expressions are not powerful enough for more general cases like this because there is no way to describe nested pair structures.
kaufmed

@pepr

Actually, some regex libraries support balancing groups, which can be used to match nested structures--albeit in a more complicated fashion. I think the Boost regex engine does, but I'm not 100% on that.
ASKER
Christian de Bellefeuille

@kaufmed:  It's close, but it still not it.  The result i get with your expression is:
[R]This part should remain
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
SOLUTION
Derek Jensen

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
Christian de Bellefeuille

@bigdogdman: Yes i've tried.  The only difference between your version and Kaufmed version is the \ right before the /BONJOUR.  But i get the same result.

But there is something different if i test your expression with this website.  It seems to work there.

But does it have anything to do with boost?  I'm using this library for RegEx.  I thought RegEx was a standard... no mather which language or library i use, i was expecting the same results.

Here's my test code:
void testBoostRegex()
{
    std::string wStr = "[BONJOUR]Everything here should be removed as well[/BONJOUR]This part should remain";
    boost::regex wExp("\[BONJOUR\].+?\[\/BONJOUR\]");
    cout << boost::regex_replace(wStr, wExp, "") << endl;
    return;
}

Open in new window

ASKER
Christian de Bellefeuille

@bigdogdman: My bad.  I've forgot to double the \.  With the following expression, it work:

    boost::regex wExp("\\[BONJOUR\\].+?\\[\\/BONJOUR\\]");

Thanks a lot!
ASKER CERTIFIED SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
kaufmed

I'm not sure that bigdogdman's post should be the accepted answer here, since it's really just a copy of what I posted. If anything, pepr's last comment should be the answer since it goes into detail about the need for double-escaping of the backslash--something I mistakenly assumed would be understood given the target language of C++.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
pepr

@kaufmed: I was late :) Anyway, things are as they are, and it is OK.
Derek Jensen

Sorry kaufmed, I didn't mean to steal any poins from you; my bad. :">
I'll try to remember to credit you (or anyone) from now on when I'm merely offering an adjustment to their suggestion.
kaufmed

It's not so much the points as the correctness. The OP stated that my suggestion didn't work (which we now know was due to lack of escaping). You're suggestion assumes that the issue is with the forward slash, which it is not. The only languages that require a forward slash to be escaped are those which use pattern delimiters (e.g. PHP, Perl, Javascipt, etc.). So in a technical sense, your answer is a repeat of my answer.

I see now that the OP posted info regarding the double-backslash as well.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
ASKER
Christian de Bellefeuille

I just want to point out that:
pepr reply happend after i've accepted the answer.  I would have gave him some points for the additionnal information that he have provided
bigdogdman didn't just "copy&paste" your version, but he have shown the exact version that work.  In kaufmed version, a \ was missing before the [\/BONJOUR].  I wouldn't feel correct for E-E readers if i accepted a "partially working" version just based on the fact that you answered first.
I've found the double backslash myself, you can see it with the timestamps.

I'll ask the moderator to split the point equally between the 3 of you, or reopen the question so i can accept the answer properly.

By the way, there will be a similar question in the next hour.  This question was "over simplified".  If i try to adapt to the real situation i'm facing, it still doesn't work
kaufmed

Don't sweat it. I've said my peace, and you have your answer. Nobody lost a limb. It's a good day for everyone  ; )
ASKER
Christian de Bellefeuille

I just want that everyone feel ok with the answer.  I didn't know that it was different for PHP/Perl/Javascript.  When i've posted this question, i've written in the tags "boost" because this is the library that i use (boost::regex_replace to be precise).  And i've posted this in regular expression, and "C++ Languages".  

But ... i was testing it in myregextester web site, because i thought i've made a mistake with my usage of boost library.

I know how it feel when someone accept the wrong answer or when the OP didn't specified things.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
Christian de Bellefeuille

There's no way to ask a related question anymore, so here's the link for those who are interrested to give precisions:
https://www.experts-exchange.com/Programming/Languages/Regular_Expressions/Q_28301459.html
pepr

Well, thanks for the points. Anyway, you should know I am not doing it for the points. (It is a game.) I am learning and repeating via searching for the answer. That's it. :)

Have a nice time (all of you).