We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now


Problems w/ String, Regex, and infinite loop

jessed asked
Medium Priority
Last Modified: 2010-07-27
The current project I work on deals with submitting certain HTML-like
resources to a homemade database we have made.  The HTML-like resources
have tags in them like
<Filename> </Filename>   and
<Title> </Title>
Along the way, we create HTML files from these resources by parsing the

Recently, a coworker of mine created a template language (as a C++
class) that, given one of these resources and a template which he
created, turns one of these resources into an actual HTML file.  (let me
know if I'm boring anyone ;)

Anyway, to get to the point, he reads in each line of the template file
and stores it in a variable of type String( the "super" string class ).
He then uses the contains() method in the Regex class to determine if
any keywords are located in it.  The problem is that, during the course
of looping through this file, a line is read in that causes Regex() to
go into an infinite loop (or darn near close).

The line read in is the following (he reads into the String() buffer he
has set up until he sees a ">"):


My coworker reads this in a buffer called line (a member of the String()
class) and then issues the following command:
    ending = line.at ((Regex)"[Ee][Nn][Dd][Ii][Nn][Gg]=\"[^\"]*\"");
to determine if the line has an ending attribute in it.

Unfortunately, this decides to go into the "infinite" loop on this
line.  The output from gdb is the following:

#0  rx_bitset_difference (size=8, a=0x11fffece0, b=0x140125b64) at
#1  0x120022cc8 in compute_super_edge (rx=0x14001a7a0,
    csetout=0x11fffece0, superstate=0x1400243a0, chr=224 'ý') at
#2  0x120022f80 in rx_handle_cache_miss (rx=0x14001a7a0,
    chr=0 '\000', data=0x1400243a0) at rx.c:3853
#3  0x12001ef90 in rx_search (rxb=0x14001a7a0, startpos=0, range=161,
    stop=161, total_size=161, get_burst=0x1200260d0
    back_check=0x120026250 <re_search_2_back_check>,
    fetch_char=0x120026440 <re_search_2_fetch_char>,
    regs=0x140010ac0, resume_state=0x0, save_state=0x0) at rx.h:3589
#4  0x120026548 in re_search_2 (rxb=0x8,
    string1=0x11fffece0 "*", '*' <repeats 31 times>, size1=1,
    string2=0x1400243a0 "\201", size2=0, startpos=537026768, range=161,
    regs=0x140010ac0, stop=161) at rx.c:6444
#5  0x12001ccec in Regex::search (this=0xffffffffffffffff,
    s=0x140000658 '\001' <repeats 200 times>..., len=-3,
    startpos=0) at Regex.cc:105
#6  0x120019664 in String::at () at String.h:640
#7  0x12000eec0 in createHTML (dbmlLoc={rep = 0x1400071e0}, hdtLoc={
      rep = 0x140009d10}, htmlLoc={rep = 0x140009ef0}) at hdt.cxx:163
#8  0x1200109ec in main (argc=4, argv=0x11ffffce8) at createHTML.cxx:34

So, in other words, I have no idea what's crashing.

I know this is probably a fairly specific question.  If any one has any
idea what could be crashing this, I do have the entire source of the
program to send (I thought no one would want to see all of it on
Usenet).  BTW, this is using g++ 2.7.2 on OSF 4.0.
Watch Question

An "infinite" loop is cause when you have the wrong termination cause for your loop.

for example:
for (int x=2; x==1; x++) printf("");
would never end. I have not seen your source code but something like is usually the reason I get infinite loops.  A annoying, but easily fixed prob.


This comment is completely unrelated to my question.  Yeah, most 10-year olds know what an infinite loop is!

You said that you are getting stuck in a infinite loop.  Which you also said that you did not know where it is crashing. A 10 year old would know that if you are getting stuck in a infinite loop then that is proboly where you are crashing.    

I don't mean to insult or anything, but sometimes the most obvious answers are just too hard to see until someone points them out.  Believe me I have been there.
As an interesting test, I'd like to see what happens when you change the regex expression to match the actual case of the input string so that it reads ending= instead of [Ee][Nn]...

Anyway, I suspect the problem is in the attempt to match the double quotes.  Before I get into the answer, anyone doing regex stuff should stop reading this and go buy Mastering Regular Expressions by Jeffrey E.F. Friedl.  

Jeff talks about this problem extensively throughout the book.  It would appear that your infinte loop is the result of backtracking.  He goes on to explain how "unrolling the loop" avoids the neverending match (pg. 162-176).

Jeff offers this as a solution to match a double quoted string with possibly escaped characters imbeded:


You will have to "translate" this to your flavor of regex.

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Most Valuable Expert 2014
Top Expert 2015

That regexp /[Ee][Nn][Dd][Ii][Nn][Gg]="[^"]*"/
doesn't seem like it ought to cause excessive backtracking.
What is the line which causes it to loop?

Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.


Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.