Problems w/ String, Regex, and infinite loop

Posted on 1997-08-12
Last Modified: 2010-07-27
The current project I work on deals with submitting certain HTML-like
resources to a homemade database we have made.  The HTML-like resources
have tags in them like
<Filename> </Filename>   and
<Title> </Title>
Along the way, we create HTML files from these resources by parsing the

Recently, a coworker of mine created a template language (as a C++
class) that, given one of these resources and a template which he
created, turns one of these resources into an actual HTML file.  (let me
know if I'm boring anyone ;)

Anyway, to get to the point, he reads in each line of the template file
and stores it in a variable of type String( the "super" string class ).
He then uses the contains() method in the Regex class to determine if
any keywords are located in it.  The problem is that, during the course
of looping through this file, a line is read in that causes Regex() to
go into an infinite loop (or darn near close).

The line read in is the following (he reads into the String() buffer he
has set up until he sees a ">"):


My coworker reads this in a buffer called line (a member of the String()
class) and then issues the following command:
    ending = ((Regex)"[Ee][Nn][Dd][Ii][Nn][Gg]=\"[^\"]*\"");
to determine if the line has an ending attribute in it.

Unfortunately, this decides to go into the "infinite" loop on this
line.  The output from gdb is the following:

#0  rx_bitset_difference (size=8, a=0x11fffece0, b=0x140125b64) at
#1  0x120022cc8 in compute_super_edge (rx=0x14001a7a0,
    csetout=0x11fffece0, superstate=0x1400243a0, chr=224 'ý') at
#2  0x120022f80 in rx_handle_cache_miss (rx=0x14001a7a0,
    chr=0 '\000', data=0x1400243a0) at rx.c:3853
#3  0x12001ef90 in rx_search (rxb=0x14001a7a0, startpos=0, range=161,
    stop=161, total_size=161, get_burst=0x1200260d0
    back_check=0x120026250 <re_search_2_back_check>,
    fetch_char=0x120026440 <re_search_2_fetch_char>,
    regs=0x140010ac0, resume_state=0x0, save_state=0x0) at rx.h:3589
#4  0x120026548 in re_search_2 (rxb=0x8,
    string1=0x11fffece0 "*", '*' <repeats 31 times>, size1=1,
    string2=0x1400243a0 "\201", size2=0, startpos=537026768, range=161,
    regs=0x140010ac0, stop=161) at rx.c:6444
#5  0x12001ccec in Regex::search (this=0xffffffffffffffff,
    s=0x140000658 '\001' <repeats 200 times>..., len=-3,
    startpos=0) at
#6  0x120019664 in String::at () at String.h:640
#7  0x12000eec0 in createHTML (dbmlLoc={rep = 0x1400071e0}, hdtLoc={
      rep = 0x140009d10}, htmlLoc={rep = 0x140009ef0}) at hdt.cxx:163
#8  0x1200109ec in main (argc=4, argv=0x11ffffce8) at createHTML.cxx:34

So, in other words, I have no idea what's crashing.

I know this is probably a fairly specific question.  If any one has any
idea what could be crashing this, I do have the entire source of the
program to send (I thought no one would want to see all of it on
Usenet).  BTW, this is using g++ 2.7.2 on OSF 4.0.
Question by:jessed
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Expert Comment

ID: 1167135
An "infinite" loop is cause when you have the wrong termination cause for your loop.

for example:
for (int x=2; x==1; x++) printf("");
would never end. I have not seen your source code but something like is usually the reason I get infinite loops.  A annoying, but easily fixed prob.

Author Comment

ID: 1167136
This comment is completely unrelated to my question.  Yeah, most 10-year olds know what an infinite loop is!

Expert Comment

ID: 1167137
You said that you are getting stuck in a infinite loop.  Which you also said that you did not know where it is crashing. A 10 year old would know that if you are getting stuck in a infinite loop then that is proboly where you are crashing.    

I don't mean to insult or anything, but sometimes the most obvious answers are just too hard to see until someone points them out.  Believe me I have been there.

Accepted Solution

peter_vc earned 200 total points
ID: 1167138
As an interesting test, I'd like to see what happens when you change the regex expression to match the actual case of the input string so that it reads ending= instead of [Ee][Nn]...

Anyway, I suspect the problem is in the attempt to match the double quotes.  Before I get into the answer, anyone doing regex stuff should stop reading this and go buy Mastering Regular Expressions by Jeffrey E.F. Friedl.  

Jeff talks about this problem extensively throughout the book.  It would appear that your infinte loop is the result of backtracking.  He goes on to explain how "unrolling the loop" avoids the neverending match (pg. 162-176).

Jeff offers this as a solution to match a double quoted string with possibly escaped characters imbeded:


You will have to "translate" this to your flavor of regex.

LVL 84

Expert Comment

ID: 1167139
That regexp /[Ee][Nn][Dd][Ii][Nn][Gg]="[^"]*"/
doesn't seem like it ought to cause excessive backtracking.
What is the line which causes it to loop?


Featured Post

[Live Webinar] The Cloud Skills Gap

As Cloud technologies come of age, business leaders grapple with the impact it has on their team's skills and the gap associated with the use of a cloud platform.

Join experts from 451 Research and Concerto Cloud Services on July 27th where we will examine fact and fiction.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will show you some of the more useful Standard Template Library (STL) algorithms through the use of working examples.  You will learn about how these algorithms fit into the STL architecture, how they work with STL containers, and why t…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

615 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question