Solved

C++ how to profile code for regex lib?

Posted on 2012-12-30
10
296 Views
Last Modified: 2013-04-01
I have a small test program that is testing regex library.

2 major lines are:
...
   if(xregcomp(preg,pattern,REG_ICASE|REG_EXTENDED)==0) {
            int ret = xregexec(preg, buffer, 1, &pmatch, 0);
                  std::cout << "ret=" << ret << " pmatch.rm_so=" << pmatch.rm_so << ", pmatch..rm_eo=" << pmatch.rm_eo << std::endl;
}

for some reason xregexec is taking some extra seconds that shouldn't at all!

In my project I have a reference to project xregex that is using regex.c lib from
Extended regular expression matching and search library,
   version 0.12.
...

The question: How can I profile and find out where the time is wasting here?
regex.c
xregex.h
0
Comment
Question by:longjumps
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
10 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 38731028
what is in pattern and buffer when it takes extra seconds?
0
 
LVL 1

Author Comment

by:longjumps
ID: 38731055
ozo, I attached both

Regex
(OR|[|][|]|AND|[&][&]|HAVING|WHERE)([[:space:]]*|/[*].*[*]/)*[('\"]*[[:space:]]*([^('\"[:space:]]+)[[:space:]]*[)'\"]*[[:space:]]*=[[:space:]]*[N]*[[:space:]]*[('\"]*[[:space:]]*\3

and buffer attached.
rule122-regex.txt
toparsw.txt
0
 
LVL 84

Expert Comment

by:ozo
ID: 38731183
There can be exponentially many ways for
([[:space:]]*|/[*].*[*]/)*
to match.   It may take a lot of time to check all of them.
0
Free eBook: Backup on AWS

Everything you need to know about backup and disaster recovery with AWS, for FREE!

 
LVL 1

Author Comment

by:longjumps
ID: 38731197
Yes. But why this slowness happens for this buffer?
Once I take any other it is not happening?
0
 
LVL 84

Expert Comment

by:ozo
ID: 38731231
does changing
([[:space:]]*|/[*].*[*]/)*
to
([[:space:]]|/[*][^*]*[*]+([^/*][^*]*[*]+)*/)*
make a difference?
0
 
LVL 1

Author Comment

by:longjumps
ID: 38732042
I am checking your regex substitution proposal.

However why the attached specific buffer is slowing down significantly performance?
Same expression with other buffers, including MBs of the things is working super fast. Why?
0
 
LVL 84

Expert Comment

by:ozo
ID: 38732084
With REG_ICASE,
beland
matches (OR|[|][|]|AND|[&][&]|HAVING|WHERE)
We then match ([[:space:]]*|/[*].*[*]/)*
When the
[('\"]*[[:space:]]*([^('\"[:space:]]+)[[:space:]]*[)'\"]*[[:space:]]*=
fails to match, a human or sufficiently clever match engine may realize that
backtracking to find a different way to match the ([[:space:]]*|/[*].*[*]/)*
won't make a difference to the success of the entire match,
but a more naive match engine
(such as one optimized for tight loops without extra complicated checks)
would go back to try them all.
(And, in theory, it could even find an infinite number of ways to match it)
would
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 38732096
[('\"]*[[:space:]]*
in combination with the preceding regexp clause looks like it could also contribute
to multiplying the number of ways to match
Perhaps you could try instead
([('\"]+[[:space:]]*)?
same for
[[:space:]]*[)'\"]*[[:space:]]*
which you might try replacing with
[[:space:]]*([)'\"]+[[:space:]])?
0
 
LVL 1

Author Comment

by:longjumps
ID: 38777197
checking solution
0
 
LVL 1

Author Closing Comment

by:longjumps
ID: 39039530
workaround solution for changes in Regex and not code.
0

Featured Post

Increase Agility with Enabled Toolchains

Connect your existing build, deployment, management, monitoring, and collaboration platforms. From Puppet to Chef, HipChat to Slack, ServiceNow to JIRA, Splunk to New Relic and beyond, hand off data between systems to engage the right people.

Connect with xMatters.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Redirected folders in a windows domain can be quite useful for a number of reasons, one of them being that with redirected application data, you can give users more seamless experience when logging into different workstations.  For example, if a use…
The recent Microsoft changes on update philosophy for Windows pre-10 and their impact on existing WSUS implementations.
This tutorial will show how to push an installation of Backup Exec to an additional server in both 2012 and 2014 versions of the software. Click on the Backup Exec button in the upper left corner. From here, select Installation and Licensing, then I…
This tutorial will show how to configure a new Backup Exec 2012 server and move an existing database to that server with the use of the BEUtility. Install Backup Exec 2012 on the new server and apply all of the latest hotfixes and service packs. The…

690 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question