farzanj
asked on
Regular expression optimization
Experts:
I am parsing the usual SIP (Session Initiation Protocol) logs. I write a regular expression to extract certain values which works fine.
But with an unexpected log, would you believe it takes for ever in recursion that finding out the pattern would not work takes forever.
Exceptional Record:
Normal Record:
Regular Expression:
Questions:
1. How can I make it fail the unexpected pattern immediately?
2. Is there an "optimization tool" like the database query optimization tools that would show me recursions, etc with suggestions to improve it.
3. Suppose I matched a pattern with $1 , $2 , etc. Is there a special variable for array equivalent that would contain all the matches?
I am parsing the usual SIP (Session Initiation Protocol) logs. I write a regular expression to extract certain values which works fine.
But with an unexpected log, would you believe it takes for ever in recursion that finding out the pattern would not work takes forever.
Exceptional Record:
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<< BYE <<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
SipTrxnManager::TrxnMsgReceivedEvHandler
sip:ujunwaonline%99yahoo.com@99.993.99.999:99995;transport=tls;pin=9999CC3A SIP/2.0
From: "iokongwu@company.com"<sip:iokongwu%99company.com@sip.voip.coname.com>;tag=9999999999
To: "ujuawnoilen@yahoo.com"<sip:ujunwaonline%99yahoo.com@sip.voip.coname.com>;tag=9999999999
Call-ID: s...
...aderjmimaupqdpyafutgw@sip.voip.coname.com
CSeq: 4 BYE
Via: SIP/2.0/TCP 992.99.994.99:9999;branch=z9hG4bK-9999999999992-999999999
Route: <sip:sip-robinhood5.voip5.coname:9999;transport=TCP;lr>;context=routing
Record-Route: <sip:sip-robinhood7.voip7.coname:9999;lr>
Record-Route: <sip:iokongwu%99company.com@sip.voip.coname.com;lr>
Route: <sip:ujunwaonline%99yahoo.com@sip.voip.coname.com;lr>
Contact: "iokongwu@company.com"<sip:iokongwu%99company.com@99.995.997.99:99998;transport=tls;pin=FFFF9999>
User-Agent: Video Chat Client v1.0.3
X-Completed: 7PP7 9999 9999 C0A99998
Reason: CSR;cause=1;text=Success;AFE=7PP7 9999 9999 C0A99998;VfxRxBitrate=991.99;VfxTxBitrate=998.99
Max-Forwards: 99
Content-Length: 0
Normal Record:
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
<< BYE <<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
SipTrxnManager::TrxnMsgReceivedEvHandler
sip:ujunwaonline%99yahoo.com@99.993.99.999:99995;transport=tls;pin=9999CC3A SIP/2.0
From: "iookwu@company.com"<sip:iokongwu%99company.com@sip.voip.coname.com>;tag=9999999999
To: "ujunwaonline@yahoo.com"<sip:ujunwaonline%99yahoo.com@sip.voip.coname.com>;tag=9999999999
Call-ID: s...
...adieulcroxxacela@sip.voip.coname.com
CSeq: 5 BYE
Via: SIP/2.0/TCP 992.99.994.99:9999;branch=z9hG4bK-9999999999999999999999
Route: <sip:sip-roundrobin5.voip5.coname:9999;transport=TCP;lr>;context=routing
Record-Route: <sip:sip-roundrobin7.voip7.coname:9999;lr>
Record-Route: <sip:iokongwu%99company.com@sip.voip.coname.com;lr>
Route: <sip:ujunwaonline%99yhaoo.com@sip.voip.coname.com;lr>
Contact: "iokongwu@company.com"<sip:iokongwu%99company.com@99.995.997.99:99998;transport=tls;pin=FFFF9999>
User-Agent: Video Chat Client v1.0.3
X-Completed: 7PP7 99CA 99BB C0A99998
Reason: CSR;cause=1;text=Success;AFE=TPPT 1C26 01CF 000B 1A42 0001 0004F3E 0A89E8BC <;VfxRxBitrate=206.93;VfxTxBitrate=0.00
Max-Forwards: 99
Content-Length: 0
Regular Expression:
(@values) = $line =~ m/[^F]*?
From[:][^<]+<sip:([^>]+).*? #1. Orig_Device_PIN (From)
To[:][^<]+<sip:([^>]+).*? #2. Dest_Device_PIN (To)
Call-ID[:]\s*(\S+).*? #3. Call-ID
User-Agent[:].*?([0-9.]+).*? #4. Agent version
Reason[:].*?cause=(\d+).*? #5. Term_Cause
AFE=\S(\S\S)\S #6. Call_Type
\s(\w+) #7. Call Setup Time*
\s(\w+) #8. ICE check time*
\s(\w+) #9. Start to Invite Time*
\s(\w+) #10. Invite to 180*
\s(\w+) #11. Time 180 to 200*
\s(\w+) #12. Total Call Time*
[^V]*VfxRxBitrate=([.0-9]+) #13. VfxRxBitrate
/sx;
Questions:
1. How can I make it fail the unexpected pattern immediately?
2. Is there an "optimization tool" like the database query optimization tools that would show me recursions, etc with suggestions to improve it.
3. Suppose I matched a pattern with $1 , $2 , etc. Is there a special variable for array equivalent that would contain all the matches?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ozo, what did you change in my regex? Everything appears to be exactly the same.
Second, I was thinking that I could use a match in if condition and if I get the match, I should proceed otherwise, should skip printing the output.
Second, I was thinking that I could use a match in if condition and if I get the match, I should proceed otherwise, should skip printing the output.
ASKER
Is there a command to analyze the regular expression? How many recursions it does, etc?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Solution? Sorry, could not follow you first post.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ok, that is good :)
So there is no RE optimization tools or commands that does the analysis?
So there is no RE optimization tools or commands that does the analysis?
ASKER
Thanks for your time, Ma'am.
ASKER
No optimization tools :(
No other ways to prevent recursion :(
No other ways to prevent recursion :(
ASKER
Related question:
https://www.experts-exchange.com/questions/27393250/if-condition-with-RE-match.html
Please help
https://www.experts-exchange.com/questions/27393250/if-condition-with-RE-match.html
Please help
$1 is the same as substr($line, $-[1], $+[1] - $-[1]) or $values[0]
$2 is the same as substr($line, $-[2], $+[2] - $-[2]) or $values[1]
$3 is the same as substr($line, $-[3], $+[3] - $-[3]) or $values[2]