Link to home
Start Free TrialLog in
Avatar of pepr
pepr

asked on

Perl vs. Python -- regular expressions

Hi,

The question http:Q_27777566.html (Should I first learn Perl or Python?) led to discussion on regular expressions in Perl an Python.  Purpose of this question is to summarize up-to-date information on the subject.  If you are expert in Python and/or Perl, please suggest/write the benchmarks and help to collect the references related to the subject.

Let the http://stuffivelearned.org/doku.php?id=programming:general:phpvspythonvsperl be the starter (but it is rather obsolete).

Thanks,
    Petr
Avatar of FishMonger
FishMonger
Flag of United States of America image

I don't know Python, but I was reading over that thread and the stats link you posted and wondered if we use qr(), would the Perl regex's be faster.  I haven't run any benchmarks and don't have time right now to test it.
ASKER CERTIFIED SOLUTION
Avatar of kaufmed
kaufmed
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pepr
pepr

ASKER

I do not use Perl these days, but qr// makes sense when the same regular expression is reused many times (e.g. in the loop).  Then it should be faster than just using  q//.  The qr in Perl should be equivalent of re.compile in Python.  It makes sense to compare the variants this way in benchmarks.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pepr

ASKER

@wilcoxon: Let's focus on just one or two examples to start with.  What Perl version do you use?
At home, I think all of my installs are Perl 14.2 (16.x is the latest).
Avatar of pepr

ASKER

I guess that the question did not attract enough attention. This way, it is probably not usefull/important. If you do not object, I am going to ask for deletion.
That's fine.  I was curious but have been very busy lately.  I'll still likely play around with it at some point.  Did you see anything that was obviously inefficient in the python scripts on the original link?
Avatar of pepr

ASKER

Well, there is probably not too much Python code to be completely wrong. Byt the way of writing the loop is rather strange:
...
fh = open(logfile)
while True:
    line = fh.readline()
    if not line: break
    for r in res:
        m = r.search(line)
        if m:
            counter += 1
            break
fh.close()
...

Open in new window

I would write it as:
...
f = open(logfile)
for line in f:
    for r in res:
        m = r.search(line)
        if m:
            counter += 1
            break
f.close()
...

Open in new window

or using the with construct:
...
with open(logfile) as f:
    for line in f:
        for r in res:
            m = r.search(line)
            if m:
                counter += 1
                break
...

Open in new window

When using the same regular expression in a loop, there is not reason to use the uncompiled one. (But this is the same in any language.)
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pepr

ASKER

@HonorGod: OK. Then, it should continue somewhere to something tangible. We should decide for some examples for testing the regular expression capability.
agreed.
Avatar of pepr

ASKER

I have just split the points hastily to close the question. The really Happy New Year to everyone :)
Thank you, and thanks for the assist, and points.

Good luck & have a Happy New Year.