?
Solved

Python urllib.urlopen() performance question

Posted on 2001-09-09
4
Medium Priority
?
772 Views
Last Modified: 2012-06-21
Hello!  I have a Python script that retrieves the HTML text from a web page via urlopen() from urllib and then processes the result line by line via readline() calls to the returned object.

I noticed that the line-by-line read seems to be very slow on Win98SE, whereas the exact same code (and same interpreter) runs very quickly on Win2k Pro.

The snippet skeleton code in question is as follows:

[...]
webPage = urllib.urlopen(startURL)

line = webPage.readline()

while line != '':
   line = webPage.readline()
[...]

The actual retrieval (the call to urlopen) seems to go quickly, so I suspect something with how I'm calling or using readline().  Any suggestions or known issues with this?


Thanks,

AP9
0
Comment
Question by:ap9
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 22

Expert Comment

by:CJ_S
ID: 6469840
It's a common way of reading a file. I do not know much about Python, but usually there is also a readall method, you can try it.

line = webPage.readall()

It might be faster...

Regards,
CJ
0
 

Author Comment

by:ap9
ID: 6472564
Hello, CJ!  Thanks for the suggestion, but readall() isn't a valid method in the returned object.  The documentation says that it is a "file-like" object but not a real file object.

I think there is a method called "readlines()" which reads in all the lines, but I've tried it before and it's still slow.
0
 
LVL 22

Accepted Solution

by:
CJ_S earned 300 total points
ID: 6473580
Then I do not think you can overcome the problem if you intend to keep on using the same object. Maybe Python provides another object that does the same. Or if you can use, for example, the XMLHTTP object, you'd solve the speed-problem.

regards,
CJ
0
 

Author Comment

by:ap9
ID: 6485868
That's the thing, though -- on Win2k it works very quickly.  But I'll see about approaching it from another direction.  Thanks.
0

Featured Post

Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is about my first experience with programming Arduino.
Make the most of your online learning experience.
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…
Progress
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question