Solved

Python urllib.urlopen() performance question

Posted on 2001-09-09
4
768 Views
Last Modified: 2012-06-21
Hello!  I have a Python script that retrieves the HTML text from a web page via urlopen() from urllib and then processes the result line by line via readline() calls to the returned object.

I noticed that the line-by-line read seems to be very slow on Win98SE, whereas the exact same code (and same interpreter) runs very quickly on Win2k Pro.

The snippet skeleton code in question is as follows:

[...]
webPage = urllib.urlopen(startURL)

line = webPage.readline()

while line != '':
   line = webPage.readline()
[...]

The actual retrieval (the call to urlopen) seems to go quickly, so I suspect something with how I'm calling or using readline().  Any suggestions or known issues with this?


Thanks,

AP9
0
Comment
Question by:ap9
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 22

Expert Comment

by:CJ_S
ID: 6469840
It's a common way of reading a file. I do not know much about Python, but usually there is also a readall method, you can try it.

line = webPage.readall()

It might be faster...

Regards,
CJ
0
 

Author Comment

by:ap9
ID: 6472564
Hello, CJ!  Thanks for the suggestion, but readall() isn't a valid method in the returned object.  The documentation says that it is a "file-like" object but not a real file object.

I think there is a method called "readlines()" which reads in all the lines, but I've tried it before and it's still slow.
0
 
LVL 22

Accepted Solution

by:
CJ_S earned 100 total points
ID: 6473580
Then I do not think you can overcome the problem if you intend to keep on using the same object. Maybe Python provides another object that does the same. Or if you can use, for example, the XMLHTTP object, you'd solve the speed-problem.

regards,
CJ
0
 

Author Comment

by:ap9
ID: 6485868
That's the thing, though -- on Win2k it works very quickly.  But I'll see about approaching it from another direction.  Thanks.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Displaying an arrayList in a listView using the default adapter is rarely the best solution. To get full control of your display data, and to be able to refresh it after editing, requires the use of a custom adapter.
Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question