Solved

module for http range retrieval

Posted on 2009-07-15
6
286 Views
Last Modified: 2012-05-07
I have been using LWP::Simple and WWW::Mechanize to retrieve files from http servers.

There are some files that are very large and I only need a section of them and would like to use the range retrieval capability of most http 1.1 servers.

Is there a perl module that currently supports this, and in what way?

thanks
0
Comment
Question by:drunnels
  • 2
  • 2
  • 2
6 Comments
 
LVL 14

Expert Comment

by:flob9
ID: 24860368

Try "max_size" :
use LWP::UserAgent;
use HTTP::Response;
 
my $browser = LWP::UserAgent->new( );
$browser->max_size(500);
$url = 'http://www.google.com/';
my $response = $browser->get($url);
 
print $response->content( );

Open in new window

0
 

Author Comment

by:drunnels
ID: 24860506
Thanks, but I'm not trying to limit the size from the beginning of the file, but rather I want to be able to specify a starting point. For instance, a may have a 500 meg file and I want to be able to ask for the download to start 400 meg into it and continue to the end of the file.
0
 
LVL 7

Expert Comment

by:Fairlight2cx
ID: 24861091
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 7

Expert Comment

by:Fairlight2cx
ID: 24861214
Actually, HTTP::Range may not be exactly what you need, since it's related to segmenting.  BUT...  The docs show that it uses the Range and Content-Range headers of the HTTP protocol.  Those are detailed at ftp://ftp.rfc-editor.org/in-notes/rfc2616.txt in the spec.  (See section 14.)

You should be able to use the header() or push_header() methods of HTTP::Request to add the appropriate information to your request and achieve your goal using full-on LWP, however.
0
 
LVL 14

Accepted Solution

by:
flob9 earned 500 total points
ID: 24861708

$req = HTTP::Request->new(GET => "http://cdimage.debian.org/debian-cd/5.0.2/i386/iso-cd/debian-502-i386-netinst.iso"); 
$req->header(Range => "bytes=0-99"); 
$res = LWP::UserAgent->new->request($req); 
print $res->as_string; 
 
 
=> response :
 
HTTP/1.1 206 Partial Content
Connection: close
Date: Wed, 15 Jul 2009 17:22:13 GMT
Accept-Ranges: bytes
Age: 3481
ETag: "d871c8-9608000-46d7ab1025380"
Server: Apache/2.2.9 (Unix)
Content-Length: 100
Content-Range: bytes 0-99/157319168
Content-Type: application/octet-stream
Last-Modified: Mon, 29 Jun 2009 11:07:10 GMT
Client-Date: Wed, 15 Jul 2009 17:22:13 GMT
Client-Peer: 130.239.18.138:80
Client-Response-Num: 1

Open in new window

0
 

Author Closing Comment

by:drunnels
ID: 31603784
Thanks. This was exactly what I needed. The only thing I'd add to your answer is that to get just the page content one would add:
$content = $res->{'_content'}

0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
EXCHANGE 2010 1 174
Angular JS HTTP Post Double Return 5 89
User security breach via VIEWSTATE? 3 297
perl: Cleaning meta tags using RegEX 12 81
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Introduction and Prerequisites This article describes methods for detecting whether a client browser accepts and returns HTTP cookies and whether the client browser runs JavaScript.  Most client browsers will, by default, be configured to use cooki…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

832 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question