Solved

Using cookies with WWW::Mechanize...

Posted on 2003-12-04
3
1,072 Views
Last Modified: 2012-05-04
I am trying to screen scrape from a web site, my code works fine for sites which don't involve cookies.  however, the web site i am trying to access requires that a browser have cookies enabled to log in.

what is the best way to go about solving this problem.  i have considered:

1. implicitly setting the header of my my agent to the Cookie value required for the site.

2. somehow getting the Mechanize agent to have cookies enabled - but i am not sure how to go about doing this? i have tried giving it an empty cookie_jar object from HTTP::Cookie but that did not seem to work.

I am fairly new to perl and would appreciate any hints and tips anyone might have.

many thanks
james

0
Comment
Question by:jamesbuckney
  • 3
3 Comments
 
LVL 20

Accepted Solution

by:
jmcg earned 125 total points
ID: 9879763
That's curious. The code for WWW::Mechanize sets up the UserAgent with a cookie jar by default, so it should behave as if cookies are enabled.

Have you taken a look at the WWW::Mechanize::Examples files?

http://search.cpan.org/~petdance/WWW-Mechanize-0.70/lib/WWW/Mechanize/Examples.pod

There they show some examples of getting past login screens and filling out forms automatically.
0
 
LVL 20

Expert Comment

by:jmcg
ID: 9910050
Welcome to Experts-Exchange, James,

I know that there was a problem with email notifications going out around 4 December (I certainly missed quite a few), so perhaps you are thinking we ignored your first question because Experts Exchange never seemed to contact you again. Maybe you'll get a notification this time and revisit your question.
0
 
LVL 20

Expert Comment

by:jmcg
ID: 10218587
Nothing has happened on this question in more than 7 weeks. It's time for cleanup!

My recommendation, which I will post in the Cleanup topic area, is to
accept answer by jmcg [grade B] (it's correct but whether it solves the problem is hard to know, asker abandoned question).

Please leave any comments here within the next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

jmcg
EE Cleanup Volunteer
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now