Solved

Perl Web Automation

Posted on 2011-09-20
8
878 Views
Last Modified: 2013-11-10
Hi Experts,

I'm wanting to do some web automation using Perl, but I'm struggling, especially since this site will not allow me to login or perform some of the vital steps without JavaScript.  Does that mean I'll need WWW::Scripter::Plugin::JavaScript?  It also has to handle cookies.


Here are some of the key steps:

1. Login at: http://portal.ccli.com/
    Yes, that's the real URL, so you can visit that login page if you like, but sorry I can't give you a username/password.  If you view source, you'll see the User ID and Password input tags are:
       <input name="ctl00$cph1$txtUserId" type="text" maxlength="20" id="ctl00_cph1_txtUserId" style="width:215px;" />
       <input type="password" name="password" style="width:215px;" MaxLength="20" value="" />

2. On the next page, I have to click the "Launch Copy Report" link.  This is what it looks like:
        <a id="ctl00_cph1_lnkOLCR" class="applink" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;ctl00$cph1$lnkOLCR&quot;, &quot;&quot;, false, &quot;&quot;, &quot;http://www.ccli.com/CopyReport/Login.cfm&quot;, false, true))">Launch Copy Report</a>

3. On the next page, I need to go to this page:
      http://www.ccli.com/CopyReport/SongCopyright.cfm?song_id=12345
    where I will supply the "12345" which could be any number.

4. Click the "Enter into Copy Report" button (image):
      <input TYPE="image" SRC="Images/Buttons/EnterIntoCopyReport.gif" onClick="enterSong();">

5. On the next page, type "1" into this field:
      <input type="text" size="10" name="project" value="0" tabindex="1" class="Highlighted">

6. Select the "CCLI Number" option from a dropdown:
      <select name="search_type" size="1" onchange="setDefaultMethod(this);">
          <option value="TitleOnly">Title</option>
          <option value="Title">Title &amp; AKA</option>
          <option value="Text">Lyrics</option>
          <option>Author</option>
          <option>Catalog</option>
          <option>Theme</option>
          <option selected>CCLI Number</option>
      </select>

7. Click the "Save" button (image):
      <input type="image" src="Images/Buttons/Save.gif" onclick="document.forms[0].nextPage.value='SongListReportSession.cfm?finished=true';" tabindex="4">

If I can get the above working, I can hopefully handle the rest.


Here's my first attempt at the code, but in addition to other things, it doesn't handle the site's requirement for JavaScript:
use WWW::Mechanize;

$m = WWW::Mechanize->new();

# Step 1. Login at: http://portal.ccli.com/
$m->get('http://portal.ccli.com/');
#print $mech->content;
$m->set_visible('myUsername','myPassword');
$m->submit();
#print $m->content;

# Step 2.  On the next page, I have to click the "Launch Copy Report" link.
# I don't know how to do this, as it's got JavaScript.

# Step 3.  On the next page, I need to go to this page:
$songno = 12345;
$m->get("http://www.ccli.com/CopyReport/SongCopyright.cfm?song_id=$songno");

# Step 4. Click the "Enter into Copy Report" button (image):
# I don't know how to do this, as it's got JavaScript.

# Step 5. On the next page, type "1" into this field:
$m->field('project','1');

# Step 6. Select the "CCLI Number" option from a dropdown:
# I don't know how to do this.

# Step 7. Click the "Save" button (image):
# I don't know how to do this, as it's got JavaScript.

Open in new window


Can someone help me get these main steps working, please?

Thanks.
tel2
0
Comment
Question by:tel2
  • 4
  • 4
8 Comments
 
LVL 23

Expert Comment

by:nemws1
ID: 36567836
Well, first off, you need to cookies, you just need to add in a cookie jar, which is only 2 additional lines:
 
use WWW::Mechanize;
use HTTP::Cookies;

$m = WWW::Mechanize->new();
$m->cookie_jar(HTTP::Cookies->new());

# Step 1. Login at: http://portal.ccli.com/
......

Open in new window


As for the rest, JavaScript doesn't really matter, if you know what the JavaScript is calling/sending back on the server side.  I'll often use the Firefox "Live HTTP Headers" plugin to help with this (or a packet sniffer like Wireshark, but a lot of people don't know how to use sniffer software).  Live HTTP Headers really should let you get by step #2, #4, #6, & #7.  You'll need the URL and any arguments being sent to it (should contain any form elements/info).  You don't need to worry about any of the Cookie headers if you're using the cookie jar. ;-)

Without seeing the code for having a login, its hard to help further.
0
 
LVL 11

Author Comment

by:tel2
ID: 36570259
Thanks for that, nemws1.

When you say:
    "JavaScript doesn't really matter, if you know what the JavaScript is calling/sending back on the server side."
Did you see my original comment:
    "...this site will not allow me to login or perform some of the vital steps without JavaScript"?
To test this, if you turn JavaScript off in Firefox, then browse to http://portal.ccli.com/, you will not be prompted for User ID / Password.  Instead, you'll get the error message:
    "Javascript & cookies are required for this site to function."
Try it if you like.  How do I get past that?

Thanks.
tel2
0
 
LVL 23

Accepted Solution

by:
nemws1 earned 500 total points
ID: 36570573
Javascript is all client-side.  The server has no idea what you're doing w/ Javascript on your client.  Yes, if you disable it, it won't work, because the server is expecting a certain response, but if you know what that response is going to be, you can script it with WWW::Mechanize without using Javascript.

For example, perhaps that form dynamically generates via Javascript the FORM element named "password".  When you submit the form (with Javascript turned on with Firefox) with "Live HTTP Headers" running, you'll see the client is submitting a form and passing it the "password" variable with an argument (along with all the other INPUT elements).  You just need to figure out all the variables that the browser is sending to the server when something is submitted.

When I tried just now with a fake username/password, "Live HTTP Headers" gave me the following URL (right after the Content-Length: header):

__LASTFOCUS=&__EVENTTARGET=&__EVENTARGUMENT=&__VIEWSTATE=%2FwEPDwUKMTMxNjk4NDk1NA9kFgJmD2QWAgIDD2QWAmYPFgIeBFRleHQFHlNvbmdTZWxlY3QgLyBDb3B5IFJlcG9ydCBMb2dpbmQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFGGN0bDAwJGNwaDEkY2hrUmVtZW1iZXJNZQ%3D%3D&ctl00%24cph1%24fldVerificationCode=&ctl00%24cph1%24txtUserId=spam&password=blah&ctl00%24cph1%24btnLogin=Login

Open in new window


Which I can then deconstruct into the following form elements that I would need for WWW::Mechanize:
 
__LASTFOCUS=
__EVENTTARGET=
__EVENTARGUMENT=
__VIEWSTATE=%2FwEPDwUKMTMxNjk4NDk1NA9kFgJmD2QWAgIDD2QWAmYPFgIeBFRleHQFHlNvbmdTZWxlY3QgLyBDb3B5IFJlcG9ydCBMb2dpbmQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFGGN0bDAwJGNwaDEkY2hrUmVtZW1iZXJNZQ%3D%3D
ctl00%24cph1%24fldVerificationCode=
ctl00%24cph1%24txtUserId=spam
password=blah
ctl00%24cph1%24btnLogin=Login

Open in new window


Granted, that __VIEWSTATE looks a little hairy, but it's probably not even needed.  I would try it with just the two fields showing the username and password (blah & spam)
0
 
LVL 11

Author Comment

by:tel2
ID: 36570826
That's great, nemws1

Thanks to your suggestions, the code is now logging in (step 1)!

Here's the code now:
use WWW::Mechanize;
use HTTP::Cookies;

$m = WWW::Mechanize->new();
$m->cookie_jar(HTTP::Cookies->new());

# Step 1. Login at: http://portal.ccli.com/
$m->get('http://portal.ccli.com/');
$m->set_visible('myUserID', 'myPassword');
#$m->submit();
$m->click('ctl00$cph1$btnLogin');
#print $m->content;

# Step 2.  On the next page, I have to click the "Launch Copy Report" link.
$m->click_button(number => 4);
print $m->content;

Open in new window

My next problem is how to click the "Launch Copy Report" button on the next page.
As mentioned in my original post, the code for that button looks like this:
    <a id="ctl00_cph1_lnkOLCR" class="applink" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;ctl00$cph1$lnkOLCR&quot;, &quot;&quot;, false, &quot;&quot;, &quot;http://www.ccli.com/CopyReport/Login.cfm&quot;, false, true))">Launch Copy Report</a>
So, how do I click it?
I don't see a "name" for it, so maybe I can click it by button number.  This is the 4th button on the page, as confirmed by: lynx -dump step2.htm
So, I tried:
    $m->click_button(number => 4);
but that gives the error:
    "Can't call method "click" on an undefined value at .../WWW/Mechanize.pm line 1770."
So, I tried:
    $m->get('http://www.ccli.com/CopyReport/Login.cfm');
but that seemed to fail and take me back to the initial login page.

Any ideas how I should click the "Launch Copy Report" button?

I've attached the entire web page, in case it's of use.
step2.txt
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 23

Expert Comment

by:nemws1
ID: 36570929
I guess I wouldn't even try to use a click_button(), but just call $m->submit_form(); with whatever fields you get from Live HTTP Headers when you click on the button.  If you've set up the cookie jar, it usually is pretty straightforward.

I can't tell from the code you attached what the 'name' value is either, but it has to have a name & value (that's the way web forms work!).  Again, Live HTTP Headers will be able to give you the name in a heartbeat.
0
 
LVL 11

Author Comment

by:tel2
ID: 36571469
Hi again nemws1,

Thanks for that.  I have installed Live HTTP Headers into Firefox, and I:
- Logged in to the first page (step 1) via Firefox.
- Started on Live HTTP Headers.
- Clicked the "Launch Copy Report" link.
And attached is the Live HTTP Headers log (I have replaced my user ID with "myUserID" and my password with "myPassword", as is my custom).

I then tried adding to my script, various combinations of get commands, including this one:
    $m->get('http://www.ccli.com/CopyReport/Login.cfm?ctl00%24cph1%24hdnUrchinHiddenField=&FromLogin=Yes&javascript=yes&cookies=yes&browser=yes&LoginID=myUserID&Password=myPassword&aliasLoginID=myUserID&isCRA=0&ctl00%24cph1%24chkCopyReporter=on');
and checked the resulting page this:
    print $m->content;
but alas, I'm back to the login page.
I also tried replacing the "%24"s with "$"s (should I do that?), but still no joy.
I'm not sure if I've got the stuff before the 1st parameter right (i.e. "http://www.ccli.com/CopyReport/Login.cfm?").

What do you recommend?

Can't I just click that 4th button (the "Launch Copy Report" image) somehow?  If not, why not?

If you want to test things yourself, I can email you a temporary password if you go to my profile page, and send me a message by clicking the "Hire Me" link on the left.

Thanks again for your time.
step2b.txt
0
 
LVL 23

Expert Comment

by:nemws1
ID: 36574449
First, yes, leave the %24s - don't use '$' instead.

Secondly, my e-mail is expexch@emptec.com - send me temp. password and I'll see what I can come up with. ;-)
0
 
LVL 11

Author Comment

by:tel2
ID: 36908705
Hi nemws1,

Thanks for the above help, and for trying to help further.  The points are yours.  Although we didn't get it all working, at least I learned a few things (e.g. about Firefox's "Live HTTP Headers" plugin).

In the end I ran out of time and did the automation with Firefox's iMacros extension, though I would still like to know how to do such with Perl, because of the flexibility that offers.

> First, yes, leave the %24s - don't use '$' instead.
The reason I asked about this is, before trying:
    $m->set_visible('myUserID', 'myPassword');
I tried:
    $m->field('ctl00%24cph1%24txtUserId','myUserID');
    $m->field('password','myPassword');
because "Live HTTP Headers" showed the "%24"s, but that failed, so I changed the "%24"s to "$"s, like this:
    $m->field('ctl00$cph1$txtUserId','myUserID');
    $m->field('password','myPassword');
and that worked, so I figured I might need to do that elsewhere, too.  I then changed it to:
    $m->set_visible('myUserID', 'myPassword');
for simplicity.
So when should I change them and when shouldn't I?

And why is it that on that first login page, I had to do:
    $m->click('ctl00$cph1$btnLogin');
and couldn't just do:
    $m->submit();
?

Thanks again.
tel2
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

Whether you've completed a degree in computer sciences or you're a self-taught programmer, writing your first lines of code in the real world is always a challenge. Here are some of the most common pitfalls for new programmers.
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
The viewer will learn how to dynamically set the form action using jQuery.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now