Avatar of tdkulig
tdkulig

asked on 

PHP Screen Scraping

I manage a large number of domains and need to keep track of them from an interface other than provided by the registrar, so I decided to grab the info via php.

The PHP code below works for https connections and handles cookies / sessions fine, I use it often. The issue seems to be that url used to login is a redirect and that where things fall apart.

The data below the PHP code is the relevant output from Live HTTP Headers firefox plugin. If I try to replay it to see whats going on I just end up at a blank page. I tried after clearing the cache and cookies from the browser too.

The POST content is the username and password and the x/y cordinates where you click the button.

My goal is to successfully login and get my domain information.

I am already aware I can get domain info using automated whois scripts if I know what domains exist, but there are other users who register domains and that would mean I still have to log onto the registrar to keep track of things.

Anyone following this thread can create a user account at name.com to test things. Just know they will block your ip if you have 10 unsuccessful password attempts at login.

<?php
 
 $url_login = 'https://www.name.com/account/login.php';
 $url_content = 'https://www.name.com/management/list_domain.php';
 $dh_username = 'user@domain.com';
 $dh_password = 'password';
 
 $dh_post = 'acct_name='.$dh_username.'&password='.$dh_password.'&x=0&y=0';
 $ch = curl_init();
 curl_setopt($ch, CURLOPT_URL, $url_login);
 curl_setopt ($ch, CURLOPT_POST, 1);
 curl_setopt ($ch, CURLOPT_POSTFIELDS, $dh_post);
 curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
 curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
 $store = curl_exec ($ch);
 curl_setopt($ch, CURLOPT_URL, $url_bckuser);
 $content = curl_exec ($ch);
 curl_close ($ch);
 
//Do whatever I need to with the content
echo $content;
?>
 
 
 
 
 
 
 
 
 
https://www.name.com/account/login_u.php?redir_location=/
 
 
 
POST /account/login_u.php?redir_location=/ HTTP/1.1
 
Host: www.name.com
 
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.6) Gecko/2009020911 Ubuntu/8.10 (intrepid) Firefox/3.0.6
 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 
Accept-Language: en-us,en;q=0.5
 
Accept-Encoding: gzip,deflate
 
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
 
Keep-Alive: 300
 
Connection: keep-alive
 
Referer: https://www.name.com/
 
Cookie: __utmz=225248517.1235958945.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=225248517.4480966728181928000.1235958945.1235967785.1236006591.3; PHPSESSID=lam68h994dpglr725vh0qefvs0; __utmb=225248517.3.10.1236006591; __utmc=225248517
 
Content-Type: application/x-www-form-urlencoded
 
Content-Length: 63
 
acct_name=user%40domain.com&password=password&x=27&y=1

Open in new window

ProgrammingWeb ApplicationsPHP

Avatar of undefined
Last Comment
tdkulig

8/22/2022 - Mon