savantmarketing
asked on
Perl WWW-Mechanize issue when downloading CSV files
I'm trying to use Perl and WWW::Mechanize to login to webmail and download an attachemt (CSV file).
The script is able to get the file but the problem is, I'm getting garbled characters on the CSV file. This does not happen if I download the CSV file using a browser.
Here is the code that I am using:
use WWW::Mechanize;
use HTTP::Cookies;
$mech = WWW::Mechanize->new();
$cook = $zangomech->cookie_jar(HTTP::Cookies->ne w(file => "cookies.txt", autosave => 1,));
$zangomech->get('http://mail.mydomain.com/login.php');
$mech->form_number(1);
$mech->field('login_userna me' => 'username');
$mech->field('secretkey' => 'pass');
$mech->click();
$mech->add_header('Content -Type' => 'text/plain', 'charset' => 'utf-8');
$mech->get('http://mail.mydomain.com');
$mech->follow_link( text => "emaillink", n => 1);
$mech->follow_link( text => "Download", n => 2);
$output = $mech->content();
open(OUTFILE, ">file.csv");
print OUTFILE "$output";
close(OUTFILE);
Below is what I see when I open up the CSV file, i just changed the filename ext from .csv to .txt so it could be loaded on a browser.:
- - - - - -
http://traffic-director.net/testfiles/test.txt
I'm really not sure what these are so I couldn't d any cleanup programatically.
Thanks in advance.
The script is able to get the file but the problem is, I'm getting garbled characters on the CSV file. This does not happen if I download the CSV file using a browser.
Here is the code that I am using:
use WWW::Mechanize;
use HTTP::Cookies;
$mech = WWW::Mechanize->new();
$cook = $zangomech->cookie_jar(HTTP::Cookies->ne
$zangomech->get('http://mail.mydomain.com/login.php');
$mech->form_number(1);
$mech->field('login_userna
$mech->field('secretkey' => 'pass');
$mech->click();
$mech->add_header('Content
$mech->get('http://mail.mydomain.com');
$mech->follow_link( text => "emaillink", n => 1);
$mech->follow_link( text => "Download", n => 2);
$output = $mech->content();
open(OUTFILE, ">file.csv");
print OUTFILE "$output";
close(OUTFILE);
Below is what I see when I open up the CSV file, i just changed the filename ext from .csv to .txt so it could be loaded on a browser.:
- - - - - -
http://traffic-director.net/testfiles/test.txt
I'm really not sure what these are so I couldn't d any cleanup programatically.
Thanks in advance.
ASKER
Yeah, the headers say that it uses utf8.
When I download the file using a browser and open up the file in notepad, the encoding is in unicode though.
When I download the file using a browser and open up the file in notepad, the encoding is in unicode though.
There are 2 bytes to represent every character. Do you want to convert it to ASCII? You will lose info if any characters have ascii codes >255. Otherwise you just need a viewer that can view unicode files.
ASKER
Here is what's happening.
If I download the file manually, I still get the same issue. I need to open up the file in notepad and then go to File > Save As, then change the file encoding:from unicode to ANSI of UTF-8.
Question:
It is possible to change the encoding programatically using PERL or PHP?
If I download the file manually, I still get the same issue. I need to open up the file in notepad and then go to File > Save As, then change the file encoding:from unicode to ANSI of UTF-8.
Question:
It is possible to change the encoding programatically using PERL or PHP?
use Text::Unidecode;
$unaccented = unidecode($output);
print OUTFILE $unaccented;
$unaccented = unidecode($output);
print OUTFILE $unaccented;
ASKER
Thanks for the suggetion.
I tried the code and I'm getting almost the same results. The only difference is that the garbled characters are being written in between each letter on the CSV file.
I tried the code and I'm getting almost the same results. The only difference is that the garbled characters are being written in between each letter on the CSV file.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
That did it.
Plus I had to specify the correct encoding when writing to the file itself.
Ex:
open(OUTFILE, ">:utf8", "file.csv");
Thanks for your help.
Plus I had to specify the correct encoding when writing to the file itself.
Ex:
open(OUTFILE, ">:utf8", "file.csv");
Thanks for your help.
What do you get when you download the file with a browser