How to export mostly text fields from mysql to CSV and import to MS-EXCEL , keeping fields aligned.

Hi Team,

  I'm just trying to find an elegant way to script an export to csv for one of my mysql tables whose fields are mostly text, some of which are "varchar" and "longtext" field type.

My problem is that in the fields that are multiline (since they contain paragraphs), when I use the command below:

 SELECT * INTO OUTFILE '/tmp/tabledump.csv' FIELDS TERMINATED BY '|'  ENCLOSED BY '"' LINES TERMINATED BY '\N' FROM ChangeRecord

Open in new window


...once I import it to MS Excel, (even if I do it manually by doing a data import, specify delimiter, date format, etc), then excel gets confused with the fields which contain line returns and puts these multi line fields into separate cells, thus totally destroying the format of the table.

*However, if I do it from phpmyadmin, then I do "export" --> "CSV for MS Excel"  --> "Save as file"...then "go"....phpmyadmin allows me to open the table directly into excel with all the text fields intact.  Even text fields containing multiple paragraphs STAY within one cell, thus preserving the entire data table structure and all fields and row align perfectly.

Is there a workaround to the export utility in CLI in mysql so I can do what the phpmyadmin does correctly?  Alternatively, is there a command line for phpmyadmin that allows me to do the export CSV for MS excel option?

Thanks very much.
rleyba828Asked:
Who is Participating?
 
Ray PaseurCommented:
Here is the script, updated to use MySQLi.  I just tested it and got good output in the CSV file.  But when I clicked the CSV file to open it in Excel, Excel (strike)discarded(/strike) hid the data after the line feeds.  Feh.  There may be an Excel setting that I don't know about.

<?php // RAY_db_to_csv.php
error_reporting(E_ALL);
echo "<pre>";


// DEMONSTRATE HOW TO EXPORT A TABLE IN CSV FORMAT


// SET YOUR TABLE NAME HERE - OR MAYBE USE THE URL "GET" ARGUMENT?
$table_name = '???';


// DATABASE CONNECTION AND SELECTION VARIABLES - GET THESE FROM YOUR HOSTING COMPANY
$db_host = "localhost"; // PROBABLY THIS IS OK
$db_name = "??";
$db_user = "??";
$db_word = "??";

// OPEN A CONNECTION TO THE DATA BASE SERVER AND SELECT THE DB
$mysqli = new mysqli($db_host, $db_user, $db_word, $db_name);

// DID THE CONNECT/SELECT WORK OR FAIL?
if ($mysqli->connect_errno)
{
    $err
    = "CONNECT FAIL: "
    . $mysqli->connect_errno
    . ' '
    . $mysqli->connect_error
    ;
    trigger_error($err, E_USER_ERROR);
}


// OPEN THE CSV FILE - PUT YOUR FAVORITE NAME HERE
$csv = 'EXPORT_' . date('Ymdhis') . "_$table_name" . '.csv';
$fp  = fopen($csv, 'w');
if (!$fp) trigger_error("UNABLE TO OPEN $csv", E_USER_ERROR);


// GET THE COLUMN NAMES
$sql = "SHOW COLUMNS FROM $table_name";
if (!$res = $mysqli->query($sql))
{
    $err
    = 'QUERY FAILURE:'
    . ' ERRNO: '
    . $mysqli->errno
    . ' ERROR: '
    . $mysqli->error
    . ' QUERY: '
    . $sql
    ;
    trigger_error($err, E_USER_ERROR);
}
if ($res->num_rows == 0)
{
    trigger_error("WTF? $table_name HAS NO COLUMNS", E_USER_ERROR);
}
else
{
    while ($show_columns = $res->fetch_object())
    {
        $my_columns[] = $show_columns->Field;
    }
    // var_dump($my_columns);
}

// WRITE THE COLUMN NAMES TO THE CSV
if (!fputcsv($fp, $my_columns)) trigger_error("FAILURE IN WRITING COLUMN NAMES TO $csv", E_USER_ERROR);


// GET THE ROWS OF DATA
$sql = "SELECT * FROM $table_name";
if (!$res = $mysqli->query($sql))
{
    $err
    = 'QUERY FAILURE:'
    . ' ERRNO: '
    . $mysqli->errno
    . ' ERROR: '
    . $mysqli->error
    . ' QUERY: '
    . $sql
    ;
    trigger_error($err, E_USER_ERROR);
}

// ITERATE OVER THE DATA SET http://php.net/manual/en/mysqli-result.fetch-row.php
while ($row = $res->fetch_row())
{
    // WRITE THE COMMA-SEPARATED VALUES.
    if (!fputcsv($fp, $row)) trigger_error("FAILURE IN WRITING DATA TO $csv", E_USER_ERROR);
}

// ALL DONE
fclose($fp);


// SHOW THE CLIENT A LINK
echo "<p><a href=\"$csv\">$csv</a></p>" . PHP_EOL;

Open in new window

0
 
Marco GasiFreelancerCommented:
I think this problem is related to the different new line symbols which Linux and windows use: in Linux the new line symbol or carriage return is "\n" but in Windows it is "\r\n". So in your script you should replace "\n" with "\r\n". You should do this within the array of records  you get from the database before to save them in a cs file or you should do it after you have saved the file, but the better choice would be the first thing. If you post your script here I can elp to write this little adjustement (which phpMyAdmin does automatically on windows systems, I suppose).

Otherwise, once you saved your csv file you can do something like:

<?php
$fname = 'mycsv.csv';
$str = file_get_contents('mycsv.csv');
$str = str_replace("\n", "\r\n", $str);
file_put_contents($fname, $str);

Open in new window

0
 
rleyba828Author Commented:
OK...I think you got something there..... I opened the resulting csv file into notepad++  and turned on display all characters....  I found something we might get started on.

1. Within the SAME text field, when there are new paragraphs within that same field, paragraphs and new lines  always end with  \[cr][lf].  This is consistent everywhere.

2. On the other hand, at the end of every record, the last field is only terminated by [cr][lf]].

I suppose we can have a short bash script that will do a global search and replace and turn all \[cr][lf] into normal new lines.  maybe just \n ? ...  and then leave all [cr]lf] as is.

Would you know of a bash technique to do this?

Thanks.
0
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

 
rleyba828Author Commented:
I tried doing this.....
 sed "s/\\r\n/\n/"  dump.csv > dump-a.csv 

Open in new window

 but all it did was change \[cr][lf] into [cr][lf].  What I want is to change \[cr][lf] into [lf].   Any suggestion?

Thanks.
0
 
Marco GasiFreelancerCommented:
 sed "s/\r\n/\n/"  dump.csv > dump-a.csv 

Open in new window

0
 
rleyba828Author Commented:
....I actually need the first \.  

So I need to convert 3 characters   --> "\"  + "[cr]" + "[lf]" into only 1 character which is "[lf]".

Researching...I also tried to do this.....
cat dump.csv | tr '\\r\n' '\n' > dumpb.csv

Open in new window


So in the command above, I wanted to convert \ and \r and \n into just one \n but tr is considering the first \ as an escape character and ends up converting every letter "r" into a newline character  (\n) which isn't what I want.

Thanks.
0
 
Ray PaseurCommented:
In PHP you can use str_replace("\r\n", "\r", $str) to remove the EOL character, leaving only the CR character.  You might also find that PHP fputcsv() can be used to your advantage.  The default settings are usually OK.

This script is old and should be updated to get off the obsolete MySQL extension, but it has worked for me in the past.

<?php // RAY_db_to_csv.php
error_reporting(E_ALL);
echo "<pre>\n";


// DEMONSTRATE HOW TO EXPORT A TABLE IN CSV FORMAT


// SET YOUR TABLE NAME HERE - OR MAYBE USE THE URL GET ARGUMENT?
$table_name = '???';


// CONNECTION AND SELECTION VARIABLES FOR THE DATABASE
$db_host = "localhost"; // PROBABLY THIS IS OK
$db_name = "??";        // GET THESE FROM YOUR HOSTING COMPANY
$db_user = "??";
$db_word = "??";

// OPEN A CONNECTION TO THE DATA BASE SERVER
if (!$db_connection = mysql_connect("$db_host", "$db_user", "$db_word"))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>NO DB CONNECTION: ";
    echo "<br/> $errmsg <br/>";
}

// SELECT THE MYSQL DATA BASE
if (!$db_sel = mysql_select_db($db_name, $db_connection))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>NO DB SELECTION: ";
    echo "<br/> $errmsg <br/>";
    die('NO DATA BASE');
}

// OPEN THE CSV FILE - PUT YOUR FAVORITE NAME HERE
$csv = 'EXPORT_' . date('Ymdhis') . "_$table_name" . '.csv';
$fp  = fopen($csv, 'w');

// GET THE COLUMN NAMES
$sql = "SHOW COLUMNS FROM $table_name";
if (!$res = mysql_query($sql))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>QUERY FAIL: ";
    echo "<br/>$sql <br/>";
    die($errmsg);
}
if (mysql_num_rows($res) == 0)
{
    die("WTF? $table_name HAS NO COLUMNS");
}
else
{
    // MAN PAGE: http://php.net/manual/en/function.mysql-fetch-assoc.php
    while ($show_columns = mysql_fetch_assoc($res))
    {
        $my_columns[] = $show_columns["Field"];
    }
    // var_dump($my_columns); ACTIVATE THIS TO SEE THE COLUMNS
}

// WRITE THE COLUMN NAMES TO THE CSV
if (!fputcsv($fp, $my_columns)) die('DISASTER');

// GET THE ROWS OF DATA
$sql = "SELECT * FROM $table_name";
$res = mysql_query($sql);
if (!$res)
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>QUERY FAIL: ";
    echo "<br/>$sql <br/>";
    die($errmsg);
}

// ITERATE OVER THE DATA SET
while ($row = mysql_fetch_row($res))
{
    // WRITE THE COMMA-SEPARATED VALUES.  MAN PAGE http://php.net/manual/en/function.fputcsv.php
    if (!fputcsv($fp, $row)) die('CATASTROPHE');
}

// ALL DONE
fclose($fp);

// SHOW THE CLIENT A LINK
echo "<p><a href=\"$csv\">$csv</a></p>\n";

Open in new window

0
 
GaryCommented:
Try

SELECT * FROM ChangeRecord INTO OUTFILE '/tmp/tabledump.csv' FIELDS ESCAPED BY '""' TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r\n'; 

Open in new window

0
 
rleyba828Author Commented:
Hi cathal,

    I tried your code above but it actually changed all the \[cr][lf] into "[cr][lf] everywhere in the file.  So I am back where I started, I still need a way to change \[cr][lf] into [lf].

Ray,
   I will go by your comment that when you opened the file in excel using your script, it discarded the characters after the line feed.

*I'll do a post in the PERL section, I'm sure this is a just a simple regex global substitution thing....I know it's tricky because of the mix of \ and control characters [cr][lf].

Thanks.  I'll let you know what I find.
0
 
Ray PaseurCommented:
It appears that Excel is violating the RFC.
http://tools.ietf.org/html/rfc4180#page-2
https://developers.google.com/analytics/solutions/articles/gdataAnalyticsCsv

But that is not so.  It complies and handles the multiline document correctly.  I just hides this fact behind the opaque veil of "Alignment->Wrap Text."  WTF, Microsoft?

Here is the data base table that I used for the test, with some quotes and newlines thrown in:Table use in the testHere is the CSV file that was created by the PHP script:The CSVHere is the initial view when the CSV is opened in Excel, after I expanded the columns enough to "see everything".  Note cell C5:Initial ViewNow look at the "Alignment" box when I select cell C5:Wrap Text is illuminatedWhen I clicked on "Wrap Text" ...Before "Wrap Text" is firedLo and Behold!  The data was there, correct, all along.The CSV was imported correctly, but the data was hidden.  Feh.
0
 
rleyba828Author Commented:
Hi Ray,

   thanks for your great detective work above...  I will go ahead and try your script.  In the meantime, I will most likely sanitize my data and send you the problematic file so you can have a better look.

   thanks and regards
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.