Avatar of rleyba828
rleyba828Flag for Australia asked on

How to export mostly text fields from mysql to CSV and import to MS-EXCEL , keeping fields aligned.

Hi Team,

  I'm just trying to find an elegant way to script an export to csv for one of my mysql tables whose fields are mostly text, some of which are "varchar" and "longtext" field type.

My problem is that in the fields that are multiline (since they contain paragraphs), when I use the command below:

 SELECT * INTO OUTFILE '/tmp/tabledump.csv' FIELDS TERMINATED BY '|'  ENCLOSED BY '"' LINES TERMINATED BY '\N' FROM ChangeRecord

Open in new window


...once I import it to MS Excel, (even if I do it manually by doing a data import, specify delimiter, date format, etc), then excel gets confused with the fields which contain line returns and puts these multi line fields into separate cells, thus totally destroying the format of the table.

*However, if I do it from phpmyadmin, then I do "export" --> "CSV for MS Excel"  --> "Save as file"...then "go"....phpmyadmin allows me to open the table directly into excel with all the text fields intact.  Even text fields containing multiple paragraphs STAY within one cell, thus preserving the entire data table structure and all fields and row align perfectly.

Is there a workaround to the export utility in CLI in mysql so I can do what the phpmyadmin does correctly?  Alternatively, is there a command line for phpmyadmin that allows me to do the export CSV for MS excel option?

Thanks very much.
PHP

Avatar of undefined
Last Comment
rleyba828

8/22/2022 - Mon
Marco Gasi

I think this problem is related to the different new line symbols which Linux and windows use: in Linux the new line symbol or carriage return is "\n" but in Windows it is "\r\n". So in your script you should replace "\n" with "\r\n". You should do this within the array of records  you get from the database before to save them in a cs file or you should do it after you have saved the file, but the better choice would be the first thing. If you post your script here I can elp to write this little adjustement (which phpMyAdmin does automatically on windows systems, I suppose).

Otherwise, once you saved your csv file you can do something like:

<?php
$fname = 'mycsv.csv';
$str = file_get_contents('mycsv.csv');
$str = str_replace("\n", "\r\n", $str);
file_put_contents($fname, $str);

Open in new window

ASKER
rleyba828

OK...I think you got something there..... I opened the resulting csv file into notepad++  and turned on display all characters....  I found something we might get started on.

1. Within the SAME text field, when there are new paragraphs within that same field, paragraphs and new lines  always end with  \[cr][lf].  This is consistent everywhere.

2. On the other hand, at the end of every record, the last field is only terminated by [cr][lf]].

I suppose we can have a short bash script that will do a global search and replace and turn all \[cr][lf] into normal new lines.  maybe just \n ? ...  and then leave all [cr]lf] as is.

Would you know of a bash technique to do this?

Thanks.
ASKER
rleyba828

I tried doing this.....
 sed "s/\\r\n/\n/"  dump.csv > dump-a.csv 

Open in new window

 but all it did was change \[cr][lf] into [cr][lf].  What I want is to change \[cr][lf] into [lf].   Any suggestion?

Thanks.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
SOLUTION
Marco Gasi

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
rleyba828

....I actually need the first \.  

So I need to convert 3 characters   --> "\"  + "[cr]" + "[lf]" into only 1 character which is "[lf]".

Researching...I also tried to do this.....
cat dump.csv | tr '\\r\n' '\n' > dumpb.csv

Open in new window


So in the command above, I wanted to convert \ and \r and \n into just one \n but tr is considering the first \ as an escape character and ends up converting every letter "r" into a newline character  (\n) which isn't what I want.

Thanks.
Ray Paseur

In PHP you can use str_replace("\r\n", "\r", $str) to remove the EOL character, leaving only the CR character.  You might also find that PHP fputcsv() can be used to your advantage.  The default settings are usually OK.

This script is old and should be updated to get off the obsolete MySQL extension, but it has worked for me in the past.

<?php // RAY_db_to_csv.php
error_reporting(E_ALL);
echo "<pre>\n";


// DEMONSTRATE HOW TO EXPORT A TABLE IN CSV FORMAT


// SET YOUR TABLE NAME HERE - OR MAYBE USE THE URL GET ARGUMENT?
$table_name = '???';


// CONNECTION AND SELECTION VARIABLES FOR THE DATABASE
$db_host = "localhost"; // PROBABLY THIS IS OK
$db_name = "??";        // GET THESE FROM YOUR HOSTING COMPANY
$db_user = "??";
$db_word = "??";

// OPEN A CONNECTION TO THE DATA BASE SERVER
if (!$db_connection = mysql_connect("$db_host", "$db_user", "$db_word"))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>NO DB CONNECTION: ";
    echo "<br/> $errmsg <br/>";
}

// SELECT THE MYSQL DATA BASE
if (!$db_sel = mysql_select_db($db_name, $db_connection))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>NO DB SELECTION: ";
    echo "<br/> $errmsg <br/>";
    die('NO DATA BASE');
}

// OPEN THE CSV FILE - PUT YOUR FAVORITE NAME HERE
$csv = 'EXPORT_' . date('Ymdhis') . "_$table_name" . '.csv';
$fp  = fopen($csv, 'w');

// GET THE COLUMN NAMES
$sql = "SHOW COLUMNS FROM $table_name";
if (!$res = mysql_query($sql))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>QUERY FAIL: ";
    echo "<br/>$sql <br/>";
    die($errmsg);
}
if (mysql_num_rows($res) == 0)
{
    die("WTF? $table_name HAS NO COLUMNS");
}
else
{
    // MAN PAGE: http://php.net/manual/en/function.mysql-fetch-assoc.php
    while ($show_columns = mysql_fetch_assoc($res))
    {
        $my_columns[] = $show_columns["Field"];
    }
    // var_dump($my_columns); ACTIVATE THIS TO SEE THE COLUMNS
}

// WRITE THE COLUMN NAMES TO THE CSV
if (!fputcsv($fp, $my_columns)) die('DISASTER');

// GET THE ROWS OF DATA
$sql = "SELECT * FROM $table_name";
$res = mysql_query($sql);
if (!$res)
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>QUERY FAIL: ";
    echo "<br/>$sql <br/>";
    die($errmsg);
}

// ITERATE OVER THE DATA SET
while ($row = mysql_fetch_row($res))
{
    // WRITE THE COMMA-SEPARATED VALUES.  MAN PAGE http://php.net/manual/en/function.fputcsv.php
    if (!fputcsv($fp, $row)) die('CATASTROPHE');
}

// ALL DONE
fclose($fp);

// SHOW THE CLIENT A LINK
echo "<p><a href=\"$csv\">$csv</a></p>\n";

Open in new window

ASKER CERTIFIED SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
rleyba828

Hi cathal,

    I tried your code above but it actually changed all the \[cr][lf] into "[cr][lf] everywhere in the file.  So I am back where I started, I still need a way to change \[cr][lf] into [lf].

Ray,
   I will go by your comment that when you opened the file in excel using your script, it discarded the characters after the line feed.

*I'll do a post in the PERL section, I'm sure this is a just a simple regex global substitution thing....I know it's tricky because of the mix of \ and control characters [cr][lf].

Thanks.  I'll let you know what I find.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Ray Paseur

It appears that Excel is violating the RFC.
http://tools.ietf.org/html/rfc4180#page-2
https://developers.google.com/analytics/solutions/articles/gdataAnalyticsCsv

But that is not so.  It complies and handles the multiline document correctly.  I just hides this fact behind the opaque veil of "Alignment->Wrap Text."  WTF, Microsoft?

Here is the data base table that I used for the test, with some quotes and newlines thrown in:Table use in the testHere is the CSV file that was created by the PHP script:The CSVHere is the initial view when the CSV is opened in Excel, after I expanded the columns enough to "see everything".  Note cell C5:Initial ViewNow look at the "Alignment" box when I select cell C5:Wrap Text is illuminatedWhen I clicked on "Wrap Text" ...Before "Wrap Text" is firedLo and Behold!  The data was there, correct, all along.The CSV was imported correctly, but the data was hidden.  Feh.
ASKER
rleyba828

Hi Ray,

   thanks for your great detective work above...  I will go ahead and try your script.  In the meantime, I will most likely sanitize my data and send you the problematic file so you can have a better look.

   thanks and regards