Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Importing data from a tab-delimited TXT file using PHP

Posted on 2009-05-12
11
494 Views
Last Modified: 2013-12-13
I'm writing a PHP-based app that requires that data be imported into a MySQL datbase from a tab delimited TXT file.

The import file will initially be in XLS format, .. but will then be exported to tab-delimited TXT format using Excel's "Save As" functionality.

One of the problems that I know I'm going to run in to is that some of the data in the TXT file will contain quotation marks as qualifiers (ie: typically wrapped around individual column values that contains commas in them, ... such as item descriptions and dollar values, etc.).  

I really don't want to have to involve any kind of "custom excel macro" or anything of the sort if I can help it, .. since that's always proven to be an annoyance.

So .. here is what I'd like to accomplish.  I need a PHP-based script (preferably something procedural) that has these features:

1) Reads the contents of a tab-delimited TXT file (which already resides on the server) which contains 11 specifically named columns in it -- (ie: ItemName, ItemDescription, ReservePrice ...)

2) Checks to see if a piece of column data has quotation mark qualifiers surrounding it, .. and if so, .. it removes them from the data.

3) Checks to see if a piece of column data starts with a "$" symbol, .. and if so .. remove the "$" symbol, and also remove any commas that might exist ($2,400.00 --> 2400.00)

4) Insert the column data into the database -- 1 row at a time -- using the "sanitized" data.

Thanks!
- Yvan





0
Comment
Question by:egoselfaxis
  • 5
  • 3
  • 3
11 Comments
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 24365924
Most of what you want can be accomplished very easily with the LOAD DATA INFILE statement in MySQL. Define a holding table with the same number of columns as you have in your TAB delimited file, then read the data with the statement

LOAD DATA INFILE 'filename' INTO TABLE tableName
   FIELDS TERMINATED BY '\t'
   OPTIONALLY  ENCLOSED BY '"'
   LINES TERMINATED BY '\n'

More at http://dev.mysql.com/doc/refman/5.1/en/load-data.html

Then run a query to update the column(s) with $ and , in it

UPDATE tableName set moneyField = REPLACE( moneyField, '$', '' );
UPDATE tableName set moneyField = REPLACE( moneyField, ',', '' );


More at http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_replace

Once this data is sanitised you can copy it to the target table with an INSERT/SELECT

INSERT INTO targetTable
    SELECT * FROM tableName

See http://dev.mysql.com/doc/refman/5.1/en/insert-select.html
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24365952
This might be easier than you think. mySQL has a special command which can be used for importing csv files, called LOAD DATA INFILE.

http://dev.mysql.com/doc/refman/5.0/en/load-data.html

The statement you need is something like:

LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY '\t' ENCLOSED BY '"'
  LINES TERMINATED BY '\n';

If you do this from PHP, you must double the backslashes and escape the double quote character:

$sql = "LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY '\\t' ENCLOSED BY '\"'
  LINES TERMINATED BY '\\n';";

Regarding the dollar values ($2,400.00 --> 2400.00), this should be removed by excel, check if it is in the csv file after export from excel. If they are there, you can remove them from the mysql table after import:

update tbl_name set dollarcol=replace(dollarcol,'$','');
update tbl_name set dollarcol=replace(dollarcol,',','');
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24366004
Right... it seems bportlock and me are would do this pretty much the same way! :)
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:egoselfaxis
ID: 24366053
Well, although I appreciate the excellent suggestion, .. since I've already invested a lot of time in developing a PHP-based front end, it's important that my solution be an extension of the application I've already developed (My clients will need to be able to manage the import process by themselves, and will not be using any kind of MySQL client.).  

Can this LOAD DATA INIFILE statement be executed using PHP?  If so, how?


- Yvan
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24366177
>> Can this LOAD DATA INIFILE statement be executed using PHP?  If so, how?

Yes, it can.
$sql = "LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY '\\t' ENCLOSED BY '\"'
  LINES TERMINATED BY '\\n';";
mysql_query($sql) or die(mysql_error());

Open in new window

0
 

Author Comment

by:egoselfaxis
ID: 24367125
Unfortunately, .. I'm not going to be able to use this in this particular instance, as I can not dictate to the client what columns they need to include in their import file, ... or the specific order.  Rather, they have given me an import file, and asked me to develop an import module based on that.

I've made some progress on my PHP import script since I posted my initial question (see attached code).  However, I'm having a problem with not being able to figure out how to properly handle the end of a row (ie: new line).  How do I adapt my script to handle the new lines?   I suspect that my use of the explode function might not be adequate in this situation, .. and that my foreach loop might actually need to be nested inside another loop of some sort.  Please advise.

Thanks!
- Yvan
        $importfile = $datadir . $_POST['importfile'];
 
        $fd = fopen($importfile, 'r');
 
        $theData = fread($fd, filesize($importfile));
 
        fclose ($fd);
 
        $delimiter = "\t";
 
        $splitcontents = explode($delimiter, $theData);
 
        $counter = "";
 
        foreach ( $splitcontents as $data ) {
 
                $counter++;
 
                echo "<strong>Column #" . $counter . ": </strong> " . $data . "<br/><br/>";                
 
                if ($counter == '11') {
 
                    echo "<h3>New Row</h3>";
 
                    $counter = "";
                }
        
        }

Open in new window

0
 
LVL 34

Accepted Solution

by:
Beverley Portlock earned 500 total points
ID: 24368005
If I was doing this I would have my own temporary holding table that I would do the LOAD DATA INFILE into and when I am finished massaging the data I would then transfer it to the "Live" table. Is the ability to have your own work table available? If so then you only need to modify the last step.

Otherwise you need to look at fgetcsv  http://www.php.net/fgetcsv and there are lots of examples there.
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 24368018
@cxr - just compared our solutions. Almost a copyright violation there.....

;-)

Cheers

brian
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24368096
Yes, I wonder if there was some telepathic forces involved... ;)
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24368242
>> How do I adapt my script to handle the new lines?

I agree with bportlock (surprise!), you should use fgetcsv() to read the file. However, to answer your question regarding your latest snippet, you could use the file() function to read the file into an array:

$data = file($importfile);
foreach($data as $line) {
  $columns = explode("\t",$line);
}

http://php.net/manual/en/function.file.php

You could also read it into a string (like you allready do) and explode on "\n" first to get lines, loop over the lines and explode on "\t" to get the fields.

Beware that both of these solutions fails if any column contains a \t (tab) character. This is why fgetcsv() is a better solution, this function reads csv files correctly, even if the columns contains the separator.
0
 

Author Comment

by:egoselfaxis
ID: 24386946
bportlock -- the fgetcsv() has made things sooo much easier for me.  Thanks so much!

- Yvan
0

Featured Post

Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A little background as to how I came to I design this code: Around 5 years ago I designed an add-in that formatted Excel files to a corporate standard, applying different cell colours and font type depending on whether the cells contained inputs,…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
This Micro Tutorial will demonstrate the scrolling table in Microsoft Excel using the INDEX function.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question