Merge csv and tsv files with PHP

I'm trying to merge all the csv or tsv files in a drectory (only one type at a time) into a "merge".csv or .tsv, plus insert the source file name as the first field in each record.

By inserting that field it seems to mess up the formatting of the csv or tsv data.  See sample files and outputs attached.

**Note: in the attached files all of the tsv files have been renamed to .jpg since expert-exchange won't allow uploading tsv files!?!?

<?php

    set_include_path('c:');

  $path='C:/Documents and Settings/Steve/My Documents/a_Holijoli/a_Projects/TCH-Group/SampleFiles/';
  $pathdir = opendir($path); 

    $outFile = "$path merge.csv";
    $fout = fopen($outFile, 'w');

    

    while (false !== ($file = readdir($pathdir))) 
    {
        $fullname="${path}${file}";

        
       if(stristr($file, '.csv') == TRUE) 
       {
           echo "...> $fullname<br/>";  
            
           $recArr = file("$fullname");
            
            foreach($recArr as $rec)
            {
                $newrec="$file.\t.$rec";
                fwrite($fout,$newrec);
                echo $newrec;
              
            }
        }    
    }
 
?>

Open in new window

samples.zip
stevelucyAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

karoldvlCommented:
I fixed some minor issues here. See if that's what you wanted.
<?php
  set_include_path('c:');

  $path='C:/Documents and Settings/Steve/My Documents/a_Holijoli/a_Projects/TCH-Group/SampleFiles/';
  $pathdir = opendir($path); 

  $outFile = "${path}merge.csv";
  $fout = fopen($outFile, 'w');

  while (false !== ($file = readdir($pathdir))) 
  {
      $fullname="${path}${file}";

       
      if(stristr($file, '.csv') == TRUE && stristr($file, 'merge.csv') == FALSE) 
      {
         echo "...> $fullname<br/>";  
          
         $recArr = file("$fullname");
          
          foreach($recArr as $rec)
          {
              $newrec="$file\t$rec";
              fwrite($fout,$newrec);
              echo $newrec;
            
          }
      }
      
  }
 
?>

Open in new window

0
stevelucyAuthor Commented:
I stil get the same problem that the existing formatting of csv data gets lost .  Attached is a screenshot of excel loading the result, plus the actual file.

csvtest.jpg
merge.csv
0
karoldvlCommented:
Try recreating your data files in uniform character encoding. Try using UTF-8 for all the files.
0
Become a Certified Penetration Testing Engineer

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

stevelucyAuthor Commented:
These files are all downloads from the Google Keyword tool or Yahoo Site Explorer.

It's funny because Google provides csv files but in notepad it looks like they're tab separated - not a comma to be found anywhere.

How would I convert all of them to UTF-8?

thanks for your help
0
karoldvlCommented:
You can use Notepad++:
http://notepad-plus.sourceforge.net/

Just open an appropriate file and use "Encoding->Convert to UTF-8 without BOM" and save.
0
stevelucyAuthor Commented:
Hey that did it!  But one more problem.  I have about 300 of these files - how could I do a batch conversion - any idea?

If not, I'll just go ahead and accept as solution - thanks!

0
karoldvlCommented:
I don't know if Notepad++ macro recording could handle something like that. Probably not.

I'm certain that iconv with some batch script would do, but it's probably too much work.

SourceForge advises something of this sort:
http://sourceforge.net/projects/uni-transmuter/

or if you want it PHP based:
http://sourceforge.net/projects/batchconvert/
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
stevelucyAuthor Commented:
thanks!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.