We help IT Professionals succeed at work.

Script processes files and combines them into one, but I need the output to maintain separate files

sara_bellum
sara_bellum asked
on
219 Views
Last Modified: 2012-05-05
I inherited a PHP script that reformats csv files to make them more human-readable, however the output is a single file.  I need the output to reflect the same number of files as are in the directory ($myDir), and retain the same filenames if possible (hr1_archive.csv, hr2_archive.csv etc).  I'm running this script from a Linux command line.
reformat.txt
Comment
Watch Question

Top Expert 2007

Commented:
This should work:
<?php
 
/*read configuration file*/
$config_name='cchrc_convert.cfg';
$cchrc_header=array();
$cchrc_data=array();
 
$config=config_fetch($config_name);
$myDir='/home/my_user/my_data';
$csv_file=csv_fetch($myDir);
 
if ($dh = opendir($dir)) {
    while (($csv_file = readdir($myDir)) !== false) {
        if (!is_file($csv_file)) {
            continue;
        }
        csv_convert($csv_file, $cchrc_header, $cchrc_data);
        csv_output($cchrc_header, $cchrc_data, $csv_file);
    }
    closedir($dh);
}
 
function config_fetch($config_name)
{
    $config_file=file($config_name);
    $config=array();
 
    if ($config_file) foreach ($config_file as $line)
    {
        list($key, $value)=explode('=', $line);
        $key=trim($key); $value=trim($value);
        if ($key=='' || $key[0]=='#') continue;
 
        switch ($key)
        {
            default: {
                $config[$key]=$value;
            } break;
            case 'Filter': {
                list($fkey, $fvalue)=explode(',', $value);
                $fkey=trim($fkey); $fvalue=trim($fvalue);
                $config['Filter'][$fkey]=$fvalue;
            } break;
            case 'ColumnKey': {
                $config['ColumnKey']=explode(',', $value);
            } break;
            case 'ColumnValue': {
                $config['ColumnValue']=explode(',', $value);
            } break;
        }
    }
 
    return $config;
}
 
/* read csv data files */
function csv_fetch($myDir)
{
    global $config_name;
 
    echo "Loading files:\n";
 
    if ($dh=opendir($myDir))
    {
        $csv_file=array();
 
        while (($file=readdir($dh))!==false) {
            if ($file==$config_name) continue;
            else if (stristr($file, '.php')) continue;
            else if ($file=='data.csv') continue;
            else if ($file=='.') continue;
            else if ($file=='..') continue;
 
            echo "$file\n";
            $csv_file=array_merge($csv_file, file($file));
        }
 
        if ($csv_file) foreach ($csv_file as $key=>$line) {
            $line=str_replace('"', '', $line);
            $line=str_replace(chr(13).chr(10), '', $line);
            $csv_file[$key]=$line;
        }
 
        closedir($dh);
    }
 
    echo "\n";
 
    return $csv_file;
}
 
/* convert */
function csv_convert($csv_file, &$cchrc_header, &$cchrc_data)
{
    global $config;
 
    echo "Converting data: .";
 
    $element='';
    $count=1;
    $records=count($csv_file);
    if ($csv_file) foreach ($csv_file as $line)
    {
        $csv_datum=explode(',', $line);
 
        /* element */
        if ($csv_datum[0]==$config['HeaderName']) {
            $element=$csv_datum[1];
        }
 
        /* reset element */
        if (count($csv_datum)==0) {
            $element='';
        }
 
        /* filter */
        if ($config['Filter']) foreach ($config['Filter'] as $key=>$value) {
            if ($csv_datum[0]==$key && $csv_datum[1]==$value) {
                $element='';
            }
        }
 
        /* push an item onto an element */
        if ($element!='' && count($csv_datum)==$config['ColumnCount'])
        {
            $key='';
            if ($config['ColumnKey']) foreach ($config['ColumnKey'] as $key_num) {
                if ($key!='') $key.=' '; $key.=$csv_datum[$key_num];
            }
            $value='';
            if ($config['ColumnValue']) foreach ($config['ColumnValue'] as $value_num) {
                if ($value!='') $value.=' '; $value.=$csv_datum[$value_num];
            }
 
            $cchrc_header[$key]=$key;
            $cchrc_data[$element][$key]=$value;
        }
 
        if ($count%1000==0) echo ".";
 
        $count++;
    }
    echo " Done\n";
 
    /* sort data */
    echo "   Sorting data: .";
 
    usort($cchrc_header, sort_date);
    ksort($cchrc_data);
 
    echo " Done\n\n";
    echo "Finished, file saved as 'data.csv'\n";
}
 
function sort_date($a, $b)
{
    // omitted (not an issue)
}
 
function csv_output($cchrc_header, $cchrc_data, $dst)
{
    global $config;
 
    $output=fopen($dst, 'w');
 
    /* Headers */
    fwrite($output, "\"Date\"");
    if ($cchrc_data) foreach ($cchrc_data as $name => $data)
    {
        fwrite($output, ",\"$name\"");
    }
    fwrite($output, "\n");
 
    /*  */
    if ($cchrc_header) foreach ($cchrc_header as $date)
    {
        fwrite($output, "\"$date\"");
        if ($cchrc_data) foreach ($cchrc_data as $name => $data)
        {
            fwrite($output, ",\"".$data[$date]."\"");
        }
        fwrite($output, "\n");
    }
 
    fclose($output);
}
 
function print_array($array, $name='')
{
    echo "$name:";
    print_r($array);
}

Open in new window

CERTIFIED EXPERT
Expert of the Year 2008
Top Expert 2008

Commented:
>>and retain the same filenames if possible
So you want to overwrite your original files? If I understand you correctly, the input files have a .csv extension and after processing these files will be overwritten. Make sure you backup your originals first, in case something goes wrong!
CERTIFIED EXPERT
Expert of the Year 2008
Top Expert 2008

Commented:
Try the attached code. NOTE: I'm prefixing the output files with "sara_" to "preserve" your original files in case something did not work as expected. Once it is working as desired AND you want to overwrite the original csv files, you can just change:
csv_output($cchrc_header, $cchrc_data, 'sara_' . $info['file']);

to:
csv_output($cchrc_header, $cchrc_data, $info['file']);
<?
/*read configuration file*/
$config_name='cchrc_convert.cfg';
$cchrc_header=array();
$cchrc_data=array(); 
$config=config_fetch($config_name);
$myDir='/home/my_user/my_data';
$csv_file=csv_fetch($myDir);
foreach( $csv_file as $info ){ 
		csv_convert($info['data'], $cchrc_header, $cchrc_data);
		csv_output($cchrc_header, $cchrc_data, 'sara_' . $info['file']); 
}
function config_fetch($config_name)
{
        $config_file=file($config_name);
        $config=array(); 
        if ($config_file) foreach ($config_file as $line)
        {
                list($key, $value)=explode('=', $line);
                $key=trim($key); $value=trim($value);
                if ($key=='' || $key[0]=='#') continue; 
                switch ($key)
                {
                        default: {
                                $config[$key]=$value;
                        } break;
                        case 'Filter': {
                                        list($fkey, $fvalue)=explode(',', $value);
                                        $fkey=trim($fkey); $fvalue=trim($fvalue);
                                        $config['Filter'][$fkey]=$fvalue;
                        } break;
                        case 'ColumnKey': {
                                        $config['ColumnKey']=explode(',', $value);
                        } break;
                        case 'ColumnValue': {
                                        $config['ColumnValue']=explode(',', $value);
                        } break;
                }
        } 
        return $config;
} 
/* read csv data files */
function csv_fetch($myDir)
{
        global $config_name; 
        echo "Loading files:\n"; 
        if ($dh=opendir($myDir))
        {
                $csv_file=array(); 
                while (($file=readdir($dh))!==false) {
                        if ($file==$config_name) continue;
                        else if (stristr($file, '.php')) continue;
//                        else if ($file=='data.csv') continue;
                        else if ($file=='.') continue;
                        else if ($file=='..') continue; 
                        echo "$file\n";
					//$csv_file=array_merge($csv_file, file($file));
					$csv_file[]=array('file'=>$file,'data'=>file($file));
                } 
                if ( count($csv_file) > 0 )
			 	foreach ($csv_file as $info) {
					foreach($info as $key=>$line){
                        		$line=str_replace('"', '', $line);
                        		$line=str_replace(chr(13).chr(10), '', $line);
                        		$info[$key]=$line;
					}
                	}
                closedir($dh);
        } 
        echo "\n"; 
        return $csv_file;
} 
/* convert */
function csv_convert($csv_file, &$cchrc_header, &$cchrc_data)
{
        global $config; 
        echo "Converting data: ."; 
        $element='';
        $count=1;
        $records=count($csv_file);
        if ($csv_file) foreach ($csv_file as $line)
        {
                $csv_datum=explode(',', $line); 
                /* element */
                if ($csv_datum[0]==$config['HeaderName']) {
                        $element=$csv_datum[1];
                } 
                /* reset element */
                if (count($csv_datum)==0) {
                        $element='';
                } 
                /* filter */
                if ($config['Filter']) foreach ($config['Filter'] as $key=>$value) {
                        if ($csv_datum[0]==$key && $csv_datum[1]==$value) {
                                $element='';
                        }
                } 
                /* push an item onto an element */
                if ($element!='' && count($csv_datum)==$config['ColumnCount'])
                {
                        $key='';
                        if ($config['ColumnKey']) foreach ($config['ColumnKey'] as $key_num) {
                                if ($key!='') $key.=' '; $key.=$csv_datum[$key_num];
                        }
                        $value='';
                        if ($config['ColumnValue']) foreach ($config['ColumnValue'] as $value_num) {
                                if ($value!='') $value.=' '; $value.=$csv_datum[$value_num];
                        } 
                        $cchrc_header[$key]=$key;
                        $cchrc_data[$element][$key]=$value;
                } 
                if ($count%1000==0) echo "."; 
                $count++;
        }
        echo " Done\n"; 
        /* sort data */
        echo "   Sorting data: ."; 
        usort($cchrc_header, sort_date);
        ksort($cchrc_data); 
        echo " Done\n\n";
        echo "Finished, file saved as 'data.csv'\n";
} 
function sort_date($a, $b)
{
        omitted (not an issue)
} 
function csv_output($cchrc_header, $cchrc_data, $output_file="data.csv")
{
        global $config; 
        $output=fopen($output_file, 'w'); 
        /* Headers */
        fwrite($output, "\"Date\"");
        if ($cchrc_data) foreach ($cchrc_data as $name => $data)
        {
                fwrite($output, ",\"$name\"");
        }
        fwrite($output, "\n"); 
        /*  */
        if ($cchrc_header) foreach ($cchrc_header as $date)
        {
                fwrite($output, "\"$date\"");
                if ($cchrc_data) foreach ($cchrc_data as $name => $data)
                {
                        fwrite($output, ",\"".$data[$date]."\"");
                }
                fwrite($output, "\n");
        } 
        fclose($output);
} 
function print_array($array, $name='')
{
        echo "$name:";
        print_r($array);
} 
?>

Open in new window

Author

Commented:
Thanks very much hielo, the script runs and produces a bunch of sara_files.  Unfortunately, all they contain is the word "Date" so I should have included that function, which I copy below.  

I had hoped to address the date issue separately, since when I ran the original script, I got an error:  Sorting data: .PHP Notice:  Use of undefined constant sort_date - assumed 'sort_date' in /home/my_user/my_data/cchrc_convert.php on line 139.  I think the script made an assumption because the output in the original data.csv was sorted by date - the problem was that a) it was only one file and b) it overwrote all the data from the first dates processed, leaving only 8 and 9 July.

I also have this error when running the original script: PHP Notice:  Undefined offset:  1 in /home/my_user/my_data/cchrc_convert.php on line 21.  Again, since the script ran and produced a data file in the correct format, I saw it as a non-critical error.

I see the above as separate questions, but I have to be able to get to some level of output in the sara files before asking them.  Thanks again.

function sort_date($a, $b)
{
        list($a_date, $a_time)=explode(' ', $a);
        $a_date=explode('/', $a_date);
        if ($a_date[0]<9) $a_date[0]='0'.$a_date[0];
        if ($a_date[1]<9) $a_date[1]='0'.$a_date[1];
        $a_test=$a_date[2].$a_date[1].$a_date[0].$a_time;
 
        list($b_date, $b_time)=explode(' ', $b);
        $b_date=explode('/', $b_date);
        if ($b_date[0]<9) $b_date[0]='0'.$b_date[0];
        if ($b_date[1]<9) $b_date[1]='0'.$b_date[1];
        $b_test=$b_date[2].$b_date[1].$b_date[0].$b_time;
 
        if ($a_test==$b_test) return 0;
        return ($a_test>$b_test)?1:-1;
}

Open in new window

Top Expert 2007

Commented:
If you need nee files just change the line
csv_output($cchrc_header, $cchrc_data, $csv_file);
to
csv_output($cchrc_header, $cchrc_data, 'new_' . $csv_file);
in my code.
CERTIFIED EXPERT
Expert of the Year 2008
Top Expert 2008
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
It worked!!  Thanks very much.  For reasons which I don't yet understand, hernst42's solution didn't work for me.  I will now endeavor to understand how you fixed this :)

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.