Link to home
Start Free TrialLog in
Avatar of sara_bellum
sara_bellumFlag for United States of America

asked on

Script processes files and combines them into one, but I need the output to maintain separate files

I inherited a PHP script that reformats csv files to make them more human-readable, however the output is a single file.  I need the output to reflect the same number of files as are in the directory ($myDir), and retain the same filenames if possible (hr1_archive.csv, hr2_archive.csv etc).  I'm running this script from a Linux command line.
reformat.txt
Avatar of hernst42
hernst42
Flag of Germany image

This should work:
<?php
 
/*read configuration file*/
$config_name='cchrc_convert.cfg';
$cchrc_header=array();
$cchrc_data=array();
 
$config=config_fetch($config_name);
$myDir='/home/my_user/my_data';
$csv_file=csv_fetch($myDir);
 
if ($dh = opendir($dir)) {
    while (($csv_file = readdir($myDir)) !== false) {
        if (!is_file($csv_file)) {
            continue;
        }
        csv_convert($csv_file, $cchrc_header, $cchrc_data);
        csv_output($cchrc_header, $cchrc_data, $csv_file);
    }
    closedir($dh);
}
 
function config_fetch($config_name)
{
    $config_file=file($config_name);
    $config=array();
 
    if ($config_file) foreach ($config_file as $line)
    {
        list($key, $value)=explode('=', $line);
        $key=trim($key); $value=trim($value);
        if ($key=='' || $key[0]=='#') continue;
 
        switch ($key)
        {
            default: {
                $config[$key]=$value;
            } break;
            case 'Filter': {
                list($fkey, $fvalue)=explode(',', $value);
                $fkey=trim($fkey); $fvalue=trim($fvalue);
                $config['Filter'][$fkey]=$fvalue;
            } break;
            case 'ColumnKey': {
                $config['ColumnKey']=explode(',', $value);
            } break;
            case 'ColumnValue': {
                $config['ColumnValue']=explode(',', $value);
            } break;
        }
    }
 
    return $config;
}
 
/* read csv data files */
function csv_fetch($myDir)
{
    global $config_name;
 
    echo "Loading files:\n";
 
    if ($dh=opendir($myDir))
    {
        $csv_file=array();
 
        while (($file=readdir($dh))!==false) {
            if ($file==$config_name) continue;
            else if (stristr($file, '.php')) continue;
            else if ($file=='data.csv') continue;
            else if ($file=='.') continue;
            else if ($file=='..') continue;
 
            echo "$file\n";
            $csv_file=array_merge($csv_file, file($file));
        }
 
        if ($csv_file) foreach ($csv_file as $key=>$line) {
            $line=str_replace('"', '', $line);
            $line=str_replace(chr(13).chr(10), '', $line);
            $csv_file[$key]=$line;
        }
 
        closedir($dh);
    }
 
    echo "\n";
 
    return $csv_file;
}
 
/* convert */
function csv_convert($csv_file, &$cchrc_header, &$cchrc_data)
{
    global $config;
 
    echo "Converting data: .";
 
    $element='';
    $count=1;
    $records=count($csv_file);
    if ($csv_file) foreach ($csv_file as $line)
    {
        $csv_datum=explode(',', $line);
 
        /* element */
        if ($csv_datum[0]==$config['HeaderName']) {
            $element=$csv_datum[1];
        }
 
        /* reset element */
        if (count($csv_datum)==0) {
            $element='';
        }
 
        /* filter */
        if ($config['Filter']) foreach ($config['Filter'] as $key=>$value) {
            if ($csv_datum[0]==$key && $csv_datum[1]==$value) {
                $element='';
            }
        }
 
        /* push an item onto an element */
        if ($element!='' && count($csv_datum)==$config['ColumnCount'])
        {
            $key='';
            if ($config['ColumnKey']) foreach ($config['ColumnKey'] as $key_num) {
                if ($key!='') $key.=' '; $key.=$csv_datum[$key_num];
            }
            $value='';
            if ($config['ColumnValue']) foreach ($config['ColumnValue'] as $value_num) {
                if ($value!='') $value.=' '; $value.=$csv_datum[$value_num];
            }
 
            $cchrc_header[$key]=$key;
            $cchrc_data[$element][$key]=$value;
        }
 
        if ($count%1000==0) echo ".";
 
        $count++;
    }
    echo " Done\n";
 
    /* sort data */
    echo "   Sorting data: .";
 
    usort($cchrc_header, sort_date);
    ksort($cchrc_data);
 
    echo " Done\n\n";
    echo "Finished, file saved as 'data.csv'\n";
}
 
function sort_date($a, $b)
{
    // omitted (not an issue)
}
 
function csv_output($cchrc_header, $cchrc_data, $dst)
{
    global $config;
 
    $output=fopen($dst, 'w');
 
    /* Headers */
    fwrite($output, "\"Date\"");
    if ($cchrc_data) foreach ($cchrc_data as $name => $data)
    {
        fwrite($output, ",\"$name\"");
    }
    fwrite($output, "\n");
 
    /*  */
    if ($cchrc_header) foreach ($cchrc_header as $date)
    {
        fwrite($output, "\"$date\"");
        if ($cchrc_data) foreach ($cchrc_data as $name => $data)
        {
            fwrite($output, ",\"".$data[$date]."\"");
        }
        fwrite($output, "\n");
    }
 
    fclose($output);
}
 
function print_array($array, $name='')
{
    echo "$name:";
    print_r($array);
}

Open in new window

Avatar of hielo
>>and retain the same filenames if possible
So you want to overwrite your original files? If I understand you correctly, the input files have a .csv extension and after processing these files will be overwritten. Make sure you backup your originals first, in case something goes wrong!
Try the attached code. NOTE: I'm prefixing the output files with "sara_" to "preserve" your original files in case something did not work as expected. Once it is working as desired AND you want to overwrite the original csv files, you can just change:
csv_output($cchrc_header, $cchrc_data, 'sara_' . $info['file']);

to:
csv_output($cchrc_header, $cchrc_data, $info['file']);
<?
/*read configuration file*/
$config_name='cchrc_convert.cfg';
$cchrc_header=array();
$cchrc_data=array(); 
$config=config_fetch($config_name);
$myDir='/home/my_user/my_data';
$csv_file=csv_fetch($myDir);
foreach( $csv_file as $info ){ 
		csv_convert($info['data'], $cchrc_header, $cchrc_data);
		csv_output($cchrc_header, $cchrc_data, 'sara_' . $info['file']); 
}
function config_fetch($config_name)
{
        $config_file=file($config_name);
        $config=array(); 
        if ($config_file) foreach ($config_file as $line)
        {
                list($key, $value)=explode('=', $line);
                $key=trim($key); $value=trim($value);
                if ($key=='' || $key[0]=='#') continue; 
                switch ($key)
                {
                        default: {
                                $config[$key]=$value;
                        } break;
                        case 'Filter': {
                                        list($fkey, $fvalue)=explode(',', $value);
                                        $fkey=trim($fkey); $fvalue=trim($fvalue);
                                        $config['Filter'][$fkey]=$fvalue;
                        } break;
                        case 'ColumnKey': {
                                        $config['ColumnKey']=explode(',', $value);
                        } break;
                        case 'ColumnValue': {
                                        $config['ColumnValue']=explode(',', $value);
                        } break;
                }
        } 
        return $config;
} 
/* read csv data files */
function csv_fetch($myDir)
{
        global $config_name; 
        echo "Loading files:\n"; 
        if ($dh=opendir($myDir))
        {
                $csv_file=array(); 
                while (($file=readdir($dh))!==false) {
                        if ($file==$config_name) continue;
                        else if (stristr($file, '.php')) continue;
//                        else if ($file=='data.csv') continue;
                        else if ($file=='.') continue;
                        else if ($file=='..') continue; 
                        echo "$file\n";
					//$csv_file=array_merge($csv_file, file($file));
					$csv_file[]=array('file'=>$file,'data'=>file($file));
                } 
                if ( count($csv_file) > 0 )
			 	foreach ($csv_file as $info) {
					foreach($info as $key=>$line){
                        		$line=str_replace('"', '', $line);
                        		$line=str_replace(chr(13).chr(10), '', $line);
                        		$info[$key]=$line;
					}
                	}
                closedir($dh);
        } 
        echo "\n"; 
        return $csv_file;
} 
/* convert */
function csv_convert($csv_file, &$cchrc_header, &$cchrc_data)
{
        global $config; 
        echo "Converting data: ."; 
        $element='';
        $count=1;
        $records=count($csv_file);
        if ($csv_file) foreach ($csv_file as $line)
        {
                $csv_datum=explode(',', $line); 
                /* element */
                if ($csv_datum[0]==$config['HeaderName']) {
                        $element=$csv_datum[1];
                } 
                /* reset element */
                if (count($csv_datum)==0) {
                        $element='';
                } 
                /* filter */
                if ($config['Filter']) foreach ($config['Filter'] as $key=>$value) {
                        if ($csv_datum[0]==$key && $csv_datum[1]==$value) {
                                $element='';
                        }
                } 
                /* push an item onto an element */
                if ($element!='' && count($csv_datum)==$config['ColumnCount'])
                {
                        $key='';
                        if ($config['ColumnKey']) foreach ($config['ColumnKey'] as $key_num) {
                                if ($key!='') $key.=' '; $key.=$csv_datum[$key_num];
                        }
                        $value='';
                        if ($config['ColumnValue']) foreach ($config['ColumnValue'] as $value_num) {
                                if ($value!='') $value.=' '; $value.=$csv_datum[$value_num];
                        } 
                        $cchrc_header[$key]=$key;
                        $cchrc_data[$element][$key]=$value;
                } 
                if ($count%1000==0) echo "."; 
                $count++;
        }
        echo " Done\n"; 
        /* sort data */
        echo "   Sorting data: ."; 
        usort($cchrc_header, sort_date);
        ksort($cchrc_data); 
        echo " Done\n\n";
        echo "Finished, file saved as 'data.csv'\n";
} 
function sort_date($a, $b)
{
        omitted (not an issue)
} 
function csv_output($cchrc_header, $cchrc_data, $output_file="data.csv")
{
        global $config; 
        $output=fopen($output_file, 'w'); 
        /* Headers */
        fwrite($output, "\"Date\"");
        if ($cchrc_data) foreach ($cchrc_data as $name => $data)
        {
                fwrite($output, ",\"$name\"");
        }
        fwrite($output, "\n"); 
        /*  */
        if ($cchrc_header) foreach ($cchrc_header as $date)
        {
                fwrite($output, "\"$date\"");
                if ($cchrc_data) foreach ($cchrc_data as $name => $data)
                {
                        fwrite($output, ",\"".$data[$date]."\"");
                }
                fwrite($output, "\n");
        } 
        fclose($output);
} 
function print_array($array, $name='')
{
        echo "$name:";
        print_r($array);
} 
?>

Open in new window

Avatar of sara_bellum

ASKER

Thanks very much hielo, the script runs and produces a bunch of sara_files.  Unfortunately, all they contain is the word "Date" so I should have included that function, which I copy below.  

I had hoped to address the date issue separately, since when I ran the original script, I got an error:  Sorting data: .PHP Notice:  Use of undefined constant sort_date - assumed 'sort_date' in /home/my_user/my_data/cchrc_convert.php on line 139.  I think the script made an assumption because the output in the original data.csv was sorted by date - the problem was that a) it was only one file and b) it overwrote all the data from the first dates processed, leaving only 8 and 9 July.

I also have this error when running the original script: PHP Notice:  Undefined offset:  1 in /home/my_user/my_data/cchrc_convert.php on line 21.  Again, since the script ran and produced a data file in the correct format, I saw it as a non-critical error.

I see the above as separate questions, but I have to be able to get to some level of output in the sara files before asking them.  Thanks again.

function sort_date($a, $b)
{
        list($a_date, $a_time)=explode(' ', $a);
        $a_date=explode('/', $a_date);
        if ($a_date[0]<9) $a_date[0]='0'.$a_date[0];
        if ($a_date[1]<9) $a_date[1]='0'.$a_date[1];
        $a_test=$a_date[2].$a_date[1].$a_date[0].$a_time;
 
        list($b_date, $b_time)=explode(' ', $b);
        $b_date=explode('/', $b_date);
        if ($b_date[0]<9) $b_date[0]='0'.$b_date[0];
        if ($b_date[1]<9) $b_date[1]='0'.$b_date[1];
        $b_test=$b_date[2].$b_date[1].$b_date[0].$b_time;
 
        if ($a_test==$b_test) return 0;
        return ($a_test>$b_test)?1:-1;
}

Open in new window

If you need nee files just change the line
csv_output($cchrc_header, $cchrc_data, $csv_file);
to
csv_output($cchrc_header, $cchrc_data, 'new_' . $csv_file);
in my code.
ASKER CERTIFIED SOLUTION
Avatar of hielo
hielo
Flag of Wallis and Futuna image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
It worked!!  Thanks very much.  For reasons which I don't yet understand, hernst42's solution didn't work for me.  I will now endeavor to understand how you fixed this :)