Link to home
Start Free TrialLog in
Avatar of QuintusSmit
QuintusSmit

asked on

change multiple spaces to one space - php

Hi

This follows on my previous question here:
http://tinyurl.com/35k9wnq 

I need to replace random number of spaces in each line with one space so that a line like this:
192\.168\.99\.44            25h/day
becomes
192\.168\.99\.44 25h/day.

I am reading the lines into an array with this script:

<?PHP
   $file_handle = fopen("/var/squish/squish.conf", "rb");
     while (!feof($file_handle) ) {
       $line_of_text = fgets($file_handle);
       if (substr($line_of_text,0,1)!='#') {
         $parts = explode(' ', $line_of_text);
         print $parts[0] . $parts[1]. "<BR>";
         }
  }
fclose($file_handle);
?>

and because I am splitting the lines with a space the random numbers make things go haywire.
ASKER CERTIFIED SOLUTION
Avatar of mcuk_storm
mcuk_storm
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Regular expressions are terribly slow.

use staigforward solution.

$n will be the new string.

$c=file_get_contents('/var/squish/squish.conf');
$n='';

$prev=false;
$size=strlen($c);
for($i=0; $i<$size; ++$i) {
	if ($c[$i]==' ') {
		if (!$prev) {
			$prev=true;
			$n.=' ';
		}
	} else {
		$n.=$c[$i];
		$prev=false;
	}
}

Open in new window

I don't think you do regexp's justice, I have just compared the following two script:

1:
<?php
$c=file_get_contents('testdata');
$n='';

$prev=false;
$size=strlen($c);
for($i=0; $i<$size; ++$i) {
        if ($c[$i]==' ') {
                if (!$prev) {
                        $prev=true;
                        $n.=' ';
                }
        } else {
                $n.=$c[$i];
                $prev=false;
        }
}
echo $n;
?>


2:
<?php

$n='';
$data = file('testdata');
foreach($data as $line) {
        $n .= preg_replace('/[ ]+/',' ',$line);
}
echo $n;
?>


Script 1 execution:
time php < test.php | md5sum
989438e5c3647807a7b552c7212515c4  -

real    0m22.861s
user    0m22.378s
sys     0m0.164s


Script 2 execution:
time php < regexp.php | md5sum  
989438e5c3647807a7b552c7212515c4  -

real    0m0.317s
user    0m0.208s
sys     0m0.105s


The test dataset was a 10000 row file, each row containing two segments of data separated by the same number of spaces as the record number. To rule out the difference between file and file_get_contents i ran tests with just these statements in and file is about 0.1sec slower than get_contents. The MD5s show that the outputs are the same
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@mcuk_storm:

Hm, that is probably the reason because regex module is written in C and can execute machine code directly, but generally, my approach is faster.

My mistake, I guess that is my C++ stereotype that isn't true for PHP.
@Rok-Kralj
Regexp can be very inefficient depending on the complexity of the statements and how they are written (though usually the alternative code is just as complex) but for smaller statements it is generally pretty well optimised. Unfortunately it does seem to have the stigma of being slow but its not always the case.
Avatar of QuintusSmit
QuintusSmit

ASKER

tx for all the responses - regular expressions would be fine as this file will never have more than 150 lines