Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 985
  • Last Modified:

How to detect binary chars in a file using preg_match in PHP 5?

What pattern could I use to detect whether a file contains non-printable chars using preg_match (or ereg)?

The logic in the conditional below could be reversed depending upon the easiest pattern.

Thanks in advance.
$fileBuffer = file_get_contents($filePath);	
$pattern = '/pattern??/';
$result = preg_match($pattern, $fileBuffer);
if (false === $result)
    return "binary file";
else
    return "text file";

Open in new window

0
DigitalDave1
Asked:
DigitalDave1
  • 3
  • 3
  • 2
2 Solutions
 
LordOfPortsCommented:
The is_binary http://us2.php.net/is_binary function might be of interest.
0
 
LordOfPortsCommented:
My mistake, sorry, is_binary is available starting with PHP 6.
0
 
DigitalDave1Author Commented:
Yes I saw is_binary(). But we are running PHP 5.x.

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Ray PaseurCommented:
I use a "clean_string()" function to remove not only binary characters, but also unwanted characters.  The code snippet just tests for numbers, but you can add all the alpha and special characters to the REGEX.

So something like this...

$str = "12345";
if (!is_clean_numeric_string($str)) die("BAD NUMBER!");

HTH, ~Ray
function is_clean_numeric_string($string) // Q-N-D IS IT NUMERIC?
{ 
   $str = trim(ereg_replace(" +", " ", $string));
   $new = ereg_replace("[^0-9]", "?", $str);
	
   if ($new != $str) 
   {
      return FALSE; 
   } else {
      return ( $new ); 
   }
}

Open in new window

0
 
LordOfPortsCommented:
Try using is_string http://us2.php.net/manual/en/function.is-string.php on $fileBuffer:
$fileBuffer = file_get_contents($filePath);     
 
$result = is_string($fileBuffer);
 
if (false === $result)
    return "binary file";
else
    return "text file";

Open in new window

0
 
Ray PaseurCommented:
For a more expanded view of things, the pattern [\x00-\x1f] matches all control characters including the NUL.
0
 
DigitalDave1Author Commented:
Worked out a preg_match pattern to test for the non-printing chars that exclude \n \r \t  etc.:

$pattern = '/[\x00-\x08\x0E-\x1F\x7F]/';

Thanks for the clues that led to this idea.


0
 
Ray PaseurCommented:
Thanks for the points -- it's a good question! ~Ray
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 3
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now