Solved

How to detect binary chars in a file using preg_match in PHP 5?

Posted on 2009-04-10
8
928 Views
Last Modified: 2012-05-06
What pattern could I use to detect whether a file contains non-printable chars using preg_match (or ereg)?

The logic in the conditional below could be reversed depending upon the easiest pattern.

Thanks in advance.
$fileBuffer = file_get_contents($filePath);	

$pattern = '/pattern??/';

$result = preg_match($pattern, $fileBuffer);

if (false === $result)

    return "binary file";

else

    return "text file";

Open in new window

0
Comment
Question by:DigitalDave1
  • 3
  • 3
  • 2
8 Comments
 
LVL 19

Expert Comment

by:LordOfPorts
Comment Utility
The is_binary http://us2.php.net/is_binary function might be of interest.
0
 
LVL 19

Expert Comment

by:LordOfPorts
Comment Utility
My mistake, sorry, is_binary is available starting with PHP 6.
0
 

Author Comment

by:DigitalDave1
Comment Utility
Yes I saw is_binary(). But we are running PHP 5.x.

0
 
LVL 108

Assisted Solution

by:Ray Paseur
Ray Paseur earned 500 total points
Comment Utility
I use a "clean_string()" function to remove not only binary characters, but also unwanted characters.  The code snippet just tests for numbers, but you can add all the alpha and special characters to the REGEX.

So something like this...

$str = "12345";
if (!is_clean_numeric_string($str)) die("BAD NUMBER!");

HTH, ~Ray
function is_clean_numeric_string($string) // Q-N-D IS IT NUMERIC?

{ 

   $str = trim(ereg_replace(" +", " ", $string));

   $new = ereg_replace("[^0-9]", "?", $str);

	

   if ($new != $str) 

   {

      return FALSE; 

   } else {

      return ( $new ); 

   }

}

Open in new window

0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 19

Expert Comment

by:LordOfPorts
Comment Utility
Try using is_string http://us2.php.net/manual/en/function.is-string.php on $fileBuffer:
$fileBuffer = file_get_contents($filePath);     
 

$result = is_string($fileBuffer);
 

if (false === $result)

    return "binary file";

else

    return "text file";

Open in new window

0
 
LVL 108

Accepted Solution

by:
Ray Paseur earned 500 total points
Comment Utility
For a more expanded view of things, the pattern [\x00-\x1f] matches all control characters including the NUL.
0
 

Author Comment

by:DigitalDave1
Comment Utility
Worked out a preg_match pattern to test for the non-printing chars that exclude \n \r \t  etc.:

$pattern = '/[\x00-\x08\x0E-\x1F\x7F]/';

Thanks for the clues that led to this idea.


0
 
LVL 108

Expert Comment

by:Ray Paseur
Comment Utility
Thanks for the points -- it's a good question! ~Ray
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
The viewer will learn how to dynamically set the form action using jQuery.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now