Solved

How to detect binary chars in a file using preg_match in PHP 5?

Posted on 2009-04-10
8
939 Views
Last Modified: 2012-05-06
What pattern could I use to detect whether a file contains non-printable chars using preg_match (or ereg)?

The logic in the conditional below could be reversed depending upon the easiest pattern.

Thanks in advance.
$fileBuffer = file_get_contents($filePath);	
$pattern = '/pattern??/';
$result = preg_match($pattern, $fileBuffer);
if (false === $result)
    return "binary file";
else
    return "text file";

Open in new window

0
Comment
Question by:DigitalDave1
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
  • 2
8 Comments
 
LVL 19

Expert Comment

by:LordOfPorts
ID: 24120236
The is_binary http://us2.php.net/is_binary function might be of interest.
0
 
LVL 19

Expert Comment

by:LordOfPorts
ID: 24120246
My mistake, sorry, is_binary is available starting with PHP 6.
0
 

Author Comment

by:DigitalDave1
ID: 24120249
Yes I saw is_binary(). But we are running PHP 5.x.

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 110

Assisted Solution

by:Ray Paseur
Ray Paseur earned 500 total points
ID: 24120253
I use a "clean_string()" function to remove not only binary characters, but also unwanted characters.  The code snippet just tests for numbers, but you can add all the alpha and special characters to the REGEX.

So something like this...

$str = "12345";
if (!is_clean_numeric_string($str)) die("BAD NUMBER!");

HTH, ~Ray
function is_clean_numeric_string($string) // Q-N-D IS IT NUMERIC?
{ 
   $str = trim(ereg_replace(" +", " ", $string));
   $new = ereg_replace("[^0-9]", "?", $str);
	
   if ($new != $str) 
   {
      return FALSE; 
   } else {
      return ( $new ); 
   }
}

Open in new window

0
 
LVL 19

Expert Comment

by:LordOfPorts
ID: 24120255
Try using is_string http://us2.php.net/manual/en/function.is-string.php on $fileBuffer:
$fileBuffer = file_get_contents($filePath);     
 
$result = is_string($fileBuffer);
 
if (false === $result)
    return "binary file";
else
    return "text file";

Open in new window

0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 24120266
For a more expanded view of things, the pattern [\x00-\x1f] matches all control characters including the NUL.
0
 

Author Comment

by:DigitalDave1
ID: 24126414
Worked out a preg_match pattern to test for the non-printing chars that exclude \n \r \t  etc.:

$pattern = '/[\x00-\x08\x0E-\x1F\x7F]/';

Thanks for the clues that led to this idea.


0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 24126507
Thanks for the points -- it's a good question! ~Ray
0

Featured Post

Secure Your Active Directory - April 20, 2017

Active Directory plays a critical role in your company’s IT infrastructure and keeping it secure in today’s hacker-infested world is a must.
Microsoft published 300+ pages of guidance, but who has the time, money, and resources to implement? Register now to find an easier way.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to remove superseded packages in windows w60 or w61 installation media (.wim) or online system to prevent unnecessary space. w60 means Windows Vista or Windows Server 2008. w61 means Windows 7 or Windows Server 2008 R2. There are various …
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question