Link to home
Start Free TrialLog in
Avatar of Richard Korts
Richard KortsFlag for United States of America

asked on

remove unprintable characters

Can someone provide a reliable way in php to remove unprintable characters from a string?

This works some of the time (but not all)

$addr = preg_replace('/[\x00-\x1F\x80-\xFF]/','', $addr);

Thanks
Avatar of Julian Hansen
Julian Hansen
Flag of South Africa image

When does it not work?
I think I would turn the regex around, maybe more like this...
http://www.laprbass.com/RAY_temp_rkorts.php

<?php // RAY_temp_rkorts.php

$rgx
= '#'        // REGEX DELIMITER
. '['        // START CHARACTER CLASS
. '^'        // NEGATION - MATCH NONE OF THESE
. 'A-Z'      // ALPHABET
. '0-9'      // NUMBERS
. ' ._()-'   // SPECIAL CHARACTERS
. ']'        // END OF CHARACTER CLASS
. '#'        // REGEX DELIMITER
. 'i'        // CASE-INSENSITIVE
;

// TEST
$str = 'This String has two illegal characters!?';
$new = preg_replace($rgx, NULL, $str);
echo '<pre>';
var_dump($str);
var_dump($new);

Open in new window

The strategy here is to "accept only known good values."  Just add the characters you want to keep into the regular expression, following the pattern shown here.
Avatar of Richard Korts

ASKER

I want to allow A - Z only (it's a street name).

So I would remove lines 8 - 9?

Correct?
Yes, but be sure you don't kill things like 5th Avenue!
To Ray,

The problem MUST be something else. I get this on about half the data:

address: 16076+Stoney+Acres+Road+Poway+CA

Notice: Trying to get property of non-object in /homepages/38/d170218105/htdocs/GVHA/geocode.php on line 7

Notice: Undefined offset: 1 in /homepages/38/d170218105/htdocs/GVHA/geocode.php on line 12

Notice: Undefined offset: 1 in /homepages/38/d170218105/htdocs/GVHA/geocode.php on line 56

The code (with your geocoding as a function) is attached.

The other (about half) is fine
geocode.php
To Ray,

I left the regx EXACTLY like you had it & added a + sign.
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ray,

The prior method I was trying to use used a timing method to pause if the result was not returned. Do you think I should just build in a pause of, say, 2 seconds between calls?
Ray,

I put sleep(1) at the end of the loop.

Works perfectly.

Richard
Requires a pause (I used sleep(1)) between consecutive calls to the geocoder.
+ is a reserved character in reged (means match 1 or more) make sure you put a \ in front of it i.e. \+