[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Function to remove some characters and words from string

Posted on 2011-10-26
4
Medium Priority
?
272 Views
Last Modified: 2012-06-21
Hi,

I need a good function to remove all following characters from a string:

Bad Characters to remove: &:/\?¿%"!()=¡*.-[]<,>;

And also remove some words:
$badwords = "http,www,cache,test"

Could someone please can help me with a GOOD function ?

Thank you
0
Comment
Question by:Fernanditos
  • 2
4 Comments
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 37036119
Try this, Fernanditos:

<?php
function CleanString($str){
    $str = preg_replace('#&|:|/|\\\\|¿|%|"|!|\(|\)|=|¡|\*|\.|-|\[|\]|<|,|>|\Z|_|´|\+|\{|\}|`|`|¨|\?#', ' ', $str);
    $str = preg_replace('/\'/', ' ', $str);
    $str = preg_replace('/\s{2,}/', ' ', $str);
    $str = preg_replace('/http|www|cache|death/', '', $str); //this gives "  this is a test ácido mañana"
     return trim($str);
}
$str= "http://www.this-is-a-test/ácido )  .    --´    (?¿¿!><\"''''_:::`+`+[]¨{mañana";
echo "Cleaned string is " . CleanString($str);
?>

Cheers
0
 
LVL 111

Accepted Solution

by:
Ray Paseur earned 2000 total points
ID: 37038856
This looks curiously like the other question that was answered with this script.  Looks like the only meaningful difference is "death" vs "test" in the collection of $badwords.

This question goes to a central issue in set theory, in a way that puts you at constant risk.  It is axiomatic that you accept only known good values when you receive data from external sources.  When you try to do it backwards, you exclude only known bad values.  The important part of these two different approaches is the phrase "only known."  It is possible for you to know all of the good values because you define them yourself.  It is not possible for you to know all of the bad values because the bad guys define them and the bad values can be anything - you have no control.  So the correct approach is to ignore or discard anything that does not fit your definition of "good values."  Anything else and you are setting yourself up for a catastrophe.

Good luck with it, ~Ray
<?php // RAY_temp_fernanditos.php
error_reporting(E_ALL);

// THE TEST DATA
$str = "http://www.this-is-a-test/ácido )  .    --´    (?¿¿!><\"''''_:::`+`+[]¨{mañana";

// THE DESIRED RESULT
$out = 'This is a test ácido mañana';

// MAKE A TEST
$new = cleanTerm($str);

if ($new == $out) echo "SUCCESS";

// THE FUNCTION
function cleanterm($str)
{
    // CREATE THE REGEX ON THE FIRST ENTRY TO THE FUNCTION
    static $regexp;
    if (empty($regexp))
    {
        $english = range('A', 'Z');
        $numbers = range('0', '9');
        $accents = range(chr(192), chr(255));
        $merged  = array_merge($english, $numbers, $accents);
        $regexp  = '#[^' . implode(NULL, $merged) . ']#i';
    }

    // GET RID OF BAD CHARACTERS
    $new = preg_replace($regexp, ' ', $str);

    // GET RID OF THE BAD WORDS (A FOOLS ERRAND BUT WE WILL SHOW IT ANYWAY)
    $badwords = "http,www,cache,death";
    $words    = explode(',', $badwords);
    foreach ($words as $w)
    {
        $new = preg_replace( '#' . preg_quote($w) . '#i', NULL, $new);
    }

    // GET RID OF EXCESS WHITESPACE
    $new = trim(preg_replace('/\s+/', ' ', $new));

    // CAPITALIZE THE FIRST LETTER
    return ucfirst($new);
}

Open in new window

0
 

Author Comment

by:Fernanditos
ID: 37039101
Ray this question is similar and different because here I just try to remove "specific" characters: "&:/\?¿%"!()=¡*.-[]<,>;" and some defined words.

Whatever, I am happy with all solutions provided because I am learning a lot.

@marqusG your solution is working great for me now and I will test the second now too.

Thank you
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 37039128
A general design-and-thought process for developing a regular expression is available in this article.  Note the parallel construction of test data and the regex.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_7830-A-Quick-Tour-of-Test-Driven-Development.html
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I found this questions asking how to do this in many different forums, so I will describe here how to implement a solution using PHP and AJAX. The logical flow for the problem should be: Write an event handler for the first drop down box to get …
Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Suggested Courses
Course of the Month20 days, 3 hours left to enroll

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question