Solved

find in php array words with tribe repeating characters?

Posted on 2014-03-14
7
580 Views
Last Modified: 2014-05-29
Hi experts,

I have a problem where I not even know how to start. I guess some regex can solve this but I am out of my knowledge in this case.

I want to clean up a search database table. My system captured lots of words which users typed which dont make sense and I want to delete them.

I read the db table into an php array and now I would like to simple check the words if there are any characters which repeat more than twice behind each other. And if this is the case delete the word.

For example:

Experts Exchange
Exxperts Exchange
Eexxperts Exchange

would pass the test.... but

eeexperts Exchange
exxxxxxxxperts
experts exchangeeeeeeeee

would be cut...

Important is for me any variable with 3 direct repeating characters only. for example "hihihihi" would not be a problem but "hhhhhiiiiii" would be.

I would be happy if somebody has some short php code for me to filter these words.

Thanks in advance,
Oliver
0
Comment
Question by:Oliver2000
  • 3
  • 2
  • 2
7 Comments
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39930720
Maybe something like this...
http://www.iconoun.com/demo/temp_oliver2000.php

See https://xkcd.com/1171/

<?php // demo/temp_oliver2000.php
error_reporting(E_ALL);
echo '<pre>';

// FROM THE POST AT EE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28389176.html

$data = array
( 'Experts Exchange'
, 'Exxperts Exchange'
, 'Eexxperts Exchange'

, 'eeexperts Exchange'
, 'exxxxxxxxperts'
, 'experts exchangeeeeeeeee'

, 'hihihihi'
, 'hhhhhiiiiii'
)
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    echo ' ';
    echo replicates($str);
}

function replicates($str, $n=3)
{
    // OPTIMISTIC
    $ret = 'Pass';

    // GET ONE INSTANCE OF EACH CHARACTER IN THE STRING
    $arr = str_split($str);
    $arr = array_unique($arr);

    // ITERATE OVER THE UNIQUE CHARACTERS
    foreach ($arr as $chr)
    {
        // CREATE THE N-LENGTH REPLICATION SUBSTRING
        $chrs = str_repeat($chr, $n);

        // LOOK FOR THIS SUBSTRING
        if (stripos($str, $chrs) !== FALSE)
        {
            $ret = 'Fail';
            break;
        }
    }
    return $ret;
}

Open in new window

HTH, ~Ray
0
 
LVL 31

Expert Comment

by:Frosty555
ID: 39930722
A regular expression like this maybe:

(\w)\1{2,}

Open in new window


Use that with the preg_match() function to check if a string matches the pattern.

(\w) means any word character (number, letter, underscore)
The \1{2,} means whatever character was found in the first part, repeated 2 or more times.

Try playing with it here:
http://www.phpliveregex.com/

I made a permalink to my example. You can see how the output of the last three calls to preg_match() on the right hand side of the screen show results (meaning it found a match), but the first three do not (didn't match).
http://www.phpliveregex.com/p/4cZ
0
 
LVL 31

Accepted Solution

by:
Frosty555 earned 500 total points
ID: 39930725
Here's a rough untested code sample:

<?php
$subject = "expeeerts exxxxchange";
$pattern = '!(\w)\1{2,}!';
preg_match($pattern, substr($subject,3), $matches);
print_r($matches);

if( count($matches) > 0 ) {
    // found a match
}

?>

Open in new window

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39930729
That's a good solution.  I added it to the script with the test cases - looks perfect!

<?php // demo/temp_oliver2000.php
error_reporting(E_ALL);
echo '<pre>';

// FROM THE POST AT EE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28389176.html

$data = array
( 'Experts Exchange'
, 'Exxperts Exchange'
, 'Eexxperts Exchange'

, 'eeexperts Exchange'
, 'exxxxxxxxperts'
, 'experts exchangeeeeeeeee'

, 'hihihihi'
, 'hhhhhiiiiii'
)
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    echo ' ';
    echo replicates($str);
}

function replicates($str, $n=3)
{
    // OPTIMISTIC
    $ret = 'Pass';

    // GET ONE INSTANCE OF EACH CHARACTER IN THE STRING
    $arr = str_split($str);
    $arr = array_unique($arr);

    // ITERATE OVER THE UNIQUE CHARACTERS
    foreach ($arr as $chr)
    {
        // CREATE THE N-LENGTH REPLICATION SUBSTRING
        $chrs = str_repeat($chr, $n);

        // LOOK FOR THIS SUBSTRING
        if (stripos($str, $chrs) !== FALSE)
        {
            $ret = 'Fail';
            break;
        }
    }
    return $ret;
}

// FROSTY555
$rgx
= '#'          // REGEX DELIMITER
. '(\w)'       // GROUP OF ANY WORD CHARACTER
. '\1'         // BACK-REFERENCE TO GROUP 1
. '{2,}'       // REPEATED TWO OR MORE TIMES
. '#'          // REGEX DELIMITER
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    if (preg_match($rgx, $str)) echo ' FAIL';
}

Open in new window

0
 

Author Comment

by:Oliver2000
ID: 39943997
Hi guys,

Sorry for the big delay. Actually both of your solutions seems to work. I am a little bit confused in this case to who i should give in such a case the points and best solution.
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39944123
That's easy: Give them to Frosty!  I have enough points to orbit Saturn.
0
 

Author Closing Comment

by:Oliver2000
ID: 40099284
Sorry for the huge delay. But finally I come back to still give the points as I should. Both solutions worked great how ever as suggested I give the points to Frosty555 for his solution. Thank two both of you guys for helping me out.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
Get to know the ins and outs of building a web-based ERP system for your enterprise. Development timeline, technology, and costs outlined.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

948 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now