Solved

find in php array words with tribe repeating characters?

Posted on 2014-03-14
7
586 Views
Last Modified: 2014-05-29
Hi experts,

I have a problem where I not even know how to start. I guess some regex can solve this but I am out of my knowledge in this case.

I want to clean up a search database table. My system captured lots of words which users typed which dont make sense and I want to delete them.

I read the db table into an php array and now I would like to simple check the words if there are any characters which repeat more than twice behind each other. And if this is the case delete the word.

For example:

Experts Exchange
Exxperts Exchange
Eexxperts Exchange

would pass the test.... but

eeexperts Exchange
exxxxxxxxperts
experts exchangeeeeeeeee

would be cut...

Important is for me any variable with 3 direct repeating characters only. for example "hihihihi" would not be a problem but "hhhhhiiiiii" would be.

I would be happy if somebody has some short php code for me to filter these words.

Thanks in advance,
Oliver
0
Comment
Question by:Oliver2000
  • 3
  • 2
  • 2
7 Comments
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 39930720
Maybe something like this...
http://www.iconoun.com/demo/temp_oliver2000.php

See https://xkcd.com/1171/

<?php // demo/temp_oliver2000.php
error_reporting(E_ALL);
echo '<pre>';

// FROM THE POST AT EE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28389176.html

$data = array
( 'Experts Exchange'
, 'Exxperts Exchange'
, 'Eexxperts Exchange'

, 'eeexperts Exchange'
, 'exxxxxxxxperts'
, 'experts exchangeeeeeeeee'

, 'hihihihi'
, 'hhhhhiiiiii'
)
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    echo ' ';
    echo replicates($str);
}

function replicates($str, $n=3)
{
    // OPTIMISTIC
    $ret = 'Pass';

    // GET ONE INSTANCE OF EACH CHARACTER IN THE STRING
    $arr = str_split($str);
    $arr = array_unique($arr);

    // ITERATE OVER THE UNIQUE CHARACTERS
    foreach ($arr as $chr)
    {
        // CREATE THE N-LENGTH REPLICATION SUBSTRING
        $chrs = str_repeat($chr, $n);

        // LOOK FOR THIS SUBSTRING
        if (stripos($str, $chrs) !== FALSE)
        {
            $ret = 'Fail';
            break;
        }
    }
    return $ret;
}

Open in new window

HTH, ~Ray
0
 
LVL 31

Expert Comment

by:Frosty555
ID: 39930722
A regular expression like this maybe:

(\w)\1{2,}

Open in new window


Use that with the preg_match() function to check if a string matches the pattern.

(\w) means any word character (number, letter, underscore)
The \1{2,} means whatever character was found in the first part, repeated 2 or more times.

Try playing with it here:
http://www.phpliveregex.com/

I made a permalink to my example. You can see how the output of the last three calls to preg_match() on the right hand side of the screen show results (meaning it found a match), but the first three do not (didn't match).
http://www.phpliveregex.com/p/4cZ
0
 
LVL 31

Accepted Solution

by:
Frosty555 earned 500 total points
ID: 39930725
Here's a rough untested code sample:

<?php
$subject = "expeeerts exxxxchange";
$pattern = '!(\w)\1{2,}!';
preg_match($pattern, substr($subject,3), $matches);
print_r($matches);

if( count($matches) > 0 ) {
    // found a match
}

?>

Open in new window

0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 
LVL 109

Expert Comment

by:Ray Paseur
ID: 39930729
That's a good solution.  I added it to the script with the test cases - looks perfect!

<?php // demo/temp_oliver2000.php
error_reporting(E_ALL);
echo '<pre>';

// FROM THE POST AT EE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28389176.html

$data = array
( 'Experts Exchange'
, 'Exxperts Exchange'
, 'Eexxperts Exchange'

, 'eeexperts Exchange'
, 'exxxxxxxxperts'
, 'experts exchangeeeeeeeee'

, 'hihihihi'
, 'hhhhhiiiiii'
)
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    echo ' ';
    echo replicates($str);
}

function replicates($str, $n=3)
{
    // OPTIMISTIC
    $ret = 'Pass';

    // GET ONE INSTANCE OF EACH CHARACTER IN THE STRING
    $arr = str_split($str);
    $arr = array_unique($arr);

    // ITERATE OVER THE UNIQUE CHARACTERS
    foreach ($arr as $chr)
    {
        // CREATE THE N-LENGTH REPLICATION SUBSTRING
        $chrs = str_repeat($chr, $n);

        // LOOK FOR THIS SUBSTRING
        if (stripos($str, $chrs) !== FALSE)
        {
            $ret = 'Fail';
            break;
        }
    }
    return $ret;
}

// FROSTY555
$rgx
= '#'          // REGEX DELIMITER
. '(\w)'       // GROUP OF ANY WORD CHARACTER
. '\1'         // BACK-REFERENCE TO GROUP 1
. '{2,}'       // REPEATED TWO OR MORE TIMES
. '#'          // REGEX DELIMITER
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    if (preg_match($rgx, $str)) echo ' FAIL';
}

Open in new window

0
 

Author Comment

by:Oliver2000
ID: 39943997
Hi guys,

Sorry for the big delay. Actually both of your solutions seems to work. I am a little bit confused in this case to who i should give in such a case the points and best solution.
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 39944123
That's easy: Give them to Frosty!  I have enough points to orbit Saturn.
0
 

Author Closing Comment

by:Oliver2000
ID: 40099284
Sorry for the huge delay. But finally I come back to still give the points as I should. Both solutions worked great how ever as suggested I give the points to Frosty555 for his solution. Thank two both of you guys for helping me out.
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An enjoyable and seamless user experience can go a long way on an eCommerce site. While a cohesive layout and engaging copy play roles in creating a positive user experience, some sites neglect aspects that seem marginal but in actuality prove very …
Color can increase conversions, create feelings of warmth or even incite people to get behind a cause. If you want your website to really impact site visitors, then it is vital to consider the impact color has on them.
Viewers will get an overview of the benefits and risks of using Bitcoin to accept payments. What Bitcoin is: Legality: Risks: Benefits: Which businesses are best suited?: Other things you should know: How to get started:
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question