Solved

find in php array words with tribe repeating characters?

Posted on 2014-03-14
7
610 Views
Last Modified: 2014-05-29
Hi experts,

I have a problem where I not even know how to start. I guess some regex can solve this but I am out of my knowledge in this case.

I want to clean up a search database table. My system captured lots of words which users typed which dont make sense and I want to delete them.

I read the db table into an php array and now I would like to simple check the words if there are any characters which repeat more than twice behind each other. And if this is the case delete the word.

For example:

Experts Exchange
Exxperts Exchange
Eexxperts Exchange

would pass the test.... but

eeexperts Exchange
exxxxxxxxperts
experts exchangeeeeeeeee

would be cut...

Important is for me any variable with 3 direct repeating characters only. for example "hihihihi" would not be a problem but "hhhhhiiiiii" would be.

I would be happy if somebody has some short php code for me to filter these words.

Thanks in advance,
Oliver
0
Comment
Question by:Oliver2000
  • 3
  • 2
  • 2
7 Comments
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39930720
Maybe something like this...
http://www.iconoun.com/demo/temp_oliver2000.php

See https://xkcd.com/1171/

<?php // demo/temp_oliver2000.php
error_reporting(E_ALL);
echo '<pre>';

// FROM THE POST AT EE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28389176.html

$data = array
( 'Experts Exchange'
, 'Exxperts Exchange'
, 'Eexxperts Exchange'

, 'eeexperts Exchange'
, 'exxxxxxxxperts'
, 'experts exchangeeeeeeeee'

, 'hihihihi'
, 'hhhhhiiiiii'
)
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    echo ' ';
    echo replicates($str);
}

function replicates($str, $n=3)
{
    // OPTIMISTIC
    $ret = 'Pass';

    // GET ONE INSTANCE OF EACH CHARACTER IN THE STRING
    $arr = str_split($str);
    $arr = array_unique($arr);

    // ITERATE OVER THE UNIQUE CHARACTERS
    foreach ($arr as $chr)
    {
        // CREATE THE N-LENGTH REPLICATION SUBSTRING
        $chrs = str_repeat($chr, $n);

        // LOOK FOR THIS SUBSTRING
        if (stripos($str, $chrs) !== FALSE)
        {
            $ret = 'Fail';
            break;
        }
    }
    return $ret;
}

Open in new window

HTH, ~Ray
0
 
LVL 31

Expert Comment

by:Frosty555
ID: 39930722
A regular expression like this maybe:

(\w)\1{2,}

Open in new window


Use that with the preg_match() function to check if a string matches the pattern.

(\w) means any word character (number, letter, underscore)
The \1{2,} means whatever character was found in the first part, repeated 2 or more times.

Try playing with it here:
http://www.phpliveregex.com/

I made a permalink to my example. You can see how the output of the last three calls to preg_match() on the right hand side of the screen show results (meaning it found a match), but the first three do not (didn't match).
http://www.phpliveregex.com/p/4cZ
0
 
LVL 31

Accepted Solution

by:
Frosty555 earned 500 total points
ID: 39930725
Here's a rough untested code sample:

<?php
$subject = "expeeerts exxxxchange";
$pattern = '!(\w)\1{2,}!';
preg_match($pattern, substr($subject,3), $matches);
print_r($matches);

if( count($matches) > 0 ) {
    // found a match
}

?>

Open in new window

0
PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39930729
That's a good solution.  I added it to the script with the test cases - looks perfect!

<?php // demo/temp_oliver2000.php
error_reporting(E_ALL);
echo '<pre>';

// FROM THE POST AT EE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28389176.html

$data = array
( 'Experts Exchange'
, 'Exxperts Exchange'
, 'Eexxperts Exchange'

, 'eeexperts Exchange'
, 'exxxxxxxxperts'
, 'experts exchangeeeeeeeee'

, 'hihihihi'
, 'hhhhhiiiiii'
)
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    echo ' ';
    echo replicates($str);
}

function replicates($str, $n=3)
{
    // OPTIMISTIC
    $ret = 'Pass';

    // GET ONE INSTANCE OF EACH CHARACTER IN THE STRING
    $arr = str_split($str);
    $arr = array_unique($arr);

    // ITERATE OVER THE UNIQUE CHARACTERS
    foreach ($arr as $chr)
    {
        // CREATE THE N-LENGTH REPLICATION SUBSTRING
        $chrs = str_repeat($chr, $n);

        // LOOK FOR THIS SUBSTRING
        if (stripos($str, $chrs) !== FALSE)
        {
            $ret = 'Fail';
            break;
        }
    }
    return $ret;
}

// FROSTY555
$rgx
= '#'          // REGEX DELIMITER
. '(\w)'       // GROUP OF ANY WORD CHARACTER
. '\1'         // BACK-REFERENCE TO GROUP 1
. '{2,}'       // REPEATED TWO OR MORE TIMES
. '#'          // REGEX DELIMITER
;

foreach ($data as $str)
{
    echo PHP_EOL;
    echo $str;
    if (preg_match($rgx, $str)) echo ' FAIL';
}

Open in new window

0
 

Author Comment

by:Oliver2000
ID: 39943997
Hi guys,

Sorry for the big delay. Actually both of your solutions seems to work. I am a little bit confused in this case to who i should give in such a case the points and best solution.
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39944123
That's easy: Give them to Frosty!  I have enough points to orbit Saturn.
0
 

Author Closing Comment

by:Oliver2000
ID: 40099284
Sorry for the huge delay. But finally I come back to still give the points as I should. Both solutions worked great how ever as suggested I give the points to Frosty555 for his solution. Thank two both of you guys for helping me out.
0

Featured Post

PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Because your company can’t afford for you to make SEO mistakes, you’ll want to ensure you’re taking the right steps each and every time you post a new piece of content. This list of optimization do’s and don’ts can help you become an SEO wizard.
This article will inform Clients about common and important expectations from the freelancers (Experts) who are looking at your Gig.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question