Solved

PHP: preg_match pattern (or other strategy) needed

Posted on 2012-03-21
3
242 Views
Last Modified: 2012-03-21
i want to count the number of "words" that are longer than MaxChars in length,
where a "word" is anything between punctuation/whitespace/specialchars (i.e., non alphanumeric) and \n, CRLF (etc) counts as whitespace
I'm not fussy about > vs. >=  for the MaxChars comparison (i.e., off by 1), however the algorithm lays out most easily/efficiently.

$NumberOfLargeWordsResult =    some_pattern_call( $InputTextString, $MaxChars);

EXAMPLE VALUES
if MaxChars is = 5, then the following would be the results for test strings (assuming > is used for the count comparison for these examples, but a solution >= is fine if easier).
Icing on the cake would be if foreign/extended char sets could count as alphanumerics, but happy with a basic algorithm.



Result      InputTextString
------------------------------------------------
    2          a ab abc abcd abcde abcdef abcdefg
    1          JNwhCSChhrNrGXH
    3          HicEEBrWq4CGEizFnxr;the+5GsjSqBtEojSOB\n PknsRLmydhYKHVKA
    0          123

I'm hoping it can be an efficient preg pattern match, but am open to any implementation suggestions.

MANY THANKS!
0
Comment
Question by:willsherwood
3 Comments
 
LVL 110

Assisted Solution

by:Ray Paseur
Ray Paseur earned 250 total points
ID: 37749616
See http://www.laprbass.com/RAY_temp_willsherwood.php
<?php // RAY_email_validation.php
error_reporting(E_ALL);
echo "<pre>";


// TEST DATA FROM THE POST AT EE
$arr = array
(   '2' =>         'a ab abc abcd abcde abcdef abcdefg'
,   '1' =>         'JNwhCSChhrNrGXH'
,   '3' =>         'HicEEBrWq4CGEizFnxr;the+5GsjSqBtEojSOB\n PknsRLmydhYKHVKA'
,   '0' =>         '123'
)
;

// A FUNCTION TO EXTRACT / COUNT LONG WORDS
function findLongWords($str, $len=5, $ret='COUNT')
{
    // STORE VOCABULARY HERE
    $lon = array();

    // BREAK THE STRING AND TEST THE WORDS
    $wds = preg_split("/[\b[:punct:]\s]+/", $str);
    foreach ($wds as $sub)
    {
        // IS LENGTH GREATER THAN THE LIMIT?
        if (strlen($sub) > $len)
        {
            $lon[] = $sub;
        }
    }

    // RETURN COUNT OR VOCABULARY
    if (strtoupper(substr($ret,0,1)) == 'V') return $lon;
    return count($lon);
}

// TEST THE FUNCTION
foreach ($arr as $num => $txt)
{
    $cnt = findLongWords($txt);
    echo PHP_EOL . "FINDING $cnt EXPECTING $num WITH $txt";
}

Open in new window

0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 250 total points
ID: 37749747
You should be able to use a preg_match_all like this:

$maxChars = 5;
$test_values = array ('a ab abc abcd abcde abcdef abcdefg',
                      'JNwhCSChhrNrGXH',
                      'HicEEBrWq4CGEizFnxr;the+5GsjSqBtEojSOB\n PknsRLmydhYKHVKA',
                      '123');

foreach ($test_values as $value) {
  print "value: $value<br>\n";
  print preg_match_all("/\b[a-z\d]{".($maxChars+1).",}/i", $value, $matches)."\n";
}

Open in new window

Result:
value: a ab abc abcd abcde abcdef abcdefg<br>
2
value: JNwhCSChhrNrGXH<br>
1
value: HicEEBrWq4CGEizFnxr;the+5GsjSqBtEojSOB\n PknsRLmydhYKHVKA<br>
3
value: 123<br>
0

Open in new window

0
 

Author Closing Comment

by:willsherwood
ID: 37749986
excellent, thanks all!
0

Featured Post

Secure Your Active Directory - April 20, 2017

Active Directory plays a critical role in your company’s IT infrastructure and keeping it secure in today’s hacker-infested world is a must.
Microsoft published 300+ pages of guidance, but who has the time, money, and resources to implement? Register now to find an easier way.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
This article discusses how to implement server side field validation and display customized error messages to the client.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question