Link to home
Start Free TrialLog in
Avatar of rgb192
rgb192Flag for United States of America

asked on

get a phone number from text. Two test cases I would like to work

this replaces words with numbers
foreach (array('one' => 1, 'two' => 2 'three' => 3, 'four' => 4, 'five' => 5, 'six' => 6, 'seven' => 7, 'eight' => 8, 'nine' => 9, 'zero' => 0) as $text => $number) {
    $phone_content_raw = str_replace(' ' . $text . ' ', $number, $phone_content_raw);
    $phone_content_raw = str_replace($text, $number, $phone_content_raw);
}

Open in new window


but I think that line 48
discard non-numeric characters will undo that added code
<?php // RAY_validate_phone_numbers.php
error_reporting(E_STRICT);

// A FUNCTION TO VALIDATE A PHONE NUMBER AND RETURN A NORMALIZED STRING
// MAN PAGE: http://discuss.fogcreek.com/joelonsoftware3/default.asp?cmd=show&ixPost=102667&ixReplies=15
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);
    if (preg_match('/[A-Z]/', $phone))
    {
        $phone = str_replace('A', '2', $phone);
        $phone = str_replace('B', '2', $phone);
        $phone = str_replace('C', '2', $phone);

        $phone = str_replace('D', '3', $phone);
        $phone = str_replace('E', '3', $phone);
        $phone = str_replace('F', '3', $phone);

        $phone = str_replace('G', '4', $phone);
        $phone = str_replace('H', '4', $phone);
        $phone = str_replace('I', '4', $phone);

        $phone = str_replace('J', '5', $phone);
        $phone = str_replace('K', '5', $phone);
        $phone = str_replace('L', '5', $phone);

        $phone = str_replace('M', '6', $phone);
        $phone = str_replace('N', '6', $phone);
        $phone = str_replace('O', '6', $phone);

        $phone = str_replace('P', '7', $phone);
        $phone = str_replace('Q', '7', $phone);
        $phone = str_replace('R', '7', $phone);
        $phone = str_replace('S', '7', $phone);

        $phone = str_replace('T', '8', $phone);
        $phone = str_replace('U', '8', $phone);
        $phone = str_replace('V', '8', $phone);

        $phone = str_replace('W', '9', $phone);
        $phone = str_replace('X', '9', $phone);
        $phone = str_replace('Y', '9', $phone);
        $phone = str_replace('Z', '9', $phone);
    }

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('/[^0-9]/', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if (substr($phone,0,1) == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID, SECOND DIGIT CANNOT BE '9' (YET)
    if (substr($phone,0,1) == '0') return FALSE;
    if (substr($phone,0,1) == '1') return FALSE;
    if (substr($phone,1,1) == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}

echo '<h1>Output that works</h1>';
echo '<br/>1-800-5551212: '. strtophone("1-800-5551212");
echo '<br/>866-Big-Dogs: '. strtophone("866-Big-Dogs");
echo '<br/>202-537-7560: '. strtophone("202-537-7560");
echo '<br/>703-356-5300 x2048: '. strtophone("703-356-5300 x2048");
echo '<br/>(212) 555-1212: '. strtophone("(212) 555-1212");
echo '<br/>1 + (212) 555-1212: '. strtophone("1 + (212) 555-1212");
echo '<br/>2345678901: '. strtophone("2345678901");
echo '<br/>12345678901: '. strtophone("12345678901");

echo '<h1>output that I want to work</h1>';

echo '<br/>1 (292) 226-7000: '. strtophone("1 (292) 226-7000");
echo '<br/>two 345678901: '. strtophone("two 345678901");

Open in new window





output:

Output that works

1-800-5551212: 8005551212
866-Big-Dogs: 8662443647
202-537-7560: 2025377560
703-356-5300 x2048: 703356530092048
(212) 555-1212: 2125551212
1 + (212) 555-1212: 2125551212
2345678901: 2345678901
12345678901: 2345678901
output that I want to work

1 (292) 226-7000:
two 345678901:




so the additional test data that I would like to work is
1 (292) 226-7000
two 345678901
ASKER CERTIFIED SOLUTION
Avatar of hielo
hielo
Flag of Wallis and Futuna image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of rgb192

ASKER

thanks for answering the first part of the question

now I want
code sample 1
to be added to
code sample 2

or
some other method to allow
two 345678901
>> or some other method to allow
>> two 345678901
The function I posted is supposed to replace the function you posted.  It will convert
"two 345678901" into "2345678901"
Is this really what you want?  I am at a loss to understand the use case in the test data.  Consider this:

"I need you to call two numbers for me.  First, please call 202-296-3131. Then call two 02-296-3132."

This is something that is written by nobody ever :-/
Avatar of rgb192

ASKER

<?php
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);

  foreach (array('ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE') as $number => $text)
  {
    $phone = str_replace($text, $number, $phone);
  }
  //http://php.net/manual/en/function.strtr.php
  $phone=strtr($phone,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','22233344455566677778889999');

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('#\D#', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if ( $phone[0] == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID
    if ( $phone[0] == '0') return FALSE;
    if ( $phone[0] == '1') return FALSE;

  //SECOND DIGIT CANNOT BE '9' (YET) http://www.nanpa.com/area_codes/index.html
    if ( $phone[1] == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}



<?php
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);

  foreach (array('ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE') as $number => $text)
  {
    $phone = str_replace($text, $number, $phone);
  }
  //http://php.net/manual/en/function.strtr.php
  $phone=strtr($phone,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','22233344455566677778889999');

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('#\D#', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if ( $phone[0] == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID
    if ( $phone[0] == '0') return FALSE;
    if ( $phone[0] == '1') return FALSE;

  //SECOND DIGIT CANNOT BE '9' (YET) http://www.nanpa.com/area_codes/index.html
    if ( $phone[1] == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}



echo strtophone("2many words 3 four you two words 456565"); 

Open in new window


added last line
<?php
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);

  foreach (array('ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE') as $number => $text)
  {
    $phone = str_replace($text, $number, $phone);
  }
  //http://php.net/manual/en/function.strtr.php
  $phone=strtr($phone,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','22233344455566677778889999');

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('#\D#', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if ( $phone[0] == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID
    if ( $phone[0] == '0') return FALSE;
    if ( $phone[0] == '1') return FALSE;

  //SECOND DIGIT CANNOT BE '9' (YET) http://www.nanpa.com/area_codes/index.html
    if ( $phone[1] == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}



echo strtophone("2many words 3 four you two words 456565");

expected output:
2342456565
output:
262699673734968296737456565


 202-296-3131. Then call two 02-296-3132
told that verizon phone number does not exist
Looks like a data normalization problem.  You might want to set up an array of test data so you can automate the testing process.  In the array keys you would have the input values.  In the array values you would have the expected output.  Then you could do something like this:

foreach ($testdata as $key => $value)
{
    $new = testFunction($key);
    if ($new != $value) echo "FAIL: $key BECAME $new INSTEAD OF $value";
}

More on test-driven development here:
https://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_7830-A-Quick-Tour-of-Test-Driven-Development.html
>>echo strtophone("2many words 3 four you two words 456565");
>>expected output:
>>2342456565
Given "1-800-BIG-DOGS"

the function replaces every letter in that string with its corresponding numeric value.  The same logic applies to "2many words 3 four you two words 456565".  The only difference is that before it begins replacing every letter with its corresponding digit, it looks for "one","two",etc and replaces with the corresponding digit.

So it is correctly changing your test case/string to:
2MANY WORDS 3 4 YOU 2 WORDS 456565

THEN it begins substituting every remaining letter with the digit.  Thus, M=>6, A=>2, etc.

Thus, the function works correctly.  The problem is that your expected results for that test case is unrealistic.

You might argue that "1-800-BIG-DOGS" has the pattern "X-XXX-XXX-XXXX", but there is no guarantee that the users will type a hyphen (or some other separator), so you can't really look for a pattern of "X-XXX-XXX-XXXX".  Additionally, it is possible that the phone number is correct, but the pattern may be different -ex:
"1 800 U NEED US"
Not sure where the OP lives, but USA phone numbers are almost always written with digits only, except for a few "cutesy" marketing things like 1-800-Big-Dogs.  So maybe the correct solution is to remove support for letters and use only digits.  I would guess that the ratio of all numbers to letters+numbers is 100:1 in current media and common parlance.

In any case we will know how to advise after we see the test data set.  And until the test data is created, it's all a bit speculative.
Avatar of rgb192

ASKER

confusing for me and took time to think about question I asked and expected output
thanks

I have a related question

only want the first telephone number in $text
https://www.experts-exchange.com/questions/28013549/only-want-the-first-phone-number-in-text.html