Solved

get a phone number from text.  Two test cases I would like to work

Posted on 2013-01-26
9
433 Views
Last Modified: 2013-01-29
this replaces words with numbers
foreach (array('one' => 1, 'two' => 2 'three' => 3, 'four' => 4, 'five' => 5, 'six' => 6, 'seven' => 7, 'eight' => 8, 'nine' => 9, 'zero' => 0) as $text => $number) {
    $phone_content_raw = str_replace(' ' . $text . ' ', $number, $phone_content_raw);
    $phone_content_raw = str_replace($text, $number, $phone_content_raw);
}

Open in new window


but I think that line 48
discard non-numeric characters will undo that added code
<?php // RAY_validate_phone_numbers.php
error_reporting(E_STRICT);

// A FUNCTION TO VALIDATE A PHONE NUMBER AND RETURN A NORMALIZED STRING
// MAN PAGE: http://discuss.fogcreek.com/joelonsoftware3/default.asp?cmd=show&ixPost=102667&ixReplies=15
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);
    if (preg_match('/[A-Z]/', $phone))
    {
        $phone = str_replace('A', '2', $phone);
        $phone = str_replace('B', '2', $phone);
        $phone = str_replace('C', '2', $phone);

        $phone = str_replace('D', '3', $phone);
        $phone = str_replace('E', '3', $phone);
        $phone = str_replace('F', '3', $phone);

        $phone = str_replace('G', '4', $phone);
        $phone = str_replace('H', '4', $phone);
        $phone = str_replace('I', '4', $phone);

        $phone = str_replace('J', '5', $phone);
        $phone = str_replace('K', '5', $phone);
        $phone = str_replace('L', '5', $phone);

        $phone = str_replace('M', '6', $phone);
        $phone = str_replace('N', '6', $phone);
        $phone = str_replace('O', '6', $phone);

        $phone = str_replace('P', '7', $phone);
        $phone = str_replace('Q', '7', $phone);
        $phone = str_replace('R', '7', $phone);
        $phone = str_replace('S', '7', $phone);

        $phone = str_replace('T', '8', $phone);
        $phone = str_replace('U', '8', $phone);
        $phone = str_replace('V', '8', $phone);

        $phone = str_replace('W', '9', $phone);
        $phone = str_replace('X', '9', $phone);
        $phone = str_replace('Y', '9', $phone);
        $phone = str_replace('Z', '9', $phone);
    }

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('/[^0-9]/', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if (substr($phone,0,1) == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID, SECOND DIGIT CANNOT BE '9' (YET)
    if (substr($phone,0,1) == '0') return FALSE;
    if (substr($phone,0,1) == '1') return FALSE;
    if (substr($phone,1,1) == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}

echo '<h1>Output that works</h1>';
echo '<br/>1-800-5551212: '. strtophone("1-800-5551212");
echo '<br/>866-Big-Dogs: '. strtophone("866-Big-Dogs");
echo '<br/>202-537-7560: '. strtophone("202-537-7560");
echo '<br/>703-356-5300 x2048: '. strtophone("703-356-5300 x2048");
echo '<br/>(212) 555-1212: '. strtophone("(212) 555-1212");
echo '<br/>1 + (212) 555-1212: '. strtophone("1 + (212) 555-1212");
echo '<br/>2345678901: '. strtophone("2345678901");
echo '<br/>12345678901: '. strtophone("12345678901");

echo '<h1>output that I want to work</h1>';

echo '<br/>1 (292) 226-7000: '. strtophone("1 (292) 226-7000");
echo '<br/>two 345678901: '. strtophone("two 345678901");

Open in new window





output:

Output that works

1-800-5551212: 8005551212
866-Big-Dogs: 8662443647
202-537-7560: 2025377560
703-356-5300 x2048: 703356530092048
(212) 555-1212: 2125551212
1 + (212) 555-1212: 2125551212
2345678901: 2345678901
12345678901: 2345678901
output that I want to work

1 (292) 226-7000:
two 345678901:




so the additional test data that I would like to work is
1 (292) 226-7000
two 345678901
0
Comment
Question by:rgb192
  • 3
  • 3
  • 3
9 Comments
 
LVL 82

Accepted Solution

by:
hielo earned 500 total points
ID: 38823295
The reason 1 (292) 226-7000 doesn't work is because according to:
http://www.nanpa.com/area_codes/index.html

the '9' is not allowed to be the middle digit of the area code (scroll down to the table and read the 'N9X' row).  If you really want to allow it, then comment out:
if ( $phone[1] == '9') return FALSE;


function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);

	foreach (array('ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE') as $number => $text)
	{
		$phone = str_replace($text, $number, $phone);
	}
	//http://php.net/manual/en/function.strtr.php
	$phone=strtr($phone,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','22233344455566677778889999');

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('#\D#', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if ( $phone[0] == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID
    if ( $phone[0] == '0') return FALSE;
    if ( $phone[0] == '1') return FALSE;

	//SECOND DIGIT CANNOT BE '9' (YET) http://www.nanpa.com/area_codes/index.html
    if ( $phone[1] == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}

Open in new window

0
 

Author Comment

by:rgb192
ID: 38823383
thanks for answering the first part of the question

now I want
code sample 1
to be added to
code sample 2

or
some other method to allow
two 345678901
0
 
LVL 82

Expert Comment

by:hielo
ID: 38823541
>> or some other method to allow
>> two 345678901
The function I posted is supposed to replace the function you posted.  It will convert
"two 345678901" into "2345678901"
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 38824456
Is this really what you want?  I am at a loss to understand the use case in the test data.  Consider this:

"I need you to call two numbers for me.  First, please call 202-296-3131. Then call two 02-296-3132."

This is something that is written by nobody ever :-/
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 

Author Comment

by:rgb192
ID: 38824898
<?php
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);

  foreach (array('ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE') as $number => $text)
  {
    $phone = str_replace($text, $number, $phone);
  }
  //http://php.net/manual/en/function.strtr.php
  $phone=strtr($phone,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','22233344455566677778889999');

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('#\D#', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if ( $phone[0] == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID
    if ( $phone[0] == '0') return FALSE;
    if ( $phone[0] == '1') return FALSE;

  //SECOND DIGIT CANNOT BE '9' (YET) http://www.nanpa.com/area_codes/index.html
    if ( $phone[1] == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}



<?php
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);

  foreach (array('ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE') as $number => $text)
  {
    $phone = str_replace($text, $number, $phone);
  }
  //http://php.net/manual/en/function.strtr.php
  $phone=strtr($phone,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','22233344455566677778889999');

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('#\D#', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if ( $phone[0] == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID
    if ( $phone[0] == '0') return FALSE;
    if ( $phone[0] == '1') return FALSE;

  //SECOND DIGIT CANNOT BE '9' (YET) http://www.nanpa.com/area_codes/index.html
    if ( $phone[1] == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}



echo strtophone("2many words 3 four you two words 456565"); 

Open in new window


added last line
<?php
function strtophone($phone, $format=FALSE, $dlm='-')
{
    // HANDLE INPUT LIKE 1-800-BIG-DOGS
    $phone = strtoupper($phone);

  foreach (array('ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE') as $number => $text)
  {
    $phone = str_replace($text, $number, $phone);
  }
  //http://php.net/manual/en/function.strtr.php
  $phone=strtr($phone,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','22233344455566677778889999');

    // DISCARD NON-NUMERIC CHARACTERS
    $phone = preg_replace('#\D#', '', $phone);

    // DISCARD A LEADING '1' FROM NUMBERS ENTERED LIKE 1-800-555-1212
    if ( $phone[0] == '1') $phone = substr($phone,1);

    // IF LESS THAN TEN DIGITS, IT IS INVALID
    if (strlen($phone) < 10) return FALSE;

    // IF IT STARTS WITH '0' OR '1' IT IS INVALID
    if ( $phone[0] == '0') return FALSE;
    if ( $phone[0] == '1') return FALSE;

  //SECOND DIGIT CANNOT BE '9' (YET) http://www.nanpa.com/area_codes/index.html
    if ( $phone[1] == '9') return FALSE;

    // ADD OTHER TESTS HERE AS MAY BE NEEDED

    // IF NOT FORMATTED
    if (!$format) return $phone;

    // ISOLATE THE COMPONENTS OF THE PHONE NUMBER
    $ac = substr($phone,0,3); // AREA
    $ex = substr($phone,3,3); // EXCHANGE
    $nm = substr($phone,6,4); // NUMBER
    $xt = substr($phone,10);  // EXTENSION

    // STANDARDIZE THE PRINTABLE FORMAT OF THE PHONE NUMBER LIKE 212-555-1212-1234
    $formatted_phone = $ac . $dlm . $ex . $dlm . $nm;
    if ($xt != '') $formatted_phone .= $dlm . $xt;
    return $formatted_phone;
}



echo strtophone("2many words 3 four you two words 456565");

expected output:
2342456565
output:
262699673734968296737456565


 202-296-3131. Then call two 02-296-3132
told that verizon phone number does not exist
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 38825014
Looks like a data normalization problem.  You might want to set up an array of test data so you can automate the testing process.  In the array keys you would have the input values.  In the array values you would have the expected output.  Then you could do something like this:

foreach ($testdata as $key => $value)
{
    $new = testFunction($key);
    if ($new != $value) echo "FAIL: $key BECAME $new INSTEAD OF $value";
}

More on test-driven development here:
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_7830-A-Quick-Tour-of-Test-Driven-Development.html
0
 
LVL 82

Expert Comment

by:hielo
ID: 38825100
>>echo strtophone("2many words 3 four you two words 456565");
>>expected output:
>>2342456565
Given "1-800-BIG-DOGS"

the function replaces every letter in that string with its corresponding numeric value.  The same logic applies to "2many words 3 four you two words 456565".  The only difference is that before it begins replacing every letter with its corresponding digit, it looks for "one","two",etc and replaces with the corresponding digit.

So it is correctly changing your test case/string to:
2MANY WORDS 3 4 YOU 2 WORDS 456565

THEN it begins substituting every remaining letter with the digit.  Thus, M=>6, A=>2, etc.

Thus, the function works correctly.  The problem is that your expected results for that test case is unrealistic.

You might argue that "1-800-BIG-DOGS" has the pattern "X-XXX-XXX-XXXX", but there is no guarantee that the users will type a hyphen (or some other separator), so you can't really look for a pattern of "X-XXX-XXX-XXXX".  Additionally, it is possible that the phone number is correct, but the pattern may be different -ex:
"1 800 U NEED US"
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 38825137
Not sure where the OP lives, but USA phone numbers are almost always written with digits only, except for a few "cutesy" marketing things like 1-800-Big-Dogs.  So maybe the correct solution is to remove support for letters and use only digits.  I would guess that the ratio of all numbers to letters+numbers is 100:1 in current media and common parlance.

In any case we will know how to advise after we see the test data set.  And until the test data is created, it's all a bit speculative.
0
 

Author Closing Comment

by:rgb192
ID: 38833853
confusing for me and took time to think about question I asked and expected output
thanks

I have a related question

only want the first telephone number in $text
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28013549.html
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to count occurrences of each item in an array.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now