Solved

First name pregmatch

Posted on 2016-10-30
11
42 Views
Last Modified: 2016-10-30
I am trying to only accept the following special characters into the name field:

' - ´

the first two work fine but when trying to use the third for example, josé, it fails validation.

 if (!preg_match("/^[a-zA-Z '-´]*$/",$_POST['first_name'])) {

Open in new window

0
Comment
Question by:Black Sulfur
  • 5
  • 4
  • 2
11 Comments
 
LVL 50

Assisted Solution

by:Rgonzo1971
Rgonzo1971 earned 50 total points
ID: 41865854
Hi

If you want accented chars then try
if (!preg_match("/^[a-zA-ZÀ-ÿ '-]*$/",$_POST['first_name'])) {

Open in new window

Regards
0
 
LVL 50

Expert Comment

by:Rgonzo1971
ID: 41865858
or maybe

if (!preg_match("/^[\p{L} '-]*$/",$_POST['first_name'])) {

Open in new window

0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41865860
The correct answer to this question depends on the character set encoding.  What character set are you using?
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865879
utf-8
0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865886
Hmm, Ray is right. I have my database set as utf-8 general and when I inserted josé into the database I get josé
0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 450 total points
ID: 41865887
Please see: https://iconoun.com/demo/temp_blacksulfur.php

This shows how to set up the regex, and how to test it with good data visualization.  Check the man page references in the top comments for more information.
<?php // demo/temp_blacksulfur.php
/**
 * https://www.experts-exchange.com/questions/28979848/First-name-pregmatch.html
 *
 * https://www.experts-exchange.com/articles/7830/A-Quick-Tour-of-Test-Driven-Development.html
 * https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
 * http://www.regular-expressions.info/unicode.html
 */
ini_set('display_errors', TRUE);
error_reporting(E_ALL);

// SET UP PHP AND BROWSER TO USE UTF-8
mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');
echo '<meta charset="utf-8" />';
echo '<pre>';

// SOME DATA VISUALIZATION TOOLS
Class Letter
{
    public function __construct($chr)
    {
        $this->chr = $chr;
        $this->hex = array();
        $bytes     = $this->usplit($chr);
        foreach ($bytes as $byte)
        {
            $this->hex = array_merge($this->hex, $this->gethex($byte));
        }
        return $this;
    }

    public function usplit ($chr)
    {
        $len = strlen($chr);
        while ($len) {
            $arr[] = substr($chr, 0, 1);
            $chr   = substr($chr, 1, $len);
            $len   = strlen($chr);
        }
        return $arr;
    }

    public function gethex($chr)
    {
        // GET THE HEX NIBBLE VALUES IN AN ARRAY
        $ret = str_split(implode(NULL, unpack('H*', $chr)));
        return $ret;
    }
}

Class Hexdump
{
    public function __construct($str)
    {
        $this->str = $str;
        $this->arr = $this->mb_str_split($str);
        $this->len = mb_strlen($str);
        foreach ($this->arr as $uchr)
        {
            $this->dat[] = new Letter($uchr);
        }
        return $this;
    }

    public function mb_str_split($ustr)
    {
        return preg_split('/(?<!^)(?!$)/u', $ustr);
    }

    public function render($br = PHP_EOL)
    {
        foreach ($this->dat as $poz => $chr)
        {
            echo $br;
            echo str_pad($poz, 4, ' ', STR_PAD_LEFT);
            echo ' ';
            echo $chr->chr;
            echo "\t";
            echo implode(null, $chr->hex);
        }
        echo $br;
    }
}

// A REGEX TO TEST WITH
$rgx
= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE
;
echo PHP_EOL . htmlentities($rgx);

// ADD TEST CASES AS NEEDED HERE
$tests = array
( 'josé'
, 'Ray'
, 'R&aacute;y'
, '42'
, ' '
, 'J.R.R.'
, '----'
, 'BlackSulfur'
)
;

// MAKE THE TESTS
foreach ($tests as $test)
{
    echo PHP_EOL;
    $out = new HexDump($test);
    $out->render();
    echo $test;
    if (preg_match($rgx, $test)) echo ' MATCH';
}

Open in new window

Outputs
#^[\p{L} '-]*$#ui

   0 j	6a
   1 o	6f
   2 s	73
   3 é	c3a9
josé MATCH

   0 R	52
   1 a	61
   2 y	79
Ray MATCH

   0 R	52
   1 &	26
   2 a	61
   3 a	61
   4 c	63
   5 u	75
   6 t	74
   7 e	65
   8 ;	3b
   9 y	79
Ráy

   0 4	34
   1 2	32
42

   0  	20
  MATCH

   0 J	4a
   1 .	2e
   2 R	52
   3 .	2e
   4 R	52
   5 .	2e
J.R.R.

   0 -	2d
   1 -	2d
   2 -	2d
   3 -	2d
---- MATCH

   0 B	42
   1 l	6c
   2 a	61
   3 c	63
   4 k	6b
   5 S	53
   6 u	75
   7 l	6c
   8 f	66
   9 u	75
  10 r	72
BlackSulfur MATCH

Open in new window

0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865908
Thanks, Ray

This is a hectic answer for my level of knowledge! Does one of these:

= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE

Open in new window

allow for the ´being inserted into the database correctly or is it the below? It's a bit over my head...

mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');

Open in new window

0
 
LVL 110

Assisted Solution

by:Ray Paseur
Ray Paseur earned 450 total points
ID: 41865929
The character encoding for the database is independent and separate from the character encoding for PHP or HTML.  Have a look at this article, and check the part about Character Sets in MySQL.
https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865947
Aha, I get it now.

I used :

if (!preg_match("#^[\p{L} '-]*$#ui",$_POST['first_name'])) {

Open in new window


And for my database connection I used:

$link = new mysqli($server_name, $db_username, $db_password, $db_dbname);
$link->set_charset("utf8mb4");

Open in new window


All seems to work now!  :)
1
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41865963
Bingo!

The way I wrote the REGEX used PHP string concatenation to build up the REGEX string one piece at a time.  I like to write REGEX and other complicated strings that way because it lets me add comments, and it's easy to change when testing.

Best of luck with your project, ~Ray
0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865999
Makes sense to do that, I think I should try it!
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses how to create an extensible mechanism for linked drop downs.
This article shows the steps required to install WordPress on Azure. Web Apps, Mobile Apps, API Apps, or Functions, in Azure all these run in an App Service plan. WordPress is no exception and requires an App Service Plan and Database to install
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

713 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question