Solved

First name pregmatch

Posted on 2016-10-30
11
43 Views
Last Modified: 2016-10-30
I am trying to only accept the following special characters into the name field:

' - ´

the first two work fine but when trying to use the third for example, josé, it fails validation.

 if (!preg_match("/^[a-zA-Z '-´]*$/",$_POST['first_name'])) {

Open in new window

0
Comment
Question by:Black Sulfur
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 2
11 Comments
 
LVL 51

Assisted Solution

by:Rgonzo1971
Rgonzo1971 earned 50 total points
ID: 41865854
Hi

If you want accented chars then try
if (!preg_match("/^[a-zA-ZÀ-ÿ '-]*$/",$_POST['first_name'])) {

Open in new window

Regards
0
 
LVL 51

Expert Comment

by:Rgonzo1971
ID: 41865858
or maybe

if (!preg_match("/^[\p{L} '-]*$/",$_POST['first_name'])) {

Open in new window

0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41865860
The correct answer to this question depends on the character set encoding.  What character set are you using?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865879
utf-8
0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865886
Hmm, Ray is right. I have my database set as utf-8 general and when I inserted josé into the database I get josé
0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 450 total points
ID: 41865887
Please see: https://iconoun.com/demo/temp_blacksulfur.php

This shows how to set up the regex, and how to test it with good data visualization.  Check the man page references in the top comments for more information.
<?php // demo/temp_blacksulfur.php
/**
 * https://www.experts-exchange.com/questions/28979848/First-name-pregmatch.html
 *
 * https://www.experts-exchange.com/articles/7830/A-Quick-Tour-of-Test-Driven-Development.html
 * https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
 * http://www.regular-expressions.info/unicode.html
 */
ini_set('display_errors', TRUE);
error_reporting(E_ALL);

// SET UP PHP AND BROWSER TO USE UTF-8
mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');
echo '<meta charset="utf-8" />';
echo '<pre>';

// SOME DATA VISUALIZATION TOOLS
Class Letter
{
    public function __construct($chr)
    {
        $this->chr = $chr;
        $this->hex = array();
        $bytes     = $this->usplit($chr);
        foreach ($bytes as $byte)
        {
            $this->hex = array_merge($this->hex, $this->gethex($byte));
        }
        return $this;
    }

    public function usplit ($chr)
    {
        $len = strlen($chr);
        while ($len) {
            $arr[] = substr($chr, 0, 1);
            $chr   = substr($chr, 1, $len);
            $len   = strlen($chr);
        }
        return $arr;
    }

    public function gethex($chr)
    {
        // GET THE HEX NIBBLE VALUES IN AN ARRAY
        $ret = str_split(implode(NULL, unpack('H*', $chr)));
        return $ret;
    }
}

Class Hexdump
{
    public function __construct($str)
    {
        $this->str = $str;
        $this->arr = $this->mb_str_split($str);
        $this->len = mb_strlen($str);
        foreach ($this->arr as $uchr)
        {
            $this->dat[] = new Letter($uchr);
        }
        return $this;
    }

    public function mb_str_split($ustr)
    {
        return preg_split('/(?<!^)(?!$)/u', $ustr);
    }

    public function render($br = PHP_EOL)
    {
        foreach ($this->dat as $poz => $chr)
        {
            echo $br;
            echo str_pad($poz, 4, ' ', STR_PAD_LEFT);
            echo ' ';
            echo $chr->chr;
            echo "\t";
            echo implode(null, $chr->hex);
        }
        echo $br;
    }
}

// A REGEX TO TEST WITH
$rgx
= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE
;
echo PHP_EOL . htmlentities($rgx);

// ADD TEST CASES AS NEEDED HERE
$tests = array
( 'josé'
, 'Ray'
, 'R&aacute;y'
, '42'
, ' '
, 'J.R.R.'
, '----'
, 'BlackSulfur'
)
;

// MAKE THE TESTS
foreach ($tests as $test)
{
    echo PHP_EOL;
    $out = new HexDump($test);
    $out->render();
    echo $test;
    if (preg_match($rgx, $test)) echo ' MATCH';
}

Open in new window

Outputs
#^[\p{L} '-]*$#ui

   0 j	6a
   1 o	6f
   2 s	73
   3 é	c3a9
josé MATCH

   0 R	52
   1 a	61
   2 y	79
Ray MATCH

   0 R	52
   1 &	26
   2 a	61
   3 a	61
   4 c	63
   5 u	75
   6 t	74
   7 e	65
   8 ;	3b
   9 y	79
Ráy

   0 4	34
   1 2	32
42

   0  	20
  MATCH

   0 J	4a
   1 .	2e
   2 R	52
   3 .	2e
   4 R	52
   5 .	2e
J.R.R.

   0 -	2d
   1 -	2d
   2 -	2d
   3 -	2d
---- MATCH

   0 B	42
   1 l	6c
   2 a	61
   3 c	63
   4 k	6b
   5 S	53
   6 u	75
   7 l	6c
   8 f	66
   9 u	75
  10 r	72
BlackSulfur MATCH

Open in new window

0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865908
Thanks, Ray

This is a hectic answer for my level of knowledge! Does one of these:

= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE

Open in new window

allow for the ´being inserted into the database correctly or is it the below? It's a bit over my head...

mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');

Open in new window

0
 
LVL 110

Assisted Solution

by:Ray Paseur
Ray Paseur earned 450 total points
ID: 41865929
The character encoding for the database is independent and separate from the character encoding for PHP or HTML.  Have a look at this article, and check the part about Character Sets in MySQL.
https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865947
Aha, I get it now.

I used :

if (!preg_match("#^[\p{L} '-]*$#ui",$_POST['first_name'])) {

Open in new window


And for my database connection I used:

$link = new mysqli($server_name, $db_username, $db_password, $db_dbname);
$link->set_charset("utf8mb4");

Open in new window


All seems to work now!  :)
1
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41865963
Bingo!

The way I wrote the REGEX used PHP string concatenation to build up the REGEX string one piece at a time.  I like to write REGEX and other complicated strings that way because it lets me add comments, and it's easy to change when testing.

Best of luck with your project, ~Ray
0
 
LVL 1

Author Comment

by:Black Sulfur
ID: 41865999
Makes sense to do that, I think I should try it!
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
This article discusses how to create an extensible mechanism for linked drop downs.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question