Solved

First name pregmatch

Posted on 2016-10-30
11
31 Views
Last Modified: 2016-10-30
I am trying to only accept the following special characters into the name field:

' - ´

the first two work fine but when trying to use the third for example, josé, it fails validation.

 if (!preg_match("/^[a-zA-Z '-´]*$/",$_POST['first_name'])) {

Open in new window

0
Comment
Question by:Black Sulfur
  • 5
  • 4
  • 2
11 Comments
 
LVL 48

Assisted Solution

by:Rgonzo1971
Rgonzo1971 earned 50 total points
ID: 41865854
Hi

If you want accented chars then try
if (!preg_match("/^[a-zA-ZÀ-ÿ '-]*$/",$_POST['first_name'])) {

Open in new window

Regards
0
 
LVL 48

Expert Comment

by:Rgonzo1971
ID: 41865858
or maybe

if (!preg_match("/^[\p{L} '-]*$/",$_POST['first_name'])) {

Open in new window

0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 41865860
The correct answer to this question depends on the character set encoding.  What character set are you using?
0
 

Author Comment

by:Black Sulfur
ID: 41865879
utf-8
0
 

Author Comment

by:Black Sulfur
ID: 41865886
Hmm, Ray is right. I have my database set as utf-8 general and when I inserted josé into the database I get josé
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 108

Accepted Solution

by:
Ray Paseur earned 450 total points
ID: 41865887
Please see: https://iconoun.com/demo/temp_blacksulfur.php

This shows how to set up the regex, and how to test it with good data visualization.  Check the man page references in the top comments for more information.
<?php // demo/temp_blacksulfur.php
/**
 * https://www.experts-exchange.com/questions/28979848/First-name-pregmatch.html
 *
 * https://www.experts-exchange.com/articles/7830/A-Quick-Tour-of-Test-Driven-Development.html
 * https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
 * http://www.regular-expressions.info/unicode.html
 */
ini_set('display_errors', TRUE);
error_reporting(E_ALL);

// SET UP PHP AND BROWSER TO USE UTF-8
mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');
echo '<meta charset="utf-8" />';
echo '<pre>';

// SOME DATA VISUALIZATION TOOLS
Class Letter
{
    public function __construct($chr)
    {
        $this->chr = $chr;
        $this->hex = array();
        $bytes     = $this->usplit($chr);
        foreach ($bytes as $byte)
        {
            $this->hex = array_merge($this->hex, $this->gethex($byte));
        }
        return $this;
    }

    public function usplit ($chr)
    {
        $len = strlen($chr);
        while ($len) {
            $arr[] = substr($chr, 0, 1);
            $chr   = substr($chr, 1, $len);
            $len   = strlen($chr);
        }
        return $arr;
    }

    public function gethex($chr)
    {
        // GET THE HEX NIBBLE VALUES IN AN ARRAY
        $ret = str_split(implode(NULL, unpack('H*', $chr)));
        return $ret;
    }
}

Class Hexdump
{
    public function __construct($str)
    {
        $this->str = $str;
        $this->arr = $this->mb_str_split($str);
        $this->len = mb_strlen($str);
        foreach ($this->arr as $uchr)
        {
            $this->dat[] = new Letter($uchr);
        }
        return $this;
    }

    public function mb_str_split($ustr)
    {
        return preg_split('/(?<!^)(?!$)/u', $ustr);
    }

    public function render($br = PHP_EOL)
    {
        foreach ($this->dat as $poz => $chr)
        {
            echo $br;
            echo str_pad($poz, 4, ' ', STR_PAD_LEFT);
            echo ' ';
            echo $chr->chr;
            echo "\t";
            echo implode(null, $chr->hex);
        }
        echo $br;
    }
}

// A REGEX TO TEST WITH
$rgx
= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE
;
echo PHP_EOL . htmlentities($rgx);

// ADD TEST CASES AS NEEDED HERE
$tests = array
( 'josé'
, 'Ray'
, 'R&aacute;y'
, '42'
, ' '
, 'J.R.R.'
, '----'
, 'BlackSulfur'
)
;

// MAKE THE TESTS
foreach ($tests as $test)
{
    echo PHP_EOL;
    $out = new HexDump($test);
    $out->render();
    echo $test;
    if (preg_match($rgx, $test)) echo ' MATCH';
}

Open in new window

Outputs
#^[\p{L} '-]*$#ui

   0 j	6a
   1 o	6f
   2 s	73
   3 é	c3a9
josé MATCH

   0 R	52
   1 a	61
   2 y	79
Ray MATCH

   0 R	52
   1 &	26
   2 a	61
   3 a	61
   4 c	63
   5 u	75
   6 t	74
   7 e	65
   8 ;	3b
   9 y	79
Ráy

   0 4	34
   1 2	32
42

   0  	20
  MATCH

   0 J	4a
   1 .	2e
   2 R	52
   3 .	2e
   4 R	52
   5 .	2e
J.R.R.

   0 -	2d
   1 -	2d
   2 -	2d
   3 -	2d
---- MATCH

   0 B	42
   1 l	6c
   2 a	61
   3 c	63
   4 k	6b
   5 S	53
   6 u	75
   7 l	6c
   8 f	66
   9 u	75
  10 r	72
BlackSulfur MATCH

Open in new window

0
 

Author Comment

by:Black Sulfur
ID: 41865908
Thanks, Ray

This is a hectic answer for my level of knowledge! Does one of these:

= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE

Open in new window

allow for the ´being inserted into the database correctly or is it the below? It's a bit over my head...

mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');

Open in new window

0
 
LVL 108

Assisted Solution

by:Ray Paseur
Ray Paseur earned 450 total points
ID: 41865929
The character encoding for the database is independent and separate from the character encoding for PHP or HTML.  Have a look at this article, and check the part about Character Sets in MySQL.
https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
0
 

Author Comment

by:Black Sulfur
ID: 41865947
Aha, I get it now.

I used :

if (!preg_match("#^[\p{L} '-]*$#ui",$_POST['first_name'])) {

Open in new window


And for my database connection I used:

$link = new mysqli($server_name, $db_username, $db_password, $db_dbname);
$link->set_charset("utf8mb4");

Open in new window


All seems to work now!  :)
1
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 41865963
Bingo!

The way I wrote the REGEX used PHP string concatenation to build up the REGEX string one piece at a time.  I like to write REGEX and other complicated strings that way because it lets me add comments, and it's easy to change when testing.

Best of luck with your project, ~Ray
0
 

Author Comment

by:Black Sulfur
ID: 41865999
Makes sense to do that, I think I should try it!
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now