Solved

First name pregmatch

Posted on 2016-10-30
11
38 Views
Last Modified: 2016-10-30
I am trying to only accept the following special characters into the name field:

' - ´

the first two work fine but when trying to use the third for example, josé, it fails validation.

 if (!preg_match("/^[a-zA-Z '-´]*$/",$_POST['first_name'])) {

Open in new window

0
Comment
Question by:Black Sulfur
  • 5
  • 4
  • 2
11 Comments
 
LVL 49

Assisted Solution

by:Rgonzo1971
Rgonzo1971 earned 50 total points
ID: 41865854
Hi

If you want accented chars then try
if (!preg_match("/^[a-zA-ZÀ-ÿ '-]*$/",$_POST['first_name'])) {

Open in new window

Regards
0
 
LVL 49

Expert Comment

by:Rgonzo1971
ID: 41865858
or maybe

if (!preg_match("/^[\p{L} '-]*$/",$_POST['first_name'])) {

Open in new window

0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 41865860
The correct answer to this question depends on the character set encoding.  What character set are you using?
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 

Author Comment

by:Black Sulfur
ID: 41865879
utf-8
0
 

Author Comment

by:Black Sulfur
ID: 41865886
Hmm, Ray is right. I have my database set as utf-8 general and when I inserted josé into the database I get josé
0
 
LVL 109

Accepted Solution

by:
Ray Paseur earned 450 total points
ID: 41865887
Please see: https://iconoun.com/demo/temp_blacksulfur.php

This shows how to set up the regex, and how to test it with good data visualization.  Check the man page references in the top comments for more information.
<?php // demo/temp_blacksulfur.php
/**
 * https://www.experts-exchange.com/questions/28979848/First-name-pregmatch.html
 *
 * https://www.experts-exchange.com/articles/7830/A-Quick-Tour-of-Test-Driven-Development.html
 * https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
 * http://www.regular-expressions.info/unicode.html
 */
ini_set('display_errors', TRUE);
error_reporting(E_ALL);

// SET UP PHP AND BROWSER TO USE UTF-8
mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');
echo '<meta charset="utf-8" />';
echo '<pre>';

// SOME DATA VISUALIZATION TOOLS
Class Letter
{
    public function __construct($chr)
    {
        $this->chr = $chr;
        $this->hex = array();
        $bytes     = $this->usplit($chr);
        foreach ($bytes as $byte)
        {
            $this->hex = array_merge($this->hex, $this->gethex($byte));
        }
        return $this;
    }

    public function usplit ($chr)
    {
        $len = strlen($chr);
        while ($len) {
            $arr[] = substr($chr, 0, 1);
            $chr   = substr($chr, 1, $len);
            $len   = strlen($chr);
        }
        return $arr;
    }

    public function gethex($chr)
    {
        // GET THE HEX NIBBLE VALUES IN AN ARRAY
        $ret = str_split(implode(NULL, unpack('H*', $chr)));
        return $ret;
    }
}

Class Hexdump
{
    public function __construct($str)
    {
        $this->str = $str;
        $this->arr = $this->mb_str_split($str);
        $this->len = mb_strlen($str);
        foreach ($this->arr as $uchr)
        {
            $this->dat[] = new Letter($uchr);
        }
        return $this;
    }

    public function mb_str_split($ustr)
    {
        return preg_split('/(?<!^)(?!$)/u', $ustr);
    }

    public function render($br = PHP_EOL)
    {
        foreach ($this->dat as $poz => $chr)
        {
            echo $br;
            echo str_pad($poz, 4, ' ', STR_PAD_LEFT);
            echo ' ';
            echo $chr->chr;
            echo "\t";
            echo implode(null, $chr->hex);
        }
        echo $br;
    }
}

// A REGEX TO TEST WITH
$rgx
= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE
;
echo PHP_EOL . htmlentities($rgx);

// ADD TEST CASES AS NEEDED HERE
$tests = array
( 'josé'
, 'Ray'
, 'R&aacute;y'
, '42'
, ' '
, 'J.R.R.'
, '----'
, 'BlackSulfur'
)
;

// MAKE THE TESTS
foreach ($tests as $test)
{
    echo PHP_EOL;
    $out = new HexDump($test);
    $out->render();
    echo $test;
    if (preg_match($rgx, $test)) echo ' MATCH';
}

Open in new window

Outputs
#^[\p{L} '-]*$#ui

   0 j	6a
   1 o	6f
   2 s	73
   3 é	c3a9
josé MATCH

   0 R	52
   1 a	61
   2 y	79
Ray MATCH

   0 R	52
   1 &	26
   2 a	61
   3 a	61
   4 c	63
   5 u	75
   6 t	74
   7 e	65
   8 ;	3b
   9 y	79
Ráy

   0 4	34
   1 2	32
42

   0  	20
  MATCH

   0 J	4a
   1 .	2e
   2 R	52
   3 .	2e
   4 R	52
   5 .	2e
J.R.R.

   0 -	2d
   1 -	2d
   2 -	2d
   3 -	2d
---- MATCH

   0 B	42
   1 l	6c
   2 a	61
   3 c	63
   4 k	6b
   5 S	53
   6 u	75
   7 l	6c
   8 f	66
   9 u	75
  10 r	72
BlackSulfur MATCH

Open in new window

0
 

Author Comment

by:Black Sulfur
ID: 41865908
Thanks, Ray

This is a hectic answer for my level of knowledge! Does one of these:

= '#'             // REGEX DELIMITER
. '^'             // AT START STRING
. '['             // START CHARACTER CLASS
. '\p{L}'         // ANY LETTER
. " '-"           // BLANK, APOSTROPHE, DASH
. ']'             // ENDOF CHARACTER CLASS
. '*'             // ONE OR MORE
. '$'             // AT ENDOF STRING
. '#'             // REGEX DELIMITER
. 'u'             // FLAG: ALLOW UNICODE
. 'i'             // FLAG: CASE-INSENSITIVE

Open in new window

allow for the ´being inserted into the database correctly or is it the below? It's a bit over my head...

mb_internal_encoding('utf-8');
mb_regex_encoding('utf-8');
mb_http_output('utf-8');

Open in new window

0
 
LVL 109

Assisted Solution

by:Ray Paseur
Ray Paseur earned 450 total points
ID: 41865929
The character encoding for the database is independent and separate from the character encoding for PHP or HTML.  Have a look at this article, and check the part about Character Sets in MySQL.
https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
0
 

Author Comment

by:Black Sulfur
ID: 41865947
Aha, I get it now.

I used :

if (!preg_match("#^[\p{L} '-]*$#ui",$_POST['first_name'])) {

Open in new window


And for my database connection I used:

$link = new mysqli($server_name, $db_username, $db_password, $db_dbname);
$link->set_charset("utf8mb4");

Open in new window


All seems to work now!  :)
1
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 41865963
Bingo!

The way I wrote the REGEX used PHP string concatenation to build up the REGEX string one piece at a time.  I like to write REGEX and other complicated strings that way because it lets me add comments, and it's easy to change when testing.

Best of luck with your project, ~Ray
0
 

Author Comment

by:Black Sulfur
ID: 41865999
Makes sense to do that, I think I should try it!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
mysql update statement 3 22
Attach to file (img) a unique id 8 24
Cookie not unsetting 7 18
INDEX does not make a difference, why? 10 44
Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question