Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

PHP function to remove special characters

Posted on 2006-11-28
8
Medium Priority
?
17,165 Views
Last Modified: 2011-08-18
Hello,

I would like a php function to remove all "special characters" but leave all letters.

The problem is that I need to support European languages, many which use letters beyond the 26 that English uses.


These should be REMOVED

< > ( ) ! # $ % ^ & = + ~ ` * " ' ¡ ¤ ¢ £ ¥ ¦ § ¨ © ª «  ¬ ­ ® ™ ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ × ÷

But these should NOT be removed

à á â ã ä å é ê ë ì í î ï ñ ò ó ô õ ö ù ú û ü ý ÿ


There are many more that SHOULD be removed and many more that SHOULD NOT be removed, but the idea is if it can be used any any European language as part of a word I want to keep it.  If not, I want to get rid of it.

Any ideas?

Thanks!
0
Comment
Question by:hankknight
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 11

Assisted Solution

by:ch2
ch2 earned 400 total points
ID: 18030404
You can use a preg_replace with an array with all chars you want to remove.
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 18030465
How about:
<?php

$s = '< > ( ) ! # $ % ^ & = + ~ ` * " \' ¡ ¤ ¢ £ ¥ ¦ § ¨ © ª «  ¬ ­ ® . ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ × ÷ à á â ã ä å é ê ë ì í î ï ñ ò ó ô õ ö ù ú û ü ý ÿ';
$s = preg_replace('/[<>()!#$%\^&=+~`*"\'¡¤¢£¥¦§¨©ª«¬­®\.¯°±²³´µ¶·¸¹º»¼½¾¿×÷]/', '', $s);
print($s);

?>
0
 
LVL 16

Author Comment

by:hankknight
ID: 18030667
Thank you both for your ideas.

The problem is that both of you "blacklisted" certain characters.

That is a problem because certain things can fall through the cracks and be missed.

Theses characters for example were missed in the examples:
             ? | {

The only characters that I want to allow are found in the ISO 8859-1 character set in these ranges:
     48 through 57
     65 through 90
     97 through 122
     192 through 246

See:
http://www.utoronto.ca/webdocs/HTMLdocs/NewHTML/iso_table.html
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 29

Assisted Solution

by:TeRReF
TeRReF earned 600 total points
ID: 18030732
Then turn it around and delete all chars that are not in the whitelist:
$s = preg_replace('/[^a-z0-9]/is', '', $s);

Add the allowed chars to this list:
a-z0-9
0
 
LVL 11

Assisted Solution

by:ch2
ch2 earned 400 total points
ID: 18030749
the array:

$ar = array(chr(48), chr(49), chr(50), chr(51), chr(52), chr(53), chr(54), chr(55), chr(56), chr(57), chr(65), chr(66), chr(67), chr(68), chr(69), chr(70), chr(71), chr(72), chr(73), chr(74), chr(75), chr(76), chr(77), chr(78), chr(79), chr(80), chr(81), chr(82), chr(83), chr(84), chr(85), chr(86), chr(87), chr(88), chr(89), chr(90), chr(87), chr(88), chr(89), chr(90), chr(91), chr(92), chr(93), chr(94), chr(95), chr(96), chr(97), chr(98), chr(99), chr(100), chr(101), chr(102), chr(103), chr(104), chr(105), chr(106), chr(107), chr(108), chr(109), chr(110), chr(111), chr(112), chr(113), chr(114), chr(115), chr(116), chr(117), chr(118), chr(119), chr(120), chr(121), chr(122), chr(192), chr(193), chr(194), chr(195), chr(196), chr(197), chr(198), chr(199), chr(200), chr(201), chr(202), chr(203), chr(204), chr(205), chr(206), chr(207), chr(208), chr(209), chr(210), chr(211), chr(212), chr(213), chr(214), chr(215), chr(216), chr(217), chr(218), chr(219), chr(220), chr(221), chr(222), chr(223), chr(224), chr(225), chr(226), chr(227), chr(228), chr(229), chr(230), chr(231), chr(232), chr(233), chr(234), chr(235), chr(236), chr(237), chr(238), chr(239), chr(240), chr(241), chr(242), chr(243), chr(244), chr(245), chr(246));
0
 
LVL 16

Author Comment

by:hankknight
ID: 18030842
Would this work?

$s = preg_replace('/[^a-z0-9À-öø-ÿ]/is', '', $s);

0
 
LVL 29

Assisted Solution

by:TeRReF
TeRReF earned 600 total points
ID: 18030870
Yes, it will, but I guess you want to keep spaces as well right? just add it to the list...
0
 
LVL 48

Accepted Solution

by:
hernst42 earned 1000 total points
ID: 18035349
If you know the characters you can convert the values to hex and use the hex-values in the regular expression:
/*
  48 through 57
  65 through 90
  97 through 122
 192 through 246
*/
echo preg_replace('/[^\x30-\x39\x41-\x5a\x61-\x7a\xc0-\xf6]/', '', $text) ."\n";

If you want to keep additional chars just add the exvalues to the list. (it's basicly the same as hankknight wrote but might be easyier to edit and readable

0

Featured Post

[Webinar] Lessons on Recovering from Petya

Skyport is working hard to help customers recover from recent attacks, like the Petya worm. This work has brought to light some important lessons. New malware attacks like this can take down your entire environment. Learn from others mistakes on how to prevent Petya like worms.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses four methods for overlaying images in a container on a web page
Many old projects have bad code, but the budget doesn't exist to rewrite the codebase. You can update this code to be safer by introducing contemporary input validation, sanitation, and safer database queries.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

670 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question