Solved

What is this odd "ý" character?

Posted on 2009-06-28
5
1,105 Views
Last Modified: 2012-06-21
In a text field in a database a "ý" character has shown up in my data.  Im wanting to strip or replace all such characters from my data prior to insertion, but I'm not clear what the character represented by the "ý"  is... is a UTF-16 or some other form of encoding?  

How do I determine what the "ý"  represents and what's the best way to strip it from my database either prior or as a result of insertion?

Currently, the field is defined as "latin-swedish-ci" though that is just by default.
0
Comment
Question by:kirin0
  • 3
5 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24733850
>is a UTF-16 or some other form of encoding?  
yes

the character usually is a accented regular character (for example é), and it's the front-end application encoding (aka the web pages encoding) that determines if it displays "correctly" or not.
0
 

Author Comment

by:kirin0
ID: 24736704
Thanks angellll -- but that doesn't help me move towards a solution.  The character is making my XML crash... what's the best way to strip it from the stream?  I'm working in PHP to generate the XML BTW and that works fine but my browser is failing to load the result into the DOM.  My preference would be to simply strip the characters before they get into the database.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24737239
>The character is making my XML crash
put this as first line in your xml:
<?xml version="1.0" encoding="UTF-16" ?>
I use this, for example:
<?xml version="1.0" encoding="ISO-8859-1" ?>

which should not make your XML "crash" any longer.

for MySQL connection when reading AND writing the data, you should read up here:
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

in php:
http://be2.php.net/manual/en/function.mysql-set-charset.php
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 250 total points
ID: 24737994
angel:  Are you really sure about that? It is a Very Bad Idea to guess at an encoding.  You should find out for certain the actual encoding of the source document.  If it is a valid document, then there should be nothing to figure out.  The encoding should be obvious.

> The character is making my XML crash.
What does this mean?  XML is a format specification, so there is no meaningful way that it could crash.  Is your XML parsing library crashing or reporting an error?  Have you written quick-and-dirty XML parsing code instead of using a real XML library?  Is your code not checking the error and crashing?  Is your code catching the error and reporting it correctly?  Are you using the wrong default encoding?  Does the source XML document not report its encoding correctly?
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24738069
>It is a Very Bad Idea to guess at an encoding.
of course, you are right.

I wrote:
>put this as first line in your xml:
when I actually wanted to write:
>put something like this as first line in your xml, with the character encoding as needed.

and the real problem might not be on the database side itself, but the application/web form the users use to encode the data, missing the "set names" stuff, hence producing implicit character code conversions, resulting in the those "funny" characters.

the "solution" has to be a end-to-end consistent use of the character set encoding ...
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

A lot of articles have been written on splitting mysqldump and grabbing the required tables. A long while back, when Shlomi (http://code.openark.org/blog/mysql/on-restoring-a-single-table-from-mysqldump) had suggested a “sed” way, I actually shell …
All XML, All the Time; More Fun MySQL Tidbits – Dynamically Generate XML via Stored Procedure in MySQL Extensible Markup Language (XML) and database systems, a marriage we are seeing more and more of.  So the topics of parsing and manipulating XM…
Two types of users will appreciate AOMEI Backupper Pro: 1 - Those with PCIe drives (and haven't found cloning software that works on them). 2 - Those who want a fast clone of their boot drive (no re-boots needed) and it can clone your drive wh…
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question