Solved

What is this odd "ý" character?

Posted on 2009-06-28
5
1,110 Views
Last Modified: 2012-06-21
In a text field in a database a "ý" character has shown up in my data.  Im wanting to strip or replace all such characters from my data prior to insertion, but I'm not clear what the character represented by the "ý"  is... is a UTF-16 or some other form of encoding?  

How do I determine what the "ý"  represents and what's the best way to strip it from my database either prior or as a result of insertion?

Currently, the field is defined as "latin-swedish-ci" though that is just by default.
0
Comment
Question by:kirin0
  • 3
5 Comments
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24733850
>is a UTF-16 or some other form of encoding?  
yes

the character usually is a accented regular character (for example é), and it's the front-end application encoding (aka the web pages encoding) that determines if it displays "correctly" or not.
0
 

Author Comment

by:kirin0
ID: 24736704
Thanks angellll -- but that doesn't help me move towards a solution.  The character is making my XML crash... what's the best way to strip it from the stream?  I'm working in PHP to generate the XML BTW and that works fine but my browser is failing to load the result into the DOM.  My preference would be to simply strip the characters before they get into the database.
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24737239
>The character is making my XML crash
put this as first line in your xml:
<?xml version="1.0" encoding="UTF-16" ?>
I use this, for example:
<?xml version="1.0" encoding="ISO-8859-1" ?>

which should not make your XML "crash" any longer.

for MySQL connection when reading AND writing the data, you should read up here:
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

in php:
http://be2.php.net/manual/en/function.mysql-set-charset.php
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 250 total points
ID: 24737994
angel:  Are you really sure about that? It is a Very Bad Idea to guess at an encoding.  You should find out for certain the actual encoding of the source document.  If it is a valid document, then there should be nothing to figure out.  The encoding should be obvious.

> The character is making my XML crash.
What does this mean?  XML is a format specification, so there is no meaningful way that it could crash.  Is your XML parsing library crashing or reporting an error?  Have you written quick-and-dirty XML parsing code instead of using a real XML library?  Is your code not checking the error and crashing?  Is your code catching the error and reporting it correctly?  Are you using the wrong default encoding?  Does the source XML document not report its encoding correctly?
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24738069
>It is a Very Bad Idea to guess at an encoding.
of course, you are right.

I wrote:
>put this as first line in your xml:
when I actually wanted to write:
>put something like this as first line in your xml, with the character encoding as needed.

and the real problem might not be on the database side itself, but the application/web form the users use to encode the data, missing the "set names" stuff, hence producing implicit character code conversions, resulting in the those "funny" characters.

the "solution" has to be a end-to-end consistent use of the character set encoding ...
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This guide whil teach how to setup live replication (database mirroring) on 2 servers for backup or other purposes. In our example situation we have this network schema (see atachment). We need to replicate EVERY executed SQL query on server 1 to…
Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question