Solved

What is this odd "ý" character?

Posted on 2009-06-28
5
1,102 Views
Last Modified: 2012-06-21
In a text field in a database a "ý" character has shown up in my data.  Im wanting to strip or replace all such characters from my data prior to insertion, but I'm not clear what the character represented by the "ý"  is... is a UTF-16 or some other form of encoding?  

How do I determine what the "ý"  represents and what's the best way to strip it from my database either prior or as a result of insertion?

Currently, the field is defined as "latin-swedish-ci" though that is just by default.
0
Comment
Question by:kirin0
  • 3
5 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24733850
>is a UTF-16 or some other form of encoding?  
yes

the character usually is a accented regular character (for example é), and it's the front-end application encoding (aka the web pages encoding) that determines if it displays "correctly" or not.
0
 

Author Comment

by:kirin0
ID: 24736704
Thanks angellll -- but that doesn't help me move towards a solution.  The character is making my XML crash... what's the best way to strip it from the stream?  I'm working in PHP to generate the XML BTW and that works fine but my browser is failing to load the result into the DOM.  My preference would be to simply strip the characters before they get into the database.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24737239
>The character is making my XML crash
put this as first line in your xml:
<?xml version="1.0" encoding="UTF-16" ?>
I use this, for example:
<?xml version="1.0" encoding="ISO-8859-1" ?>

which should not make your XML "crash" any longer.

for MySQL connection when reading AND writing the data, you should read up here:
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

in php:
http://be2.php.net/manual/en/function.mysql-set-charset.php
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 250 total points
ID: 24737994
angel:  Are you really sure about that? It is a Very Bad Idea to guess at an encoding.  You should find out for certain the actual encoding of the source document.  If it is a valid document, then there should be nothing to figure out.  The encoding should be obvious.

> The character is making my XML crash.
What does this mean?  XML is a format specification, so there is no meaningful way that it could crash.  Is your XML parsing library crashing or reporting an error?  Have you written quick-and-dirty XML parsing code instead of using a real XML library?  Is your code not checking the error and crashing?  Is your code catching the error and reporting it correctly?  Are you using the wrong default encoding?  Does the source XML document not report its encoding correctly?
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24738069
>It is a Very Bad Idea to guess at an encoding.
of course, you are right.

I wrote:
>put this as first line in your xml:
when I actually wanted to write:
>put something like this as first line in your xml, with the character encoding as needed.

and the real problem might not be on the database side itself, but the application/web form the users use to encode the data, missing the "set names" stuff, hence producing implicit character code conversions, resulting in the those "funny" characters.

the "solution" has to be a end-to-end consistent use of the character set encoding ...
0

Featured Post

Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Fore-Foreword Today (2016) Maxmind has a new approach to the distribution of its data sets.  This article may be obsolete.  Instead of using the examples here, have a look at the MaxMind API (https://www.maxmind.com/en/geolite2-developer-package). …
Introduction In this installment of my SQL tidbits, I will be looking at parsing Extensible Markup Language (XML) directly passed as string parameters to MySQL 5.1.5 or higher. These would be instances where LOAD_FILE (http://dev.mysql.com/doc/refm…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
A company’s greatest vulnerability is their email. CEO fraud, ransomware and spear phishing attacks are the no1 threat to a company’s security. Cybercrime is responsible for the largest loss of money to companies today with losses projected to r…

930 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now