Solved

What is this odd "ý" character?

Posted on 2009-06-28
5
1,099 Views
Last Modified: 2012-06-21
In a text field in a database a "ý" character has shown up in my data.  Im wanting to strip or replace all such characters from my data prior to insertion, but I'm not clear what the character represented by the "ý"  is... is a UTF-16 or some other form of encoding?  

How do I determine what the "ý"  represents and what's the best way to strip it from my database either prior or as a result of insertion?

Currently, the field is defined as "latin-swedish-ci" though that is just by default.
0
Comment
Question by:kirin0
  • 3
5 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24733850
>is a UTF-16 or some other form of encoding?  
yes

the character usually is a accented regular character (for example é), and it's the front-end application encoding (aka the web pages encoding) that determines if it displays "correctly" or not.
0
 

Author Comment

by:kirin0
ID: 24736704
Thanks angellll -- but that doesn't help me move towards a solution.  The character is making my XML crash... what's the best way to strip it from the stream?  I'm working in PHP to generate the XML BTW and that works fine but my browser is failing to load the result into the DOM.  My preference would be to simply strip the characters before they get into the database.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24737239
>The character is making my XML crash
put this as first line in your xml:
<?xml version="1.0" encoding="UTF-16" ?>
I use this, for example:
<?xml version="1.0" encoding="ISO-8859-1" ?>

which should not make your XML "crash" any longer.

for MySQL connection when reading AND writing the data, you should read up here:
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

in php:
http://be2.php.net/manual/en/function.mysql-set-charset.php
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 250 total points
ID: 24737994
angel:  Are you really sure about that? It is a Very Bad Idea to guess at an encoding.  You should find out for certain the actual encoding of the source document.  If it is a valid document, then there should be nothing to figure out.  The encoding should be obvious.

> The character is making my XML crash.
What does this mean?  XML is a format specification, so there is no meaningful way that it could crash.  Is your XML parsing library crashing or reporting an error?  Have you written quick-and-dirty XML parsing code instead of using a real XML library?  Is your code not checking the error and crashing?  Is your code catching the error and reporting it correctly?  Are you using the wrong default encoding?  Does the source XML document not report its encoding correctly?
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24738069
>It is a Very Bad Idea to guess at an encoding.
of course, you are right.

I wrote:
>put this as first line in your xml:
when I actually wanted to write:
>put something like this as first line in your xml, with the character encoding as needed.

and the real problem might not be on the database side itself, but the application/web form the users use to encode the data, missing the "set names" stuff, hence producing implicit character code conversions, resulting in the those "funny" characters.

the "solution" has to be a end-to-end consistent use of the character set encoding ...
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

More Fun with XML and MySQL – Parsing Delimited String with a Single SQL Statement Are you ready for another of my SQL tidbits?  Hopefully so, as in this adventure, I will be covering a topic that comes up a lot which is parsing a comma (or other…
All XML, All the Time; More Fun MySQL Tidbits – Dynamically Generate XML via Stored Procedure in MySQL Extensible Markup Language (XML) and database systems, a marriage we are seeing more and more of.  So the topics of parsing and manipulating XM…
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now