Greek Characters convert to '?' s on export from SQL using VB.net
We have developed a web app hosted on Azure, which copies the records in a SQL View to a MySQL Db for a Wordpress Website.
All is working fine, but it is a scientific website and some of the records contain greek letters. These characters are not being carried over correctly, and display as question marks on the website.
On investigation, I have read that others had success with a similar issue by forcing conversion of the field containing the greek characters to UTF-8 with a SQL command on the My SQL destination Db, others found converting it to Latin1 resolved their issue. However, I had success with neither of these. I also discovered that the greek characters appear to be converted to question marks before being sent to the MYSQL: When outputting the field values with console.writeline() in the vb code, the values expressed display '?'s instead of greek characters before they are exported to the MySQL Db.
So I'm wondering if there is any known process which I can put those fields through to ensure the greek characters are kept and remain untranslated to '?'s in the export functionality.
Incase it helps, following is a sample of the code used to write the values from the SQL view to the MySQL table:
Dim DBAAW As New MasterDBEntities '(this is the SQL Db) 'first find the Antibody Record by its HCConstructNumber Dim WARecord = (From wq In DBAAW.WP_Whole_Antibodies_on_Sale Where wq.HCConstructNumber = HCC Select wq).FirstOrDefault() If (WARecord IsNot Nothing) Then With WARecord SyncHelp.AddNewMeta(postID, "wpcf-specificitystatement", .SpecificityStatement) Console.WriteLine("Synonyms: " & .Synonyms) SyncHelp.AddNewMeta(postID, "wpcf-synonyms", .Synonyms) End With End If
'And this is what SyncHelp.AddNewMeta does to update the post's record in MySQL:
Public Shared Sub AddNewMeta(p1d As Long, metakey As String, metaval As String) Dim PMMta As New absolute_fix1Entities '(the MySQL Db) Dim xdel As Integer Dim meta = (From s In PMMta.antibody_postmeta Where s.post_id = p1d And s.meta_key = metakey Select s).FirstOrDefault() If meta IsNot Nothing Then meta.meta_value = Mval End IfEnd Sub
Besides database collation and codepage you also have a collation setting in making the connection to the db, there is a PHP default charset, then you have the web servers encoding default, which must match what you specify in htmls head section, too. And last not least you can also specify the input encoding via accept-charset attribute.
So I count 6 places where you might have a mismatch that leads to such conversion errors. A ? simple denotes a character not convertable. If you have all these things at UTF-8 or Latin-1, anything holding all charsyou need, the parts of the puzzle snap into each other. I'd go for UTF-8, as it is a norm in the web.
Unfortunately there's no one size fits all solution to such problems, so you better look at all these places and fix where there is a wrong setting. If you already have data, that's another part of the problem, because simply setting another charset doesn't change data to it, just like changing the label on a jar doesn't change content.
On top of that there isn't only a database server default charset, each database can have its own setting and each table and each field. In regard of data you might need to live with a data loss, to try and make the transition correctly, first step must be a backup of the data in it's current state, no matter how wrong it might be, if you can see greek in PHPMyAdmin or Workbench, there's hope for the data you have.
Some things to look at:
PHP.ini: default_charset = "utf-8";
Some PHP functions have parameters for a specific encoding: htmlspecialchars($str, ENT_NOQUOTES, "UTF-8")
http headers sent: header('Content-Type: text/html; charset=utf-8');
HTML head: <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
HTML forms: <form ... accept-charset="utf-8">
MySQL Connection: mysql_set_charset()
and the MySQL database itself, start with http://archive.oreilly.com/pub/post/turning_mysql_data_in_latin1_t.html