bill201
asked on
how can i change string encoding with sql query in access
hi
i have in a table a row that i want to encoding to utf-8 (for example...) how can i do it?
the table name is table1
and the row name is Song
i have in a table a row that i want to encoding to utf-8 (for example...) how can i do it?
the table name is table1
and the row name is Song
ASKER
i need a solution how to encode for a query result in access and not to export to a text file....
ASKER
do anyone have a solution
Hi,
here is a function you can copy to a standard module in VBA:
You can use this in SQL like this:
Returns the column "Song" and the converted column.
Cheers,
Christian
here is a function you can copy to a standard module in VBA:
'---------------------------------------------------------------------------------------
' Function : fnConvert
' Author : Christian Coppes
' Date : 15.06.2012
' Purpose : Converts a string into the chosen code
' (Find all possible "strCode" strings in
' HKEY_CLASSES_ROOT\MIME\Database\Charset)
' Parameters : strInput: The string to convert
' strCode: Name of the code like "utf-8" or "ascii"
' Returns : String: Converted string
'---------------------------------------------------------------------------------------
'
Public Function fnConvert(strInput As String, strCode As String) As String
Dim strm As ADODB.Stream
Set strm = New ADODB.Stream
With strm
.Open
.Type = adTypeText
.Charset = strCode
.WriteText strInput, adWriteChar
.Position = 0
fnConvert = .ReadText
.Close
End With
Set strm = Nothing
End Function
You can use this in SQL like this:
SELECT Song, fnConvert(Song, "utf-8") AS ConvertedSong FROM table1
Returns the column "Song" and the converted column.
Cheers,
Christian
Is the data stored in the SQL DB as the NVarchar datatype? If it is, you shouldn't need to do any conversion.
ASKER
thanks for your comments
[b]jimpen[/b] what is a NVarchar datatype ?
[b]Bitsqueezer[/b] i try to copy this code but i get a compile error on "Dim strm As ADODB.Stream"
"compile error user defined type not defined (i have access 2010 do i need to select some "differences" in the tools option?)
[b]jimpen[/b] what is a NVarchar datatype ?
[b]Bitsqueezer[/b] i try to copy this code but i get a compile error on "Dim strm As ADODB.Stream"
"compile error user defined type not defined (i have access 2010 do i need to select some "differences" in the tools option?)
Hi,
yes, but not "differences", you need to look into "references" and activate the Microsoft ADO 2.8 library.
Nvarchar datatype only exists on SQL Server, not in Access. It is a Unicode datatype where any character consists of two bytes.
Cheers,
Christian
yes, but not "differences", you need to look into "references" and activate the Microsoft ADO 2.8 library.
Nvarchar datatype only exists on SQL Server, not in Access. It is a Unicode datatype where any character consists of two bytes.
Cheers,
Christian
ASKER
thanks a lot Christian
i try this code but after that all the field is still in gibberish is there some way to find out what out of encode is is ant to what encode i have to convert them
have some one any ideo what i can to do?
i try this code but after that all the field is still in gibberish is there some way to find out what out of encode is is ant to what encode i have to convert them
have some one any ideo what i can to do?
Hi,
maybe you should write an example of what you have before and after - and what all this should be good for. As jimpmen wrote above, normally there is no need to encode or decode anything, all letters will be saved in the way you inserted them (Access uses at least since A2007 Unicode as default for text fields). If you have a German "äöü" and encode it to ASCII you will get "aou" by this function. So this would surely destroy the original text.
The function does what you requested in your first posting.
Cheers,
Christian
maybe you should write an example of what you have before and after - and what all this should be good for. As jimpmen wrote above, normally there is no need to encode or decode anything, all letters will be saved in the way you inserted them (Access uses at least since A2007 Unicode as default for text fields). If you have a German "äöü" and encode it to ASCII you will get "aou" by this function. So this would surely destroy the original text.
The function does what you requested in your first posting.
Cheers,
Christian
jimpen what is a NVarchar datatype ?
In SQL Server there are two major data types for text data: Char/Varchar and NChar/NVarchar. The varchar is an ASCII character based standard. The NVarchar is a UTF-8 based standard that has 2 positions.
The first position will hold the ASCII character, the second holds the UTF-8 character. If the data is inserted into the SQL DB as NVarchar, then when Access sees them, it will depend on your Windows Regional Settiing for which character to display. If the data was inserted a Cyrillic and your computer is set to United States standards then it will show the ASCII character equivalents. I don't know what would happen if inserted in the DB as Cyrillic and trying to have a Hangul (Korean) display.
ASKER
it's hebrew character and i attach screen picture before and after
beforeandafter.png
beforeandafter.png
Is the before and after both gibberish?
If that is the case, first check the data type in the field in table is set to NVarchar. If it isn't then I suspect that is your problem. The ASCII table 128+ is where you get umlauts and such live. When the UTF-8 data is inserted into an ASCII varchar field SQL strips off the UTF-8 codes and stores the ASCII equivalent.
So when you translate it back it won't won't make any sense, because it's trying to convert the ASCII character in the UTF-8 codec. Even if you change the data type now, all prior data will still probably be gibberish.
I don't have access to a SQL server, right now, to test possible solutions.
If that is the case, first check the data type in the field in table is set to NVarchar. If it isn't then I suspect that is your problem. The ASCII table 128+ is where you get umlauts and such live. When the UTF-8 data is inserted into an ASCII varchar field SQL strips off the UTF-8 codes and stores the ASCII equivalent.
So when you translate it back it won't won't make any sense, because it's trying to convert the ASCII character in the UTF-8 codec. Even if you change the data type now, all prior data will still probably be gibberish.
I don't have access to a SQL server, right now, to test possible solutions.
ASKER
but the data is stored in access and not in a sql server (do ns access support nvarchar?)
ASKER
in some other form i get an advice to use some code that encode right the character.
this is the code
Public Function FixText(Source As String)
Dim i As Integer, tmp As String, c As String * 1
For i = 1 To Len(Source)
c = Mid(Source, i, 1)
If AscW(c) >= 224 And AscW(c) <= 250 Then c = ChrW(AscW(c) + 1264)
tmp = tmp & c
Next
FixText = tmp
End Function
this is the code
Public Function FixText(Source As String)
Dim i As Integer, tmp As String, c As String * 1
For i = 1 To Len(Source)
c = Mid(Source, i, 1)
If AscW(c) >= 224 And AscW(c) <= 250 Then c = ChrW(AscW(c) + 1264)
tmp = tmp & c
Next
FixText = tmp
End Function
Hello bill201,
Your original data is encoded in something called a “Windows code page”, used before Unicode was introduced to handle non-Latin characters in eight bits. Please compare the Windows-1252 page, often mistaken for “ANSI”, with Windows-1255, the Hebrew code page. In your sample, the “before” section shows the Latin diacritical in positions occupied by the Hebrew characters in the “after” section.
According to Wikipedia, there is a good relation to Unicode, so that your solution to add an offset for all characters in the block — mapping them to the Unicode Hebrew block — should work.
Alef, Windows-1255 code point 224, hex E0, represented as “à” in ANSI, becomes Unicode U+5D0, decimal 1488. The offset is 1488-224=1264.
Another solution is to work on a computer set to use the Hebrew code page, and treat the data as non-Unicode, but that seems more complicated.
I hope this explains.
(°v°)
Your original data is encoded in something called a “Windows code page”, used before Unicode was introduced to handle non-Latin characters in eight bits. Please compare the Windows-1252 page, often mistaken for “ANSI”, with Windows-1255, the Hebrew code page. In your sample, the “before” section shows the Latin diacritical in positions occupied by the Hebrew characters in the “after” section.
According to Wikipedia, there is a good relation to Unicode, so that your solution to add an offset for all characters in the block — mapping them to the Unicode Hebrew block — should work.
Alef, Windows-1255 code point 224, hex E0, represented as “à” in ANSI, becomes Unicode U+5D0, decimal 1488. The offset is 1488-224=1264.
Another solution is to work on a computer set to use the Hebrew code page, and treat the data as non-Unicode, but that seems more complicated.
I hope this explains.
(°v°)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
thanks alot harfang for your very trainer explanation. I have learned very much...
ASKER
An excellent and comprehensive answer
Thanks for the kind words and success with your data transformations! — (°v°)
https://www.experts-exchange.com/questions/20273639/Write-File-in-UTF-8-Format.html
try this might helps you..