Link to home
Start Free TrialLog in
Avatar of bumpert
bumpert

asked on

Invalid character data was found

Hi, i'm trying to select data from my postgresql database with a java application that use the lastest jdbc drivers from postgre developper. I have frenc character into some text column and when i tried to retrieve it (my ResultSet . getString("columnName")) i got this error "Invalid character data was found".

I think there something to do with the charset but i tried to put it at UTF-8 or CP1252 without result....

someone can help... i'm stuck here in my project and couldn't make anything else before fixing this problem

thx
Avatar of Giant2
Giant2

check if the version of JDBC you use support the charset you are using on the DB.
Avatar of earth man2
What version of java are you using ?
Avatar of bumpert

ASKER

i didn't find anything about the which charset is supported or not by this jdbc...

i use java 1.4.2_05

thx
Try passing the charset in the connection string. e.g.

jdbc:postgresql://localhost:5432/testdb?charSet=LATIN1
Avatar of bumpert

ASKER

i tried passing the charset in the connection string and tried to pass it by the Properties parameter without result
Please post exact stack trace
>>i tried to put it at UTF-8

That should be

jdbc:postgresql://localhost:5432/testdb?charSet=UNICODE

Make sure you haven't got data in there that's invalid (it may be nothing to do with accented characters)
Avatar of bumpert

ASKER

i tried with unicode without success...

i know which row cause this problem, cause i display the row that did this in my error message. But the problem is that it seems to be ok...
how can i know if it have an invalid data?

i'm trying to update data where i know it can have problem
didn't seems to be the french character... but it's very bugging me
>> cause i display the row that did this in my error message

How?

Try doing

select encode(badcolumn, 'hex') from tablex where id = n

then post the result.
Avatar of bumpert

ASKER

humm no it's seems to be french character the problem

it could be type by a user that have a french keyboard layout or type in windows with the ascii code, so how can i be able to select it from java with the jdbc?

here is the error message with the stack trace
--------


Invalid character data was found.  This is most likely caused by stored data containing characters that are invalid for the character set the database was created in.  The most common example of this is storing 8bit data in a SQL_ASCII database.


org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:1830)
org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:2134)
JavaTableParser.actionPerformed(JavaTableParser.java:185)
javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1786)
javax.swing.AbstractButton$ForwardActionEvents.actionPerformed(AbstractButton.java:1839)
javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:420)
javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:258)
javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:245)
java.awt.Component.processMouseEvent(Component.java:5100)
java.awt.Component.processEvent(Component.java:4897)
java.awt.Container.processEvent(Container.java:1569)
java.awt.Component.dispatchEventImpl(Component.java:3615)
java.awt.Container.dispatchEventImpl(Container.java:1627)
java.awt.Component.dispatchEvent(Component.java:3477)
java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:3483)
java.awt.LightweightDispatcher.processMouseEvent(Container.java:3198)
java.awt.LightweightDispatcher.dispatchEvent(Container.java:3128)
java.awt.Container.dispatchEventImpl(Container.java:1613)
java.awt.Window.dispatchEventImpl(Window.java:1606)
java.awt.Component.dispatchEvent(Component.java:3477)
java.awt.EventQueue.dispatchEvent(EventQueue.java:456)
java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchThread.java:201)
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:151)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:145)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:137)
java.awt.EventDispatchThread.run(EventDispatchThread.java:100)

Avatar of bumpert

ASKER

ok i'll try the encode
Avatar of bumpert

ASKER

didn't work in java with jdbc and in postgresql...
it's not a valide function



ERROR:  Function encode(character varying, "unknown") does not exist
        Unable to identify a function that satisfies the given argument types
        You may need to add explicit typecasts
>>didn't work in java with jdbc and in postgresql...

well i meant to execute that at the db prompt, not from Java
Avatar of bumpert

ASKER

didn't work in portgresql prompt too

it's the db prompt error message i pasted you
Avatar of bumpert

ASKER

dont have cast for hex type and didn't able to create one
Strange - it's certainly here, and i'd *guess* has been around for a while:

http://www.postgresql.com/docs/7.3/interactive/functions-string.html
Avatar of bumpert

ASKER

ok it work if i supply a text into '' but didn't work if i put the column name without the '' into the function like:
select encode('txt', 'hex') from fdsdb_java where typeforme != '' and txt like '60 degr%'; this work but return the hex of the text txt and not of the text into the colum name txt

and this

select encode(txt, 'hex') from fdsdb_java where typeforme != '' and txt like '60 degr%';

give the error message i posted before

thx for help
Avatar of bumpert

ASKER

to fix the problem, not really fix, but i need to do something quick, i need to update each row that have a french character (row that jdbc give me error, i presume it's all row with french character), with a non french... so where i have é i put e and where i have à i put a

but i'm still curios on a real solution, could be very usefull to know and could solve my problem really quickly instead of this poor "solution"

thx
Hmm. Probably could come up with a kludge but what does this return (if anything)?:

select pg_client_encoding() from fdsdb_java
sounds like the problem may be with the data in your db, not with how you are accessing it.
That's precisely why i'm asking bumpert to dump it as hex
well pardon me for offering a suggestion to what the cause of the problem was!!
anyways see if your hex dump confirms my suspisions
what platform are you using ?
did you use initdb URDB -E LATIN9
output of psql \l command ?
if you are using windows console CMD.EXE did you set codepage correctly ?
eg C:> CHCP 1252
>>well pardon me for offering a suggestion to what the cause of the problem was!!

Well it would help if it were a new one ;-)

>>
Make sure you haven't got data in there that's invalid (it may be nothing to do with accented characters)
>>

http://www.postgresql.org/docs/7.3/interactive/multibyte.html
PostgreSQL JDBC driver does not support MULE_INTERNAL, LATIN6, LATIN8, and LATIN10.
What version of postgresql are you using?

latest postgresql.jar jdbc driver requires code transformation functions which appeared only recently.

look at Encoding.java

when making postgresql from source make sure --enable-nls is specified E.G.
./configure --prefix=/usr --with-perl --with-python --with-openssl --with-gnu-ld --enable-nls
make
make install
you may need to set client session encoding parameter to eliminate bad characters in the future.
SET CLIENT_ENCODING = 'WIN1250';
Check the character set of your Database. What is it?
Avatar of bumpert

ASKER

>Hmm. Probably could come up with a kludge but what does this return (if anything)?:
>select pg_client_encoding() from fdsdb_java

SQL_ASCII

thx
Avatar of bumpert

ASKER

>  sounds like the problem may be with the data in your db, not with how you are accessing it.

it's what i think now by seeing your post, but the problem is if i acces it from embperl in webpage i have no problem.
And the only data that seems to cause problem are french character, entered by a user with a french keyboard layout or with the ascii code (ex.: ALT + 0233 for é)

thx for help
Then you should set it to something else. UNICODE or LATIN1
Avatar of bumpert

ASKER

>what platform are you using ?

Gentoo Linux

>did you use initdb URDB -E LATIN9

Don't know, it's not me that manage the server, but i don't think so..

>output of psql \l command ?

return my database, with SQL_ASCII

>if you are using windows console CMD.EXE did you set codepage correctly ?
>eg C:> CHCP 1252

I use putty to access my server, but the problem it seems to be only when i tried to retrieve data with jdbc, cause it's working in psql (in putty) and working with embperl in our webpage

thx for help
I can't find details on SQL_ASCII character set, but i'd guess it doesn't support French characters. See my last comment
Avatar of bumpert

ASKER

>What version of postgresql are you using?

7.3.4

>latest postgresql.jar jdbc driver requires code transformation functions which appeared only recently.

i use the lastest from cvs

>look at Encoding.java

i'll check but don't really know what to check

>when making postgresql from source make sure --enable-nls is specified E.G.
>./configure --prefix=/usr --with-perl --with-python --with-openssl --with-gnu-ld --enable-nls
>make
>make install

ok would be good for the future, but i can't reinstall postgre for now

thx for help
Avatar of bumpert

ASKER

> you may need to set client session encoding parameter to eliminate bad characters in the future.
>SET CLIENT_ENCODING = 'WIN1250';

ok thx
Avatar of bumpert

ASKER

> Check the character set of your Database. What is it?

SQL_ASCII
Avatar of bumpert

ASKER

> Then you should set it to something else. UNICODE or LATIN1

the database? i could'nt cause i'm not the only user that acces this database.... i hate trying doing things in company where your are limited by too much things.... at home i could do what i want

thx
Avatar of bumpert

ASKER

> I can't find details on SQL_ASCII character set, but i'd guess it doesn't support French characters. See my last comment

i think it support, cause my problem was the way i saved my data and now i change my way and it's ok i can se french caracter without problem...

but i need to fix my others table that are wrong
>>cause my problem was the way i saved my data

Can you explain this?
Avatar of bumpert

ASKER

i was adding text into an applet, send the data to a cgi that write the data to a text file and run a java application on the server to retrieve the data and save it into the db.. very complicated but we didn't want to open the 5432 port from outside and i was starting programming :0

thx for help
Hmm, but that doesn't account for the problem really ...
Avatar of bumpert

ASKER

i know, it's weird cause if i send the data to the cgi and the cgi save it themself i have no problem
The Java that's *writing* to the db may have the wrong encoding set on the driver
Avatar of bumpert

ASKER

yes it's why i was thinking, so i changed  the way my applet save and now the french caracter is saving well

but i still have problem to retrieve the data that was saved in that poor style

thx for help (sorry for late answer, i wasn't at work for 4 days)
>>but i still have problem to retrieve the data that was saved in that poor style

Try reading using the same driver setting as that with which it was saved
Avatar of bumpert

ASKER

it was the first thing i tried
thx
Post the result of

System.out.println(new sun.misc.HexDumpEncoder().encode(resultSet.getBytes("bad_field")));
If you can work out the character set used you should be able to use convert function to update table column(s).

convert(string  using conversion_name)
Avatar of bumpert

ASKER

0000: 53 6F 75 64 75 72 65 20   64 27 61 6E 67 6C 65 20  Soudure d'angle
0010: 64 65 20 35 6D 6D 20 73   75 72 20 31 32 30 20 64  de 5mm sur 120 d
0020: 65 67 72 E9 73
That output (the method is slightly buggy) is perfect output for French stored in ISO-8859-1. So if you go to the trouble of doing your own decoding, you'll get

Soudure d'angle de 5mm sur 120 degrés
ASKER CERTIFIED SOLUTION
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of bumpert

ASKER

> String s = new String(resultSet.getBytes("bad_field"), "ISO8859-1");


this solved my problem, please post to this thread and i'll give you others points: https://www.experts-exchange.com/questions/21151155/Invalid-character-data-was-found.html


thx a lot
8-)