Solved

Invalid character data was found

Posted on 2004-09-30
53
400 Views
Last Modified: 2012-06-21
Hi, i'm trying to select data from my postgresql database with a java application that use the lastest jdbc drivers from postgre developper. I have frenc character into some text column and when i tried to retrieve it (my ResultSet . getString("columnName")) i got this error "Invalid character data was found".

I think there something to do with the charset but i tried to put it at UTF-8 or CP1252 without result....

someone can help... i'm stuck here in my project and couldn't make anything else before fixing this problem

thx
0
Comment
Question by:bumpert
  • 24
  • 19
  • 6
  • +2
53 Comments
 
LVL 12

Expert Comment

by:Giant2
ID: 12191723
check if the version of JDBC you use support the charset you are using on the DB.
0
 
LVL 22

Expert Comment

by:earth man2
ID: 12192139
What version of java are you using ?
0
 

Author Comment

by:bumpert
ID: 12192205
i didn't find anything about the which charset is supported or not by this jdbc...

i use java 1.4.2_05

thx
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12193023
Try passing the charset in the connection string. e.g.

jdbc:postgresql://localhost:5432/testdb?charSet=LATIN1
0
 

Author Comment

by:bumpert
ID: 12193104
i tried passing the charset in the connection string and tried to pass it by the Properties parameter without result
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12193212
Please post exact stack trace
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12193245
>>i tried to put it at UTF-8

That should be

jdbc:postgresql://localhost:5432/testdb?charSet=UNICODE

Make sure you haven't got data in there that's invalid (it may be nothing to do with accented characters)
0
 

Author Comment

by:bumpert
ID: 12193558
i tried with unicode without success...

i know which row cause this problem, cause i display the row that did this in my error message. But the problem is that it seems to be ok...
how can i know if it have an invalid data?

i'm trying to update data where i know it can have problem
didn't seems to be the french character... but it's very bugging me
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12193700
>> cause i display the row that did this in my error message

How?

Try doing

select encode(badcolumn, 'hex') from tablex where id = n

then post the result.
0
 

Author Comment

by:bumpert
ID: 12193790
humm no it's seems to be french character the problem

it could be type by a user that have a french keyboard layout or type in windows with the ascii code, so how can i be able to select it from java with the jdbc?

here is the error message with the stack trace
--------


Invalid character data was found.  This is most likely caused by stored data containing characters that are invalid for the character set the database was created in.  The most common example of this is storing 8bit data in a SQL_ASCII database.


org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:1830)
org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:2134)
JavaTableParser.actionPerformed(JavaTableParser.java:185)
javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1786)
javax.swing.AbstractButton$ForwardActionEvents.actionPerformed(AbstractButton.java:1839)
javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:420)
javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:258)
javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:245)
java.awt.Component.processMouseEvent(Component.java:5100)
java.awt.Component.processEvent(Component.java:4897)
java.awt.Container.processEvent(Container.java:1569)
java.awt.Component.dispatchEventImpl(Component.java:3615)
java.awt.Container.dispatchEventImpl(Container.java:1627)
java.awt.Component.dispatchEvent(Component.java:3477)
java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:3483)
java.awt.LightweightDispatcher.processMouseEvent(Container.java:3198)
java.awt.LightweightDispatcher.dispatchEvent(Container.java:3128)
java.awt.Container.dispatchEventImpl(Container.java:1613)
java.awt.Window.dispatchEventImpl(Window.java:1606)
java.awt.Component.dispatchEvent(Component.java:3477)
java.awt.EventQueue.dispatchEvent(EventQueue.java:456)
java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchThread.java:201)
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:151)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:145)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:137)
java.awt.EventDispatchThread.run(EventDispatchThread.java:100)

0
 

Author Comment

by:bumpert
ID: 12193813
ok i'll try the encode
0
 

Author Comment

by:bumpert
ID: 12193846
didn't work in java with jdbc and in postgresql...
it's not a valide function



ERROR:  Function encode(character varying, "unknown") does not exist
        Unable to identify a function that satisfies the given argument types
        You may need to add explicit typecasts
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12193914
>>didn't work in java with jdbc and in postgresql...

well i meant to execute that at the db prompt, not from Java
0
 

Author Comment

by:bumpert
ID: 12193930
didn't work in portgresql prompt too

it's the db prompt error message i pasted you
0
 

Author Comment

by:bumpert
ID: 12193964
dont have cast for hex type and didn't able to create one
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12194010
Strange - it's certainly here, and i'd *guess* has been around for a while:

http://www.postgresql.com/docs/7.3/interactive/functions-string.html
0
 

Author Comment

by:bumpert
ID: 12194212
ok it work if i supply a text into '' but didn't work if i put the column name without the '' into the function like:
select encode('txt', 'hex') from fdsdb_java where typeforme != '' and txt like '60 degr%'; this work but return the hex of the text txt and not of the text into the colum name txt

and this

select encode(txt, 'hex') from fdsdb_java where typeforme != '' and txt like '60 degr%';

give the error message i posted before

thx for help
0
 

Author Comment

by:bumpert
ID: 12194358
to fix the problem, not really fix, but i need to do something quick, i need to update each row that have a french character (row that jdbc give me error, i presume it's all row with french character), with a non french... so where i have é i put e and where i have à i put a

but i'm still curios on a real solution, could be very usefull to know and could solve my problem really quickly instead of this poor "solution"

thx
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12194501
Hmm. Probably could come up with a kludge but what does this return (if anything)?:

select pg_client_encoding() from fdsdb_java
0
 
LVL 92

Expert Comment

by:objects
ID: 12194770
sounds like the problem may be with the data in your db, not with how you are accessing it.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12194817
That's precisely why i'm asking bumpert to dump it as hex
0
 
LVL 92

Expert Comment

by:objects
ID: 12194909
well pardon me for offering a suggestion to what the cause of the problem was!!
anyways see if your hex dump confirms my suspisions
0
 
LVL 22

Expert Comment

by:earth man2
ID: 12194964
what platform are you using ?
did you use initdb URDB -E LATIN9
output of psql \l command ?
if you are using windows console CMD.EXE did you set codepage correctly ?
eg C:> CHCP 1252
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12195042
>>well pardon me for offering a suggestion to what the cause of the problem was!!

Well it would help if it were a new one ;-)

>>
Make sure you haven't got data in there that's invalid (it may be nothing to do with accented characters)
>>

0
 
LVL 22

Expert Comment

by:earth man2
ID: 12195074
http://www.postgresql.org/docs/7.3/interactive/multibyte.html
PostgreSQL JDBC driver does not support MULE_INTERNAL, LATIN6, LATIN8, and LATIN10.
0
 
LVL 22

Expert Comment

by:earth man2
ID: 12195401
What version of postgresql are you using?

latest postgresql.jar jdbc driver requires code transformation functions which appeared only recently.

look at Encoding.java

when making postgresql from source make sure --enable-nls is specified E.G.
./configure --prefix=/usr --with-perl --with-python --with-openssl --with-gnu-ld --enable-nls
make
make install
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 22

Expert Comment

by:earth man2
ID: 12195868
you may need to set client session encoding parameter to eliminate bad characters in the future.
SET CLIENT_ENCODING = 'WIN1250';
0
 
LVL 12

Expert Comment

by:Giant2
ID: 12197545
Check the character set of your Database. What is it?
0
 

Author Comment

by:bumpert
ID: 12199047
>Hmm. Probably could come up with a kludge but what does this return (if anything)?:
>select pg_client_encoding() from fdsdb_java

SQL_ASCII

thx
0
 

Author Comment

by:bumpert
ID: 12199063
>  sounds like the problem may be with the data in your db, not with how you are accessing it.

it's what i think now by seeing your post, but the problem is if i acces it from embperl in webpage i have no problem.
And the only data that seems to cause problem are french character, entered by a user with a french keyboard layout or with the ascii code (ex.: ALT + 0233 for é)

thx for help
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12199069
Then you should set it to something else. UNICODE or LATIN1
0
 

Author Comment

by:bumpert
ID: 12199089
>what platform are you using ?

Gentoo Linux

>did you use initdb URDB -E LATIN9

Don't know, it's not me that manage the server, but i don't think so..

>output of psql \l command ?

return my database, with SQL_ASCII

>if you are using windows console CMD.EXE did you set codepage correctly ?
>eg C:> CHCP 1252

I use putty to access my server, but the problem it seems to be only when i tried to retrieve data with jdbc, cause it's working in psql (in putty) and working with embperl in our webpage

thx for help
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12199111
I can't find details on SQL_ASCII character set, but i'd guess it doesn't support French characters. See my last comment
0
 

Author Comment

by:bumpert
ID: 12199121
>What version of postgresql are you using?

7.3.4

>latest postgresql.jar jdbc driver requires code transformation functions which appeared only recently.

i use the lastest from cvs

>look at Encoding.java

i'll check but don't really know what to check

>when making postgresql from source make sure --enable-nls is specified E.G.
>./configure --prefix=/usr --with-perl --with-python --with-openssl --with-gnu-ld --enable-nls
>make
>make install

ok would be good for the future, but i can't reinstall postgre for now

thx for help
0
 

Author Comment

by:bumpert
ID: 12199126
> you may need to set client session encoding parameter to eliminate bad characters in the future.
>SET CLIENT_ENCODING = 'WIN1250';

ok thx
0
 

Author Comment

by:bumpert
ID: 12199131
> Check the character set of your Database. What is it?

SQL_ASCII
0
 

Author Comment

by:bumpert
ID: 12199148
> Then you should set it to something else. UNICODE or LATIN1

the database? i could'nt cause i'm not the only user that acces this database.... i hate trying doing things in company where your are limited by too much things.... at home i could do what i want

thx
0
 

Author Comment

by:bumpert
ID: 12199280
> I can't find details on SQL_ASCII character set, but i'd guess it doesn't support French characters. See my last comment

i think it support, cause my problem was the way i saved my data and now i change my way and it's ok i can se french caracter without problem...

but i need to fix my others table that are wrong
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12203277
>>cause my problem was the way i saved my data

Can you explain this?
0
 

Author Comment

by:bumpert
ID: 12203492
i was adding text into an applet, send the data to a cgi that write the data to a text file and run a java application on the server to retrieve the data and save it into the db.. very complicated but we didn't want to open the 5432 port from outside and i was starting programming :0

thx for help
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12203666
Hmm, but that doesn't account for the problem really ...
0
 

Author Comment

by:bumpert
ID: 12203701
i know, it's weird cause if i send the data to the cgi and the cgi save it themself i have no problem
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12203763
The Java that's *writing* to the db may have the wrong encoding set on the driver
0
 

Author Comment

by:bumpert
ID: 12227112
yes it's why i was thinking, so i changed  the way my applet save and now the french caracter is saving well

but i still have problem to retrieve the data that was saved in that poor style

thx for help (sorry for late answer, i wasn't at work for 4 days)
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12229497
>>but i still have problem to retrieve the data that was saved in that poor style

Try reading using the same driver setting as that with which it was saved
0
 

Author Comment

by:bumpert
ID: 12230114
it was the first thing i tried
thx
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12230149
Post the result of

System.out.println(new sun.misc.HexDumpEncoder().encode(resultSet.getBytes("bad_field")));
0
 
LVL 22

Expert Comment

by:earth man2
ID: 12230834
If you can work out the character set used you should be able to use convert function to update table column(s).

convert(string  using conversion_name)
0
 

Author Comment

by:bumpert
ID: 12230865
0000: 53 6F 75 64 75 72 65 20   64 27 61 6E 67 6C 65 20  Soudure d'angle
0010: 64 65 20 35 6D 6D 20 73   75 72 20 31 32 30 20 64  de 5mm sur 120 d
0020: 65 67 72 E9 73
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12231048
That output (the method is slightly buggy) is perfect output for French stored in ISO-8859-1. So if you go to the trouble of doing your own decoding, you'll get

Soudure d'angle de 5mm sur 120 degrés
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 125 total points
ID: 12231154
>>if you go to the trouble of doing your own decoding

and you can do that thus:


String s = new String(resultSet.getBytes("bad_field"), "ISO8859-1");
0
 

Author Comment

by:bumpert
ID: 12231316
> String s = new String(resultSet.getBytes("bad_field"), "ISO8859-1");


this solved my problem, please post to this thread and i'll give you others points: http://www.experts-exchange.com/Databases/PostgreSQL/Q_21151155.html


thx a lot
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12231396
8-)
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now