drnick
asked on
Java-SQL-Connection to Microsoft Access - Unicode Problem
Hello everybody.
I need to use Java to access a microsoft access (2003) database using the java odbc bridge.
Everything works fine so far, i can access the table that i need to and extract the stuff that i need,
just one problem:
if i read a memo that contains unicode characters, these will be replaced with "?".
This has obviously something to do with an automatic text conversation to ansi or whatever.
I need a work around for that.
Most probable, this will work with supplying some extra properties when connecting to the
database.
I need a small snipplet of code showing me what to do.
Thanks in advance,
doc nick.
I need to use Java to access a microsoft access (2003) database using the java odbc bridge.
Everything works fine so far, i can access the table that i need to and extract the stuff that i need,
just one problem:
if i read a memo that contains unicode characters, these will be replaced with "?".
This has obviously something to do with an automatic text conversation to ansi or whatever.
I need a work around for that.
Most probable, this will work with supplying some extra properties when connecting to the
database.
I need a small snipplet of code showing me what to do.
Thanks in advance,
doc nick.
following discusses the enhancements made to the bridge for using a charset other than default
http://java.sun.com/j2se/1.4.2/docs/guide/jdbc/bridge.html
Also ensure you are using a font that is capable of sidplaying the characters (otherwise you'll get a ? displayed)
http://java.sun.com/j2se/1.4.2/docs/guide/jdbc/bridge.html
Also ensure you are using a font that is capable of sidplaying the characters (otherwise you'll get a ? displayed)
ASKER
Hello,
first of all: thanks for your quick responeses.
i've tried both of your ideas (the properties and the data-source approach of mr. objects), and both
unfortunately don't work.
I have java 1.5.0_06-b05, so i'm up-to-date, i'm using access 2003, and the table tbl_strokes contains
a mapping between single unicode characters and unicode memory, containing some information of the
characters. the memo at some point contains asian characters.
still i get "?" for all of these.
I can determine that this is definitely not a display problem, the integer value of the characters is the same
as the integer value of '?'. also i have all the fonts i need installed.
what do i do wrong?
here is my code:
/**
* This class encapsulates the database access.
*
* @author Thomas Weise
*/
public final class Database
{
/**
* The database connection.
*/
private static Connection CONNECTION ;
/**
* The strokes-statement.
*/
private static PreparedStatement STROKES_STATEMENT ;
/**
* The database driver.
*/
private static final String DB_DRIVER =
"sun.jdbc.odbc.JdbcOdbcDri ver";
/**
* The database name.
*/
private static final String DB_NAME =
"jdbc:odbc:japanese";
/**
* The sql statement for strokes.
*/
private static final String SQL_STROKES =
"select strokes from tbl_strokes where ascw(character)=?";
/**
* This internal constructor stops you from instantiating the database.
*/
private Database()
{
Typesafe.do_not_call();
}
/**
* Initialize the database encapsulation.
* @throws ClassNotFoundException If the database driver class is not
* available.
* @throws SQLException If we could not connect to the database.
* @throws NamingException When something strange happens.
*/
public static final void init ()
throws ClassNotFoundException, SQLException, NamingException
{
Properties l_p;
Class.forName(DB_DRIVER);
l_p = new Properties();
l_p.put("charSet", "UTF-8");
CONNECTION = DriverManager.getConnectio n(DB_NAME, l_p);
STROKES_STATEMENT = CONNECTION.prepareStatemen t(
SQL_STROKES,
ResultSet.TYPE_FORWARD_ONL Y,
ResultSet.CONCUR_READ_ONLY );
STROKES_STATEMENT.setEscap eProcessin g(false);
STROKES_STATEMENT.setMaxRo ws(1);
//STROKES_STATEMENT.setFet chSize(1);
//STROKES_STATEMENT.setMax FieldSize( 40 * 1024);
Runtime.getRuntime().addSh utdownHook (new Thread()
{
@Override
public final void run()
{
Database.dispose();
}
});
}
/**
* Obtain the strokes for a given character.
* @param p_char The character to obtain the strokes of.
* @return The string containing the instructions on how to paint the
* character, or <code>null</code> if none could be found.
*/
public static final String get_strokes (final char p_char)
{
ResultSet l_rs;
try
{
STROKES_STATEMENT.setInt(1 , p_char);
l_rs = STROKES_STATEMENT.executeQ uery();
if(l_rs.next())
{
return l_rs.getString(1);
}
}
catch(Throwable l_t)
{
l_t.printStackTrace();
}
return null;
}
/**
* Dispose the database connectivity encapsulation.
*/
static final void dispose ()
{
try
{
STROKES_STATEMENT.close();
}
catch(Throwable l_t)
{
//
}
finally
{
STROKES_STATEMENT = null;
}
try
{
CONNECTION.close();
}
catch(Throwable l_t)
{
//
}
finally
{
CONNECTION = null;
}
}
}
first of all: thanks for your quick responeses.
i've tried both of your ideas (the properties and the data-source approach of mr. objects), and both
unfortunately don't work.
I have java 1.5.0_06-b05, so i'm up-to-date, i'm using access 2003, and the table tbl_strokes contains
a mapping between single unicode characters and unicode memory, containing some information of the
characters. the memo at some point contains asian characters.
still i get "?" for all of these.
I can determine that this is definitely not a display problem, the integer value of the characters is the same
as the integer value of '?'. also i have all the fonts i need installed.
what do i do wrong?
here is my code:
/**
* This class encapsulates the database access.
*
* @author Thomas Weise
*/
public final class Database
{
/**
* The database connection.
*/
private static Connection CONNECTION ;
/**
* The strokes-statement.
*/
private static PreparedStatement STROKES_STATEMENT ;
/**
* The database driver.
*/
private static final String DB_DRIVER =
"sun.jdbc.odbc.JdbcOdbcDri
/**
* The database name.
*/
private static final String DB_NAME =
"jdbc:odbc:japanese";
/**
* The sql statement for strokes.
*/
private static final String SQL_STROKES =
"select strokes from tbl_strokes where ascw(character)=?";
/**
* This internal constructor stops you from instantiating the database.
*/
private Database()
{
Typesafe.do_not_call();
}
/**
* Initialize the database encapsulation.
* @throws ClassNotFoundException If the database driver class is not
* available.
* @throws SQLException If we could not connect to the database.
* @throws NamingException When something strange happens.
*/
public static final void init ()
throws ClassNotFoundException, SQLException, NamingException
{
Properties l_p;
Class.forName(DB_DRIVER);
l_p = new Properties();
l_p.put("charSet", "UTF-8");
CONNECTION = DriverManager.getConnectio
STROKES_STATEMENT = CONNECTION.prepareStatemen
SQL_STROKES,
ResultSet.TYPE_FORWARD_ONL
ResultSet.CONCUR_READ_ONLY
STROKES_STATEMENT.setEscap
STROKES_STATEMENT.setMaxRo
//STROKES_STATEMENT.setFet
//STROKES_STATEMENT.setMax
Runtime.getRuntime().addSh
{
@Override
public final void run()
{
Database.dispose();
}
});
}
/**
* Obtain the strokes for a given character.
* @param p_char The character to obtain the strokes of.
* @return The string containing the instructions on how to paint the
* character, or <code>null</code> if none could be found.
*/
public static final String get_strokes (final char p_char)
{
ResultSet l_rs;
try
{
STROKES_STATEMENT.setInt(1
l_rs = STROKES_STATEMENT.executeQ
if(l_rs.next())
{
return l_rs.getString(1);
}
}
catch(Throwable l_t)
{
l_t.printStackTrace();
}
return null;
}
/**
* Dispose the database connectivity encapsulation.
*/
static final void dispose ()
{
try
{
STROKES_STATEMENT.close();
}
catch(Throwable l_t)
{
//
}
finally
{
STROKES_STATEMENT = null;
}
try
{
CONNECTION.close();
}
catch(Throwable l_t)
{
//
}
finally
{
CONNECTION = null;
}
}
}
how are you displaying them.
A ? typically means that the font did not have a glyph to display that character.
Check the actual value of the char to see if its being recieved ok.
A ? typically means that the font did not have a glyph to display that character.
Check the actual value of the char to see if its being recieved ok.
ASKER
no, this is not the case.
i can display asian characters correctly, if i try to display an int-to-char-cast, for example, they come correct.
plus i can watch them in the eclipse debugger, where they are displayed correct.
i can invert the characters i receive from the database to integers, and those characters i get from the
database have the same integer value as '?'.
This method would work even if i had no asian fonts or support installed at all.
the problem must be between the init method and the method get_strokes, i think.
i have checked in access if i can make a sql inside access that obtains the correct values and it works.
exactly the same query (select strokes from tbl_strokes where ascw(character)=XXX) works fine there.
(like it should work by definition).
have you an idea? all i use is the code above, maybe you can try it yourself.
the only relevant table in my access 2003 database is tbl_strokes, having only
two relevant columns: character = text(1) and strokes = memo.
The character is something, type for example 'A' into it.
in the memo there is some random text, including one unicode character,
like the chinese 一 (U+4E00, 0xD2BB).
the database is registered as odbc - system-dns under the name "chinese"
get_strokes only returns always ? instad of 一 at all places.
drnick
i can display asian characters correctly, if i try to display an int-to-char-cast, for example, they come correct.
plus i can watch them in the eclipse debugger, where they are displayed correct.
i can invert the characters i receive from the database to integers, and those characters i get from the
database have the same integer value as '?'.
This method would work even if i had no asian fonts or support installed at all.
the problem must be between the init method and the method get_strokes, i think.
i have checked in access if i can make a sql inside access that obtains the correct values and it works.
exactly the same query (select strokes from tbl_strokes where ascw(character)=XXX) works fine there.
(like it should work by definition).
have you an idea? all i use is the code above, maybe you can try it yourself.
the only relevant table in my access 2003 database is tbl_strokes, having only
two relevant columns: character = text(1) and strokes = memo.
The character is something, type for example 'A' into it.
in the memo there is some random text, including one unicode character,
like the chinese 一 (U+4E00, 0xD2BB).
the database is registered as odbc - system-dns under the name "chinese"
get_strokes only returns always ? instad of 一 at all places.
drnick
ASKER
I've found out that the problem lies in the java-odbc bridge. this thing cannot handle unicode properly for some reason.
There seems to be no possible way to work around this, so i had to make up another solution.
I've written a visual basic program exporting all data from the access database and imported it into a hsqldb.
Now i can work properly.
There seems to be no possible way to work around this, so i had to make up another solution.
I've written a visual basic program exporting all data from the access database and imported it into a hsqldb.
Now i can work properly.
Glad you got it working. Are you sure about the Unicode thing? References ..?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
props.put ("charSet", "UTF8");
Connection c = DriverManager.getConnectio