Advertisement

02.13.2003 at 10:47AM PST, ID: 20512969
[x]
Attachment Details
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

character encoding (unicode to utf-8) conversion problem
Tags: java, encoding
I have run into a problem that I can't seem to find a solution to.

my users are copying and pasting from MS-Word.  My DB is Oracle with its encoding set to "UTF-8".

Using Oracle's thin driver it automatically converts to the DB's default character set.

When Java tries to encode Unicode to UTF-8 and it runs into an unknown character (typically a character that is in the High Ascii range) it substitutes it with '?' or some other wierd character.

How do I prevent this.

I tried different encodings using a simple driver like:
class UnicodeConversionTest
{
    public static void main(String[] args)
    {
   try {
     String str = new String("`test3`");
     String utfStr = new String(str.getBytes("UTF-8"), "UTF-8");
     System.out.println("Converted:" + str + " to:" + utfStr);
   } catch (Exception e) {
       e.printStackTrace(System.out);
     }
    }
}

But that didn't work.  Then I tried a more elaborate conversion:
import sun.io.CharToByteConverter;
import sun.io.ByteToCharConverter;

public class UnicodeTest {
 public UnicodeTest() {
 }

 public static void main(String[] args) {

   UnicodeTest unicodeTest1 = new UnicodeTest();

   try {
     ByteToCharConverter fromUnicode = ByteToCharConverter.getConverter("US-ASCII");
     char[] subChars = { ' ' };
     fromUnicode.setSubstitutionMode(true);
     fromUnicode.setSubstitutionChars(subChars);
     String originalStr = new String("test3");
     char[] convertedChars = fromUnicode.convertAll(originalStr.getBytes());
     String convertedStr = new String(convertedChars);
     //String convertedStr = new String(originalStr.getBytes("US-ASCII"), "US-ASCII");
     System.out.println("String:" + originalStr + " converted to:" + convertedStr);
   } catch (Exception e) {
     e.printStackTrace(System.out);
   }
 }

I tried a variation of the second code snippet that inserts into the DB - just to see the results and it was a no go.

I don't want '?' replacing the uknown chars.  I would rather strip them or replace them with ' ' but I haven't been able to get that to work (using the second bit of code)

Any ideas on what I am doing wrong?

Thanx,
CJ
Start your free trial to view this solution
Question Stats
Zone: Programming
Question Asked By: cheekycj
Solution Provided By: orangehead911
Participating Experts: 7
Solution Grade: A
Views: 749
Translate:
Loading Advertisement...
02.13.2003 at 10:51AM PST, ID: 7944022

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:15AM PST, ID: 7944214

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:16AM PST, ID: 7944220

Rank: Wizard

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:21AM PST, ID: 7944255

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:24AM PST, ID: 7944282

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:27AM PST, ID: 7944300

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:33AM PST, ID: 7944353

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:33AM PST, ID: 7944357

Rank: Wizard

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:39AM PST, ID: 7944406

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:41AM PST, ID: 7944423

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:48AM PST, ID: 7944481

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 11:50AM PST, ID: 7944496

Rank: Wizard

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 12:02PM PST, ID: 7944578

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 12:36PM PST, ID: 7944797

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.13.2003 at 12:43PM PST, ID: 7944845

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.14.2003 at 01:36AM PST, ID: 7948424

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.14.2003 at 01:09PM PST, ID: 7952761

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.14.2003 at 01:18PM PST, ID: 7952829

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.19.2003 at 11:41AM PST, ID: 7983201

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
02.20.2003 at 02:03PM PST, ID: 7989825

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.03.2003 at 12:44PM PST, ID: 8059863

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.03.2003 at 01:23PM PST, ID: 8060123

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.05.2003 at 07:08PM PST, ID: 8077031

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.06.2003 at 09:02AM PST, ID: 8081612

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.06.2003 at 10:49AM PST, ID: 8082516

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.06.2003 at 10:52AM PST, ID: 8082533

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.06.2003 at 11:32AM PST, ID: 8082871

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.06.2003 at 11:58AM PST, ID: 8083103

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.06.2003 at 04:59PM PST, ID: 8085006

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.10.2003 at 02:56PM PST, ID: 8106658

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.10.2003 at 03:21PM PST, ID: 8106802

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
03.11.2003 at 07:48AM PST, ID: 8111916

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
04.02.2003 at 12:49PM PST, ID: 8256512

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
04.02.2003 at 01:07PM PST, ID: 8256631

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
05.26.2003 at 11:19AM PDT, ID: 8585576

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
05.26.2003 at 04:36PM PDT, ID: 8586675

Rank: Genius

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
06.05.2003 at 06:11AM PDT, ID: 8656783

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
06.05.2003 at 08:35AM PDT, ID: 8658188

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
05.24.2004 at 02:22AM PDT, ID: 11141440

All comments and solutions are available to Premium Service Members only.

Start your 7 day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
Loading Advertisement...
Microsoft
  • Internet Protocols
  • Applications
  • Development
  • OS
  • Hardware
  • Windows Security
Apple
  • Operating Systems
  • Hardware
  • Programming
  • Networking
  • Software
Internet
  • Search Engines
  • File Sharing
  • WebTrends / Stats
  • Spy / Ad Blockers
  • Web Browsers
  • New Net Users
  • Web Development
  • Chat / IM
  • Anti Spam
  • Web Servers
  • Anti-Virus
  • Email Clients
Gamers
  • Tips
  • Online / MMORPG
  • Puzzle
  • Emulators
  • Action / Adventure
  • Role Playing
  • Consoles
  • Game Programming
  • Strategy
  • Sports
  • Misc
  • Computer Games
Digital Living
  • Hardware
  • New Net Users
  • New Users
  • Software
  • Digital Music
  • Gaming World
  • Home Security
  • Apple
  • Networking Hardware
Virus & Spyware
  • Vulnerabilities
  • IDS
  • Encryption
  • Anti-Virus
  • Operating Systems Security
  • Software Firewalls
  • WebApplications
  • Cell Phones
  • Operating Systems
  • Internet
  • Hardware Firewalls
Hardware
  • Handhelds / PDAs
  • Displays / Monitors
  • Components
  • Networking Hardware
  • Peripherals
  • Laptops/Notebooks
  • Storage
  • Servers
  • Desktops
  • New Users
  • Misc
  • Apple
Software
  • System Utilities
  • Industry Specific
  • Network Management
  • Photos / Graphics
  • Page Layout
  • VMWare
  • Misc
  • Web Development
  • OS
  • CYGWIN
  • Voice Recognition
  • Message Queue
  • Quality Assurance
  • Security
  • Firewalls
  • MultiMedia Applications
  • Development
  • Database
  • Office / Productivity
  • Business Management
  • OS/2 Apps
  • Server Software
  • Internet / Email
ITPro
  • OS
  • Storage
  • Encryption
  • Operating Systems Security
  • Apple Hardware
  • Laptops & Notebooks
  • Servers
  • Networking Hardware
  • Peripherals
  • Devices
  • Displays / Monitors
  • WebTrends / Stats
  • Search Engines
  • Firewalls
  • WebApplications
  • IDS
  • Vulnerabilities
  • Email Clients
  • File Sharing
  • Spy / Ad Blockers
  • Web Browsers
  • Web Servers
  • Networking
  • Anti-Virus
  • Chat / IM
  • Anti Spam
Developer
  • Web Servers
  • Web Browsers
  • Game Programming
  • Dev Tools
  • Industry Specific
  • Office / Productivity
  • Database
  • CYGWIN
  • Web Development
  • Search Engines
  • File Sharing
  • WebTrends / Stats
  • Programming
  • Content Management
  • Application Servers
  • Protocols
Storage
  • Removable Backup Media
  • Storage Technology
  • Servers
  • Grid
  • Remote Access
  • Backup / Restore
  • Misc
  • Hard Drives
OS
  • Miscellaneous
  • Security
  • Development
  • Linux
  • VMWare
  • MainFrame OS
  • Unix
  • Apple
  • OS / 2
  • AS / 400
  • BeOS
  • Microsoft
  • VMS / OpenVMS
Database
  • Oracle
  • Miscellaneous
  • MySQL
  • Software
  • Sybase
  • Contact Management
  • PostgreSQL
  • Data Manipulation
  • Clarion
  • InterSystems Cache
  • Siebel
  • MUMPS
  • OLAP
  • SQLBase
  • SAS
  • GIS & GPS
  • 4GL
  • Berkeley DB
  • DB2
  • Informix
  • Interbase / Firebird
  • FoxPro
  • Reporting
  • LDAP
  • Filemaker Pro
  • MS SQL Server
  • dBase
  • MS Access
Security
  • Misc
  • Web Browsers
  • Software Firewalls
  • Operating Systems Security
  • File Sharing
  • Spy / Ad Blockers
  • Vulnerabilities
  • WebApplications
  • IDS
  • Anti-Virus
  • Encryption
  • Anti Spam
  • Email Clients
  • VPN
  • Chat / IM
Programming
  • Editors IDEs
  • Installation
  • Handhelds / PDAs
  • Multimedia Programming
  • System / Kernel
  • Algorithms
  • Game
  • Signal Processing
  • Project Management
  • Open Source
  • Database
  • Misc
  • Languages
  • Processor Platforms
  • Theory
Web Development
  • Scripting
  • Blogs
  • Web Servers
  • Software
  • Search Engines
  • Web Graphics
  • Images
  • Internet Marketing
  • Images and Photos
  • Components
  • Document Imaging
  • Web Languages/Standards
  • Illustration
  • WebApplications
  • Fonts
  • WebTrends / Stats
  • Authoring
  • Digital Camera Software
  • Miscellaneous
Networking
  • Protocols
  • Apple Networking
  • Network Management
  • Message Queue
  • Application Servers
  • Content Management
  • File Servers
  • Email Servers
  • Misc
  • Java Editors & IDEs
  • Wireless
  • Networking Hardware
  • Backup / Restore
  • System Utilities
  • ISPs & Hosting
  • Web Servers
  • Storage Technology
  • Removable Backup Media
  • Servers
  • Broadband
  • Grid
  • OS / 2
  • Novell Netware
  • Unix Networking
  • Windows Networking
  • Security
  • Telecommunications
  • Operating Systems
  • Linux Networking
Other
  • Community Advisor
  • Lounge
  • Community Support
  • New Net Users
  • Philosophy / Religion
  • Math / Science
  • Miscellaneous
  • URLs
  • Expert Lounge
  • Politics
  • Puzzles / Riddles
Community Support
  • Suggestions
  • New to EE
  • New Topics
  • Community Advisor
  • CleanUp
  • Announcements
  • General
  • Feedback
  • Input
  • EE Bugs
 
02.13.2003 at 10:51AM PST, ID: 7944022

Rank: Genius

>>System.out.println("Converted:" + str + " to:" + utfStr);

You're talking about '?' getting printed out unexpectedly by the above code i take it?
 
02.13.2003 at 11:15AM PST, ID: 7944214
yes.

This works:
convertedStr = convertedStr.replace('\ufffd', ' ');

But I was hoping for a solution that wouldn't require me to replace the chars manually.

CJ
 
02.13.2003 at 11:16AM PST, ID: 7944220

Rank: Wizard

To me it sounds like that the problem you're having is that the output device can't handle the odd characters! I have been getting the same result in the past.

Have you tried round-trip, meaning inserting from Java and then selecting from Java, showing the final result in a JTextField?

I am pretty sure that the driver works just fine, and that a call to Statement.execute or Statement.executeUpdate would encode your strings correctly.

\t
Accepted Solution
 
02.13.2003 at 11:21AM PST, ID: 7944255
When I use the following (for testing purposes):

      String insertSql = "insert into unicode_test (string_id, string_value) values (?,?)";
      PreparedStatement ps = conn.prepareStatement(insertSql);
      ps.setInt(1, maxID);
      ps.setString(2, convertedStr);

      int rowsInserted = ps.executeUpdate();

The DB gets an inverted '?' stored as the character.

So the conversion that jdbc driver is doing is using a wierd character and I don't want that character to be stored or displayed in my tools or site.

Retrieving the value from the DB and displaying it returns the string with the '?' or inverted '?'

CJ
 
02.13.2003 at 11:24AM PST, ID: 7944282

Rank: Genius

>>yes.

You can't print Unicode characters to the console. Just show them in a GUI component and you'll probably find they're OK.

Another (rough) way to test:

String s = "\u20AC";
System.out.println(s.getBytes("UTF8").length);

You'll see that the length > 1

 
02.13.2003 at 11:27AM PST, ID: 7944300

Rank: Genius

OK - i'm not getting updates from this thread. You're ahead of me!
 
02.13.2003 at 11:33AM PST, ID: 7944353
:-)

Its not the console I am worried about, it is the data being stored in the DB that has this ugly character that other languages quering that data are displaying.

CJ
 
02.13.2003 at 11:33AM PST, ID: 7944357

Rank: Wizard

Why are you converting the string before yhou send it to the DB? The driver should be handling all necessary conversion!

\t
 
02.13.2003 at 11:39AM PST, ID: 7944406
The driver's conversion is resulting in the string being stored with those inverted '?' or '\ufffd' chars in the DB.  I don't want those in the DB.  Since the jdbc driver's conversion is doing this, I want to convert before the driver does it, so any unsupported chars are not replaced with ugly characters in the DB.

CJ
 
02.13.2003 at 11:41AM PST, ID: 7944423

Rank: Genius

That's what i was about to ask too.

>>that other languages quering that data are displaying.

They're all displaying the same thing are they?
 
02.13.2003 at 11:48AM PST, ID: 7944481
well the other languages querying the DB (besides Java) are Perl, ColdFusion and C.

They display the inverted '?' b/c that is what they get back from the DB.

Coldfusion and Perl also insert into the DB.  ColdFusion is having the same problem as Java.  Perl reads a environment setting var called 'NLS_LANG' that fixes this issue as we set the encoding to 'WE8ISO8859P1'.

The OCI Or