Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1926
  • Last Modified:

Reading chinese characters using POI

Hi,

I have a file(.xls) which contains some chinese characters. Can you tell me
whether POI can read the chinese characters from the cells.

I do not wanted solution using "sun.jdbc.odbc.JdbcOdbcDriver" because my first 2 row has some information and depending on header starts from row 3 or 4.

and using sun.jdbc.odbc.JdbcOdbcDriver select query will be difficult

Therefore POI solution will be highly required.

Please Guide

Thanks
0
tia_kamakshi
Asked:
tia_kamakshi
  • 13
  • 11
  • 7
1 Solution
 
CEHJCommented:
afaik Unicode is supported
0
 
objectsCommented:
POI has had some issues handling chinese characters correctly, but I think most of them have been fixed. So make sure you are usong the latest version.

You will also need to correctly set the encoding used when writing each cell

0
 
tia_kamakshiAuthor Commented:
Please give me some example code to read excel file to correct encoding.

What latest version I should use for POI? What version you recommend.

I am getting excel file from client. I can open excel file and I am able to read chinese charaters from excel file

Please suggest with example

Many Thanks

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
tia_kamakshiAuthor Commented:
Thanks for your reply.

In this example we are reading plain text in an excel file

Which I am alreadng.

I need to read chinese characater

Please help me in reading chinese from POI

All chinese character is displaying as ????

Please Guide
0
 
objectsCommented:
Did you set the cell encoding?

 cell.setEncoding(HSSFCell.ENCODING_UTF_16);

And how are you displaying the cell text? Make sure you are using a font that supports chinese
0
 
CEHJCommented:
If you post your file, we can take a look
0
 
tia_kamakshiAuthor Commented:
Please find sample excel file attached. Please see product sheet only

Many Thanks
VLookupSample.xls
0
 
CEHJCommented:
Using this : http://technojeeves.com/joomla/index.php/free/113-read-excel-spreadsheet-with-java  i was able to read it. You can see the result below:
chin.png
0
 
tia_kamakshiAuthor Commented:
Hi,

I have also ran the code in Netbean6.5. My output was comming as ????

Then I again my jar file with the same code in command prompt, there also I get the output as ?????

I am using POI version 3.7. what POI version you have used?

Please find attached my output attached and librairies used

Please suggest
screenshot.PNG
librariesScreenshot.PNG
0
 
tia_kamakshiAuthor Commented:
With addition to above comments.

As suggested by objects to use

 cell.setEncoding

I don't know why my cell do not have method setEncoding

My HSSFCell do not have method setEncoding

Error says cannot find the symbol setEncoding

HSSFCell cell = (HSSFCell) list.get(j);
cell.setEncoding(HSSFCell.ENCODING_UTF_16);

Currently I am displaying text as
System.out.print(cell.getRichStringCellValue());

It is displaying output on Netbean6.5 output window.

Also I tried on command prompt.


I have the same issue

Please suggest
0
 
CEHJCommented:
>>Then I again my jar file with the same code in command prompt, there also I get the output as ?????

You're VERY unlikely ever to see Chinese characters in a Windows command prompt. There are several hoops to jump through before that can happen

>>I have also ran the code in Netbean6.5. My output was comming as ????

You need to ensure that Netbeans is using a font that supports Chinese
0
 
objectsCommented:
> System.out.print(cell.getRichStringCellValue());

you cannot do that to display chinese
instead save it to a file and display file with application that supports chinese
0
 
tia_kamakshiAuthor Commented:
Hi,

I have saved the text in file. Still no joy.

In output file I am still getting chinese character as ??????

Please see my log file attached with code I am using

Please guide

Many Thanks
package com.pbms.exceltest.businesslogic;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFCellStyle;
import org.apache.poi.hssf.usermodel.HSSFDataFormat;
import org.apache.poi.hssf.usermodel.HSSFFont;
import org.apache.poi.hssf.usermodel.HSSFRichTextString;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.util.CellRangeAddress;

/* ====================================================================
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
==================================================================== */
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

import java.util.Iterator;
import org.apache.log4j.Logger;

/**
 * File for HSSF testing/examples
 *
 * THIS IS NOT THE MAIN HSSF FILE!! This is a utility for testing functionality.
 * It does contain sample API usage that may be educational to regular API
 * users.
 *
 * @see #main
 * @author Andrew Oliver (acoliver at apache dot org)
 * @author Charles Johnson
 */
public class HssfDumper {
    Logger mL = Logger.getLogger(HssfDumper.class);
    /**
     * creates an {@link HSSFWorkbook} the specified OS filename.
     */
    private static HSSFWorkbook readFile(String filename)
        throws IOException {
        return new HSSFWorkbook(new FileInputStream(filename));
    }

    public void processExcel(String file) {
//        if (file.length < 1) {
//            System.err.println("java HssfDumper <file to dump>");
//
//            return;
//        }

        ExcelTestBaseBL baseBL = new ExcelTestBaseBL();

        String fileName = file;

        try {
            HSSFWorkbook wb = HssfDumper.readFile(fileName);

            for (int k = 0; k < wb.getNumberOfSheets(); k++) {
                HSSFSheet sheet = wb.getSheetAt(k);
                int numRows = sheet.getPhysicalNumberOfRows();
                System.out.printf("Sheet %d '%s' has %d row(s)\n", k,
                    wb.getSheetName(k), numRows);

               // mL.info("Sheet %d '%s' has %d row(s)\n"+wb.getSheetName(k)+""+numRows );

                for (Iterator<Row> rows = sheet.rowIterator(); rows.hasNext();) {
                    HSSFRow row = (HSSFRow) rows.next();

                    if (row == null) {
                        continue;
                    }

                    int numCells = row.getPhysicalNumberOfCells();
                    System.out.printf("Row %d has %d cell(s)\n",
                        row.getRowNum(), numCells);

                    for (Iterator<Cell> cells = row.cellIterator();
                            cells.hasNext();) {
                        HSSFCell cell = (HSSFCell) cells.next();
                        String value = null;

                        switch (cell.getCellType()) {
                            case HSSFCell.CELL_TYPE_FORMULA:
                                value = "FORMULA value=" +
                                    cell.getCellFormula();

                                break;

                            case HSSFCell.CELL_TYPE_NUMERIC:
                                value = "NUMERIC value=" +
                                    cell.getNumericCellValue();

                                break;

                            case HSSFCell.CELL_TYPE_STRING:
                                value = "STRING value=" +
                                    cell.getStringCellValue();

                                break;

                            default:
                        }

                        System.out.printf("Cell (%d,%d) %s\n",
                            cell.getRowIndex(), cell.getColumnIndex(), value);

                        mL.info(value);
                        baseBL.saveLog(value);
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Open in new window

// Saving log to file in attached (Log.txt)

public void saveLog(String text)
            throws Exception {
        if (text != null) {
            try {
                String fileName = "E:\\SMEC\\Logs\\ExcelTest\\log.txt";
                FileWriter fstream = new FileWriter(fileName, true);
                BufferedWriter out = new BufferedWriter(fstream);
                out.write(text);
                String newLine = System.getProperty("line.separator");
                out.write(newLine);
                out.close();
                fstream.close();
            } catch (Exception ex) {
                mL.error("Unable to print text: " + text, ex);
                System.out.print("Unable to print " + text);
                System.exit(0);
            }
        }
    }

Open in new window

log.txt
0
 
CEHJCommented:
You don't specify any encoding for output, so it will use that returned by

System.out.println(System.getProperty("file.encoding"));

(which probably doesn't support Chinese)



0
 
CEHJCommented:
I adapted the code to use an optional utf-8-encoded output file

http://technojeeves.com/joomla/index.php/free/113-read-excel-spreadsheet-with-java

Attached is the result:




xls-out.txt
0
 
tia_kamakshiAuthor Commented:
I have also used the same code I have posted above

What I have changed is I have modified main method to method processExcel() and then called it from my main method

And then I am prinitng it to my text file using method saveLog(value);

I have also pasted my savelog method above which is printing text to file

Pasting my this code below again

I can see in the link you have sent that PrintWriter is using UTF-8 as follows

out = new PrintWriter(new OutputStreamWriter(
                            new FileOutputStream(args[1]), "UTF-8"));


Do I need to modify to my method saveLog(String text) to reveive and write in UTF-8 encode only

Please see my code and advice

Many Thanks
// Saving log to file in attached (Log.txt)

public void saveLog(String text)
            throws Exception {
        if (text != null) {
            try {
                String fileName = "E:\\SMEC\\Logs\\ExcelTest\\log.txt";
                FileWriter fstream = new FileWriter(fileName, true);
                BufferedWriter out = new BufferedWriter(fstream);
                out.write(text);
                String newLine = System.getProperty("line.separator");
                out.write(newLine);
                out.close();
                fstream.close();
            } catch (Exception ex) {
                mL.error("Unable to print text: " + text, ex);
                System.out.print("Unable to print " + text);
                System.exit(0);
            }
        }
    }

Open in new window

0
 
CEHJCommented:
>>Do I need to modify to my method saveLog(String text) to reveive and write in UTF-8 encode only

Yes - or another encoding that allows Chinese, but since utf-8 is universal, i'd stick with that
0
 
CEHJCommented:
Incidentally, in case you don't realise, you can just use log4j (encoding set to utf-8) to write to a file instead of using that highly inefficient saveLog method
0
 
tia_kamakshiAuthor Commented:
Hi CEHJ,

These are my test methods. In my project I need to create output as html file after extracting these values.

Therefore I cannot use log4j

Can you please help me in modifying method saveLog()

Then I may need to change the code at line 123 (baseBL.saveLog(value);) in my main code

Please help me in fixing this problem

Many Thanks
0
 
CEHJCommented:
Why don't you simply use the code at the link i posted?
0
 
objectsCommented:
> Why don't you simply use the code at the link i posted?

Thats just a copy of what I had already posted earlier

for your saveLog() method try this:


public void saveLog(String text)
            throws Exception {
        if (text != null) {
            try {
                String fileName = "E:\\SMEC\\Logs\\ExcelTest\\log.txt";
                Writer fstream = new OutputStreamWriter(new FileOutputStream(fileName, true));
                BufferedWriter out = new BufferedWriter(fstream);
                out.write(text);
                String newLine = System.getProperty("line.separator");
                out.write(newLine);
                out.close();
                fstream.close();
            } catch (Exception ex) {
                mL.error("Unable to print text: " + text, ex);
                System.out.print("Unable to print " + text);
                System.exit(0);
            }
        }
    }
0
 
tia_kamakshiAuthor Commented:
Thanks for your replies

I am using the same code you have posted in the URL.

You can see my code is exactly same what you have in same URL

Modifying the method "saveLog(String text)" also has not helped

Please guide
0
 
CEHJCommented:
The last code posted in this question is wrong btw. As i mentioned above, you need to use UTF-8 in your Writer
0
 
tia_kamakshiAuthor Commented:
Can you please help me in fixing the code which saves text in my file

public void saveLog(String text)
            throws Exception {
        if (text != null) {
            try {
                String fileName = "E:\\SMEC\\Logs\\ExcelTest\\log.txt";
                Writer fstream = new OutputStreamWriter(new FileOutputStream(fileName, true));
                BufferedWriter out = new BufferedWriter(fstream);
                out.write(text);
                String newLine = System.getProperty("line.separator");
                out.write(newLine);
                out.close();
                fstream.close();
            } catch (Exception ex) {
                mL.error("Unable to print text: " + text, ex);
                System.out.print("Unable to print " + text);
                System.exit(0);
            }
        }
    }
0
 
CEHJCommented:
I'm wondering why you're not using ready-made code that works and does what you want? (The code i posted)

As i mentioned, you need to specify UTF-8 encoding


Writer fstream = new OutputStreamWriter(new FileOutputStream(fileName, true), "UTF-8");

Open in new window

0
 
objectsCommented:
> Can you please help me in fixing the code which saves text in my file

good to see your using what I suggested above, all you need is to now specify the correct encoding, for example if it is UTF-8 then you would use:

public void saveLog(String text)
            throws Exception {
        if (text != null) {
            try {
                String fileName = "E:\\SMEC\\Logs\\ExcelTest\\log.txt";
                Writer fstream = new OutputStreamWriter(new FileOutputStream(fileName, true), "UTF-8");
                BufferedWriter out = new BufferedWriter(fstream);
                out.write(text);
                String newLine = System.getProperty("line.separator");
                out.write(newLine);
                out.close();
                fstream.close();
            } catch (Exception ex) {
                mL.error("Unable to print text: " + text, ex);
                System.out.print("Unable to print " + text);
                System.exit(0);
            }
        }
    }


0
 
CEHJCommented:
>>... for example if it is UTF-8 then you would use: ...

There's

a. no need to repeat code already posted
b. no need to reinforce the use of bad practice
0
 
objectsCommented:
Grow up. I didn't repeat anything, I was just expanding on my earlier suggestion
0
 
tia_kamakshiAuthor Commented:
Many Thanks it worked great now
0
 
CEHJCommented:
:)
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 13
  • 11
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now