Solved

display arabic content

Posted on 2012-03-20
10
628 Views
Last Modified: 2012-06-27
Hi, we have to collect the arabic text from web gui, store in orcale 11g db and display it back in another page. we are able to collect and store the data. However, when we read the same text back from DB, the JSP pages are diplaying some wierd characters. Our JSPs have charset set to utf-8. Could you please let us know if you we need to take care of any thing else while reading the data from oracle or in jsp.
0
Comment
Question by:damarasa
  • 5
  • 3
  • 2
10 Comments
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37745984
you need to configure your database to handle UTF8 properly, I'm no Oracle expert but I guess following (Oracle) varaiables are used for that:
  ORA_NLS  ORA_NLS32  NLS_LANG
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 37747433
collect the arabic text from web gui
You will also need to be sure that the web gui uses the correct character encoding and direction.  This article gives some of the background on character set encodings.
http://www.joelonsoftware.com/articles/Unicode.html

Another possible solution would be to base64_encode() the information for transport and storage.  This script worked for another question here at EE.
http://www.laprbass.com/RAY_temp_markjulie.php
<?php // RAY_temp_markjulie.php
error_reporting(E_ALL);

// CONNECTION AND SELECTION VARIABLES FOR THE DATABASE
$db_host = "localhost"; // PROBABLY THIS IS OK
$db_name = "??";        // GET THESE FROM YOUR DBA / HOSTING PROVIDER
$db_user = "??";
$db_word = "??";

// REMOVE THIS FOR YOUR TESTS
require_once('RAY_live_data.php');

// OPEN A CONNECTION TO THE DATA BASE SERVER
// MAN PAGE: http://php.net/manual/en/function.mysql-connect.php
if (!$dbcx = mysql_connect("$db_host", "$db_user", "$db_word"))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>NO DB CONNECTION: ";
    echo "<br/> $errmsg <br/>";
}

// SELECT THE MYSQL DATA BASE
// MAN PAGE: http://php.net/manual/en/function.mysql-select-db.php
if (!$db_sel = mysql_select_db($db_name, $dbcx))
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>NO DB SELECTION: ";
    echo "<br/> $errmsg <br/>";
    die('NO DATA BASE');
}
// IF WE GOT THIS FAR WE CAN DO QUERIES


// CREATE A TABLE TO TEST WITH
$sql = "CREATE TEMPORARY TABLE my_table
( _key  INT  NOT NULL AUTO_INCREMENT
, thing TEXT NOT NULL DEFAULT ''
, PRIMARY KEY(_key)
)
"
;
$res = mysql_query($sql);

// IF mysql_query() RETURNS FALSE, GET THE ERROR REASONS
if (!$res)
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>QUERY FAIL: ";
    echo "<br/>$sql <br/>";
    die($errmsg);
}


// GET SOME ARABIC, ENCODE IT, AND PUT IT INTO THE DATA BASE
$arabic = file_get_contents('http://www.atoalif.com/ar/');
$arabic = base64_encode($arabic);
$sql = "INSERT INTO my_table (thing) VALUES ('$arabic')";
$res = mysql_query($sql);

// IF mysql_query() RETURNS FALSE, GET THE ERROR REASONS
if (!$res)
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>QUERY FAIL: ";
    echo "<br/>$sql <br/>";
    die($errmsg);
}

// GET THE AUTO_INCREMENT ID OF THE RECORD JUST INSERTED
// MAN PAGE: http://php.net/manual/en/function.mysql-insert-id.php
$_key  = mysql_insert_id();


// MAKING A SELECT QUERY AND TESTING THE RESULTS
$sql = "SELECT * FROM my_table WHERE _key = $_key LIMIT 1";
$res = mysql_query($sql);

// IF mysql_query() RETURNS FALSE, GET THE ERROR REASONS
if (!$res)
{
    $errmsg = mysql_errno() . ' ' . mysql_error();
    echo "<br/>QUERY FAIL: ";
    echo "<br/>$sql <br/>";
    die($errmsg);
}


// GET THE RESULTS SET AND DECODE IT
while ($row = mysql_fetch_assoc($res))
{
    echo base64_decode($row["thing"]);
}

Open in new window

0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37750286
maybe a bit off-topic:
> Another possible solution would be to base64_encode() ...
please make a test with proper en-/decoding functions on both ends of transport (i.e. browser and php) first; have fun with string €uro ;-)
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 37750563
The €uro happens to be at position 128 in the ANSI character set.  This is hex 80.  The presence of the high-order bit is also used by Unicode.  Fortunately life is full of second chances!
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37751830
> ..  This is hex 80.  
have fun
80 is for some M$ only, it should be hex 20ac or hex e282ac
conclusion: don't expect proper coding when decoding unless you know 101% which charset was used
@ Ray_Paseur, no offence, just a reminder that encoding/decoding with different assumtion on both ends of the traffic ands in a nightmare, sometimes, somehow

@damarasa, things get more exciting with arabic (or hindi, or ...)  charsets ;-)
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 

Author Comment

by:damarasa
ID: 37752429
Thanks for the comments/suggestions so far.  I am trying to provide some more details to get more help from all around:
(i) we have Oracle 11g DB created with AL32UTF8 character set.

(ii) Our HTMLs/JSPs are using utf-8 encoding
<%@page pageEncoding="UTF-8" contentType="text/html; charset=utf-8" %>

(iii) arabic is entered in the html form, stored the same string in DB (tried both varchar and nvarchar) - it looks garbled in toad.

(iv) when the same data is retrived and diplayed on the page - again it looks garbled.

(v) interestingly, when the arabic content is stored into db after explicit encoding, we could read it and diplayed on the page properly - arabic characters are diplayed.
Insert into table column values(new string(formdata.getbytes('utf-8'), "8859_1"))

(vi) similarly, when the data is written into db without any encoding, we tried to encode the data after it is retrieved from db - then also page displays correct arabic letters.
string = new string (resultset.getColumn().getBytes('utf-8'), "8859_1");

So, please let us know where we can improve to avoid explicit encoding of data while writing OR while reading  and still can display arabic content correctly.

Please guide us to any link that has comprehensive example.

Thanks a bunch in advace,
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37752679
(iii) well, most tools have problems with special charsets (unfortunately "special" is anything outside plain old 7-bit US-ASCII)
  luckyly, it doesnt matter what the charset in the DB is

if you db is used for storing text and giving back to be displayed elsewhere, the charset is not that important
things are different if you want to use your db for proper selections and sorting

up to now I understand your question that you just have problems with displaying (in the browser) data, this should simply work if your application (php or whatever) uses the same charset when writing and reading the db, then you need to ensure that the data you get from the browser is encoded in a way you expect it
in short: use the same charset to send data to the browser as you used when receiving data
0
 

Accepted Solution

by:
damarasa earned 0 total points
ID: 37787156
Sorry for a little silence from my side. My problem is solved.
Basically, my goal is to enable arabic support in my already existing web application built on using struts 1.2. In other words, we should display Arabic labels - should allow users to key in arabic characters - store that arabic strings entered into the forms into Oracle DB - Retrieve and display and arabic stored in DB - let it be modified and store it again... on. &..on..&...

With basic changes like
(i) charset setting at DB (NLSCHARSET=AL32UTF8)
(ii) page encoding in every JSP changed to UTF8, Apply 'dir=rtl' in html body tag, charcter encoding for Request, Response objects are set to UTF-8.
(iii) run JVM with -Dfile.encoding=utf8
etc... we are able to see arabic content flowing between pages smoothly. However, when we attempt to store it in DB and retrieve, the content is garbled.

On further probing what we realized is 'Struts is the culprit'.
If the data pass through any action servlet its encoding got changed. The solution we found is to set up a FILTER that explicitly set char encoding of request, response objects to UTF-8.
Pls. let me know in case of questions.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37787215
nice finding, congrats
0
 

Author Closing Comment

by:damarasa
ID: 37818615
Though I posted the question to the forum I kept on probing for solution and finally I got it. I posted the solution as my comment - hence accepted my own comments.

Additional Info:
As a rule of thumb, if the web application is built using Struts 1.x, one must set up a servlet filter to explicitly encode Request and Response objects.
This is not neccessary in case of  simple JSP/Servlet technologies are used to build your web app.
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
PHP foreach DB query 4 34
Video will not go into background 20 27
Video on my site 4 26
razorCMS: Change Menu Font 4 24
SASS allows you to treat your CSS code in a more OOP way. Let's have a look on how you can structure your code in order for it to be easily maintained and reused.
Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…
The viewer will learn the benefit of using external CSS files and the relationship between class and ID selectors. Create your external css file by saving it as style.css then set up your style tags: (CODE) Reference the nav tag and set your prop…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now