Problem with German characters passed as request

I am working on a German site where user is allowed to enter German chanracters in the text fields.

There is a form where user enters name and contents.
Then these values are stored in the DB and displayed in a page where there is a "Delete" link corresponding to each row.
User clicks on the Delete link and the  name is passed to a Servlet, as a request paraemter (with the URL).
The servlets gets the search name from the request and deletes the row from the DB.

Problem comes when the name field consists of all German characters
Suppose user saved the values with
name : < Ö¬ßÖ?Üo¦â»Æ+ >

When this value is passed as a request

 <a href="controller?action=deleteRow&searchName=<%= sName %>&param1=<%=param1%>" onclick="return confirmDelete()"> Delete</a><br />

and the servlet gets it using request.getParamter(), it is retreived as

< Ö¬ßÖ?Üo¦â»Æ >  //the last + sign is missing

Thie porblem comes when there are all German characters in the name. Even if there is just one english character in the name, it works perfectly fine.
PLease tell me how to resolve this.
LVL 8
thomas908Asked:
Who is Participating?
 
objectsConnect With a Mentor Commented:
and try:

java.net.URLEncoder.encode(sName, "UTF8")
0
 
Holger101497Connect With a Mentor Commented:
> Even if there is just one english character in the name, it works perfectly fine.

This is surprising :-) Have you tested this with more than one browser? :-)

MOST probably (>90%), this is a problem with your URL. Put a debug statement into your code to see if you get the correct name in the code at all (just a little general troubleshooting help :-)

In general, you HAVE TO escape all URL parameters (which could contain anything other than alphanumerics like this:
action=deleteRow&searchName=<%=java.net.URLEncoder.encode(sName)%>

This will turn every "illegal character" (like German special characters, but also spaces and &-signs) into the "well known" %20-thinghies...

Just imagine what happens if your sName is "Tom&Jerry" - your URL now turns into "....&searchName=Tom&Jerry&param1=..."

which will OF COURSE give you a search name of "Tom" and a strange empty parameter called Jerry :-)

Try it, let me know...
0
 
CEHJCommented:
What encoding have you set for the page containing the form?
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
objectsCommented:
try using utf8 for your pages
0
 
Holger101497Commented:
> try using utf8 for your pages

well, he could, but that's not strictly "required" - the default locales (ISO-8859-1) work quite fine for Germany :-)
Also, just changing that won't help because you could still have stuff like "&" or quotation marks in the data that will "break the link" if not urlencoded

P.S.:
< Ö¬ßÖ?Üo¦â»Æ >  //the last + sign is missing

Yes, that's exactly "part of the problem" I described. I'm really surprised everything else works, but probably the browser does a lot of work for you based on "smart guessing". However, if it finds a "+" in the URL, it doesn't do anything to it because a "+" is the official (or inofficial?) "encoding" for a space. Therefore, the browser doesn't touch it, but the servlet "decodes" it and turns it into a space....

> Even if there is just one english character in the name, it works perfectly fine.

Are you REALLY sure of that? Even if the string contains a "+"-sign??? Or a "&"? Or a quotation mark??? (that would break your HTML!)

Cheers,

Holger
0
 
thomas908Author Commented:
>>n general, you HAVE TO escape all URL parameters (which could contain anything other than alphanumerics like this:
action=deleteRow&searchName=<%=java.net.URLEncoder.encode(sName)%>

I did that and now I get < ?????o???+ > from the request
0
 
CEHJCommented:
>>and now I get

How are you viewing this? If at the console, that normally won't support higher 'ascii' characters
0
 
objectsCommented:
you'll to use a font that supports the chars you are using to be able to view what it is.
0
 
Holger101497Commented:
See? You get the "+" sign ok now :-D

Well... I don't know exactly what you're doing, character encoding seems to be a problem. Some people always say "use UTF-8" if there are any problems. That often works, but really isn't usually required and can actually cause problems depending on the encoding your files really have!

Which encoding do you use for your page(s)?
What's the default Locale on the server?

What's the page source for that input field now? Is this an encoding problem or a decoding problem?
0
 
thomas908Author Commented:
>>Are you REALLY sure of that? Even if the string contains a "+"-sign??? Or a "&"? Or a quotation mark???
No, you are right. This does not always. Couple of times when I had put some english character it worked, but it was just a coincidence.
0
 
thomas908Author Commented:
>> try using utf8 for your pages

i have this op top of my JSP page
<%@ page language="java" contentType="text/html; charset=UTF-8" %>
0
 
thomas908Author Commented:
Do I need to set encoding in the Seervlet also, where I am using request.getParameter() to get the name from the request
0
 
thomas908Author Commented:
I am viewing the names in the browser (they display perfectly fine). When I click on the Delete link, correponding to each name (name is send as a request parameter as specified above to a servlet), the Servlet does request.getParameter() to get the name. But it doesn't get the correct name (as mentioned in discussion above).

Instead of

<a href="controller?action=deleteRow&searchName=<%=java.net.URLEncoder.encode(sName)%>&param1=<%=param1%>" onclick="return confirmDelete()"> Delete</a><br />

Should i make it a form and submit it using Javascript (with POST)?
0
 
objectsCommented:
btw, why are you using the name to lookup. Wouldn't the primary key be a better choice? And you wouldn't have any issues with encoding then also.
0
 
thomas908Author Commented:
>>java.net.URLEncoder.encode(sName, "UTF8")
That works great. Thanks a lot objects.

Thanks everyone for pitching in.
0
 
Holger101497Commented:
pheeeeeew....

So I'm gone for 30 minutes, the question is closed and I get 20% of the points although I had basically already answered it... I'm REALLY not here for the points (what I can buy for them?), but this is the reason I don't do EE frequently any more - you really need to be here "full-time" or not at all :-(((
I would've had to answer within 10 minutes...

btw.:
This:
> Which encoding do you use for your page(s)?
> What's the default Locale on the server?

really was the solution. If the two are different, you can get into trouble - which can mostly be solved, but if the two were the same, it wouldn't be there in the first place. If your page didn't specify UTF-8-encoding, everything would've worked the way I had initially suggested... (and if they remain different, you might run into further trouble when using some libraries which don't handle this perfectly - sometimes they can't because they have no idea they're being called from a page with a different encoding...). I still recommend  using the same encoding...

*sigh*

Well, glad everything works now anyways...

Holger (no, not a "newbie expert")
0
 
thomas908Author Commented:
Thanks Holger for your inputs
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.