?
Solved

Why does this compile?

Posted on 2003-11-27
16
Medium Priority
?
964 Views
Last Modified: 2012-06-27
Well, I know why it does, I just was wondering at what point, and at what part of the java API does this get converted into proper code?

Is it at compilation?  Or execution?

Hehehe...now THIS is unmaintanable code ;-)

File:  a.java-----------------------------------------
/*
\u002a\u002f\u0070\u0075\u0062\u006c\u0069\u0063
\u0020\u0063\u006c\u0061\u0073\u0073\u0020\u0061
\u007b\u0070\u0075\u0062\u006c\u0069\u0063\u0020
\u0073\u0074\u0061\u0074\u0069\u0063\u0020
\u0076\u006f\u0069\u0064\u0020
\u006d\u0061\u0069\u006e\u0028
\u0053\u0074\u0072\u0069\u006e\u0067\u005b\u005d\u0061
\u0029\u007b\u0053\u0079\u0073\u0074\u0065\u006d\u002e
\u006f\u0075\u0074\u002e
\u0070\u0072\u0069\u006e\u0074\u006c\u006e\u0028
\u0022\u0048\u0069\u0022
\u0029\u003b\u007d\u007d\u002f\u002a
*/
0
Comment
Question by:TimYates
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
  • 4
16 Comments
 
LVL 15

Assisted Solution

by:JakobA
JakobA earned 1000 total points
ID: 9832727
Because unicode characters (\uHHHH) are read as souce by the javac compiler.
0
 
LVL 20

Expert Comment

by:Venabili
ID: 9832801
The unicode characters are converted as part of preparation for compilation.

Here for example is the explanation for one such character:
http://www.javaspecialists.co.za/archive/Issue050.html

But as far as I know, all they are converted then
0
 
LVL 20

Accepted Solution

by:
Venabili earned 1000 total points
ID: 9832901
Well - according to this:
http://www.chinalinuxpub.com/doc/oreillybookself/java/langref/ch02_01.htm

The conversion is not of all unicode but from all the others TO unicode... It is from the page I give here:
"
Since most operating environments do not support Unicode, Java uses a pre-processing phase to make sure that all of the characters of a program are in Unicode. This pre-processing comprises two steps:


Translate the program source into Unicode characters if it is in an encoding other than Unicode. Java defines escape sequences that allow all characters that can be represented in Unicode to be represented in other character encodings, such as ASCII or EBCDIC. The escape sequences are recognized by the compiler, even if the program is already represented in Unicode.

Divide the stream of Unicode characters into lines.

Conversion to Unicode
The first thing a Java compiler does is translate its input from the source character encoding (e.g., ASCII or EBCDIC) into Unicode. During the conversion process, Java translates escape sequences of the form \u followed by four hexadecimal digits into the Unicode characters indicated by the given hexadecimal values. These escape sequences let you represent Unicode characters in whatever character set you are using for your source code, even if it is not Unicode. For example, \u0000 is a way of representing the NUL character. "

I suppose it is better explanation than teh other one. But in all cases the explanation is - in pre-proccessing all the characters are turned to the same type (no matter if you use 'b' or the unicode value for this...)
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 15

Expert Comment

by:JakobA
ID: 9832980
and so 'a.java' gets compiled to 'a.class' :-)
0
 
LVL 35

Author Comment

by:TimYates
ID: 9833026
But shouldn't the character sequence:

\u002b

for example, only be valid inside a String literal of a java file?
0
 
LVL 20

Expert Comment

by:Venabili
ID: 9833052
The preproccessing change all the characters not only the ones in the String literals. The idea is not ot have troubles with the encodings I suppose. So it is a valid character no matter where it is.

"A Java program is a sequence of characters. These characters are represented using 16-bit numeric codes defined by the Unicode standard.[1] Unicode is a 16-bit character encoding standard that includes representations for all of the characters needed to write all major natural languages, as well as special symbols for mathematics. Unicode defines the codes 0 through 127 to be consistent with ASCII. Because of that consistency, Java programs can be written in ASCII without any need for programmers to be aware of Unicode. "

From the second link... It looks like the normal ASCII coding we use is just for programmers convinience...
Why don't you look at the link. I'll try to find some more info about all this....
0
 
LVL 15

Expert Comment

by:JakobA
ID: 9833074
Nope.  It is just *also* valid there.   javac thinks in unicode, so it is not really the unicode literals that are translated. It is everything else that is converted to unicode from whithever alphabet the file is written in.
0
 
LVL 20

Expert Comment

by:Venabili
ID: 9833096
JakobA,

Can you give us any source why you think so?
0
 
LVL 15

Expert Comment

by:JakobA
ID: 9833130
Nothing better than what you yourself have already given. Guess I butted in. sorry.
0
 
LVL 20

Expert Comment

by:Venabili
ID: 9833141
JakobA,

No problem. Just it is an interested thing (and something that not everyone even ever thought of ) so if you have any other sources?

Venabili
0
 
LVL 35

Author Comment

by:TimYates
ID: 9833263
> The preproccessing change all the characters

So Java DOES have a preprocessor?

Why am I not allowed #ifdef then? ;-)

Bah!
0
 
LVL 20

Expert Comment

by:Venabili
ID: 9833269
A question I ask myself every time when I start writing in Java after a few days writing in C :)

It looks like the compiler makes some pre proccessing...

0
 
LVL 35

Author Comment

by:TimYates
ID: 9835595
Thanks to both of you :-)

It still seems wrong...

Either it should only do chars in String constants, or allow me #defines ;-)

Hee hee!

Back to moving house!! :-)

Tim
0
 
LVL 15

Expert Comment

by:JakobA
ID: 9838124
Thanks.

Is not the #ifdef of C intended for machine specific situations?
   If this comp has 16-bit words then do this, else do that.

In the enthusiasm of "we are making a machine independent language" that would be left out.

Anyway  #ifdef (and particularly #define) are some of the prime 'shot yourself in the foot'-features of C. I tend to say good riddance :-)

regards JakobA
0
 
LVL 20

Expert Comment

by:Venabili
ID: 9838535
Absolutely agree that it seems wrong but... can we do anything? :)

Venabili
0
 
LVL 35

Author Comment

by:TimYates
ID: 9848838
> I tend to say good riddance :-)

Yeah, but they were SOOOO useful for building the same source up for different machines/releases, etc

#ifndef SHAREWARE
// Full release code in here
...
...
...
#endif

:-(
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:
Suggested Courses
Course of the Month9 days, 10 hours left to enroll

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question