Nik, you describe an imaginary situation - is that the desired situation and you want to find a way to achieve it or do you already have this implemented and only need a way to uniquely name the files?
Sasha.
Main Topics
Browse All TopicsHello experts,
Imagine the following situation:
Several machines are running several system processes each, that happen to be JVMs. Each JVM runs the same software, that writes a file in a network directory somewhere. This file have to be assigned unique name, so that only one file exists per JVM and no files get overwritten. Let us assume that when a JVM is shutting down, it will delete it's corresponding file and even if it has crashed, there is a way to determine whether the particular file is "an orphan" without a running JVM. So forget about the reliability stuff.
The question is, how to generate the name of these files so it is 100% unique. The solution has to be 100% Pure Java. I have reached as far as putting the IP address of the machine in the name, but then I have to concatenate something else to distinguish the JVMs running on one machine. For machines with several network cards and IPs, we use one of them, as long as it won't change during runtime of any of the JVMs on that machine.
Cheers,
Nik
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Why dont you pass a paramters to java.exe to identify the JVM and use this parameters to generate the filename, is this because you dont control the startup of your JVM?
I just got a idea, you may use multicasting socket to notify the other JVM of your subnet that the filename XXX is now reserved, so the others JVM put the name in a array!
To use multicasting you need to add a address to your server between 224.0.0.0 and 239.255.255.255. We are using Multicasting for a similar problem!
If you need info on multicasting just ask!
Not really, my proposed mechanisism is only to notify other JVM that the current JVM is taking the name contain in the message sent! The problem with my solution is: With multicasting you cannot receive ACK because you dont know how many will receive the UDP packet!
I still think that a RMI server that provide unique ID is the best solution.
I want to know the solution that you will use Nik!!
I do not control the starting up of the JVM.
As I mentioned above, I can use the IP of the machine to distinguish between JVMs on different machines. Now, how can I distinguish between the JVMs on the same machine? They all run my code plus some custom code.
I was hoping on a miracle, somebody to give a 'secret' property ID of the JVM, but helaas, these things a pretyy much standartized.
One solution I have in mind is to use a Random generator, that does the following: generate a number, then generate a file name (using IP for example) and then look up whether any of the other file names for the same IP (if any) matches. If yes, generate another name. The only problem with this solution is again, the synchronization, since another JVM might be starting up at this moment checking for the name. The chances are pretty small for a name collision, but I want the best :-) (and meet the requirements).
May be a strategy like having two files (keys) per JVM, both unique, that are created with some time in between, so another JVM that is checking up file names at the same moment, can do this while the other one is delaying the creation of the second file (key).
Did you follow? What do you think?
Cheers,
Nik
Yes, a pure Java solution. I've heard that file locking schemes don't work in some OSes, and my code will run in god knows what OS. You may be right that opening a file exclusively may be used as an independend locking scheme. Any other ideas, similar to the two-phase file generation? I'll have a look to some transactional systems, how they implement the two phase commit.
Cheers,
Nik
OK Nik, how about that :
int RuntimeID = Runtime.getRuntime().hashC
int ThreadID = Thread.currentThread().has
take the ip (how do u get it, anyway?)
along with those two id`s, and u are almost 100% sure u got your uniqe ID.
running my code on several netscapes, explorers, java and applet viewers simultaniously, and I never saw both of them beeing identical.
>>I've heard that file locking schemes don't work in some OSes
Yes, I believe that's true...
Ok, Nik, how about this: Make a singleton class such as this:
public class Test{
private long jvmID = (long)(Math.random()*Long.
private static Test instance = null;
private Test(){}
public static Test getTest(){
if (instance==null)
instance = new Test();
return instance;
}
public static String getJVM_ID(){
return Test.getTest().jvmID;
}
}
and use getJVM_ID as the unique value for the JVM?
It looks like it should work... no?
Sasha.
WOW omry... great minds think alike -- I just spent the last 2 hours testing with the hashcode of Runtime. Unfortunately my tests showed it doesn't work. The hashCode returned by Runtime.hashCode includes its location in the internal memory of the JVM. What comes from my tests is that it is not very random...
Here is an example:
---- Test.java ----
import java.io.*;
public class Test{
public static void main(String [] args){
try{
InputStream in = new FileInputStream("jvmids.tx
OutputStream out = new FileOutputStream("jvmids1.
int b = 0;
while ((b = in.read()) != -1)
out.write(b);
in.close();
PrintStream pout = new PrintStream(out);
pout.println(Runtime.getRu
pout.flush();
pout.close();
new File("jvmids.txt").delete(
new File("jvmids1.txt").rename
} catch (IOException e){
e.printStackTrace();
}
}
}
---- end of Test.java ----
---- Run.java ----
import java.io.*;
public class Run{
public static void main(String [] args){
try{
for (int i=0;i<100;i++){
Runtime.getRuntime().exec(
System.out.println(i);
}
} catch (IOException e){
e.printStackTrace();
}
catch (InterruptedException e){
e.printStackTrace();
}
}
}
---- end of Run.java ----
---- Check.java ----
import java.io.*;
public class Check{
public static void main(String [] args){
try{
DataInputStream in = new DataInputStream(new FileInputStream("jvmids.tx
String [] jvmids = new String [10000];
int length = 0;
String jvmid;
while ((jvmid = in.readLine())!=null)
jvmids[length++] = jvmid;
for (int i=0;i<length;i++)
for (int j=i+1;j<length;j++)
if (jvmids[i].equals(jvmids[j
System.out.println("The JVM ID in line "+(i+1)+" equals the JVM ID in line "+(j+1));
in.close();
} catch (IOException e){
e.printStackTrace();
}
}
}
---- end of Check.java ----
Run the Run.java file first and then Check.java -- you will see many matches...
Sasha.
stupid me!
your code does the wrong test!
it proves that two virual machines that run one after the other can return the same result for
Runtime.getRuntime().hashC
not that two jvm`s running simultaniously can!
in your test, at any given momemnt, maximum of two jvms are runing :
the one running Run.class, and the one running Test.class.
and what it checks is only the id for Test.class, for this instance.
Test.class exists and run opens a new jvm for the Test.class.
it means it does not proves me wrong!
...//..//..//..//..//..//.
one hour later, I have proven I was wrong.
this modified code prove that even for jvm that running at the same time,
Runtime.getRuntime().hashC
can return the same value, and that the same value actualy come from diferent jvm`s (and its not because of some cache magic of the os or java.exe)
here it is (your Check.java will not work, but its easy enough to spot dupes with bare eyes).
--- jvm_id.java
public class jvm_id{
private long jvmID = (long)(Math.random()*Long.
private static jvm_id instance = null;
private jvm_id(){}
public static jvm_id getJVMID(){
if (instance==null)
instance = new jvm_id();
return instance;
}
public static String getJVM_ID(){
return String.valueOf(getJVMID().
}
}
---- Run.java
import java.io.*;
public class Run{
public static void main(String [] args){
for (int i=0;i<100;i++)
{
new Runner(i).start();
}
}
}
class Runner extends java.lang.Thread
{
int n;
Runner(int n)
{
System.out.println("new Runner"+n);
this.n = n;
}
public void run()
{
try
{
Runtime.getRuntime().exec(
System.out.println("done with jvm #"+n);
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
------ Test.java
import java.io.*;
public class Test{
public static void main(String [] args){
try{
InputStream in = new FileInputStream("jvmids.tx
OutputStream out = new FileOutputStream("jvmids1.
int b = 0;
while ((b = in.read()) != -1)
out.write(b);
in.close();
PrintStream pout = new PrintStream(out);
pout.println(args[0] + " : " + Runtime.getRuntime().hashC
pout.flush();
pout.close();
new File("jvmids.txt").delete(
new File("jvmids1.txt").rename
} catch (IOException e){
e.printStackTrace();
}
try{Thread.sleep(200);}cat
}
}
stupid me!
your code does the wrong test!
it proves that two virual machines that run one after the other can return the same result for
Runtime.getRuntime().hashC
not that two jvm`s running simultaniously can!
in your test, at any given momemnt, maximum of two jvms are runing :
the one running Run.class, and the one running Test.class.
and what it checks is only the id for Test.class, for this instance.
Test.class exists and run opens a new jvm for the Test.class.
it means it does not proves me wrong!
...//..//..//..//..//..//.
one hour later, I have proven I was wrong.
this modified code prove that even for jvm that running at the same time,
Runtime.getRuntime().hashC
can return the same value, and that the same value actualy come from diferent jvm`s (and its not because of some cache magic of the os or java.exe)
here it is (your Check.java will not work, but its easy enough to spot dupes with bare eyes).
--- jvm_id.java
public class jvm_id{
private long jvmID = (long)(Math.random()*Long.
private static jvm_id instance = null;
private jvm_id(){}
public static jvm_id getJVMID(){
if (instance==null)
instance = new jvm_id();
return instance;
}
public static String getJVM_ID(){
return String.valueOf(getJVMID().
}
}
---- Run.java
import java.io.*;
public class Run{
public static void main(String [] args){
for (int i=0;i<100;i++)
{
new Runner(i).start();
}
}
}
class Runner extends java.lang.Thread
{
int n;
Runner(int n)
{
System.out.println("new Runner"+n);
this.n = n;
}
public void run()
{
try
{
Runtime.getRuntime().exec(
System.out.println("done with jvm #"+n);
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
------ Test.java
import java.io.*;
public class Test{
public static void main(String [] args){
try{
InputStream in = new FileInputStream("jvmids.tx
OutputStream out = new FileOutputStream("jvmids1.
int b = 0;
while ((b = in.read()) != -1)
out.write(b);
in.close();
PrintStream pout = new PrintStream(out);
pout.println(args[0] + " : " + Runtime.getRuntime().hashC
pout.flush();
pout.close();
new File("jvmids.txt").delete(
new File("jvmids1.txt").rename
} catch (IOException e){
e.printStackTrace();
}
try{Thread.sleep(200);}cat
}
}
Omry, my test DOES work, it just doesn't do what you thought it does. I didn't try to prove that 2 JVMs running concurrently can return the same Runtime.getRuntime().hashC
Sasha.
only thing u are forgeting, Sasha, is that two JVM`s runnign at the same time are not using the same physical memory, so the actual memory address of each Runtime is different.
unfortunetly, Java wont let us know that address, even indirectly through hashCode(), so it does not matter.
and btw : a whole singleton is just too much for such a simple thing.
this is simplier, and will also work fine :
class jvm_id
{
private static long id = 0;
public static long getID()
{
if(id == 0)
{
id = (long)(Math.random()*Long.
}
return id;
}
}
>> two JVM`s runnign at the same time are not using the same physical memory
Yes, but if you KNOW the fact that the hashCode is just the internal memory location, you know that it doesn't matter :-)
I agree that the static number is simpler, but you need to use Long.MAX_VALUE-1 so that you don't overflow :-)
Sasha.
This of course doesn't prove anything, but from the documentation on Object.hashCode:
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
Sasha.
I agree that two JVMs can return the same hashCode() for the Runtime object.
Anyway, for the first approach 3 days ago, for something relatively random I used java.util.Random.nextLong(
Cheers,
Nik
Hmm, I thought the random number generator was implemented in native code that would allow accessing better time resolution than 55 ms on win95/98 - it isn't. What this means is that my solution isn't random enough either - on win95/98, 2 JVMs that get started within 55 ms of each other can easily return the same ID. The two comforting things I can think of are that the time of creation of my singleton is more random and that it takes much longer than 55ms to start a JVM (but still they can be run concurrently). What do you do with random numbers to get a more random number? :-) You combine them somehow. What I suggest is that you use both the random number trick and the memory address of more than one object (not singleton - just a static array of Objects that you instantiate at your will). You can run some tests either by yourself or by modifing the testing code given above to check whether that is random enough, but I guess that if after creating some objects, you create yet another java.util.Random and use that TOO, it should be very random.
As for a magic solution - the JVMs aren't aware of each other, so in theory they can be completely identical (except from outside one of them, where you can actually look at their location in the memory), which eliminates the possibility to positively uniquely identify a JVM.
Just wish that the random generator was better in Java (using time resolution in cpu ticks for example)...
Sasha.
btw Sasha, your statment :
"As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects"
is wrong.
hashCode may return the same ID for two diferent objects.
u may want to learn how a hashtable works internaly, and u`ll figure out why u are wrong.
Umm, Omry - it's not my statement, it's from the docs.
How about this idea - set up a server that would allow to connect to it and ask for an ID, the getID method would be synchronized and then it can just return 1,2,3,4 etc. Each time it's first run, your code will connect to that server and ask it for an ID - that ID will be the JVM ID.
The only problem that I can think of with that approach is that the server may not be able to handle that much load. However this is easy to solve - just divide the IPs into ranges and each range will connect to a different server (you'll set up more than 1 server), making sure each JVM gets a unique number...
Sasha.
Hehe, ranak - here is the source code that calculates this so called unique ID:
private static byte[] computeAddressHash() {
/** Get the local host's IP address. */
byte[] addr = (byte[]) java.security.AccessContro
new PrivilegedAction() {
public Object run() {
try {
return InetAddress.getLocalHost()
} catch (Exception e) {}
return new byte[] { 0, 0, 0, 0 };
}
});
byte[] addrHash;
final int ADDR_HASH_LENGTH = 8;
try {
/** Calculate message digest of IP address using SHA. */ MessageDigest md = MessageDigest.getInstance(
out.write(addr, 0, addr.length);
out.flush();
byte digest[] = md.digest();
int hashlength = Math.min(ADDR_HASH_LENGTH,
addrHash = new byte[hashlength];
System.arraycopy(digest, 0, addrHash, 0, hashlength);
} catch (IOException ignore) {
/* can't happen, but be deterministic anyway. */
addrHash = new byte[0];}
catch (NoSuchAlgorithmException complain) {
throw new InternalError(complain.toS
}
return addrHash;
}
What it essentially does is take the IP address of the host and apply SHA to it. I'm not sure how SHA exactly works, but since it doesn't have an access to a better random number generator - it can't give you a more random number than what I proposed (I think SHA is deterministic anyway...).
Sasha.
>>each new jvm will look for the first free port
Well, you'd better use the last free port and not the first one. Otherwise, I think it's a possible solution, but still needs to be tested, because I'm not quite sure what happens when you try to listen on a port that someone is already listening on, but I think it would throw an IOException. But overall, the solution sounds like it will work (what about the case when you aren't allowed to open a ServerSocket?).
Nik, why can't you use the server solution I gave?
Sasha.
Sasha, I do not want to use a server, as somone has to take care of starting it up on the host. Also, I do not want to use a fixed common resource (port) to resolve the reference (socket) to this server, since I do not know how many JVM the client may start, which first, etc. For example, in case of a socket application for the uniqueness generator, if the port is occupied for some reason, and the first free port is used, how the other JVMs are going to know about it?
My JVM can be 1.2 and higher.
ranak's suggestion java.rmi.dgc.VMID
"Create a new VMID. Each new VMID returned from this constructor is unique for all Java virtual machines under the following conditions: a) the conditions for uniqueness for objects of the class java.rmi.server.UID are satisfied, and b) an address can be obtained for this host that is unique and constant for the lifetime of this object."
lead me to
http://java.sun.com/produc
"Create a pure identifier that is unique with respect to the host on which it is generated. This UID is unique under the following conditions: a) the machine takes more than one second to reboot, and b) the machine's clock is never set backward. In order to construct a UID that is globally unique, simply pair a UID with an InetAddress. "
Sasha, Omry_y, I have the feeling that this is it. What do you think? I'll give it a try on Monday... I hope the toString() for this UID will always produce something that I can put in a name of a file (e.g. a number)
Cheers,
Nik
> I'm not quite sure what happens when you try to listen on a port that someone is already listening on, but I think it would throw an IOException'
well - the same problem as with file locking - it may work on some systems / JVM and not work on other (but it's better than files locking - I've used it several times)
Source code:
public UID() {
synchronized (mutex) {
if (lastCount == Short.MAX_VALUE) {
boolean done = false;
while (!done) {
time = System.currentTimeMillis()
if (time < lastTime+ONE_SECOND) {
// pause for a second to wait for time to change
try {
Thread.currentThread().sle
} catch (java.lang.InterruptedExce
continue;
} else {
lastTime = time;
lastCount = Short.MIN_VALUE;
done = true;
}
}
} else {
time = lastTime;
}
unique = hostUnique;
count = lastCount++;
}
}
Same randomness as Math.random() ran on 2 JVMs simultaneously. It just uses the time making sure that the method never returns twice in the same second.
Sasha.
> two programs running on the same computer cannot listen to the same port.
ABSOLUTELY NOT TRUE (on all kinds of Unix-es and as far as I know on Windows).
you can bind() (BSD sockets) more than socket to the same IP / port pair (using the appropriate socket options).
but all the Java implementations that I've seen won't bind ServerSocket to some port if some other Socket already listens there.
> is only available for applications with speciel needs,
well - BSD Sockets support multiple applications (even different processes) listening on the same socket (IP address / port) (the one that will handle first incoming socket connection is chosen randomly ??).
I'm not sure what 'special needs' is :) but as I already said, all Java implementations that I've been working with does not support this behaviour.
Also, I heard a rumour that MS is planning on releasing a new TCP stack version that allows multiple uses of the same sockets.
I had another idea, that I used to create usefull timestamps for distributed messages. I used NTP (there a lots of servers) which basically means sending some UDP packets to a server. The results is a very accurate time (in microsecond), so if I ask the time host of the current intranet (there is one in almost any intranet) or a well know host on the internet, it'll give me a good initializer for the Random. If all JVM (running my code) on the same machine use the same server, they have a good chance of having different initializers for their Randoms. Do you think this'll work?
Note that the delay that ievitably is there is kind of predicted and removed from the NTP, but some error is always there. I think this is insignificant when I would use NTP to get values for initializing the Random (however gnificant for the timestamps in the events that I produced, so I used some other techniques too - logical clocks).
Cheers,
Nik
>>>Each JVM runs the same software, that writes a file in a network directory somewhere. This file have to be assigned unique name, so that only one file exists per JVM and no files get overwritten.
Pardon me if I am wrong but can't you do a list of the current files and then form your unique file name somehow ?
I still dont see why using network ports is not good enough.
"Also, I do not want to use a fixed common resource (port) to resolve the reference (socket) to this server, since I do not know how many JVM the client may start, which first, etc. For example, in case of a socket application for the uniqueness generator, if the port is occupied for some reason, and the first free port is used, how the other JVMs are going to know about it?"
ports are virutualy unlimited in that context.
u have thousends of them, and only that so meny jvm`s one computer can handle, so using a "fixed common resource" is not a problem here".
you know enough to be sure he will not have more then 1,000 jvm on the same machine,right?
which stats first is not a problem.
u can state that your jvm`s uses ports 233000 up to 234000.
each jvm will look for the first available slot in this range, starting with 233000, and increading if someone is listening.
a listener may be one of your jvm`s, or just another program. it does not really matter.
and besides, using network comunication u can find out if its ont of your jvm`s (just ask through the socketu just opened, who is it).
I dont see any problem with this approach.
OK, how about synchronization then? After finished scanning, I wil do a row or two in the source code of calculations of the next id, and try to alloccted the next port. What if it is occupied by that time already? Just continue with the next port? Won't there be finer synch problems that will pop up later?
For the moment Random + init via NTP time is the runner up for the solution.
Cheers,
Nik
>>I would rather use ServerSocket based locking too
I thought that's what was meant from the beginning, you don't try to connect to ports (scanning, as Nik said - is that what he meant?) but instead try opening ServerSockets until you don't get an IOException when trying to open one... That should be pretty synchronized - you don't need any calculations, you decide on the port and open the socket in the same operation (which is done by the OS, so it should be synched, except for the BSD sockets heyhey described)...
Sasha.
Just FYI
multiple processes can't bind to single port.
This is allowed in case of Multicast.
Multicast is in between uniCast and Broadcast, where all the members subscribed to this multicast group can listen at that port.
So many classes in the rmi package are not just for RMI , which is also used by CORBA , distributed computing.
For more info on Multicast and using RMI pacakges look at the JINI docs .
java.rmi.server.ObjID is unique in VM.
It is to be used in assciation with java.rmi.dgc.VMID to identify any object uniquely across JVMs
java.rmi.server.UID is unique id on a host.
heyhey ,
as you said some flavors of UNIX allow multiple processes to bind to a single port if I recall correctly instead of INADDR_ANY they can use REUSEPORT (similar to REuSEADDRESS)flag but the caveat is all users must specify the flag. I thik that Solaris 2.6 does too.
ranak is right that this is mostly used alongwith Multicast and is not possible unless all processes use the REUSEPORT option.
using MultiCast multiple Sockets(process) can bind to a single port.
You can refer to any unix network programming like Recard steven's book
Multiple sockets are not allowed to bind to the same port for security reason.
They can do that if all sockets set the option of SO_REUSEADDR and other stuffs.
This was provided to support Multicast.
Ref:
http://www.cs.ucl.ac.uk/st
http://www.lcg.org/sock-fa
http://foldoc.doc.ic.ac.uk
ranak, VMID uses uid, and you saw the source code of its constructor. It uses random, the way I use it. This means the granularity is around several millisecond, and if two JVMs are ran within 1 ms, they'll get the same id's.
heyhey, this is exactly why I am reluctant to use sockets. Different OSes, different sockets. Anyway, I'll try this too.
Cheers,
Nik
In NT/2000 there is a thing called high-resolution cocks. It is an API that can deliver callbacks in microsecond, depending on the speed of the machine. Anyway, Java doesn't use that. The JMF does, when the native supoport is installed.
The UID() can be instantiated from anywhere I think, what do you mean with the 'server' stuff, Sasha? If I would use UID, I would only instantiate it, without any other RMI shtuff.
Cheers,
Nik
Sasha, isn't the constructor of UID public? This means I can use it directly and explicitly. I do not really care whether the RMI server will use it internally to generate it's uids, the problem is that this uid, under some circumstenses will not be unique between two JVMs running on the same machine. I want this to happen as rarely as possible.
Remember, I do not want a centralized server, that will give me ids, as it is a bottleneck and a single point of failure.
Cheers,
nik
It has been a good discussion. I close it now. Since I already had in mind all the random related stuff I won't be grading these ideas, and I believe omry_y has passed on the idea about the server sockets that is perfectly usable with Java.
Thanks you all for the ideas, I hope you learned new things too.
ranak, I am rejecting your answer, as it turns out UID is used by VMID, and UID uses random stuff again.
Cheers,
Nik
Business Accounts
Answer for Membership
by: diakovPosted on 2000-07-14 at 06:26:00ID: 3370246
Note that using random generator is not 100% unique between the JVMs. I imagine it is allowed to use a constant (for example one) amount of small shared helper files to store some information on the network drive along with the unique files.
Cheers,
Nik