Solved

Linux Java RMI connection refused to host

Posted on 2011-03-10
94
1,345 Views
Last Modified: 2012-08-13
We have developed a Java program that has a server / client setup through Java RMI. We are trying to deploy this in a linux environment using Amazon Web Services EC2. The client and server are able to connect from a single machine but unable to connect from separate machines where we get the following error:

java.rmi.ConnectException: Connection refused to host: LOCAL_IP_CLIENT; nested exception is:
      java.net.ConnectException: Connection refused
0
Comment
Question by:aseisman
  • 34
  • 26
  • 21
  • +1
94 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 35098195
Are they on the same network or different ones?
0
 

Author Comment

by:aseisman
ID: 35098222
They are on the same network and should have all communication opened between them (ie they can ping each other and have file shares setup).
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35098285
Can you telnet to it using the RMI port?
0
 

Author Comment

by:aseisman
ID: 35098337
This is what I tried:

telnet 10.110.9.85:1099
telnet: 10.110.9.85:1099: Name or service not known
10.110.9.85:1099: Unknown host

it hangs for a second before giving the error
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35098618

When you start the RMIregistry and then start the server does it connect
to reegistry without any errors - ?
Perhaps you want to have put a System printout after the start of the
server object and its connection to the registry
0
 

Author Comment

by:aseisman
ID: 35098636
When I start the client on the same machine as the server is running, the client is able to connect without a problem.

This error is the same error I get when running the client on the same machine as the server when the server is NOT running.

In addition, we had this software running on a Windows network and are trying to get it to run on Linux.

The above make me think that it has something to do with system level settings but we have been going at this for days are are really not sure how to proceed.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35098637
That should be
telnet 10.110.9.85 1099

Open in new window

0
 

Author Comment

by:aseisman
ID: 35098660
When I attempt to telnet that way I get:

Trying 10.110.9.85...
Connected to 10.110.9.85.
Escape character is '^]'.

and the only way to get out of it is to ctrl+c .

I get the same result when I am on the machine running the server and try to telnet to itself.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35098680
That's good. So you don't have a network problem by the looks
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35098690
It looks like in this last telnet attempt it was not using port 1099

Perhpas you have some kind of firewall and you need to open port 1099 ?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35098697
Can you please post the full stack trace?
0
 

Author Comment

by:aseisman
ID: 35098701
This is exactly what I typed (left out that first line before):

telnet 10.110.9.85 1099
Trying 10.110.9.85...
Connected to 10.110.9.85.
Escape character is '^]'.
0
 

Author Comment

by:aseisman
ID: 35098773
bash run.sh
^\2011-03-10 19:47:28
Full thread dump OpenJDK 64-Bit Server VM (19.0-b06 mixed mode):

"Low Memory Detector" daemon prio=10 tid=0x00007f1b38099800 nid=0x31b9 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x00007f1b38097000 nid=0x31b8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x00007f1b38094000 nid=0x31b7 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f1b38092000 nid=0x31b6 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f1b38073800 nid=0x31b5 in Object.wait() [0x00007f1b3da28000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
      - locked <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
      at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007f1b38071800 nid=0x31b4 in Object.wait() [0x00007f1b3db29000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)
      at java.lang.Object.wait(Object.java:502)
      at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
      - locked <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f1b38007000 nid=0x31b0 runnable [0x00007f1b3f331000]
   java.lang.Thread.State: RUNNABLE
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1447)
      - locked <0x00000000eb2d8d88> (a java.util.HashMap)
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1445)
      - locked <0x00000000eb2d9e88> (a java.util.HashMap)
      at java.net.URLClassLoader.defineClass(URLClassLoader.java:237)
      at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
      at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
      at java.security.AccessController.doPrivileged(Native Method)
      at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:266)

"VM Thread" prio=10 tid=0x00007f1b3806b800 nid=0x31b3 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f1b38012000 nid=0x31b1 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f1b38014000 nid=0x31b2 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f1b3809c000 nid=0x31ba waiting on condition

JNI global references: 859

Heap
 PSYoungGen      total 298688K, used 5121K [0x00000000eb2b0000, 0x0000000100000000, 0x0000000100000000)
  eden space 256064K, 2% used [0x00000000eb2b0000,0x00000000eb7b0528,0x00000000facc0000)
  from space 42624K, 0% used [0x00000000fd660000,0x00000000fd660000,0x0000000100000000)
  to   space 42624K, 0% used [0x00000000facc0000,0x00000000facc0000,0x00000000fd660000)
 PSOldGen        total 682688K, used 0K [0x00000000c1800000, 0x00000000eb2b0000, 0x00000000eb2b0000)
  object space 682688K, 0% used [0x00000000c1800000,0x00000000c1800000,0x00000000eb2b0000)
 PSPermGen       total 21248K, used 2535K [0x00000000b7000000, 0x00000000b84c0000, 0x00000000c1800000)
  object space 21248K, 11% used [0x00000000b7000000,0x00000000b7279fc0,0x00000000b84c0000)

FileServer exception: 10.110.9.85/RerunScheduleServer
java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:
      java.net.ConnectException: Connection refused
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35098822
>>Connection refused to host: 10.110.198.30

That's a different host than the one you tested. THAT is the one you should be testing
0
 

Author Comment

by:aseisman
ID: 35098880
That is the IP of the client machine. We tested it to the server machine. Am I interpreting it wrong to think that it is saying the client (that IP address) is being refused by the server (the address we tested)?
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35098890

They had similar problem in
http://forum.springsource.org/showthread.php?t=33711
This is the solution, perhaps it is worth looking into it:


figured it out
Well, since I didn't have time to wait for a reply, I went ahead and banged my head against it for a little while and figured out what to do.

After re-reading the links from thosmas' posts, I realized that it was the hosts file on the server that needed to be fixed.

If your hosts file has the first line as:
Code:

127.0.0.1    localhost  {hostname}

you must remove the {hostname} from that first line and add another line like:
Code:

{actual ip address}   {hostname}

That worked for me.
0
 

Author Comment

by:aseisman
ID: 35098929
What should the hostname be. Currently the hosts file is:

#
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.
# Syntax:
#    
# IP-Address  Full-Qualified-Hostname  Short-Hostname
10.110.9.85 localhost
~                      
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35099001
Don't alter any hosts files.

>>Am I interpreting it wrong to think that it is saying the client (that IP address) is being refused by the server (the address we tested)?

You could well be right. Try the testing the other way around
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35099002
well, it kind of looks you have it OK, as you don't have 127.0.0.1 as the first line
Fully qulaified hostname should be  computer_name.domain.com instead of localhost
but I'm not sure that would make difference
 
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35099057
How do you do the binding in the server-side code, like that:

Registry registry = LocateRegistry.getRegistry(1099);
reagistry.rebind("name",object_id);

0
 
LVL 47

Expert Comment

by:for_yan
ID: 35099269
Do you refer to "localhost" in the binding operations?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35099337
If you find problems testing the other way around, you need to start looking into 'firewalling' or filtering issues for your distro. You might try
man hosts.allow
man hosts.deny

Open in new window

0
 

Author Comment

by:aseisman
ID: 35099456
I am trying to investigate the firewall issue because I have long thought that it was possibly the cause. The server machine is running a distribution of SLES and the other is Fedora.

When I go into YAST on the SLES machine and attempt to configure the firewall it says that "another firewall is running". If you have any suggestion of how to proceed I would appreciate it. Otherwise, will revert back when I have exhausted this avenue.
0
 
LVL 92

Expert Comment

by:objects
ID: 35099557
check your iptables

sudo iptables -L
0
 

Author Comment

by:aseisman
ID: 35099604
I previously attempted to open this port in the IP tables to solve this issue, my current IP tables on the server machine read:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination        
ACCEPT     tcp  --  anywhere             anywhere            tcp spt:rmiregistry state ESTABLISHED

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination        

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination        

The exact same is the case for the client machine.
0
 
LVL 92

Expert Comment

by:objects
ID: 35099631
> I previously attempted to open this port in the IP tables to solve this issue

theres was no need to do that
you should undo the change you made
0
 

Author Comment

by:aseisman
ID: 35099648
Does that mean firewall is not the issue based on the IP tables?
0
 
LVL 92

Expert Comment

by:objects
ID: 35099659
whats the answer to for_yan's question above?
0
 

Author Comment

by:aseisman
ID: 35099675
We do not refer to localhost in the binding operations.
0
 
LVL 92

Expert Comment

by:objects
ID: 35099929
can you post how you are starting up the registry
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35099966
It's nothing to do with naming - it knows exactly what host to hit and it isn't being allowed

It's probably a BAD idea to start messing directly with iptables directly- you could get yourself into a mess if you don't know what you're doing. Check what higher level tools are using iptables for your distro first
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35099982
0
 

Author Comment

by:aseisman
ID: 35100059
I am only able to work on these servers from a terminal session. Could the firewall issue be on both machine or just the server side? The server side is running SLES not fedora.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35100077
Looks like you should focus on the servver
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35100097
0
 
LVL 92

Expert Comment

by:objects
ID: 35100111
> Could the firewall issue be on both machine or just the server side?

there doesn't appear to be any firewall issue

still need to see how the registry is being started up on the server
0
 

Author Comment

by:aseisman
ID: 35100125
When I go to configure the firewall in YAST it says that the firewall is disabled. It also says that there is another firewall running (but not what it is) and that turning ON the firewall in YAST could cause problems unless I have turned off the other firewall.

Maybe it is possible that it thinks that there is another firewall because of the single entry I put into the IP tables. If that is the case, then it doesn't seem like a firewall issue.
0
 

Author Comment

by:aseisman
ID: 35100141
In order to get at the part of the code that does the registering I am going to need to consult with a colleague who is gone for the day. I will talk to him tomorrow and post what I have at that point. Thank you for all of your help so far. If you have any thoughts beyond the registry code, I will continue to try to trouble shoot today as well.
0
 
LVL 92

Expert Comment

by:objects
ID: 35100370
another thing you could do is use something like lsof or netstat to check what the server is listening on
0
 

Author Comment

by:aseisman
ID: 35100424
netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        1      0 localhost:34477         localhost:58238         CLOSE_WAIT  
tcp        0      0 localhost:ssh           ool-6039f6ab.stat:35043 ESTABLISHED
tcp        0      0 localhost:58238         localhost:34477         FIN_WAIT2  
tcp        0      0 localhost:57259         localhost:59374         ESTABLISHED
tcp        0    160 localhost:ssh           ool-6039f6ab.stat:45241 ESTABLISHED
tcp        0      0 localhost:ssh           ool-6039f6ab.stat:42805 ESTABLISHED
tcp        0      0 localhost:52402         localhost:40750         FIN_WAIT2  
tcp        1      0 localhost:40750         localhost:52402         CLOSE_WAIT  
tcp        0     23 localhost:59374         localhost:57259         ESTABLISHED
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags       Type       State         I-Node Path
unix  13     [ ]         DGRAM                    2603   /dev/log
unix  2      [ ]         DGRAM                    1421   @/org/kernel/udev/udevd
unix  2      [ ]         DGRAM                    2698   @/org/freedesktop/hal/udev_event
unix  2      [ ]         DGRAM                    187662
unix  3      [ ]         STREAM     CONNECTED     186972
unix  3      [ ]         STREAM     CONNECTED     186971
unix  3      [ ]         STREAM     CONNECTED     186688
unix  3      [ ]         STREAM     CONNECTED     186687
unix  3      [ ]         STREAM     CONNECTED     177657
unix  3      [ ]         STREAM     CONNECTED     177656
unix  2      [ ]         DGRAM                    121114
unix  2      [ ]         DGRAM                    120979
unix  3      [ ]         STREAM     CONNECTED     120896
unix  3      [ ]         STREAM     CONNECTED     120895
unix  2      [ ]         DGRAM                    14045  
unix  3      [ ]         STREAM     CONNECTED     14044  
unix  3      [ ]         STREAM     CONNECTED     14043  
unix  3      [ ]         STREAM     CONNECTED     14042  
unix  3      [ ]         STREAM     CONNECTED     14041  
unix  3      [ ]         STREAM     CONNECTED     14040  
unix  3      [ ]         STREAM     CONNECTED     14039  
unix  3      [ ]         STREAM     CONNECTED     14038  
unix  3      [ ]         STREAM     CONNECTED     14037  
unix  3      [ ]         STREAM     CONNECTED     14036  
unix  3      [ ]         STREAM     CONNECTED     14035  
unix  3      [ ]         STREAM     CONNECTED     14034  
unix  3      [ ]         STREAM     CONNECTED     14033  
unix  3      [ ]         STREAM     CONNECTED     14032  
unix  3      [ ]         STREAM     CONNECTED     14031  
unix  3      [ ]         STREAM     CONNECTED     14030  
unix  3      [ ]         STREAM     CONNECTED     14029  
unix  3      [ ]         STREAM     CONNECTED     14028  
unix  3      [ ]         STREAM     CONNECTED     14027  
unix  3      [ ]         STREAM     CONNECTED     14026  
unix  3      [ ]         STREAM     CONNECTED     14025  
unix  3      [ ]         STREAM     CONNECTED     14024  
unix  3      [ ]         STREAM     CONNECTED     14023  
unix  3      [ ]         STREAM     CONNECTED     14022  
unix  3      [ ]         STREAM     CONNECTED     14021  
unix  3      [ ]         STREAM     CONNECTED     14020  
unix  3      [ ]         STREAM     CONNECTED     14019  
unix  3      [ ]         STREAM     CONNECTED     14018  
unix  3      [ ]         STREAM     CONNECTED     14017  
unix  3      [ ]         STREAM     CONNECTED     14016  
unix  3      [ ]         STREAM     CONNECTED     14015  
unix  3      [ ]         STREAM     CONNECTED     14014  
unix  3      [ ]         STREAM     CONNECTED     14013  
unix  3      [ ]         STREAM     CONNECTED     14012  
unix  3      [ ]         STREAM     CONNECTED     14011  
unix  3      [ ]         STREAM     CONNECTED     14010  
unix  3      [ ]         STREAM     CONNECTED     14009  
unix  3      [ ]         STREAM     CONNECTED     14008  
unix  3      [ ]         STREAM     CONNECTED     14007  
unix  3      [ ]         STREAM     CONNECTED     14006  
unix  3      [ ]         STREAM     CONNECTED     14005  
unix  3      [ ]         STREAM     CONNECTED     14004  
unix  3      [ ]         STREAM     CONNECTED     14003  
unix  3      [ ]         STREAM     CONNECTED     14002  
unix  3      [ ]         STREAM     CONNECTED     14001  
unix  3      [ ]         STREAM     CONNECTED     14000  
unix  3      [ ]         STREAM     CONNECTED     13999  
unix  3      [ ]         STREAM     CONNECTED     13998  
unix  3      [ ]         STREAM     CONNECTED     13997  
unix  3      [ ]         STREAM     CONNECTED     13996  
unix  3      [ ]         STREAM     CONNECTED     13995  
unix  3      [ ]         STREAM     CONNECTED     13994  
unix  3      [ ]         STREAM     CONNECTED     13993  
unix  3      [ ]         STREAM     CONNECTED     13992  
unix  3      [ ]         STREAM     CONNECTED     13991  
unix  3      [ ]         STREAM     CONNECTED     13990  
unix  3      [ ]         STREAM     CONNECTED     13989  
unix  3      [ ]         STREAM     CONNECTED     13988  
unix  3      [ ]         STREAM     CONNECTED     13987  
unix  2      [ ]         DGRAM                    8461  
unix  2      [ ]         DGRAM                    7722  
unix  2      [ ]         DGRAM                    4114  
unix  2      [ ]         DGRAM                    4074  
unix  2      [ ]         DGRAM                    4068  
unix  3      [ ]         STREAM     CONNECTED     4067  
unix  3      [ ]         STREAM     CONNECTED     4066  
unix  2      [ ]         DGRAM                    3826  
unix  2      [ ]         DGRAM                    3397  
unix  3      [ ]         STREAM     CONNECTED     2693   @/var/run/hald/dbus-IlQmQqKenD
unix  3      [ ]         STREAM     CONNECTED     2691  
unix  3      [ ]         STREAM     CONNECTED     2671   /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     2670  
unix  3      [ ]         STREAM     CONNECTED     2656   /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     2655  
unix  3      [ ]         STREAM     CONNECTED     2578  
unix  3      [ ]         STREAM     CONNECTED     2577  
0
 
LVL 92

Expert Comment

by:objects
ID: 35100498
do

netstat -a | grep LISTEN
0
 

Author Comment

by:aseisman
ID: 35100535
netstat -a | grep LISTEN
tcp        0      0 *:ssh                   *:*                     LISTEN      
tcp        0      0 domU-12-31-39-16-4:smtp *:*                     LISTEN      
tcp        0      0 *:46811                 *:*                     LISTEN      
tcp        0      0 *:nfs                   *:*                     LISTEN      
tcp        0      0 *:60454                 *:*                     LISTEN      
tcp        0      0 *:mysql                 *:*                     LISTEN      
tcp        0      0 *:rmiregistry           *:*                     LISTEN      
tcp        0      0 *:59374                 *:*                     LISTEN      
tcp        0      0 *:57231                 *:*                     LISTEN      
tcp        0      0 *:sunrpc                *:*                     LISTEN      
tcp        0      0 *:www-http              *:*                     LISTEN      
tcp        0      0 *:ftp                   *:*                     LISTEN      
unix  2      [ ACC ]     STREAM     LISTENING     8092   private/local
unix  2      [ ACC ]     STREAM     LISTENING     8096   private/virtual
unix  2      [ ACC ]     STREAM     LISTENING     8100   private/lmtp
unix  2      [ ACC ]     STREAM     LISTENING     8104   private/anvil
unix  2      [ ACC ]     STREAM     LISTENING     8108   private/scache
unix  2      [ ACC ]     STREAM     LISTENING     8112   private/maildrop
unix  2      [ ACC ]     STREAM     LISTENING     8116   private/cyrus
unix  2      [ ACC ]     STREAM     LISTENING     8120   private/uucp
unix  2      [ ACC ]     STREAM     LISTENING     8124   private/ifmail
unix  2      [ ACC ]     STREAM     LISTENING     8128   private/bsmtp
unix  2      [ ACC ]     STREAM     LISTENING     8132   private/procmail
unix  2      [ ACC ]     STREAM     LISTENING     8136   private/retry
unix  2      [ ACC ]     STREAM     LISTENING     8140   private/proxywrite
unix  2      [ ACC ]     STREAM     LISTENING     7577   /var/run/nscd/socket
unix  2      [ ACC ]     STREAM     LISTENING     4211   /var/lib/mysql/mysql.sock
unix  2      [ ACC ]     STREAM     LISTENING     2672   @/var/run/hald/dbus-IlQmQqKenD
unix  2      [ ACC ]     STREAM     LISTENING     2574   /var/run/dbus/system_bus_socket
unix  2      [ ACC ]     STREAM     LISTENING     4075   /var/run/audispd_events
unix  2      [ ACC ]     STREAM     LISTENING     4102   /var/run/rpcbind.sock
unix  2      [ ACC ]     STREAM     LISTENING     8037   public/cleanup
unix  2      [ ACC ]     STREAM     LISTENING     8044   private/rewrite
unix  2      [ ACC ]     STREAM     LISTENING     8048   private/bounce
unix  2      [ ACC ]     STREAM     LISTENING     2654   @/var/run/hald/dbus-oHzqlhrxCP
unix  2      [ ACC ]     STREAM     LISTENING     8052   private/defer
unix  2      [ ACC ]     STREAM     LISTENING     8056   private/trace
unix  2      [ ACC ]     STREAM     LISTENING     8060   private/verify
unix  2      [ ACC ]     STREAM     LISTENING     8064   public/flush
unix  2      [ ACC ]     STREAM     LISTENING     8068   private/proxymap
unix  2      [ ACC ]     STREAM     LISTENING     8072   private/smtp
unix  2      [ ACC ]     STREAM     LISTENING     8076   private/relay
unix  2      [ ACC ]     STREAM     LISTENING     8080   public/showq
unix  2      [ ACC ]     STREAM     LISTENING     8084   private/error
unix  2      [ ACC ]     STREAM     LISTENING     8088   private/discard
0
 
LVL 92

Expert Comment

by:objects
ID: 35100720
> tcp        0      0 *:rmiregistry           *:*                     LISTEN      

next check if you can connect to the server on that port
0
 

Author Comment

by:aseisman
ID: 35100763
What port? "rmiregistry"? I think that is just the default port of 1099 and I think that is how I am trying to connect now when getting the original error.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35100776
If you're using netstat, you want to do
netstat -pant

Open in new window

0
 

Author Comment

by:aseisman
ID: 35100881
Using netstat -pant I think this is my server listening:

tcp        0      0 0.0.0.0:1099            0.0.0.0:*               LISTEN      25478/java  
0
 
LVL 92

Expert Comment

by:objects
ID: 35100894
> What port? "rmiregistry"?

netstat -an | grep LISTEN
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 

Author Comment

by:aseisman
ID: 35100910
The port is 1099, how do you want me to try to connect to this port?
0
 
LVL 92

Expert Comment

by:objects
ID: 35100911
so (from the client) try to telnet to the server on port 1099
0
 

Author Comment

by:aseisman
ID: 35100956
telnet 10.110.9.85 1099
Trying 10.110.9.85...
Connected to 10.110.9.85.
0
 
LVL 92

Expert Comment

by:objects
ID: 35101053
now do you run the client from that same box?
if so do you use the same ip to connect?
0
 

Author Comment

by:aseisman
ID: 35101074
Yes and Yes (and it doesn't work).
0
 
LVL 92

Expert Comment

by:objects
ID: 35101183
can you post the full stack trace for the exception
0
 

Author Comment

by:aseisman
ID: 35101194
That has already been posted above.
0
 
LVL 92

Expert Comment

by:objects
ID: 35101232
Need to see the full stack trace

> java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:

and the exception posted was trying to connect to a different ip
Need to see the full stack trace when the client tries to connect to server (10.110.9.85)
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35101233
Can you ping from server to client?
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35101250
Was that dump from server?
and what is stck trace as it seen form the client?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35101283
You need to ensure that the RMI sockets get bound to routable addresses, which in your case would be your private IP addresses. At the moment, at least one of them isn't
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35101305
Try the following (using the correct address of course)
java -Djava.rmi.server.hostname=10.110.9.85 YourApp

Open in new window

0
 

Author Comment

by:aseisman
ID: 35102066
I can ping between the servers.

The stack trace was from the client side

Please be more specific about how to ensure that my RMI sockets get bound to a routable address and which one is not (is it definitely my program that is not binding correctly)

The Java -D... did not work
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35102115
If the stack trace is from the client side why would the client request rmi connection to itself,
as you mentioned that 10.110.198.30 is the client ip address

>>java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:
      java.net.ConnectException: Connection refused

Do you have the client-side code when it establishes the connection?

0
 

Author Comment

by:aseisman
ID: 35102140
I interpret that statement to mean that the connection to the server has been refused to the host where the host is the client computer. Am I interpreting this incorrectly?
0
 
LVL 92

Expert Comment

by:objects
ID: 35102161
> The Java -D... did not work

no it wouldn't make a difference. That property has nothing to do with making the connection
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35102168
Well, i happen to have analogous situation - I am simply running some app which tries to connect to
abandoned RMI server - it reports server IP address in such message

What is your code when you discover registry on the client? Maybe you are looking on localhost?
0
 
LVL 92

Expert Comment

by:objects
ID: 35102215
> The stack trace was from the client side

you've just posted the exception, need to see the full stack trace

also what security policies do you have in place, both client and server?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35103057
Try the following: take a backup of any /etc/hosts files that have preceding comments then delete the preceding comments. You might need to restart the box(es)
0
 
LVL 92

Expert Comment

by:objects
ID: 35103215
When for_yan suggested making changes to the hosts file the following was posted

CEHJ> Don't alter any hosts files.

go figure
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35107196
>>go figure

I was talking about making edits to the functional contents of dns. My last suggestion is regarding removing crud from the file that doesn't alter dns
0
 
LVL 92

Expert Comment

by:objects
ID: 35107218
of course you were :-D

not sure why you'd need to restart the boxes if you haven't altered the DNS
0
 

Author Comment

by:aseisman
ID: 35109130
Some new information:

This morning I cloned the server machine and now have two identical machines running with different DNS names and IP addresses.

I will refer to the original machine as "OLD" and thew cloned as "NEW"

I can start the server and client on the OLD machine and it works.

When I then try to start the client on the NEW machine it does not work.

I then tried to start the server on the NEW machine with the OLD configuration (ie it points to the OLD server for the RMI binding). This works!

I then tried to connect the client on the NEW machine with the OLD configuration, and it works!!!

(this is all of course wrong).

If I close the server on the OLD machine, the client on the NEW one does not disconnect.

If I close the server on the NEW machine (supposedly ... but not actually) bound to the OLD machine's IP, the NEW client disconnects.

One more variation is IF I change the config on the NEW machine so that the server points to itself and the client points to the OLD ip, then it does not work.

What I take from all this is that no matter what the configs do, it is pointing to localhost, however, there is some kind of validation that the two configs must be pointing to the same "name / ip"

I have attached the code for how the server binds.

The "getBindServer()" function just grabs an IP or DNS name out of a config file.

RerunScheduleInterface rsi = new RerunScheduleImpl("RerunScheduleServer");
            server = ConfigManager.getInstance().getBindServer() + "/RerunScheduleServer";
            Naming.rebind(server, rsi);
            ServerUtils.write(ConfigManager.getInstance().getStatusLogFile(), true, "----------------------------------------");
            String nonFinishedFile = ConfigManager.getInstance().getNotFinishedLogFile();
            Vector<SimBean> newSims = ServerUtils.fileToSimBean(nonFinishedFile);
            ServerUtils.write(nonFinishedFile, false, "");
            rsi.addScheduledSim(newSims);
            ServerMonitorThread.getInstance(rsi).start();
            System.out.println(rsi.viewPendingSim().size());
        } catch (Exception e) {
            System.out.println("Error :  FileServer: " + server + "\n" + e.getMessage());
        }

Open in new window

0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35109183
What does

System.out.println(server)

actually print?
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35109268
Sorry, I'd still return to the yesterday's situation with one server and one client and check
the code on the client - the fact that it complains that it cannot connect to the IP address of the client itself seems weird.
As I wrote - I have a similar situation and it complians that it cannot connect to the server and gives server IP address in the error message - that is understandable and logical.
Why should it complain that it cannot connect to the client IP address. Does it try to locate Registry on the server (I mean in the client code) ?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35109324
Also, as a general point:

>>System.out.println("Error :  FileServer: " + server + "\n" + e.getMessage());

it's always a good idea to do e.printStackTrace();
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35109342
...and in the light of that,  could you please

a. make those substitutions
b. try one scenario (preferably your goal scenario)
c. post full stack traces
0
 

Author Comment

by:aseisman
ID: 35109490
Result from System.out.println(server)

10.110.9.85/RerunScheduleServer
try {
            name = ClientConfigManager.getInstance().getServerName() + "/RerunScheduleServer";
            RerunScheduleInterface rsi = (RerunScheduleInterface) Naming.lookup(name);
            ClientMonitorThread mt = ClientMonitorThread.getInstance(rsi);
            mt.start();
            System.out.println("connect " + name + " success.");
            System.out.println("client start...");
        } catch (Exception e) {
            System.err.println("FileServer exception: " + name);
            System.err.println(e);
        }

Open in new window

0
 

Author Comment

by:aseisman
ID: 35109497
The above code is the client connection code
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35109538
>>server = ConfigManager.getInstance().getBindServer() + "/RerunScheduleServer";


>>name = ClientConfigManager.getInstance().getServerName() + "/RerunScheduleServer";

Are those two values identical?
0
 

Author Comment

by:aseisman
ID: 35109566
Yes, the two values are identical
0
 
LVL 47

Expert Comment

by:for_yan
ID: 35109618
So what is the server name  that it returns?

Could you try at least for the sake of the experimet to
use the simple LocateRegistry method with server IP address in the client code
and see if it connects this way?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35109643
Can you please tell me what command you use to start the server?
0
 

Author Comment

by:aseisman
ID: 35109653
java -jar dist/RerunScheduleServer.jar to start the server

the output of the print server is above

still working on other suggestions
0
 
LVL 47

Assisted Solution

by:for_yan
for_yan earned 150 total points
ID: 35109657

This is how I'd try to do it on the client:


 Registry registry = LocateRegistry.getRegistry("10.110.9.85",1099);
       RerunScheduleServer  rhs =
(RerunScheduleServer) registry.lookup("RerunScheduleServer");
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35109758
>>java -jar dist/RerunScheduleServer.jar to start the server

Can you please start it as below, attaching the last-named argument as a file. You can obscure any public-facing IP addresses as X.X.X.X but please don't do that with private ones
java -Dsun.rmi.transport.tcp.logLevel=verbose -jar dist/RerunScheduleServer.jar 2>&1 | tee /tmp/rmi.log.txt

Open in new window

0
 

Author Comment

by:aseisman
ID: 35109825
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint <clinit>
FINE: main: localHostKnown = true, localHost = 10.110.9.85
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport <init>
FINE: main: Version = 2, ep = [10.110.9.85:1099]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint getLocalEndpoint
FINE: main: created local endpoint for socket factory null on port 1099
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport <init>
FINE: main: Version = 2, ep = [10.110.9.85:0]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint getLocalEndpoint
FINE: main: created local endpoint for socket factory null on port 0
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport listen
FINE: main: (port 1099) create server socket
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint newServerSocket
FINER: main: creating server socket on [10.110.9.85:1099]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport listen
FINE: main: (port 0) create server socket
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint newServerSocket
FINER: main: creating server socket on [10.110.9.85:0]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
FINE: RMI TCP Accept-1099: listening on port 1099
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint setDefaultPort
FINE: main: default port for server socket factory null and client socket factory null set to 60713
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
FINE: RMI TCP Accept-0: listening on port 60713
10.110.9.85/RerunScheduleServer
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
FINE: main: create connection
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint newSocket
FINER: main: opening socket to [10.110.9.85:1099]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.proxy.RMIMasterSocketFactory createSocket
FINE: main: host: 10.110.9.85, port: 1099
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$ConnectionHandler run0
FINE: RMI TCP Connection(1)-10.110.9.85: accepted socket from [10.110.9.85:58331]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$ConnectionHandler run0
FINER: RMI TCP Connection(1)-10.110.9.85: (port 1099) suggesting 10.110.9.85:58331
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
FINER: main: server suggested 10.110.9.85:58331
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
FINER: main: using 10.110.9.85:60713
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$ConnectionHandler run0
FINER: RMI TCP Connection(1)-10.110.9.85: (port 1099) client using 10.110.9.85:60713
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport handleMessages
FINE: RMI TCP Connection(1)-10.110.9.85: (port 1099) op = 80
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35109922
The client has connected fine. Let us know about any exceptions
0
 
LVL 92

Expert Comment

by:objects
ID: 35113036
Be interested in the answer to a few of for_yan's questions, they should help clear up whats going on here.
You also still haven't posted a full stack trace I was asked for a few times earlier
0
 

Author Comment

by:aseisman
ID: 35113052
I posted this much earlier in response to the stack trace. Do you need more than this:


Full thread dump OpenJDK 64-Bit Server VM (19.0-b06 mixed mode):

"Low Memory Detector" daemon prio=10 tid=0x00007f1b38099800 nid=0x31b9 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x00007f1b38097000 nid=0x31b8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x00007f1b38094000 nid=0x31b7 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f1b38092000 nid=0x31b6 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f1b38073800 nid=0x31b5 in Object.wait() [0x00007f1b3da28000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
      - locked <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
      at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007f1b38071800 nid=0x31b4 in Object.wait() [0x00007f1b3db29000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)
      at java.lang.Object.wait(Object.java:502)
      at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
      - locked <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f1b38007000 nid=0x31b0 runnable [0x00007f1b3f331000]
   java.lang.Thread.State: RUNNABLE
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1447)
      - locked <0x00000000eb2d8d88> (a java.util.HashMap)
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1445)
      - locked <0x00000000eb2d9e88> (a java.util.HashMap)
      at java.net.URLClassLoader.defineClass(URLClassLoader.java:237)
      at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
      at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
      at java.security.AccessController.doPrivileged(Native Method)
      at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:266)

"VM Thread" prio=10 tid=0x00007f1b3806b800 nid=0x31b3 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f1b38012000 nid=0x31b1 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f1b38014000 nid=0x31b2 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f1b3809c000 nid=0x31ba waiting on condition

JNI global references: 859

Heap
 PSYoungGen      total 298688K, used 5121K [0x00000000eb2b0000, 0x0000000100000000, 0x0000000100000000)
  eden space 256064K, 2% used [0x00000000eb2b0000,0x00000000eb7b0528,0x00000000facc0000)
  from space 42624K, 0% used [0x00000000fd660000,0x00000000fd660000,0x0000000100000000)
  to   space 42624K, 0% used [0x00000000facc0000,0x00000000facc0000,0x00000000fd660000)
 PSOldGen        total 682688K, used 0K [0x00000000c1800000, 0x00000000eb2b0000, 0x00000000eb2b0000)
  object space 682688K, 0% used [0x00000000c1800000,0x00000000c1800000,0x00000000eb2b0000)
 PSPermGen       total 21248K, used 2535K [0x00000000b7000000, 0x00000000b84c0000, 0x00000000c1800000)
  object space 21248K, 11% used [0x00000000b7000000,0x00000000b7279fc0,0x00000000b84c0000)

FileServer exception: 10.110.9.85/RerunScheduleServer
java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:
      java.net.ConnectException: Connection refused
0
 
LVL 92

Assisted Solution

by:objects
objects earned 150 total points
ID: 35113061
thats a thread dump, not a stack dump
you can print a stack trace using:

exception.printStackTrace();

0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 200 total points
ID: 35113168
How many clients are there? If there's just one it looks like it's succeeding (see trace file) and then failing (see exception) because it's trying to connect to the wrong server.
0
 
LVL 92

Expert Comment

by:objects
ID: 35113651
reading the comments you appear to be on the wrong track
can you please post the stack trace, and respond to for_yan's queries so we can help you get to the cause of the problem and avoid you wasting more time.
0
 

Accepted Solution

by:
aseisman earned 0 total points
ID: 35184628
We were able to solve this problem by creating brand new installations of our servers, both with Fedora core. It is unclear what exactly the problem was, but the Java code itself could not have been the problem as it now works on the new installations. Thank you to everyone for all your help troubleshooting this problem, I know this is not a very satisfying outcome, but it worked.
0
 
LVL 92

Expert Comment

by:objects
ID: 35184638
yes your java code was fine. the problem was related to registry access, not your code.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35184680
Your problem was that one of the clients was making a connection to the wrong server
0
 

Author Closing Comment

by:aseisman
ID: 35221283
Solution was not from the answers given, however, several people put a lot of effort into helping solve this issue. The solution is not an A because it was inconclusive the exact reason for the problem even though it remains fully solved.
0

Featured Post

Shouldn't all users have the same email signature?

You wouldn't let your users design their own business cards, would you? So, why do you let them design their own email signatures? Think of the damage they could be doing to your brand reputation! Choose the easy way to manage set up and add email signatures for all users.

Join & Write a Comment

Big data transfers via information superhighways require special attention and protection. Learn more about the IT-regulations of the country where your server is located. Analyze cloud providers and their encryption systems for safe data transit. S…
Is your company's data protection keeping pace with virtualization? Here are 7 dynamic ways to adapt to rapid breakthroughs in technology.
This video teaches viewers about errors in exception handling.
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now