• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1766
  • Last Modified:

Linux Java RMI connection refused to host

We have developed a Java program that has a server / client setup through Java RMI. We are trying to deploy this in a linux environment using Amazon Web Services EC2. The client and server are able to connect from a single machine but unable to connect from separate machines where we get the following error:

java.rmi.ConnectException: Connection refused to host: LOCAL_IP_CLIENT; nested exception is:
      java.net.ConnectException: Connection refused
0
aseisman
Asked:
aseisman
  • 34
  • 26
  • 21
  • +1
4 Solutions
 
CEHJCommented:
Are they on the same network or different ones?
0
 
aseismanAuthor Commented:
They are on the same network and should have all communication opened between them (ie they can ping each other and have file shares setup).
0
 
CEHJCommented:
Can you telnet to it using the RMI port?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
aseismanAuthor Commented:
This is what I tried:

telnet 10.110.9.85:1099
telnet: 10.110.9.85:1099: Name or service not known
10.110.9.85:1099: Unknown host

it hangs for a second before giving the error
0
 
for_yanCommented:

When you start the RMIregistry and then start the server does it connect
to reegistry without any errors - ?
Perhaps you want to have put a System printout after the start of the
server object and its connection to the registry
0
 
aseismanAuthor Commented:
When I start the client on the same machine as the server is running, the client is able to connect without a problem.

This error is the same error I get when running the client on the same machine as the server when the server is NOT running.

In addition, we had this software running on a Windows network and are trying to get it to run on Linux.

The above make me think that it has something to do with system level settings but we have been going at this for days are are really not sure how to proceed.
0
 
CEHJCommented:
That should be
telnet 10.110.9.85 1099

Open in new window

0
 
aseismanAuthor Commented:
When I attempt to telnet that way I get:

Trying 10.110.9.85...
Connected to 10.110.9.85.
Escape character is '^]'.

and the only way to get out of it is to ctrl+c .

I get the same result when I am on the machine running the server and try to telnet to itself.
0
 
CEHJCommented:
That's good. So you don't have a network problem by the looks
0
 
for_yanCommented:
It looks like in this last telnet attempt it was not using port 1099

Perhpas you have some kind of firewall and you need to open port 1099 ?
0
 
CEHJCommented:
Can you please post the full stack trace?
0
 
aseismanAuthor Commented:
This is exactly what I typed (left out that first line before):

telnet 10.110.9.85 1099
Trying 10.110.9.85...
Connected to 10.110.9.85.
Escape character is '^]'.
0
 
aseismanAuthor Commented:
bash run.sh
^\2011-03-10 19:47:28
Full thread dump OpenJDK 64-Bit Server VM (19.0-b06 mixed mode):

"Low Memory Detector" daemon prio=10 tid=0x00007f1b38099800 nid=0x31b9 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x00007f1b38097000 nid=0x31b8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x00007f1b38094000 nid=0x31b7 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f1b38092000 nid=0x31b6 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f1b38073800 nid=0x31b5 in Object.wait() [0x00007f1b3da28000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
      - locked <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
      at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007f1b38071800 nid=0x31b4 in Object.wait() [0x00007f1b3db29000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)
      at java.lang.Object.wait(Object.java:502)
      at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
      - locked <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f1b38007000 nid=0x31b0 runnable [0x00007f1b3f331000]
   java.lang.Thread.State: RUNNABLE
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1447)
      - locked <0x00000000eb2d8d88> (a java.util.HashMap)
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1445)
      - locked <0x00000000eb2d9e88> (a java.util.HashMap)
      at java.net.URLClassLoader.defineClass(URLClassLoader.java:237)
      at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
      at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
      at java.security.AccessController.doPrivileged(Native Method)
      at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:266)

"VM Thread" prio=10 tid=0x00007f1b3806b800 nid=0x31b3 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f1b38012000 nid=0x31b1 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f1b38014000 nid=0x31b2 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f1b3809c000 nid=0x31ba waiting on condition

JNI global references: 859

Heap
 PSYoungGen      total 298688K, used 5121K [0x00000000eb2b0000, 0x0000000100000000, 0x0000000100000000)
  eden space 256064K, 2% used [0x00000000eb2b0000,0x00000000eb7b0528,0x00000000facc0000)
  from space 42624K, 0% used [0x00000000fd660000,0x00000000fd660000,0x0000000100000000)
  to   space 42624K, 0% used [0x00000000facc0000,0x00000000facc0000,0x00000000fd660000)
 PSOldGen        total 682688K, used 0K [0x00000000c1800000, 0x00000000eb2b0000, 0x00000000eb2b0000)
  object space 682688K, 0% used [0x00000000c1800000,0x00000000c1800000,0x00000000eb2b0000)
 PSPermGen       total 21248K, used 2535K [0x00000000b7000000, 0x00000000b84c0000, 0x00000000c1800000)
  object space 21248K, 11% used [0x00000000b7000000,0x00000000b7279fc0,0x00000000b84c0000)

FileServer exception: 10.110.9.85/RerunScheduleServer
java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:
      java.net.ConnectException: Connection refused
0
 
CEHJCommented:
>>Connection refused to host: 10.110.198.30

That's a different host than the one you tested. THAT is the one you should be testing
0
 
aseismanAuthor Commented:
That is the IP of the client machine. We tested it to the server machine. Am I interpreting it wrong to think that it is saying the client (that IP address) is being refused by the server (the address we tested)?
0
 
for_yanCommented:

They had similar problem in
http://forum.springsource.org/showthread.php?t=33711
This is the solution, perhaps it is worth looking into it:


figured it out
Well, since I didn't have time to wait for a reply, I went ahead and banged my head against it for a little while and figured out what to do.

After re-reading the links from thosmas' posts, I realized that it was the hosts file on the server that needed to be fixed.

If your hosts file has the first line as:
Code:

127.0.0.1    localhost  {hostname}

you must remove the {hostname} from that first line and add another line like:
Code:

{actual ip address}   {hostname}

That worked for me.
0
 
aseismanAuthor Commented:
What should the hostname be. Currently the hosts file is:

#
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.
# Syntax:
#    
# IP-Address  Full-Qualified-Hostname  Short-Hostname
10.110.9.85 localhost
~                      
0
 
CEHJCommented:
Don't alter any hosts files.

>>Am I interpreting it wrong to think that it is saying the client (that IP address) is being refused by the server (the address we tested)?

You could well be right. Try the testing the other way around
0
 
for_yanCommented:
well, it kind of looks you have it OK, as you don't have 127.0.0.1 as the first line
Fully qulaified hostname should be  computer_name.domain.com instead of localhost
but I'm not sure that would make difference
 
0
 
for_yanCommented:
How do you do the binding in the server-side code, like that:

Registry registry = LocateRegistry.getRegistry(1099);
reagistry.rebind("name",object_id);

0
 
for_yanCommented:
Do you refer to "localhost" in the binding operations?
0
 
CEHJCommented:
If you find problems testing the other way around, you need to start looking into 'firewalling' or filtering issues for your distro. You might try
man hosts.allow
man hosts.deny

Open in new window

0
 
aseismanAuthor Commented:
I am trying to investigate the firewall issue because I have long thought that it was possibly the cause. The server machine is running a distribution of SLES and the other is Fedora.

When I go into YAST on the SLES machine and attempt to configure the firewall it says that "another firewall is running". If you have any suggestion of how to proceed I would appreciate it. Otherwise, will revert back when I have exhausted this avenue.
0
 
objectsCommented:
check your iptables

sudo iptables -L
0
 
aseismanAuthor Commented:
I previously attempted to open this port in the IP tables to solve this issue, my current IP tables on the server machine read:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination        
ACCEPT     tcp  --  anywhere             anywhere            tcp spt:rmiregistry state ESTABLISHED

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination        

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination        

The exact same is the case for the client machine.
0
 
objectsCommented:
> I previously attempted to open this port in the IP tables to solve this issue

theres was no need to do that
you should undo the change you made
0
 
aseismanAuthor Commented:
Does that mean firewall is not the issue based on the IP tables?
0
 
objectsCommented:
whats the answer to for_yan's question above?
0
 
aseismanAuthor Commented:
We do not refer to localhost in the binding operations.
0
 
objectsCommented:
can you post how you are starting up the registry
0
 
CEHJCommented:
It's nothing to do with naming - it knows exactly what host to hit and it isn't being allowed

It's probably a BAD idea to start messing directly with iptables directly- you could get yourself into a mess if you don't know what you're doing. Check what higher level tools are using iptables for your distro first
0
 
aseismanAuthor Commented:
I am only able to work on these servers from a terminal session. Could the firewall issue be on both machine or just the server side? The server side is running SLES not fedora.
0
 
CEHJCommented:
Looks like you should focus on the servver
0
 
objectsCommented:
> Could the firewall issue be on both machine or just the server side?

there doesn't appear to be any firewall issue

still need to see how the registry is being started up on the server
0
 
aseismanAuthor Commented:
When I go to configure the firewall in YAST it says that the firewall is disabled. It also says that there is another firewall running (but not what it is) and that turning ON the firewall in YAST could cause problems unless I have turned off the other firewall.

Maybe it is possible that it thinks that there is another firewall because of the single entry I put into the IP tables. If that is the case, then it doesn't seem like a firewall issue.
0
 
aseismanAuthor Commented:
In order to get at the part of the code that does the registering I am going to need to consult with a colleague who is gone for the day. I will talk to him tomorrow and post what I have at that point. Thank you for all of your help so far. If you have any thoughts beyond the registry code, I will continue to try to trouble shoot today as well.
0
 
objectsCommented:
another thing you could do is use something like lsof or netstat to check what the server is listening on
0
 
aseismanAuthor Commented:
netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        1      0 localhost:34477         localhost:58238         CLOSE_WAIT  
tcp        0      0 localhost:ssh           ool-6039f6ab.stat:35043 ESTABLISHED
tcp        0      0 localhost:58238         localhost:34477         FIN_WAIT2  
tcp        0      0 localhost:57259         localhost:59374         ESTABLISHED
tcp        0    160 localhost:ssh           ool-6039f6ab.stat:45241 ESTABLISHED
tcp        0      0 localhost:ssh           ool-6039f6ab.stat:42805 ESTABLISHED
tcp        0      0 localhost:52402         localhost:40750         FIN_WAIT2  
tcp        1      0 localhost:40750         localhost:52402         CLOSE_WAIT  
tcp        0     23 localhost:59374         localhost:57259         ESTABLISHED
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags       Type       State         I-Node Path
unix  13     [ ]         DGRAM                    2603   /dev/log
unix  2      [ ]         DGRAM                    1421   @/org/kernel/udev/udevd
unix  2      [ ]         DGRAM                    2698   @/org/freedesktop/hal/udev_event
unix  2      [ ]         DGRAM                    187662
unix  3      [ ]         STREAM     CONNECTED     186972
unix  3      [ ]         STREAM     CONNECTED     186971
unix  3      [ ]         STREAM     CONNECTED     186688
unix  3      [ ]         STREAM     CONNECTED     186687
unix  3      [ ]         STREAM     CONNECTED     177657
unix  3      [ ]         STREAM     CONNECTED     177656
unix  2      [ ]         DGRAM                    121114
unix  2      [ ]         DGRAM                    120979
unix  3      [ ]         STREAM     CONNECTED     120896
unix  3      [ ]         STREAM     CONNECTED     120895
unix  2      [ ]         DGRAM                    14045  
unix  3      [ ]         STREAM     CONNECTED     14044  
unix  3      [ ]         STREAM     CONNECTED     14043  
unix  3      [ ]         STREAM     CONNECTED     14042  
unix  3      [ ]         STREAM     CONNECTED     14041  
unix  3      [ ]         STREAM     CONNECTED     14040  
unix  3      [ ]         STREAM     CONNECTED     14039  
unix  3      [ ]         STREAM     CONNECTED     14038  
unix  3      [ ]         STREAM     CONNECTED     14037  
unix  3      [ ]         STREAM     CONNECTED     14036  
unix  3      [ ]         STREAM     CONNECTED     14035  
unix  3      [ ]         STREAM     CONNECTED     14034  
unix  3      [ ]         STREAM     CONNECTED     14033  
unix  3      [ ]         STREAM     CONNECTED     14032  
unix  3      [ ]         STREAM     CONNECTED     14031  
unix  3      [ ]         STREAM     CONNECTED     14030  
unix  3      [ ]         STREAM     CONNECTED     14029  
unix  3      [ ]         STREAM     CONNECTED     14028  
unix  3      [ ]         STREAM     CONNECTED     14027  
unix  3      [ ]         STREAM     CONNECTED     14026  
unix  3      [ ]         STREAM     CONNECTED     14025  
unix  3      [ ]         STREAM     CONNECTED     14024  
unix  3      [ ]         STREAM     CONNECTED     14023  
unix  3      [ ]         STREAM     CONNECTED     14022  
unix  3      [ ]         STREAM     CONNECTED     14021  
unix  3      [ ]         STREAM     CONNECTED     14020  
unix  3      [ ]         STREAM     CONNECTED     14019  
unix  3      [ ]         STREAM     CONNECTED     14018  
unix  3      [ ]         STREAM     CONNECTED     14017  
unix  3      [ ]         STREAM     CONNECTED     14016  
unix  3      [ ]         STREAM     CONNECTED     14015  
unix  3      [ ]         STREAM     CONNECTED     14014  
unix  3      [ ]         STREAM     CONNECTED     14013  
unix  3      [ ]         STREAM     CONNECTED     14012  
unix  3      [ ]         STREAM     CONNECTED     14011  
unix  3      [ ]         STREAM     CONNECTED     14010  
unix  3      [ ]         STREAM     CONNECTED     14009  
unix  3      [ ]         STREAM     CONNECTED     14008  
unix  3      [ ]         STREAM     CONNECTED     14007  
unix  3      [ ]         STREAM     CONNECTED     14006  
unix  3      [ ]         STREAM     CONNECTED     14005  
unix  3      [ ]         STREAM     CONNECTED     14004  
unix  3      [ ]         STREAM     CONNECTED     14003  
unix  3      [ ]         STREAM     CONNECTED     14002  
unix  3      [ ]         STREAM     CONNECTED     14001  
unix  3      [ ]         STREAM     CONNECTED     14000  
unix  3      [ ]         STREAM     CONNECTED     13999  
unix  3      [ ]         STREAM     CONNECTED     13998  
unix  3      [ ]         STREAM     CONNECTED     13997  
unix  3      [ ]         STREAM     CONNECTED     13996  
unix  3      [ ]         STREAM     CONNECTED     13995  
unix  3      [ ]         STREAM     CONNECTED     13994  
unix  3      [ ]         STREAM     CONNECTED     13993  
unix  3      [ ]         STREAM     CONNECTED     13992  
unix  3      [ ]         STREAM     CONNECTED     13991  
unix  3      [ ]         STREAM     CONNECTED     13990  
unix  3      [ ]         STREAM     CONNECTED     13989  
unix  3      [ ]         STREAM     CONNECTED     13988  
unix  3      [ ]         STREAM     CONNECTED     13987  
unix  2      [ ]         DGRAM                    8461  
unix  2      [ ]         DGRAM                    7722  
unix  2      [ ]         DGRAM                    4114  
unix  2      [ ]         DGRAM                    4074  
unix  2      [ ]         DGRAM                    4068  
unix  3      [ ]         STREAM     CONNECTED     4067  
unix  3      [ ]         STREAM     CONNECTED     4066  
unix  2      [ ]         DGRAM                    3826  
unix  2      [ ]         DGRAM                    3397  
unix  3      [ ]         STREAM     CONNECTED     2693   @/var/run/hald/dbus-IlQmQqKenD
unix  3      [ ]         STREAM     CONNECTED     2691  
unix  3      [ ]         STREAM     CONNECTED     2671   /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     2670  
unix  3      [ ]         STREAM     CONNECTED     2656   /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     2655  
unix  3      [ ]         STREAM     CONNECTED     2578  
unix  3      [ ]         STREAM     CONNECTED     2577  
0
 
objectsCommented:
do

netstat -a | grep LISTEN
0
 
aseismanAuthor Commented:
netstat -a | grep LISTEN
tcp        0      0 *:ssh                   *:*                     LISTEN      
tcp        0      0 domU-12-31-39-16-4:smtp *:*                     LISTEN      
tcp        0      0 *:46811                 *:*                     LISTEN      
tcp        0      0 *:nfs                   *:*                     LISTEN      
tcp        0      0 *:60454                 *:*                     LISTEN      
tcp        0      0 *:mysql                 *:*                     LISTEN      
tcp        0      0 *:rmiregistry           *:*                     LISTEN      
tcp        0      0 *:59374                 *:*                     LISTEN      
tcp        0      0 *:57231                 *:*                     LISTEN      
tcp        0      0 *:sunrpc                *:*                     LISTEN      
tcp        0      0 *:www-http              *:*                     LISTEN      
tcp        0      0 *:ftp                   *:*                     LISTEN      
unix  2      [ ACC ]     STREAM     LISTENING     8092   private/local
unix  2      [ ACC ]     STREAM     LISTENING     8096   private/virtual
unix  2      [ ACC ]     STREAM     LISTENING     8100   private/lmtp
unix  2      [ ACC ]     STREAM     LISTENING     8104   private/anvil
unix  2      [ ACC ]     STREAM     LISTENING     8108   private/scache
unix  2      [ ACC ]     STREAM     LISTENING     8112   private/maildrop
unix  2      [ ACC ]     STREAM     LISTENING     8116   private/cyrus
unix  2      [ ACC ]     STREAM     LISTENING     8120   private/uucp
unix  2      [ ACC ]     STREAM     LISTENING     8124   private/ifmail
unix  2      [ ACC ]     STREAM     LISTENING     8128   private/bsmtp
unix  2      [ ACC ]     STREAM     LISTENING     8132   private/procmail
unix  2      [ ACC ]     STREAM     LISTENING     8136   private/retry
unix  2      [ ACC ]     STREAM     LISTENING     8140   private/proxywrite
unix  2      [ ACC ]     STREAM     LISTENING     7577   /var/run/nscd/socket
unix  2      [ ACC ]     STREAM     LISTENING     4211   /var/lib/mysql/mysql.sock
unix  2      [ ACC ]     STREAM     LISTENING     2672   @/var/run/hald/dbus-IlQmQqKenD
unix  2      [ ACC ]     STREAM     LISTENING     2574   /var/run/dbus/system_bus_socket
unix  2      [ ACC ]     STREAM     LISTENING     4075   /var/run/audispd_events
unix  2      [ ACC ]     STREAM     LISTENING     4102   /var/run/rpcbind.sock
unix  2      [ ACC ]     STREAM     LISTENING     8037   public/cleanup
unix  2      [ ACC ]     STREAM     LISTENING     8044   private/rewrite
unix  2      [ ACC ]     STREAM     LISTENING     8048   private/bounce
unix  2      [ ACC ]     STREAM     LISTENING     2654   @/var/run/hald/dbus-oHzqlhrxCP
unix  2      [ ACC ]     STREAM     LISTENING     8052   private/defer
unix  2      [ ACC ]     STREAM     LISTENING     8056   private/trace
unix  2      [ ACC ]     STREAM     LISTENING     8060   private/verify
unix  2      [ ACC ]     STREAM     LISTENING     8064   public/flush
unix  2      [ ACC ]     STREAM     LISTENING     8068   private/proxymap
unix  2      [ ACC ]     STREAM     LISTENING     8072   private/smtp
unix  2      [ ACC ]     STREAM     LISTENING     8076   private/relay
unix  2      [ ACC ]     STREAM     LISTENING     8080   public/showq
unix  2      [ ACC ]     STREAM     LISTENING     8084   private/error
unix  2      [ ACC ]     STREAM     LISTENING     8088   private/discard
0
 
objectsCommented:
> tcp        0      0 *:rmiregistry           *:*                     LISTEN      

next check if you can connect to the server on that port
0
 
aseismanAuthor Commented:
What port? "rmiregistry"? I think that is just the default port of 1099 and I think that is how I am trying to connect now when getting the original error.
0
 
CEHJCommented:
If you're using netstat, you want to do
netstat -pant

Open in new window

0
 
aseismanAuthor Commented:
Using netstat -pant I think this is my server listening:

tcp        0      0 0.0.0.0:1099            0.0.0.0:*               LISTEN      25478/java  
0
 
objectsCommented:
> What port? "rmiregistry"?

netstat -an | grep LISTEN
0
 
aseismanAuthor Commented:
The port is 1099, how do you want me to try to connect to this port?
0
 
objectsCommented:
so (from the client) try to telnet to the server on port 1099
0
 
aseismanAuthor Commented:
telnet 10.110.9.85 1099
Trying 10.110.9.85...
Connected to 10.110.9.85.
0
 
objectsCommented:
now do you run the client from that same box?
if so do you use the same ip to connect?
0
 
aseismanAuthor Commented:
Yes and Yes (and it doesn't work).
0
 
objectsCommented:
can you post the full stack trace for the exception
0
 
aseismanAuthor Commented:
That has already been posted above.
0
 
objectsCommented:
Need to see the full stack trace

> java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:

and the exception posted was trying to connect to a different ip
Need to see the full stack trace when the client tries to connect to server (10.110.9.85)
0
 
for_yanCommented:
Can you ping from server to client?
0
 
for_yanCommented:
Was that dump from server?
and what is stck trace as it seen form the client?
0
 
CEHJCommented:
You need to ensure that the RMI sockets get bound to routable addresses, which in your case would be your private IP addresses. At the moment, at least one of them isn't
0
 
CEHJCommented:
Try the following (using the correct address of course)
java -Djava.rmi.server.hostname=10.110.9.85 YourApp

Open in new window

0
 
aseismanAuthor Commented:
I can ping between the servers.

The stack trace was from the client side

Please be more specific about how to ensure that my RMI sockets get bound to a routable address and which one is not (is it definitely my program that is not binding correctly)

The Java -D... did not work
0
 
for_yanCommented:
If the stack trace is from the client side why would the client request rmi connection to itself,
as you mentioned that 10.110.198.30 is the client ip address

>>java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:
      java.net.ConnectException: Connection refused

Do you have the client-side code when it establishes the connection?

0
 
aseismanAuthor Commented:
I interpret that statement to mean that the connection to the server has been refused to the host where the host is the client computer. Am I interpreting this incorrectly?
0
 
objectsCommented:
> The Java -D... did not work

no it wouldn't make a difference. That property has nothing to do with making the connection
0
 
for_yanCommented:
Well, i happen to have analogous situation - I am simply running some app which tries to connect to
abandoned RMI server - it reports server IP address in such message

What is your code when you discover registry on the client? Maybe you are looking on localhost?
0
 
objectsCommented:
> The stack trace was from the client side

you've just posted the exception, need to see the full stack trace

also what security policies do you have in place, both client and server?
0
 
CEHJCommented:
Try the following: take a backup of any /etc/hosts files that have preceding comments then delete the preceding comments. You might need to restart the box(es)
0
 
objectsCommented:
When for_yan suggested making changes to the hosts file the following was posted

CEHJ> Don't alter any hosts files.

go figure
0
 
CEHJCommented:
>>go figure

I was talking about making edits to the functional contents of dns. My last suggestion is regarding removing crud from the file that doesn't alter dns
0
 
objectsCommented:
of course you were :-D

not sure why you'd need to restart the boxes if you haven't altered the DNS
0
 
aseismanAuthor Commented:
Some new information:

This morning I cloned the server machine and now have two identical machines running with different DNS names and IP addresses.

I will refer to the original machine as "OLD" and thew cloned as "NEW"

I can start the server and client on the OLD machine and it works.

When I then try to start the client on the NEW machine it does not work.

I then tried to start the server on the NEW machine with the OLD configuration (ie it points to the OLD server for the RMI binding). This works!

I then tried to connect the client on the NEW machine with the OLD configuration, and it works!!!

(this is all of course wrong).

If I close the server on the OLD machine, the client on the NEW one does not disconnect.

If I close the server on the NEW machine (supposedly ... but not actually) bound to the OLD machine's IP, the NEW client disconnects.

One more variation is IF I change the config on the NEW machine so that the server points to itself and the client points to the OLD ip, then it does not work.

What I take from all this is that no matter what the configs do, it is pointing to localhost, however, there is some kind of validation that the two configs must be pointing to the same "name / ip"

I have attached the code for how the server binds.

The "getBindServer()" function just grabs an IP or DNS name out of a config file.

RerunScheduleInterface rsi = new RerunScheduleImpl("RerunScheduleServer");
            server = ConfigManager.getInstance().getBindServer() + "/RerunScheduleServer";
            Naming.rebind(server, rsi);
            ServerUtils.write(ConfigManager.getInstance().getStatusLogFile(), true, "----------------------------------------");
            String nonFinishedFile = ConfigManager.getInstance().getNotFinishedLogFile();
            Vector<SimBean> newSims = ServerUtils.fileToSimBean(nonFinishedFile);
            ServerUtils.write(nonFinishedFile, false, "");
            rsi.addScheduledSim(newSims);
            ServerMonitorThread.getInstance(rsi).start();
            System.out.println(rsi.viewPendingSim().size());
        } catch (Exception e) {
            System.out.println("Error :  FileServer: " + server + "\n" + e.getMessage());
        }

Open in new window

0
 
CEHJCommented:
What does

System.out.println(server)

actually print?
0
 
for_yanCommented:
Sorry, I'd still return to the yesterday's situation with one server and one client and check
the code on the client - the fact that it complains that it cannot connect to the IP address of the client itself seems weird.
As I wrote - I have a similar situation and it complians that it cannot connect to the server and gives server IP address in the error message - that is understandable and logical.
Why should it complain that it cannot connect to the client IP address. Does it try to locate Registry on the server (I mean in the client code) ?
0
 
CEHJCommented:
Also, as a general point:

>>System.out.println("Error :  FileServer: " + server + "\n" + e.getMessage());

it's always a good idea to do e.printStackTrace();
0
 
CEHJCommented:
...and in the light of that,  could you please

a. make those substitutions
b. try one scenario (preferably your goal scenario)
c. post full stack traces
0
 
aseismanAuthor Commented:
Result from System.out.println(server)

10.110.9.85/RerunScheduleServer
try {
            name = ClientConfigManager.getInstance().getServerName() + "/RerunScheduleServer";
            RerunScheduleInterface rsi = (RerunScheduleInterface) Naming.lookup(name);
            ClientMonitorThread mt = ClientMonitorThread.getInstance(rsi);
            mt.start();
            System.out.println("connect " + name + " success.");
            System.out.println("client start...");
        } catch (Exception e) {
            System.err.println("FileServer exception: " + name);
            System.err.println(e);
        }

Open in new window

0
 
aseismanAuthor Commented:
The above code is the client connection code
0
 
CEHJCommented:
>>server = ConfigManager.getInstance().getBindServer() + "/RerunScheduleServer";


>>name = ClientConfigManager.getInstance().getServerName() + "/RerunScheduleServer";

Are those two values identical?
0
 
aseismanAuthor Commented:
Yes, the two values are identical
0
 
for_yanCommented:
So what is the server name  that it returns?

Could you try at least for the sake of the experimet to
use the simple LocateRegistry method with server IP address in the client code
and see if it connects this way?
0
 
CEHJCommented:
Can you please tell me what command you use to start the server?
0
 
aseismanAuthor Commented:
java -jar dist/RerunScheduleServer.jar to start the server

the output of the print server is above

still working on other suggestions
0
 
for_yanCommented:

This is how I'd try to do it on the client:


 Registry registry = LocateRegistry.getRegistry("10.110.9.85",1099);
       RerunScheduleServer  rhs =
(RerunScheduleServer) registry.lookup("RerunScheduleServer");
0
 
CEHJCommented:
>>java -jar dist/RerunScheduleServer.jar to start the server

Can you please start it as below, attaching the last-named argument as a file. You can obscure any public-facing IP addresses as X.X.X.X but please don't do that with private ones
java -Dsun.rmi.transport.tcp.logLevel=verbose -jar dist/RerunScheduleServer.jar 2>&1 | tee /tmp/rmi.log.txt

Open in new window

0
 
aseismanAuthor Commented:
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint <clinit>
FINE: main: localHostKnown = true, localHost = 10.110.9.85
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport <init>
FINE: main: Version = 2, ep = [10.110.9.85:1099]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint getLocalEndpoint
FINE: main: created local endpoint for socket factory null on port 1099
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport <init>
FINE: main: Version = 2, ep = [10.110.9.85:0]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint getLocalEndpoint
FINE: main: created local endpoint for socket factory null on port 0
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport listen
FINE: main: (port 1099) create server socket
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint newServerSocket
FINER: main: creating server socket on [10.110.9.85:1099]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport listen
FINE: main: (port 0) create server socket
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint newServerSocket
FINER: main: creating server socket on [10.110.9.85:0]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
FINE: RMI TCP Accept-1099: listening on port 1099
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint setDefaultPort
FINE: main: default port for server socket factory null and client socket factory null set to 60713
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
FINE: RMI TCP Accept-0: listening on port 60713
10.110.9.85/RerunScheduleServer
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
FINE: main: create connection
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPEndpoint newSocket
FINER: main: opening socket to [10.110.9.85:1099]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.proxy.RMIMasterSocketFactory createSocket
FINE: main: host: 10.110.9.85, port: 1099
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$ConnectionHandler run0
FINE: RMI TCP Connection(1)-10.110.9.85: accepted socket from [10.110.9.85:58331]
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$ConnectionHandler run0
FINER: RMI TCP Connection(1)-10.110.9.85: (port 1099) suggesting 10.110.9.85:58331
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
FINER: main: server suggested 10.110.9.85:58331
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
FINER: main: using 10.110.9.85:60713
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport$ConnectionHandler run0
FINER: RMI TCP Connection(1)-10.110.9.85: (port 1099) client using 10.110.9.85:60713
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPTransport handleMessages
FINE: RMI TCP Connection(1)-10.110.9.85: (port 1099) op = 80
Mar 11, 2011 3:49:01 PM sun.rmi.transport.tcp.TCPChannel createConnection
0
 
CEHJCommented:
The client has connected fine. Let us know about any exceptions
0
 
objectsCommented:
Be interested in the answer to a few of for_yan's questions, they should help clear up whats going on here.
You also still haven't posted a full stack trace I was asked for a few times earlier
0
 
aseismanAuthor Commented:
I posted this much earlier in response to the stack trace. Do you need more than this:


Full thread dump OpenJDK 64-Bit Server VM (19.0-b06 mixed mode):

"Low Memory Detector" daemon prio=10 tid=0x00007f1b38099800 nid=0x31b9 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x00007f1b38097000 nid=0x31b8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x00007f1b38094000 nid=0x31b7 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f1b38092000 nid=0x31b6 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f1b38073800 nid=0x31b5 in Object.wait() [0x00007f1b3da28000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
      - locked <0x00000000eb2b0c60> (a java.lang.ref.ReferenceQueue$Lock)
      at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
      at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007f1b38071800 nid=0x31b4 in Object.wait() [0x00007f1b3db29000]
   java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      - waiting on <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)
      at java.lang.Object.wait(Object.java:502)
      at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
      - locked <0x00000000eb2b0b38> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f1b38007000 nid=0x31b0 runnable [0x00007f1b3f331000]
   java.lang.Thread.State: RUNNABLE
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1447)
      - locked <0x00000000eb2d8d88> (a java.util.HashMap)
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1445)
      - locked <0x00000000eb2d9e88> (a java.util.HashMap)
      at java.net.URLClassLoader.defineClass(URLClassLoader.java:237)
      at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
      at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
      at java.security.AccessController.doPrivileged(Native Method)
      at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
      - locked <0x00000000eb2d9ce8> (a sun.misc.Launcher$AppClassLoader)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:266)

"VM Thread" prio=10 tid=0x00007f1b3806b800 nid=0x31b3 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f1b38012000 nid=0x31b1 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f1b38014000 nid=0x31b2 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f1b3809c000 nid=0x31ba waiting on condition

JNI global references: 859

Heap
 PSYoungGen      total 298688K, used 5121K [0x00000000eb2b0000, 0x0000000100000000, 0x0000000100000000)
  eden space 256064K, 2% used [0x00000000eb2b0000,0x00000000eb7b0528,0x00000000facc0000)
  from space 42624K, 0% used [0x00000000fd660000,0x00000000fd660000,0x0000000100000000)
  to   space 42624K, 0% used [0x00000000facc0000,0x00000000facc0000,0x00000000fd660000)
 PSOldGen        total 682688K, used 0K [0x00000000c1800000, 0x00000000eb2b0000, 0x00000000eb2b0000)
  object space 682688K, 0% used [0x00000000c1800000,0x00000000c1800000,0x00000000eb2b0000)
 PSPermGen       total 21248K, used 2535K [0x00000000b7000000, 0x00000000b84c0000, 0x00000000c1800000)
  object space 21248K, 11% used [0x00000000b7000000,0x00000000b7279fc0,0x00000000b84c0000)

FileServer exception: 10.110.9.85/RerunScheduleServer
java.rmi.ConnectException: Connection refused to host: 10.110.198.30; nested exception is:
      java.net.ConnectException: Connection refused
0
 
objectsCommented:
thats a thread dump, not a stack dump
you can print a stack trace using:

exception.printStackTrace();

0
 
CEHJCommented:
How many clients are there? If there's just one it looks like it's succeeding (see trace file) and then failing (see exception) because it's trying to connect to the wrong server.
0
 
objectsCommented:
reading the comments you appear to be on the wrong track
can you please post the stack trace, and respond to for_yan's queries so we can help you get to the cause of the problem and avoid you wasting more time.
0
 
aseismanAuthor Commented:
We were able to solve this problem by creating brand new installations of our servers, both with Fedora core. It is unclear what exactly the problem was, but the Java code itself could not have been the problem as it now works on the new installations. Thank you to everyone for all your help troubleshooting this problem, I know this is not a very satisfying outcome, but it worked.
0
 
objectsCommented:
yes your java code was fine. the problem was related to registry access, not your code.
0
 
CEHJCommented:
Your problem was that one of the clients was making a connection to the wrong server
0
 
aseismanAuthor Commented:
Solution was not from the answers given, however, several people put a lot of effort into helping solve this issue. The solution is not an A because it was inconclusive the exact reason for the problem even though it remains fully solved.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 34
  • 26
  • 21
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now