Solved

Remote Back-up in SunOS 4.1.3

Posted on 1998-10-16
21
318 Views
Last Modified: 2010-04-21
I have a network consisting of a thick-net backbone (with AUI connections) connecting 6 sub-nets.  I used to be able to do a remote dump from each system in each sub-net to one of my servers that has two 5GB exa-byte tape drives.  Recently, I tried to do a remote dump from clients on one of my sub-nets to this server and got a message stating that the dump was aborted due to a broken pipe.  No new hardware or software has been added to any of my machines and my back-up routine has worked perfectly for over 10 years.  I am able to do a remote dump from the clients on the sub-net to the server of the sub-net and I can do a remote dump from the server on the sub-net to my backup server as I always was able to.  What could be wrong and how do I fix it?
0
Comment
Question by:ddavis42
  • 9
  • 5
  • 4
  • +1
21 Comments
 
LVL 5

Expert Comment

by:tfabian
Comment Utility
well, the obvious question is whether you still have network connectivity between the machines on different subnets?? if not, you had best look at your routers and/or gateways, and at any interconnection points between the subnets..

also investigate whether or not any routing changes, deletions, additions, etc. occurred.. eg. if another machine was added to any of the subnets, and if it's misconfigured, it could be hijacking your traffic and sending it to the big bit bucket in the sky...


good luck,
0
 

Author Comment

by:ddavis42
Comment Utility
1) There is still connectivity between machines.  I can ping and remotely log into any machine on any sub-net.

2) No new machines have been added on any of the sub-nets.

Nothing has changed on any of the sub-nets except I can no longer do remote dumps.
0
 
LVL 5

Expert Comment

by:tfabian
Comment Utility
I repeat, check for installation of different gateways, firewalls, or filters in the routers.. if nothing on the machines themselves have changed, that's the only possibility..

try moving the failing machine to the subnet with your server, and back it up that way as a test to ensure that the backup process still works...

beyond that, I'm not able to diagnose anything



0
 

Author Comment

by:ddavis42
Comment Utility
I repeat, nothing has been added.  No different gateways, firewalls, or filters.

The backup process still works, because I use the same routines that I was using on my tape drive server on the server of the sub-net.  I copied them to the sub-net servers, modified them to point to the correct tape drive (there is only one tape drive on the sub-net servers) and initiate them the same way I did on the tape drive server.

One month the routines were working fine, the next month trying to do my monthly backups, they failed.  That appears to be the only thing in my network that has changed.

(I said this was a hard one!)
0
 
LVL 4

Expert Comment

by:davidmwilliams
Comment Utility
 The fact that you're getting a 'broken pipe' error means that a software error has occurred, and one of the processes in your backup program that feeds input into another process has failed.
  If your backup program is a shell script, look for the | (pipe) character in it.  This character means the command on the left hand side is feeding its output into the command on the right hand side, which receives it as input.
  If you are getting a broken pipe, then the command on the left is failing.
  Find this command, and then you can narrow down the problem, because you know just which part of the backup is failing.
  If you use rsh then maybe a .rhosts file has been deleted on the remote machine, for example.
0
 

Author Comment

by:ddavis42
Comment Utility
I am using shell scripts to do my backups.  They haven't changed in 10 years (except to add new disks as the systems grew) and have always worked fine.  They are just automatically running the rdump command on the various clients, nothing special (read no | in the command).  As far as the .rhosts files missing, they are all in place and haven't changed since December 1995.
0
 
LVL 5

Expert Comment

by:tfabian
Comment Utility
if nothing on the systems has changed in 3 or more years (as I believe you're saying), then you've got a bigger problem than you're admitting to or asking..

three year old operating systems left unpatched are full of a bunch of security holes, and a bunch of outdated code..

my advise at this point (and you can take it or leave it) is to completely upgrade the operating systems to current version levels, reinstall current versions of your software, and then try to reimplement your backup routines..


stagnation is as big a problem as your failing backups..



0
 

Author Comment

by:ddavis42
Comment Utility
Upgrading is something we plan on doing soon, however that has no bearing on my current problem.  As far as security and patches, we have taken care of those issues as necessary.  These same backup routines are still working for the rest of my sub-nets.
0
 
LVL 1

Expert Comment

by:arthurd
Comment Utility
Is there anything going on during backups?  If a file is updated or deleted during the backup, this may cause some problems.  


0
 
LVL 1

Expert Comment

by:arthurd
Comment Utility
Is there anything going on during backups?  If a file is updated or deleted during the backup, this may cause some problems.  


0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 

Author Comment

by:ddavis42
Comment Utility
Nothing else is going on while the backups are being done.  They are done at 2:00 am, well before anyone here is at work!
0
 
LVL 5

Expert Comment

by:tfabian
Comment Utility
what about the cleaning people?? might they be unplugging a router or gateway about then and plugging in their vacumn cleaner??  stranger things have happened...


0
 
LVL 1

Expert Comment

by:arthurd
Comment Utility
I got this from sunsolve.  I don't know if this will help, but it may help in troubleshooting:

------------------------------------------------------

The broken pipe occurs because the rmt process that gets started on the tape host exits before signaling to the dump process on the dump host that it is finished. This could simply be a timing issue (in which case the dump actually completed - check to see if the dump is actually complete). If it is a timing issue, it means that the rmt process was finished and exited, but the dump on the dump host was too slow in catching the rmt's signal.

The rmt could be dying off for other reasons as well.  These would include having extra rmt processes or some defunct rmt process remaining.  Another cause would be a busy network where the local dump process is losing contact with the remote rmt process.  Following is a workaround (command to be issued on the 386i):
  dump 0f - | rsh <tapehost> "dd of=/dev/<tape device>"

This is very slow (a factor of ~10X), but it avoids the rmt process on the tape host.

--------------------------------------------------------
0
 

Author Comment

by:ddavis42
Comment Utility
To tfabian - Our cleaning crew goes home at 10:00 pm.
To arthurd - This sounds possable.  I will be doing my monthly backups (the ones that are causing all of these problems) next week.  When I get to the trouble sub-net I will check and see if the dump has been done, and let you (all) know how it works.  Thanks
0
 
LVL 4

Expert Comment

by:davidmwilliams
Comment Utility
 You missed what I said entirely ... the shell script need not have changed, but because you know a pipe has broken, you can look to see where the pipe is, and so that narrows down just what the problem is.
  The comment about missing .rhosts was just an example.
  Are you sure you are exercising rigour in your investigations of this problem??
0
 
LVL 4

Expert Comment

by:davidmwilliams
Comment Utility
 Another comment - why on earth are you waiting until the regular time for the monthly backups to test your backup system?
  Is your data important to you?  What happens if it fails - you will have no monthly backup at all.  Why not test the situation?
0
 

Author Comment

by:ddavis42
Comment Utility
I do and have been doing weekly and daily backups all along.  I am still able to do my monthly backups, just not using my normal tape server, it takes a little longer, but they are getting done.  I appreciate your concern, but you don't know my situation (work load) and I will test the situation next week as that is when time permits.
0
 
LVL 4

Expert Comment

by:davidmwilliams
Comment Utility
 That's interesting; so your daily and weekly backups are succeeding?
  And your monthly backups are succeeding with a different tape server, and no other alterations to your system?
  Just what do you mean by 'tape server' (a tape drive, a different Unix box ...) ?
0
 

Author Comment

by:ddavis42
Comment Utility
My daily and weekly backups are done within each subnet as they each have a tape drive.  My monthly backups are all done from one workstation that has two 5GB tape drives.  So my daily and weekly backups are done within each subnet and the monthly is done across our back bone.
0
 

Author Comment

by:ddavis42
Comment Utility
Well, I guess tfabian gets the points.  Though I don't know how it happened, aparently my routing tables got honked up.  Once I did a route -f on all of the systems involved in the problem, everything started working as before.  I would like to thank everyone else for your help and advice.
0
 
LVL 5

Accepted Solution

by:
tfabian earned 400 total points
Comment Utility
gee thanks..


 routing tables are always/should always be the first place to look..
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

My previous tech tip, Installing the Solaris OS From the Flash Archive On a Tape (http://www.experts-exchange.com/articles/OS/Unix/Solaris/Installing-the-Solaris-OS-From-the-Flash-Archive-on-a-Tape.html), discussed installing the Solaris Operating S…
Introduction Regular patching is part of a system administrator's tasks. However, many patches require that the system be in single-user mode before they can be installed. A cluster patch in particular can take quite a while to apply if the machine…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now