Solved

the specified network name is no longer available

Posted on 2013-05-16
41
1,499 Views
Last Modified: 2013-09-02
I am running redundant backups on 3 Windows 7 Pro computers located at 3 different sites over MPLS.  Call them sites 1,2 and 3.  Each computer backs up the same files from the originating computers (located at all 3 sites) in sequence; i.e. there is generally no overlap between backup jobs on any one computer and/or on the MPLS.

The backup computer at Site 2 was recently changed up from XP to Windows 7.
This ONE SITE backup computer, before Windows 7 and now with it, occasionally generates errors:
"the specified network name is no longer available"
That is, it only happens at Site 2 and I've only seen it happen on backup jobs from Site 1 (Site 3 has but one small backup job so the opportunity to see this is much more limited).

This happens in the middle of a backup and the rest of the message makes it obvious that the individual filename is known.
None of the other backup computers has this kind of error.
Another site, which has to back up the exact same files from the same source computers, has none of these errors.
The site with errors is geographically further away but the MPLS is all on fiber.

All of the computers are accessed using UAC so no name service at all; i.e. \\[ipaddress]\....

I very much doubt that there are any settings that could be affecting this.  I'm rather left with wondering about hardware - thus the recent swap-out of the computer.  Yet these errors continue to occur on occasion.  

I'm looking for good ways to narrow down the possibilities.
The hardware chain is:
Site 2
- backup computer
- local switch
- main switch
- MPLS router
MPLS fiber system
- at least one switch
Site 1
- MPLS router (which also serves the Site 3 backups without errors)
So, everything is common with Site 3 at this point.....

Not that I suspect the hardware necessarily but where else to look.
And, how to look?
I can run Wireshark on the Site 2 LAN switch ports ........
0
Comment
Question by:Fred Marshall
  • 20
  • 16
  • 4
41 Comments
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39178609
Ill take some shots in the dark here:
All layer 2 switches? No routing/vlans/different subnets between sites? Maybe the arp tables got full.

You using crash plan or something else's fw for this sort of thig?
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39179654
There are no VLANs as such.

There are separate subnets at each site with routing to connect them.  No problems there that I can see.  All of the connectivity is working except for these transient occurrences that likely have nothing at all to do with routing.

No using Crashplan.  Using Carbonite *separately* so this is not part of the solution.
This is a versioning backup system based on Second Copy where each site maintains separate backups of everything for geographical dispersion of backups.

My interest is in the networking / file sharing failure itself.
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39179781
All of the switches are Layer 2 switches unless there's anything different buried in the MPLS links.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39179941
I've had to do this for win 7 before. It's like the cache fills up and new connections get denied for no good reason.  Reboot would always fix it but this change made it always work
http://dbastas.blogspot.com/2012/05/optimize-windows-xp-and-windows-7-for.html?m=1
0
 
LVL 27

Expert Comment

by:Steve
ID: 39180646
may be worth checking into the known issues with SMB2 since the new version came out.

XP uses SMB and windows 7 used SMB2. the two are known to cause conflicts so It's worth considering disabling SMB2 across the board to see if it helps.

http://support.microsoft.com/kb/2696547


If that doesn't work I'd check the event log on sending and receiving PCs, and also leave some pings running so you can rule out connection issues/drops.
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39182285
aarontomosky: those regedits had already been made on all the computers - sending and receiving.

totallytonto: I found a hotfix for Windows 7at http://support.microsoft.com/kb/2792026/en-us to get around freezeups - which I can't confirm happen but .. could be.  But, the same thing was happening under Windows XP on the receiving end.  So I wonder....

Disabling SMB2 is not recommended except for debugging...   ???
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39182321
Kinda scary thought, but this whole thing may be resolved when the other clients are also running win 7... or it could just make it worse. Any way you can think of testing this out while keeping a way to revert?
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39182330
Well, at this stage *most* of the computers are running Windows 7... but I can't guarantee the correlation at the moment.

"While keeping a way to revert..."?  Do you mean turning off SMB2?
I mentioned XP because it doesn't run SMB2 as far as I know....
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39182426
I meant maybe setting up a win7 box to replace an xp box while keeping the xp box around in case you have to switch back, but I don't think that sounds like an option since these are upgrades.
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39183419
Well, actually the XP boxes remain on hand.  The one backup "receiver" was changed out from an XP box to a new Windows 7 box with no improvement in this situation.  So, I don't think that going back is likely to help.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39185262
what are you using to do the backups? robocopy?
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39186045
This is a versioning backup system based on Second Copy
0
 
LVL 27

Expert Comment

by:Steve
ID: 39187273
it's the XP machines that suffer when you have a mixture of XP & 7 using the same shares, due to SMB2 and how it accesses the files.

once you get rid of all the XP machines it should stop being an issue, but you may have to disabled SMB2 until then.
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39270287
This has boiled down now to the following:

- all of the machines/sites doing backup with Second Copy are Windows 7.
- only ONE folder on an XP machine fails to backup with the reported error and this is at another site.
So, out of around 20 backup jobs repeated at 3 sites for a total of 60, there is but one job out of the 60 from one source at one remote site backup that is doing this now.
And, now it seems consistent - although I somewhat doubt that it will always fail based on past experience with it.
The same source going to local and another remote backup works fine.
The Event Viewers show nothing.
Pings work consistently.

Here is the topology:

Source Computer<>switch<>switch<>switch<>MPLS router<>MPLS router<>switch<>switch<>Backup computer.

The topology is the same for the other remote backup that works except the
MPLS router<>switch<>switch<>Backup computer
at the end are different devices.

These observations suggest that all is fine at the source and all is fine at the backup device otherwise.

And, to repeat, 19 of 20 backups at the "failing" site still work consistently.  It's just this one.
And, to repeat, the same "failing" source/folder works on the other backups.

I might add that the failing source folder is relatively large but not the largest.
0
 
LVL 27

Expert Comment

by:Steve
ID: 39270762
have you left a ping running during the backup to see if a brief network drop causes failure?
otherwise I'm not sure what else to suggest other than checking the event viewer for any errors around the time of the failure.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39271629
Do these run on a schedule? So it may not be the folder specifically, but the time it runs? Look at what else it happening at that time
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39272290
Yes, it runs on a schedule.  In this case it is launched at 4:30 a.m.  I don't see anything else running at the same time.  The backups are staggered in time to not overlap.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39272457
so my question is: is something wrong with that folder, or is something happening around 4:30 that will mess up any backup happening at that time...
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39275024
If there were something wrong with the folder then why might the other 2 backups of it work fine?

There is nothing I know of that's happening at 4:30 that would mess it up that I know of.
And, if fails if I start it manually at any time.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39275038
Here's a thought: move the schedule around and see if whatever runs in that timeslot fails or if the pattern changes
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 25

Author Comment

by:Fred Marshall
ID: 39275057
Isn't that what the manually-launched backups do?
0
 
LVL 27

Expert Comment

by:Steve
ID: 39278086
I've seen something similar to this and it was caused by network issues (packet loss/delay)

Did you try leaving the constant ping running to see if any blips occurred around the time it failed?
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39278352
Is the VPN expiring and reconnecting at that time?
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39278499
There is no VPN.  Simple, bare MPLS.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39278806
Can you run pings overnight (at least around 4:30) from each site to both other sites, and something else on the Internet somewhere? Lets see exactly which links go out.
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39278996
Yes, I have a "probe" program that will do that (ping continuously) and log "n" misses with a trace route.  This way a contiguous set of misses can be seen and the results of a trace route at that time can also be seen.

But, again, this failure is not connected with 4:30 because it occurs *whenever* I run the backup between these two machines and on this one source folder.  Otherwise the source folder works fine with the same backup setup on 2 other backup machines.

It seems strange that there would be an outage on this one site while doing one particular backup out of 20 backups.

I have mapped the source folder so the path length is shorter.  But path length issues are normally reported as such and aren't mysterious.  And, that's not what I'm seeing here.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39279298
Sorry, I thought you said the manually backups for this folder succeed, I see I read that wrong.
I think I may still be unclear on where the files originate and where they go:
site1 -> site2 all 20 backups work
site2 -> site3 all 20 backups work
site3 -> site1 19 backups work (the one fails no matter when it's run either on a schedule or manually)
is there a site1->site3?

so this problem folder, where does it originate?
0
 
LVL 25

Assisted Solution

by:Fred Marshall
Fred Marshall earned 0 total points
ID: 39279532
site1 -> site1 all 20 backups work
site1 -> site2 all 20 backups work
site1 -> site3 19 backups work (the one fails no matter when it's run

There are also backups from site 2 and from site 3 but those aren't an issue.  A good number of the backups come from site 1.

I realized that the *same* filename was showing in all the errors.  So, I manually copied that file onto the backup.  So, now when the comparision is made, that file need not be copied.

I notice that manually copying the files causes a Windows 7 message about not trusting the file type which (at least in the failing case is *.FIM). So, I had to interact with it in order to proceed.  I can't imagine why this pops up as the Windows firewall is set to share files across the scope of all the subnets.

I manually copied all of the missing files to the backup and now the backup works (with no new files copied).  There will be new files tonight at the source so we'll see what happens.  This confirms at least that the comparison phase of the backups works.  And it proves that the file copies can work as well.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39279683
I don't think I've ever seen that before... I wonder if it has to do with that "i've been downloaded from the internet" flag on files, not specifically the extension.
0
 
LVL 25

Assisted Solution

by:Fred Marshall
Fred Marshall earned 0 total points
ID: 39285278
I did two things:
I set access on the backup drive to Everyone with all controls.
I copied the "missing" files onto the backup manually.
Now it seems to be working.
I'd like to go back and limit the permissions though .. so I will and see what happens.
0
 
LVL 38

Assisted Solution

by:Aaron Tomosky
Aaron Tomosky earned 500 total points
ID: 39285476
here is what I was talking about, it's an extra data stream attached to files outside of normal permissions:
http://superuser.com/questions/38476/this-file-came-from-another-computer-how-can-i-unblock-all-the-files-in-a
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39287067
Well, the error that's generated is likely different than what I'd been seeing here.

However, I *have* seen cases where files could not be backed up and this may be the answer to that issue.  In that case the error was different.  Usually the "owner" just deletes the files altogether to get past it.  I try to have a backup setup where no errors occur to avoid follow-ups.
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39359438
I have now compared packet captures between successful backups and ones that don't work.

In the ones that fail, I see references to NetBIOS name service.
In the ones that succeed, I see NO such references.

Well, there is no name service between subnets.  All of the addressing here is with UNC.
So, that seems a key difference and must be a clue. But I have no idea why that would happen.
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39359454
do you use netbios names in your unc? i.e. \\server1\share2 or do you use dns in your unc like \\server1.company.com\share2
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39359686
No NetBIOS names as there is not site-to-site or subnet-to-subnet name service.
By UNC I meant:
\\[ipaddress]\.......
e.g.
\\10.111.1.213\sharedfolder
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39359694
are all the backups still working since you manually copied that "bad" file a while back?
0
 
LVL 25

Author Comment

by:Fred Marshall
ID: 39359722
Mostly.  I have a case of the ScanSnap .pdf files that doesn't work and I'm keeping the situation just that way to be able to work on it.

The user just told me that *all* .pdf ScanSnap files do this.  That at least makes more sense than "some of them".
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 39359748
Could be they are written using a service that runs as network service user or something funny. Back to my thought its some type of weird permissions related thing.
0
 
LVL 25

Accepted Solution

by:
Fred Marshall earned 0 total points
ID: 39447587
The problem seems to have gone away and there's no rhyme or reason attributable to that change.
0
 
LVL 25

Author Closing Comment

by:Fred Marshall
ID: 39457820
I don't think we know beyond what I was able to do to change the behavior.
0

Featured Post

Do email signature updates give you a headache?

Do you feel like you are constantly making changes to email signatures? Are the images not formatting how you want them to? Want high-quality HTML signatures on all devices, including on mobiles and Macs? Then, let Exclaimer solve all your email signature problems today.

Join & Write a Comment

If you get continual lockouts after changing your Active Directory password, there are several possible reasons.  Two of the most common are using other devices to access your email and stored passwords in the credential manager of windows.
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Windows 8 came with a dramatically different user interface known as Metro. Notably missing from that interface was a Start button and Start Menu. Microsoft responded to negative user feedback of the Metro interface, bringing back the Start button a…
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now