the specified network name is no longer available

I am running redundant backups on 3 Windows 7 Pro computers located at 3 different sites over MPLS.  Call them sites 1,2 and 3.  Each computer backs up the same files from the originating computers (located at all 3 sites) in sequence; i.e. there is generally no overlap between backup jobs on any one computer and/or on the MPLS.

The backup computer at Site 2 was recently changed up from XP to Windows 7.
This ONE SITE backup computer, before Windows 7 and now with it, occasionally generates errors:
"the specified network name is no longer available"
That is, it only happens at Site 2 and I've only seen it happen on backup jobs from Site 1 (Site 3 has but one small backup job so the opportunity to see this is much more limited).

This happens in the middle of a backup and the rest of the message makes it obvious that the individual filename is known.
None of the other backup computers has this kind of error.
Another site, which has to back up the exact same files from the same source computers, has none of these errors.
The site with errors is geographically further away but the MPLS is all on fiber.

All of the computers are accessed using UAC so no name service at all; i.e. \\[ipaddress]\....

I very much doubt that there are any settings that could be affecting this.  I'm rather left with wondering about hardware - thus the recent swap-out of the computer.  Yet these errors continue to occur on occasion.  

I'm looking for good ways to narrow down the possibilities.
The hardware chain is:
Site 2
- backup computer
- local switch
- main switch
- MPLS router
MPLS fiber system
- at least one switch
Site 1
- MPLS router (which also serves the Site 3 backups without errors)
So, everything is common with Site 3 at this point.....

Not that I suspect the hardware necessarily but where else to look.
And, how to look?
I can run Wireshark on the Site 2 LAN switch ports ........
LVL 27
Fred MarshallPrincipalAsked:
Who is Participating?
 
Fred MarshallPrincipalAuthor Commented:
The problem seems to have gone away and there's no rhyme or reason attributable to that change.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Ill take some shots in the dark here:
All layer 2 switches? No routing/vlans/different subnets between sites? Maybe the arp tables got full.

You using crash plan or something else's fw for this sort of thig?
0
 
Fred MarshallPrincipalAuthor Commented:
There are no VLANs as such.

There are separate subnets at each site with routing to connect them.  No problems there that I can see.  All of the connectivity is working except for these transient occurrences that likely have nothing at all to do with routing.

No using Crashplan.  Using Carbonite *separately* so this is not part of the solution.
This is a versioning backup system based on Second Copy where each site maintains separate backups of everything for geographical dispersion of backups.

My interest is in the networking / file sharing failure itself.
0
WEBINAR: 10 Easy Ways to Lose a Password

Join us on June 27th at 8 am PDT to learn about the methods that hackers use to lift real, working credentials from even the most security-savvy employees. We'll cover the importance of multi-factor authentication and how these solutions can better protect your business!

 
Fred MarshallPrincipalAuthor Commented:
All of the switches are Layer 2 switches unless there's anything different buried in the MPLS links.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
I've had to do this for win 7 before. It's like the cache fills up and new connections get denied for no good reason.  Reboot would always fix it but this change made it always work
http://dbastas.blogspot.com/2012/05/optimize-windows-xp-and-windows-7-for.html?m=1
0
 
SteveCommented:
may be worth checking into the known issues with SMB2 since the new version came out.

XP uses SMB and windows 7 used SMB2. the two are known to cause conflicts so It's worth considering disabling SMB2 across the board to see if it helps.

http://support.microsoft.com/kb/2696547


If that doesn't work I'd check the event log on sending and receiving PCs, and also leave some pings running so you can rule out connection issues/drops.
0
 
Fred MarshallPrincipalAuthor Commented:
aarontomosky: those regedits had already been made on all the computers - sending and receiving.

totallytonto: I found a hotfix for Windows 7at http://support.microsoft.com/kb/2792026/en-us to get around freezeups - which I can't confirm happen but .. could be.  But, the same thing was happening under Windows XP on the receiving end.  So I wonder....

Disabling SMB2 is not recommended except for debugging...   ???
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Kinda scary thought, but this whole thing may be resolved when the other clients are also running win 7... or it could just make it worse. Any way you can think of testing this out while keeping a way to revert?
0
 
Fred MarshallPrincipalAuthor Commented:
Well, at this stage *most* of the computers are running Windows 7... but I can't guarantee the correlation at the moment.

"While keeping a way to revert..."?  Do you mean turning off SMB2?
I mentioned XP because it doesn't run SMB2 as far as I know....
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
I meant maybe setting up a win7 box to replace an xp box while keeping the xp box around in case you have to switch back, but I don't think that sounds like an option since these are upgrades.
0
 
Fred MarshallPrincipalAuthor Commented:
Well, actually the XP boxes remain on hand.  The one backup "receiver" was changed out from an XP box to a new Windows 7 box with no improvement in this situation.  So, I don't think that going back is likely to help.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
what are you using to do the backups? robocopy?
0
 
Fred MarshallPrincipalAuthor Commented:
This is a versioning backup system based on Second Copy
0
 
SteveCommented:
it's the XP machines that suffer when you have a mixture of XP & 7 using the same shares, due to SMB2 and how it accesses the files.

once you get rid of all the XP machines it should stop being an issue, but you may have to disabled SMB2 until then.
0
 
Fred MarshallPrincipalAuthor Commented:
This has boiled down now to the following:

- all of the machines/sites doing backup with Second Copy are Windows 7.
- only ONE folder on an XP machine fails to backup with the reported error and this is at another site.
So, out of around 20 backup jobs repeated at 3 sites for a total of 60, there is but one job out of the 60 from one source at one remote site backup that is doing this now.
And, now it seems consistent - although I somewhat doubt that it will always fail based on past experience with it.
The same source going to local and another remote backup works fine.
The Event Viewers show nothing.
Pings work consistently.

Here is the topology:

Source Computer<>switch<>switch<>switch<>MPLS router<>MPLS router<>switch<>switch<>Backup computer.

The topology is the same for the other remote backup that works except the
MPLS router<>switch<>switch<>Backup computer
at the end are different devices.

These observations suggest that all is fine at the source and all is fine at the backup device otherwise.

And, to repeat, 19 of 20 backups at the "failing" site still work consistently.  It's just this one.
And, to repeat, the same "failing" source/folder works on the other backups.

I might add that the failing source folder is relatively large but not the largest.
0
 
SteveCommented:
have you left a ping running during the backup to see if a brief network drop causes failure?
otherwise I'm not sure what else to suggest other than checking the event viewer for any errors around the time of the failure.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Do these run on a schedule? So it may not be the folder specifically, but the time it runs? Look at what else it happening at that time
0
 
Fred MarshallPrincipalAuthor Commented:
Yes, it runs on a schedule.  In this case it is launched at 4:30 a.m.  I don't see anything else running at the same time.  The backups are staggered in time to not overlap.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
so my question is: is something wrong with that folder, or is something happening around 4:30 that will mess up any backup happening at that time...
0
 
Fred MarshallPrincipalAuthor Commented:
If there were something wrong with the folder then why might the other 2 backups of it work fine?

There is nothing I know of that's happening at 4:30 that would mess it up that I know of.
And, if fails if I start it manually at any time.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Here's a thought: move the schedule around and see if whatever runs in that timeslot fails or if the pattern changes
0
 
Fred MarshallPrincipalAuthor Commented:
Isn't that what the manually-launched backups do?
0
 
SteveCommented:
I've seen something similar to this and it was caused by network issues (packet loss/delay)

Did you try leaving the constant ping running to see if any blips occurred around the time it failed?
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Is the VPN expiring and reconnecting at that time?
0
 
Fred MarshallPrincipalAuthor Commented:
There is no VPN.  Simple, bare MPLS.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Can you run pings overnight (at least around 4:30) from each site to both other sites, and something else on the Internet somewhere? Lets see exactly which links go out.
0
 
Fred MarshallPrincipalAuthor Commented:
Yes, I have a "probe" program that will do that (ping continuously) and log "n" misses with a trace route.  This way a contiguous set of misses can be seen and the results of a trace route at that time can also be seen.

But, again, this failure is not connected with 4:30 because it occurs *whenever* I run the backup between these two machines and on this one source folder.  Otherwise the source folder works fine with the same backup setup on 2 other backup machines.

It seems strange that there would be an outage on this one site while doing one particular backup out of 20 backups.

I have mapped the source folder so the path length is shorter.  But path length issues are normally reported as such and aren't mysterious.  And, that's not what I'm seeing here.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Sorry, I thought you said the manually backups for this folder succeed, I see I read that wrong.
I think I may still be unclear on where the files originate and where they go:
site1 -> site2 all 20 backups work
site2 -> site3 all 20 backups work
site3 -> site1 19 backups work (the one fails no matter when it's run either on a schedule or manually)
is there a site1->site3?

so this problem folder, where does it originate?
0
 
Fred MarshallPrincipalAuthor Commented:
site1 -> site1 all 20 backups work
site1 -> site2 all 20 backups work
site1 -> site3 19 backups work (the one fails no matter when it's run

There are also backups from site 2 and from site 3 but those aren't an issue.  A good number of the backups come from site 1.

I realized that the *same* filename was showing in all the errors.  So, I manually copied that file onto the backup.  So, now when the comparision is made, that file need not be copied.

I notice that manually copying the files causes a Windows 7 message about not trusting the file type which (at least in the failing case is *.FIM). So, I had to interact with it in order to proceed.  I can't imagine why this pops up as the Windows firewall is set to share files across the scope of all the subnets.

I manually copied all of the missing files to the backup and now the backup works (with no new files copied).  There will be new files tonight at the source so we'll see what happens.  This confirms at least that the comparison phase of the backups works.  And it proves that the file copies can work as well.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
I don't think I've ever seen that before... I wonder if it has to do with that "i've been downloaded from the internet" flag on files, not specifically the extension.
0
 
Fred MarshallPrincipalAuthor Commented:
I did two things:
I set access on the backup drive to Everyone with all controls.
I copied the "missing" files onto the backup manually.
Now it seems to be working.
I'd like to go back and limit the permissions though .. so I will and see what happens.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
here is what I was talking about, it's an extra data stream attached to files outside of normal permissions:
http://superuser.com/questions/38476/this-file-came-from-another-computer-how-can-i-unblock-all-the-files-in-a
0
 
Fred MarshallPrincipalAuthor Commented:
Well, the error that's generated is likely different than what I'd been seeing here.

However, I *have* seen cases where files could not be backed up and this may be the answer to that issue.  In that case the error was different.  Usually the "owner" just deletes the files altogether to get past it.  I try to have a backup setup where no errors occur to avoid follow-ups.
0
 
Fred MarshallPrincipalAuthor Commented:
I have now compared packet captures between successful backups and ones that don't work.

In the ones that fail, I see references to NetBIOS name service.
In the ones that succeed, I see NO such references.

Well, there is no name service between subnets.  All of the addressing here is with UNC.
So, that seems a key difference and must be a clue. But I have no idea why that would happen.
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
do you use netbios names in your unc? i.e. \\server1\share2 or do you use dns in your unc like \\server1.company.com\share2
0
 
Fred MarshallPrincipalAuthor Commented:
No NetBIOS names as there is not site-to-site or subnet-to-subnet name service.
By UNC I meant:
\\[ipaddress]\.......
e.g.
\\10.111.1.213\sharedfolder
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
are all the backups still working since you manually copied that "bad" file a while back?
0
 
Fred MarshallPrincipalAuthor Commented:
Mostly.  I have a case of the ScanSnap .pdf files that doesn't work and I'm keeping the situation just that way to be able to work on it.

The user just told me that *all* .pdf ScanSnap files do this.  That at least makes more sense than "some of them".
0
 
Aaron TomoskySD-WAN SimplifiedCommented:
Could be they are written using a service that runs as network service user or something funny. Back to my thought its some type of weird permissions related thing.
0
 
Fred MarshallPrincipalAuthor Commented:
I don't think we know beyond what I was able to do to change the behavior.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.