Solved

Catostrophic Failure on Backup Exec 12.5

Posted on 2013-01-14
43
1,068 Views
Last Modified: 2013-03-10
Good day everyone. We run Backup Exec for all of our backup needs. We have an issue. We receive constant 0x8000ffff - Catastrophic failure errors backing up our SQL server and they are random and not consistent failures.

The errors states:


Backup- ms7.ms.ads - AOFO: Initialization failure on: "\\ms7.ms.ads\Shadow?Copy?Components". Advanced Open File Option used: Microsoft Volume Shadow Copy Service (VSS).
V-79-10000-11253 - Microsoft Volume Shadow Copy Services (VSS) snapshot provider returned the error: "Catastrophic failure". To make sure that the VSS is not disabled and can be started, click Control Panel, and then click Administrative Tools. Open the Services, and start Volume Shadow Copy. Check the Windows Event Viewer for details.
 - AOFO: Initialization failure on: "\\ms7.ms.ads\System?State". Advanced Open File Option used: Microsoft Volume Shadow Copy Service (VSS).
V-79-10000-11253 - Microsoft Volume Shadow Copy Services (VSS) snapshot provider returned the error: "Catastrophic failure". To make sure that the VSS is not disabled and can be started, click Control Panel, and then click Administrative Tools. Open the Services, and start Volume Shadow Copy. Check the Windows Event Viewer for details.

Whether it be backing up Full, Differential, or logs...it fails at times.

Any idea why this is happening?
0
Comment
Question by:mig1980
  • 23
  • 19
43 Comments
 
LVL 52

Assisted Solution

by:Manpreet SIngh Khatra
Manpreet SIngh Khatra earned 36 total points
ID: 38777948
Please ensure the VSS service is started in Services console

Run the below command in CMD to ensure all Writers are in Stable state
vssadmin list writers

- Rancy
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38777956
Couple things that could be wrong.  Run through the steps here:

http://www.symantec.com/business/support/index?page=content&id=TECH42969

Jim.
0
 

Author Comment

by:mig1980
ID: 38788284
VSS service is started and all writer show as stable. I had already checked that. It happens very sporadically (not always on the same job either) but always related to the sql backup jobs on the same sql server.

Could an sql job running on the sql server cause this issue?

In looking at the link JDettman provided, the only thing I could not find is the vsp.sys driver under the System Information - Software Environment - System Drivers. Everything else checks out but I do not understand what this is referring to: Verify the Backup to Disk folders being used are not selected for backup.

Also, Advanced Open File is not checked on either of the three backup jobs that pertain to that sql server. I initially had it checked but unchecked it to test before I posted this question. Not sure what the advantage is of having it checked.
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38788539
<<Verify the Backup to Disk folders being used are not selected for backup.>>

 It's saying that if your backing up to disk instead of tape, make sure that the folders your backing up to are not being included in the backup selection.

Jim.
0
 

Author Comment

by:mig1980
ID: 38789061
Oh, gotcha. No, I am backing up to tape.

Any other ideas here?
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38792604
I would re-install the remote agent on the server making sure the open file option is checked.

If that doesn't do it, then that leaves a network issue.  Check the event logs on the backup server and the SQL server and see if there is anything going on when the backup fails.

Jim.
0
 

Author Comment

by:mig1980
ID: 38793943
What is the difference between having the open file option checked and not? What does that do?

Also, could an sql job running on the sql server at the same time as the backup cause this issue?
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38805395
Sorry haven't gotten back to this - have been a bit backed up work wise.

<<What is the difference between having the open file option checked and not? What does that do?>>

 Open file option allows the software to backup a file even if it is currently in use by an application.  Normally if a file is locked, you can't read it - not good for the backup and it would need to skip the file.  

 If an app was running all the time (like a web server), you'd never get a backup.

 The open file option hooks into the disk I/O and waits for a quiet point or suspends I/O to the file, then takes a snapshot of the file.  I/O is then resumed.  In that way, it can get a backup of the file at a given point in time without having the file needing to be closed.

 Symantec used to do this on it's own, but Microsoft added the Volume Shadow Service to Windows, which does this as well and BE can use.  This is how your configured.

<<Also, could an sql job running on the sql server at the same time as the backup cause this issue?>>

 No.  VSS reaches out to something called "writters", which actually do the work of talking to something like SQL or Exchange and grabbing the data.  If anything it would be the opposite, the writter might cause errors for database users.

<<Check the Windows Event Viewer for details.>>

 I would double check the event logs when this occurs.  There should be a clue there as to why it's failing.

 I'm guessing though because they are random you have some kind of a network issue.

 When did the problem start and what has changed in that timeframe?

Jim.
0
 

Author Comment

by:mig1980
ID: 38806504
The problem started after our busy time of the year began (2 months ago roughly). It was backing up fine prior to that. And it is very random. I will take a look at the windows event viewer for the SQL server and see if there is anything out of the ordinary.
0
 

Author Comment

by:mig1980
ID: 38844404
After taking a look at the event viewer on the SQL server, I don't see any errors around the time that the backup occurs and the catastrophic failures occurs for VSS. All I see are informational events in the application section of the event viewer like this:


Event Type:      Information
Event Source:      MSSQLSERVER
Event Category:      (6)
Event ID:      18265
Date:            2/1/2013
Time:            8:11:18 AM
User:            DCDOMAIN\besa
Computer:      MS7
Description:
Log was backed up. Database: SEGDB12, creation date(time): 2010/05/24(11:18:12), first LSN: 2345:445:1, last LSN: 2345:447:1, number of dump devices: 1, device information: (FILE=1, TYPE=VIRTUAL_DEVICE: {'SEGDB12_00__2a36d06e_7913_433e_8be6_d0228b9e9d7a_'}). This is an informational message only. No user action is required.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 59 47 00 00 0a 00 00 00   YG......
0008: 06 00 00 00 4d 00 53 00   ....M.S.
0010: 49 00 4e 00 37 00 00 00   ....7...
0018: 07 00 00 00 6d 00 61 00   ....m.a.
0020: 73 00 74 00 65 00 72 00   s.t.e.r.
0028: 00 00                     ..      

Is there any other ideas on what to check to get to the bottom of this?
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38845084
at a command prompt, issue the following command:

VSSADMIN LIST WRITERS

and paste the output here.

Jim.
0
 

Author Comment

by:mig1980
ID: 38845118
Here you go:


vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001 Microsoft Corp.

Writer name: 'System Writer'
   Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
   Writer Instance Id: {5eeff731-cd9f-4a0d-992c-ea99adac0de2}
   State: [1] Stable
   Last error: No error

Writer name: 'MSDEWriter'
   Writer Id: {f8544ac1-0611-4fa5-b04b-f7ee00b03277}
   Writer Instance Id: {dd2b5a5c-33fc-4147-ab53-2c9d3c823ca3}
   State: [1] Stable
   Last error: No error

Writer name: 'Event Log Writer'
   Writer Id: {eee8c692-67ed-4250-8d86-390603070d00}
   Writer Instance Id: {449b6076-3334-40a7-8ff5-fbf7ca351c6d}
   State: [1] Stable
   Last error: No error

Writer name: 'Registry Writer'
   Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
   Writer Instance Id: {da22869d-b16b-4b56-b47f-364d7441f88f}
   State: [1] Stable
   Last error: No error

Writer name: 'COM+ REGDB Writer'
   Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
   Writer Instance Id: {71b92d71-f4bc-4828-a076-a488484f274d}
   State: [1] Stable
   Last error: No error

Writer name: 'TermServLicensing'
   Writer Id: {5382579c-98df-47a7-ac6c-98a6d7106e09}
   Writer Instance Id: {67e37aa1-4a61-488f-ad0d-0964e15210eb}
   State: [1] Stable
   Last error: No error

Writer name: 'BITS Writer'
   Writer Id: {4969d978-be47-48b0-b100-f328f07ac1e0}
   Writer Instance Id: {a16c514c-ad98-4ebc-b7eb-f8f0381c5e42}
   State: [1] Stable
   Last error: No error

Writer name: 'IIS Metabase Writer'
   Writer Id: {59b1f0cf-90ef-465f-9609-6ca8b2938366}
   State: [1] Stable
   Last error: No error

Writer name: 'WMI Writer'
   Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
   Writer Instance Id: {309e2d02-62b0-46fe-b21c-060cbd2952ac}
   State: [1] Stable
   Last error: No error
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38845191
Well that all looks fine.  Two things:

a. What's in the event logs for the OS when one of these errors occurs.

b. In Backup Exec, if you run a credentials check on the backup select list, does it all pass?

 Also, in the meantime, let's see if we can't pin this down some more.  As a test, unselect the system state from the backup selection list and see if it runs without error that way.  If it does, then at least we narrowed it down to the one VSS writer.  Does this after b above.

Jim.
0
 

Author Comment

by:mig1980
ID: 38845251
a...are you referring to the System events logs on the SQL Server? If yes, nothing shows.

b...I'm not sure how to perform this?
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38845331
<<a...are you referring to the System events logs on the SQL Server? If yes, nothing shows.>>

 Yes and there should be something there.  Look in the application log.

<<b. In Backup Exec, if you run a credentials check on the backup select list, does it all pass?>>

1.  On the menu, Edit, Manage selection lists.  

2.  Choose the seletcion list being used by the backup that's failing.

3.  When the the dialog opens, select "Resource Credentials" on the left.

4.  Click the "test all" button on the right.

 This will verify that BE has the required access to backup each of the resources.

 Test Results column should all be "Successfull".  Pay pacticular attention to "System State".

Jim.
0
 

Author Comment

by:mig1980
ID: 38845697
The Application event log only seems to be recording the SQLServer Success and Failure Audits as nothing else appears no matter how far back I go. I see a few Information logs appear after each database is finished backing up similar to the one I posted in ID: 38844404 above.

I tested the Resource Credentials and all came back successful (system state and shadow copy components stated that server credentials were used).

I ran the log backup with the System State and I recenved the same error. I then ran it without backing up the system state and it completed fine with the exception of the prompt I explain below which appears after every backup job. I also want to mention that I am backing up Full, Differential, and log and all three jobs use the same selection list.


I also notice that if I run a backup while logged in to the server, I receive a prompt stating "IDR Full Backup Success" and it mentions that it is recommended that I rerun the Intelligent Disaster Recovery Preparation Wizard and select "Copy - Disaster recovery information (.dr) files" option, to back up your disaster recovery information (.dr) file. I don't think I even choose this option??
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38851071
<<I don't think I even choose this option?? >>

 IDR let's you recover a system after a disk failure and was turned on at some point.  Whether you actually want to use it or not is hard to say.

 Typically, one might not bother with it on a server which has RAID setup as it would be a very rare occurance to loose a volume.  Or if it as a virtualized server.  In those cases, it just doesn't make sense to run it.

 It may be however where your error is coming from.  If you didn't use the wizard to fully configure it, then that may be causing the issue.

 I'm surprised it said "sucess" on the IDR with system state turned off in the selection list.  Part of getting a valid IDR is to make sure that the system state is backed up.

 I've never used IDR myself and also don't knwo your environment, so I'm hesitant to offer anything past this point.

 I would say that for a backup or two or possibly just for testing, you might want to try turning IDR off, leave system state turned on, and see if you get a error free backup.

 If so, then turn IDR back on (assuming you want to)  and run through the wizard to make sure it's configured properly.

 If after that you get a error free backup your set.  If not, then we know what's causing the issue.

Jim.
0
 

Author Comment

by:mig1980
ID: 38852123
I went into the Properties of the backup jobs in question I don't see the option to turn off IDR. Any idea how to turn it off?
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38852219
The only way to turn it off I believe is to remove it.

Go to Tools>serial numbers and installations, then take out the IDR license, and complete the wizard.

Make sure you make a note of the serial # before removing it.

Jim.
0
 

Author Comment

by:mig1980
ID: 38852275
There is no serial numbers and installations. Attached is a screenshot of what is available under options.
BackupExecOptions.jpg
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38852430
"Install options and license keys on this media server."

Jim.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:mig1980
ID: 38852473
So I tried that but attached are the only options I see to remove, etc. IDR is not on the list.
BackupExecOptions2.jpg
0
 

Author Comment

by:mig1980
ID: 38861304
So I took a look at the SQL Server error logs and it is showing errors around the same time of the backup jobs are running. Below are the errors it shows:

[165] ODBC Error: 0, Unable to complete login process due to delay in opening server connection [SQLSTATE 08001]

[382] Logon to server '(local)' failed (ConnUpdateStartExecutionDate)

[382] Logon to server '(local)' failed (SaveAllSchedules)

These errors are occurring almost at every backup time. Any ideas?
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38861420
<<So I tried that but attached are the only options I see to remove, etc. IDR is not on the list. >>

 You got me on that.   As far as I know that's the only place it shows up.  Might be time for a call to Symantec.

<<So I took a look at the SQL Server error logs and it is showing errors around the same time of the backup jobs are running. Below are the errors it shows:>>

 These could be due to a lot of different things and are hard to diagnose.  Basically it means that the software requested a connection, and because it took so long, the login to SQL failed.

 That can mean anything from a network problem, to SQL server being overloaded, to something as simple as trying to hit the wrong server because of DNS.

 I think to make further progress, you should ignore those for the moment and call symatec and find out how IDR gets turned off in 12.5.  See if that doesn't clear things up.

 Maybe they'll have some input on what started all this as well.

Jim.
0
 

Assisted Solution

by:mig1980
mig1980 earned 0 total points
ID: 38861793
I ended up figuring out how to uninstall IDR from Backup Exec. The article that describes the steps is here: http://www.symantec.com/business/support/index?page=content&id=TECH64587

I ran a backup job manually (SQL server log backup) and it did finish without any errors. I will review my schedules for the next 24 hours and see how it goes.
0
 

Author Comment

by:mig1980
ID: 38864099
Well, I reviewed my backups for the SQL Server and I am still getting the Catastrophic Failure errors on Backup Exec and the errors mentioned above on the SQL Server. Any other ideas?
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38864453
<<I ended up figuring out how to uninstall IDR from Backup Exec. The article that describes the steps is here: http://www.symantec.com/business/support/index?page=content&id=TECH64587>>

 So they made it part of the program install.  That's interesting.  Everything in the past has always been through the tools menu before in BE.

<<Well, I reviewed my backups for the SQL Server and I am still getting the Catastrophic Failure errors on Backup Exec and the errors mentioned above on the SQL Server. Any other ideas? >>

Just to double check, do a DNSLookup from the server doing the backup to the SQL Server that has the issue.  Make sure DNS is pointing to the correct IP address.

Jim.



Jim.
0
 

Author Comment

by:mig1980
ID: 38865900
I did an nslookup and it does resolve to the correct IP address for the SQL  Server from the Backup Exec Server.
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 464 total points
ID: 38868209
Is your Windows 2003 server up to date on SP's?   There was an update to VSS on Windows 2003 server for timeouts.  Anything past SP1 should have the fixes in.

But that aside, the problem appears to be SQL server getting bogged down rather then a problem with the writers flushing to the shadow copy.

I can't find any documentation though on how to increase the connection time out for the VSS writer.  I'll see if I can't find that.

 In the mean time, make sure the server is patched to at least SP1.

Jim.
0
 

Author Comment

by:mig1980
ID: 38868910
Yes, Windows 2003 server has Service Pack 2. So, this issue could be correlated to other SQL server jobs being performed at the same time? The reason I ask is because we have multiple jobs running at any given point in time throughout the day.
0
 
LVL 57
ID: 38869347
<<Yes, Windows 2003 server has Service Pack 2. So, this issue could be correlated to other SQL server jobs being performed at the same time? >>

 If there on that server yes.  What the error is saying is that SQL is too busy to complete the login in the connection timeout alloted.

 This causes the VSSWriter to signal an error to the backup.

Jim.
0
 

Author Comment

by:mig1980
ID: 38869635
Is there a way to allow either a longer timeout or allow more memory for this function?
0
 
LVL 57
ID: 38873601
<<Is there a way to allow either a longer timeout or allow more memory for this function? >>

 I haven't found an actual setting anywhere as yet.  My fear is it will be something like an  undocumented registry key.  But I haven't had time to dig.

Jim.
0
 

Author Comment

by:mig1980
ID: 38910593
Following up to see if there is any further assistance in this.
0
 
LVL 57
ID: 38911427
I've looked, but have found nothing to date on getting the timeout change.

One thing we should double check though; make sure that whatever protocol the client is using (use SQL Server Configuration Manager) to see what order the client has the protocols listed in (tcpip, named pipes, etc).  Then check SQL server.  If they are in a different order, change the client to match the server.

That ensures that the client won't waste time with SQL trying to get a connection with the right protocol.

Jim.
0
 

Author Comment

by:mig1980
ID: 38911486
The client being the backup server?
0
 
LVL 57
ID: 38911567
<<The client being the backup server? >>

 Yes.  The SQL Client configuration should match the SQL Network configuration on the SQL server.

Jim.
0
 

Author Comment

by:mig1980
ID: 38911653
OK, so I accessed my SQL Server and this information is only from my SQL server:

Protocols for MSSQLSERVER (under SQL Server 2005 Network Configuration) lists Shared Memory (enabled), Named Pipes (disabled), TCP/IP (enabled), VIA (disabled).

Client Protocols (under SQL Native Client Configuration) lists Shared Memory (enabled), TCP/IP (enabled), Named Pipes (enabled) , VIA (disabled).

All of these are from top to bottom. Is this what you were referring to or are you also referring to the configurations of the SQL Server Configuration Manager on the backup Exec server?
0
 
LVL 57
ID: 38911686
<<All of these are from top to bottom. Is this what you were referring to or are you also referring to the configurations of the SQL Server Configuration Manager on the backup Exec server? >>

  Yes.   But with what you listed and the current enabled/disabled settings, change the order on the server side putting TCP/IP in front of Named pipes (order number will be 2 for TCPIP and 3 for named pipes).

  Do a right click on a protocol and select order to change it.  This will bring up a dialog box.  Selected TCPIP and click the up arrow.

  This change won't impact anything on the client side (the backup exec server) if by chance it talks to another SQL server box using named pipes and nothing on the server side since named pipes is disabled anyway.

 The orders will then match between the two.

Jim.
0
 

Author Comment

by:mig1980
ID: 38911937
I notice that the order is default for MS SQL  server. I looked at another test SQL Server we have and the order is the same. Should I still change it?
0
 
LVL 57
ID: 38911991
<<I notice that the order is default for MS SQL  server. I looked at another test SQL Server we have and the order is the same. Should I still change it? >>

 Yes.  But normally you would want named pipes before TCPIP on a local lan.

 However in the case, I don't know what else is configured in your environment.  Making the change this way, nothing will be impacted except what were trying to change.

Jim.
0
 

Accepted Solution

by:
mig1980 earned 0 total points
ID: 38954874
Thank you all for your assistance in this matter. I feel silly but have learned something. After restarting the server, all seems to be well. I have not received the failures in a week.
0
 

Author Closing Comment

by:mig1980
ID: 38970868
The best solution was not presented. It was as simple as restarting the server.
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Suggested Solutions

Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
ADCs have gained traction within the last decade, largely due to increased demand for legacy load balancing appliances to handle more advanced application delivery requirements and improve application performance.
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This tutorial will show how to configure a single USB drive with a separate folder for each day of the week. This will allow each of the backups to be kept separate preventing the previous day’s backup from being overwritten. The USB drive must be s…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now