Solved

Soft Write Errors - PowerVault 124T LTO-2 Veritas 10.d

Posted on 2006-07-02
20
1,134 Views
Last Modified: 2008-01-09
PowerVault 124T LTO-2 Veritas 10.d  Certance tape drive Adaptec 39160 dual port  Dell Power edge 2800 perc4e/di raid controller 2GB ram Server 2003  Dell Drivers  Autoloader V31.0  Drive 1826 We also tried the Veritas drivers The Dell provided SCSI cable is 4 Meters long
 
Veritas writes with compression about 270GB per tape and generates 3500 soft write errors.  We have tried multiple tape brands Dell, Fuji and Maxell. Pre-zipped data generates a lot less soft write errors. Tape drive has been cleaned multiple times.

What do we need to fix to stop the soft write errors ?
Thanks,  Brian
0
Comment
Question by:martinmartin2
  • 8
  • 3
  • 3
  • +1
20 Comments
 
LVL 30

Accepted Solution

by:
Duncan Meyers earned 250 total points
Comment Utility
Ignore them. You only need to be concerened if they become hard write errors - which indicate a tape media physical flaw or a drive hardware problem. The soft write error indicates a corrected error (which, incidentally, are happening constantly on your hard disc drives - you just never see them as the drive firmware handles the retry/error correction) - and from what you post, the errors may relate to the tape drive data compression algorithms, so a firmware update may make the errors disappear.

In short, the soft errors are not a threat to your data integrity. Ignore them unless they become hard read/write errors or the number of errors increases dramatically.
0
 
LVL 30

Expert Comment

by:Duncan Meyers
Comment Utility
BTW - some manufacturers mask soft write errors unless they exceed a certain threshold in order to reduce spurious and unnecessary hardware calls...
0
 
LVL 22

Assisted Solution

by:dovidmichel
dovidmichel earned 250 total points
Comment Utility
I both disagree and agree with meyersd.

Yes, since they are soft write errors the data was correctly written to tape and once you have verified the data can be restored and it not corrupted there is no need to worry about the data.

No don't ignore the problem, contact Dell and have the problem corrected.

With some drives such as the old 8mm drives from Exabyte it was normal to have soft write errors and only if the number of soft errors increased was there an indication of a problem. With other drives it is not part of the technology to always have soft write errors. This is the catagory for LTO, it is not normal for them to have soft write errors.

The problem does not have to be with the drive, but in this case since you have been pro-active and tried different brand tapes chances are the source is with the drive.

A soft write error is were the drive tries to write and fails, the tape is then advanced to the next block and the write is tried again. This continues until either the data is correctly written or a threshhold is met in which case it becomse a hard write, or UDE Unrecoverable Data Error.
0
 

Author Comment

by:martinmartin2
Comment Utility
I have an open ticket with Dell.  Dell has been working on this problem for 5 days. They will not replace the drive with only soft write errors.  We get so many errors the backups are slow.  The though put to tape averages 281mb/min. Our branch office that has the same tape drive gets 856mb/min. The branch office can read our tapes with soft read errors.  The soft read errors do not equal the number of soft write errors.

I was hoping some one had seen this problem before so I can point Dell support in the right direction.
0
 
LVL 22

Expert Comment

by:dovidmichel
Comment Utility
Both throughput and capacity are going to suffer as a result of the errors.

You can try an IBM tape since from what I have seen and what an IBM engineer told me, IBM LTO drives really do work better with IBM tapes.

First off go back through your logs and compile a report containing date of backup, # of mbs backed up, # of soft write errors.

It will help your case to be able to present a documented history. Best case would be if you could show a point in time with little to no errors and then a steady increase in the # of errors as time goes one.

It might also help if you make the argument not off the errors themselves but that the problem is the lost capacity and longer backup times. Then one more thing to remember with tech support departments, speak with the manager, and if he does not help then his manager and so on.
0
 

Author Comment

by:martinmartin2
Comment Utility
We do not have an IBM drive it is Certance ( Once Seagate now Quantum )
0
 

Author Comment

by:martinmartin2
Comment Utility
It will help your case to be able to present a documented history. Best case would be if
you could show a point in time with little to no errors and then a steady increase in
the # of errors as time goes one.

This never existed.  The number of errors were from day one. First backup. The moment we switched on the tape drive.  Purgatory for the first moment.
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 

Author Comment

by:martinmartin2
Comment Utility
Dell is shiping a new autoloader & tape drive.  We will see itby Thursday. I will post again on Thursday July 6. 2006 after 6:00pm.  Thanks
0
 
LVL 30

Expert Comment

by:Duncan Meyers
Comment Utility
>We get so many errors the backups are slow

That's a pretty important point. It indicates that the drive is most likely faulty.

BTW - With Dell, I've found that if you stand your ground and *insist* that a part be replaced, they'll usually come to the party pretty quickly. But you have to be prepared to argue, particularly if you're dealing with the off-shored helpdesk guys.
0
 

Author Comment

by:martinmartin2
Comment Utility
New Autolaoder and tape drive have been installed.  The problem continues on the new equipment.  The backups are slightly faster because there are less soft write errors.  We are now getting 2800 per 270GB full tape backup with compression.   Strange problem.  Now what ?
0
 
LVL 22

Expert Comment

by:dovidmichel
Comment Utility
OK so by just swapping the drive errors were reduced by 20%, that sure sounds like hardware to me.

What it really comes down to is that a soft write error happens when the drive is not able to write data to tape.

0
 

Author Comment

by:martinmartin2
Comment Utility
Dell's final resolution was to replace the LTO-2 tape changer with a LTO-3 tape changer.  All the errors went away with the new hardware.
0
 

Author Comment

by:martinmartin2
Comment Utility
Dell forum - Dell-Bob D Moderator
Forum Home > Business Systems > PowerVault > Tape Backup Library > PowerVault 124T LTO-2 Veritas 10.d Excessive soft write errors


The LTO format records data in what is referred to as Codeword Pairs and Codeword Quads. A Codeword Quad includes two Codeword pairs and is the smallest unit that can be rewritten. A Codeword Quad holds 936 bytes of data. The ECC system records 64 Codeword Quads were 54 contains user data and 10 contains ECC data. On average a Codeword Quad therefore contain 790 (936 * 54/64) bytes of user data.

Whenever an error is detected in a Codeword the Codeword Quad containing the Codeword in error will be rewritten. The LTO formats require all tracks written in parallel to be synchronized. For this reason the Codeword Quads located on the other physical tracks has to be rewritten as well. For these reasons a rewrite will result in capacity loss of 6320 bytes (790 bytes * 8 tracks) on LTO2 and 12640 bytes (790 * 16 tracks) on LTO3.

Practically the LTO formats have reserved approximately 5 % (10GB on LTO2) extra capacity for rewrites. You will use this extra capacity if there are approximately 7 rewrites / MB on LTO2.

martinmartin2
We get 22,500 softwrite errors per 100GB of data written to a tape.  How does that fit the formula ?

Dell-Bob D Moderator
You are "allowed" 7 errors per MB before you exceed the 5% extra capacity the media vendors have allotted for soft errors.

100GB = 102400MB

102400MB * 7 = 716,800 errors

Although 22500 errors seems excessive, you have only used roughly 135MB of the 10GB allotted for that tape (22500errors * 6320bytes)  If you are satisfied with your throughput #'s for the backups I see no reason to worry about these errors at this time.


0
 

Author Comment

by:martinmartin2
Comment Utility
Please close the question and award award the points. I appreciate everyone's comments.
0
 
LVL 87

Expert Comment

by:rindi
Comment Utility
Hello martinmartin2,

you still there?
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Hi, I've made you some graphics for a better understanding how RAID works. First of all, there are two ways a raid can be generated: - By hardware - By software What does that mean? This means: If you have a hardware RAID controller, there…
We wanted to provide an in-depth explanation of the Ping Node offering clarifications on its function and usage. Incorrect Ping Node configuration and functionality can cause problems with HA clusters. The importance of this article is critical for …
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now