asked on

How to check ZFS storage pool?

Hi
We are running an Oracle database on Sun Solaris. This database has recently added a new SAS-RAID. There we use ZFS to be able to use a partition size of 19 TB.
During a new load of data into Oracle we got a ORA-01115: IO error reading block from file ....

After this we ran zpool status -xv and this showed us, there really were errors (see code).
4 files (filesize 30G) were corrupt.
The RAID does not show any errors.

Now, how can we make sure we will be informed earlier in the self healing ZFS? Is there any utility to check the disks? There was no information before we wrote data.

Thank you for any suggestions.

pool: ora5
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:
 
	NAME         STATE     READ WRITE CKSUM
	ora5         DEGRADED     0     0   178
	  c12t0d0s0  DEGRADED     0     0   178  too many errors
 
errors: Permanent errors have been detected in the following files:
...
...
...

Open in new window

Brian Utterback

Of course ZFS cannot find a checksum error without needing to calculate the checksum. So, it took a read or write to trigger an error. You can use the "zpool scrub" command to force zfs to check all the checksums in the pool to proactively find errors.

I am not sure about when and where you would find actual errors. I suspect that the /var/adm/messages file will have
something in it. Also, take a look at the fault management system.

buchli

ASKER

Dear blu

Thank you for your hint with zpool scrub. We will try this. Seems to be a starting point. But as far as I can see, we will have to fill the whole filesystem (19TB) with dummy data first to check it with zpool scrub. This will produce a havy workload. But if it will help to make sure the disks are ok ... At the moment we have to leave them empty.

Of course we checked /var/adm/messages. There was no message in it.
Is there no way to check a ZFS-filesystem without filling it completely with dummy-data?

ASKER CERTIFIED SOLUTION

Brian Utterback

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

buchli

ASKER

Thank you very much. For now we live with it and check the disk with scrub. We will check first, if this is a disk or a controller problem.