Avatar of buchli
buchliFlag for Switzerland asked on

How to check ZFS storage pool?

Hi
We are running an Oracle database on Sun Solaris. This database has recently added a new SAS-RAID. There we use ZFS to be able to use a partition size of 19 TB.
During a new load of data into Oracle we got a ORA-01115: IO error reading block from file ....

After this we ran zpool status -xv and this showed us, there really were errors (see code).
4 files (filesize 30G) were corrupt.
The RAID does not show any errors.

Now, how can we make sure we will be informed earlier in the self healing ZFS? Is there any utility to check the disks? There was no information before we wrote data.

Thank you for any suggestions.

pool: ora5
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:
 
	NAME         STATE     READ WRITE CKSUM
	ora5         DEGRADED     0     0   178
	  c12t0d0s0  DEGRADED     0     0   178  too many errors
 
errors: Permanent errors have been detected in the following files:
...
...
...

Open in new window

Unix OSDatabases

Avatar of undefined
Last Comment
buchli

8/22/2022 - Mon
Brian Utterback

Of course ZFS cannot find a checksum error without needing to calculate the checksum. So, it took a read or write to trigger an error. You can use the "zpool scrub" command to force zfs to check all the checksums in the pool to proactively find errors.

I am not sure about when and where you would find actual errors. I suspect that the /var/adm/messages file will have
something in it. Also, take a look at the fault management system.
ASKER
buchli

Dear blu

Thank you for your hint with zpool scrub. We will try this. Seems to be a starting point. But as far as I can see, we will have to fill the whole filesystem (19TB) with dummy data first to check it with zpool scrub. This will produce a havy workload. But if it will help to make sure the disks are ok ... At the moment we have to leave them empty.

Of course we checked /var/adm/messages. There was no message in it.
Is there no way to check a ZFS-filesystem without filling it completely with dummy-data?
ASKER CERTIFIED SOLUTION
Brian Utterback

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
buchli

Thank you very much. For now we live with it and check the disk with scrub. We will check first, if this is a disk or a controller problem.

All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck