Link to home
Start Free TrialLog in
Avatar of UNS SCRL
UNS SCRLFlag for Belgium

asked on

Access lost to ISCSI volume after a while

Dears,
I'm having an issue with a Thecus n12000pro and would like to get an advice on the way to solve it.

My issue is that after some time the 10 HDD 10K raid 5 stop responding till I restarted the NAS.
I believe the error message appearing on the screen was that the nas failed to mount the volume which is strange because it was mounted and working just before ?
The issue has appeared when I configured the the storage to be used for replication and for active internal storage for a Server hosting VMs

The VMWare 6.5 Internal storage is done in ISCSI through a 10GB link on interface 172.16.6.5/24 connected directly to the server on a 10GB card
The management of the NAS is done on a third interface through a 1GB link on interface 192.168.3.30/24

I tried to add 1TB of SSD cache in writeback mode but it didn't help. I thought that it was a IOPS issue but it doesn't seem the case.
The management is still available on the management interface and we can see that when the issue occur the network is unused on the 10GB interface even if it doesn't answer and the server is freezed due to that situation.
The firmware version is 2.05.14.4. I could try to upgrade the firmware to 2.06.02.10 but I'm not sure it would solve the issue.
Should I backup everything elsewhere and rebuild the whole configuration ? Should I buy another NAS ? Is this a simple configuration issue ?
As you see I need advice :-)

Any help or advice on how to solve this is welcome.

Thanks for your follow-up.

Best regards

Laurent Ulrich
Avatar of Afthab T
Afthab T
Flag of United Arab Emirates image

I have seen the same issue with Qnap and Buffalo NAS. It mostly the issue with NAS storage. I suggest NAS Iscsi for only backup and other non-critical purposes only. You can update the firmware of the device and see the result.
Avatar of Member_2_231077
Member_2_231077

Sounds like disk problems, if a couple of them fall asleep it would stop and then they come alive again at reboot so it works again. 10 disks is too many fro RAID 5 anyway, you should use RAID 6
The Thecus N12000PRO is certified for use with VMware vSphere ESXi.

So this NAS issue is really one for the Vendor Thecus
Avatar of UNS SCRL

ASKER

Thanks for your ideas, I'll check to see if the HDD are going to sleep at one point and I'll update the firmware.
If it's still not ok then I'll backup and create a RAID 6 as suggested :-)
I'll keep you posted on the results.
Hi, I tried a few things already and here are the current actions and results :
Changed power management setting of the hard drive  from 30 to OFF
Storage unavailable again after a while :-(
FSC error => reboot, successful repair (error code 1) and reboot but error still present at the last boot time (I checked it's not a n old one)
upgrade firmware to latest stable version and reboot
FSC error still there at the after upgrade boot time.
Restart in repair mode and FSC twice : Error code 1 the first time and error code 0 the second time.
restart and fiile system checking error showing up again at last boot time :-(

At the moment the storage is available but who knows for how long ?
I forgot to say that the disk space is of 3,05TB free on 7,27TB so we are far from the "red" zone

As course of action, should I move to backup, factory reset and reconfigure in RAID 6 ? What would you advise regarding cache mode ? Write Back ? Write Through ?
N12000Pro-Startup-log.JPG
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Was that 2 separate reboots in the screenshot or did it take half an hour to run fsck or whatever it runs? It may be the filesystem not being marked as clean as it didn't shut down properly is why that event is there.
I made a reboot each time and each time I got the message in the logs
In total I made 6 reboots and got the message each time :-(
For info, the storage was unavailable again after a grand total of 2 hours :-(
I'll try to reach to Thecus support as you suggest.