Gribble
asked on
I have a zfs storage box and two xen servers that are having errors
a recent network interruption has now caused errors with my xen servers connection to an ifs share over iSCSI.
If I reboot the Xen server then everything comes backup, but within 20 mins the /var directory on the linux containers goes into read only because of journal errors
Here is an out put of the errors from the zfs storage box
Jan 8 00:06:42 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a 183801935 (172.16.30.220) on iqn.dzfs01:alameda LU18 (172.16.30.221:3260,1), ISID=23d0a0000, TSIH=299, CID=0, HeaderDigest=off, DataDigest=off
Jan 8 00:06:43 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a 183801935 (172.16.30.220) on iqn.dzfs01:malden LU13 (172.16.30.221:3260,1), ISID=23d100000, TSIH=286, CID=0, HeaderDigest=off, DataDigest=off
Jan 8 00:06:44 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto r: ***ERROR*** accept error: -1
Jan 8 00:06:44 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a 183801935 (172.16.30.220) on iqn.dzfs01:dogeared01 LU14 (172.16.30.221:3260,1), ISID=23d010000, TSIH=297, CID=0, HeaderDigest=off, DataDigest=off
Jan 8 00:06:44 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto r: ***ERROR*** accept error: -1
Jan 8 00:06:44 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto r: ***ERROR*** accept error: -1
Jan 8 00:06:45 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a 183801935 (172.16.30.220) on iqn.dzfs01:timor LU5 (172.16.30.221:3260,1), ISID=23d050000, TSIH=285, CID=0, HeaderDigest=off, DataDigest=off
Jan 8 00:06:46 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a 183801935 (172.16.30.220) on iqn.dzfs01:gargoyle LU2 (172.16.30.221:3260,1), ISID=23d060000, TSIH=293, CID=0, HeaderDigest=off, DataDigest=off
Jan 8 00:06:58 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto r: ***ERROR*** accept error: -1
Jan 8 00:06:59 dzfs01 last message repeated 5 times
Jan 8 00:06:59 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a 183801935 (172.16.30.220) on iqn.dzfs01:coronado LU19 (172.16.30.221:3260,1), ISID=23d0f0000, TSIH=295, CID=0, HeaderDigest=off, DataDigest=off
Jan 8 00:06:59 dzfs01 istgt[1211]: istgt_iscsi.c: 777:istgt_iscsi_write_pdu_ internal: ***ERROR*** iscsi_write() failed (errno=32)
Jan 8 00:06:59 dzfs01 istgt[1211]: istgt_iscsi.c:4984:sender: ***ERROR*** iscsi_write_pdu() failed on iqn.dzfs01:coronado,t,0x00 01(iqn.199 4-05.com.r edhat:d8a1 83801935,i ,0x00023d0 f0000)
If I reboot the Xen server then everything comes backup, but within 20 mins the /var directory on the linux containers goes into read only because of journal errors
Here is an out put of the errors from the zfs storage box
Jan 8 00:06:42 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a
Jan 8 00:06:43 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a
Jan 8 00:06:44 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto
Jan 8 00:06:44 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a
Jan 8 00:06:44 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto
Jan 8 00:06:44 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto
Jan 8 00:06:45 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a
Jan 8 00:06:46 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a
Jan 8 00:06:58 dzfs01 istgt[1211]: istgt.c:1411:istgt_accepto
Jan 8 00:06:59 dzfs01 last message repeated 5 times
Jan 8 00:06:59 dzfs01 istgt[1211]: Login from iqn.1994-05.com.redhat:d8a
Jan 8 00:06:59 dzfs01 istgt[1211]: istgt_iscsi.c: 777:istgt_iscsi_write_pdu_
Jan 8 00:06:59 dzfs01 istgt[1211]: istgt_iscsi.c:4984:sender:
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Just take care to monitor the new switch o it never again comes to data loss.
ASKER
Resolved the issue by replacing the network switch
-32 broken pipe (quite logical once permission is denied)