Xetroximyn
asked on
AIX 5.1 - remove drive from volume group so it can be replaced
we are getting some disk operation errors
The hardware guy says he needs the disk removed from the volume group before he can replace it.
Then of course it will need to be added back in after it is replaced.
I think the discs are mirrored - if that makes a difference.
Hdisk 4 is the one with errors
The hardware guy says he needs the disk removed from the volume group before he can replace it.
Then of course it will need to be added back in after it is replaced.
I think the discs are mirrored - if that makes a difference.
Hdisk 4 is the one with errors
ibm1:/> lspv
hdisk0 00015051814ca2c5 rootvg
hdisk1 000150514226fc44 usr1vg
hdisk2 000150519965a2bb usr1vg
hdisk3 0001505115c7dbce usr1vg
hdisk4 000c925d02a3b3b2 usr1vg
hdisk5 000c925d822f5eda usr1vg
hdisk6 00011784d15410dc rootvg
hdisk7 000150512ffa6367 usr1vg
hdisk8 000150519965a4eb usr1vg
hdisk9 000150512b7bdb40 usr1vg
hdisk10 000c925d02a3a8c3 usr1vg
hdisk11 000c924d87206941 usr1vg
ibm1:/>
ibm1:/> lsvg -l usr1vg
usr1vg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
loglv00 jfslog 16 32 4 open/syncd N/A
usr1 jfs 3943 7886 10 open/syncd /usr1
ibm1:/>
ASKER
I dont think that is a possibility.
Thanks for your help!! IBM is no help since we are on 5.1!!
Thanks for your help!! IBM is no help since we are on 5.1!!
ibm1:/> lspv -l hdisk4
hdisk4:
LV NAME LPs PPs DISTRIBUTION MOUNT POINT
usr1 1084 1084 217..217..216..217..217 /usr1
ibm1:/> lsvg -p usr1vg
usr1vg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 542 0 00..00..00..00..00
hdisk8 active 542 0 00..00..00..00..00
hdisk1 active 542 0 00..00..00..00..00
hdisk9 active 1084 0 00..00..00..00..00
hdisk3 active 1084 0 00..00..00..00..00
hdisk7 active 542 0 00..00..00..00..00
hdisk4 active 1084 0 00..00..00..00..00
hdisk5 active 1084 0 00..00..00..00..00
hdisk10 active 1084 467 33..00..00..217..217
hdisk11 active 1084 287 00..00..00..70..217
ibm1:/> lslv -l usr1
usr1:/usr1
PV COPIES IN BAND DISTRIBUTION
hdisk3 1084:000:000 20% 217:217:216:217:217
hdisk9 1083:000:000 19% 217:216:216:217:217
hdisk7 542:000:000 19% 109:108:108:108:109
hdisk2 542:000:000 19% 109:108:108:108:109
hdisk8 541:000:000 19% 109:107:108:108:109
hdisk1 542:000:000 19% 109:108:108:108:109
hdisk5 1084:000:000 20% 217:217:216:217:217
hdisk4 1084:000:000 20% 217:217:216:217:217
hdisk11 782:000:000 27% 217:217:216:132:000
hdisk10 602:000:000 36% 169:217:216:000:000
ibm1:/>
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks! I will be attempting this between 1-2pm Eastern tomorrow. If you can respond quickly then -- all the better :)
BTW -- is this going to reduce the volume size in the interim?
BTW -- is this going to reduce the volume size in the interim?
No, the available space will not be reduced, but there will be no redundancy (mirroring) for the usr1 logical volume between "rmlvcopy" and "mklvcopy".
If everything works as expected you should run "syncvg -v usr1vg" after "mklvcopy" to make sure that all copies (mirrors) are in sync again.
If everything works as expected you should run "syncvg -v usr1vg" after "mklvcopy" to make sure that all copies (mirrors) are in sync again.
ASKER
thanks!! this seemed to mostly work except the last command failed.
Any suggestions?
Any suggestions?
ibm1:/> lspv
hdisk0 00015051814ca2c5 rootvg
hdisk1 000150514226fc44 usr1vg
hdisk2 000150519965a2bb usr1vg
hdisk3 0001505115c7dbce usr1vg
hdisk4 none None
hdisk5 000c925d822f5eda usr1vg
hdisk6 00011784d15410dc rootvg
hdisk7 000150512ffa6367 usr1vg
hdisk8 000150519965a4eb usr1vg
hdisk9 000150512b7bdb40 usr1vg
hdisk10 000c925d02a3a8c3 usr1vg
hdisk11 000c924d87206941 usr1vg
ibm1:/> extendvg usr1vg hdisk4
0516-1254 extendvg: Changing the PVID in the ODM.
ibm1:/> lspv
hdisk0 00015051814ca2c5 rootvg
hdisk1 000150514226fc44 usr1vg
hdisk2 000150519965a2bb usr1vg
hdisk3 0001505115c7dbce usr1vg
hdisk4 00015051e2e7d077 usr1vg
hdisk5 000c925d822f5eda usr1vg
hdisk6 00011784d15410dc rootvg
hdisk7 000150512ffa6367 usr1vg
hdisk8 000150519965a4eb usr1vg
hdisk9 000150512b7bdb40 usr1vg
hdisk10 000c925d02a3a8c3 usr1vg
hdisk11 000c924d87206941 usr1vg
ibm1:/> mklvcopy usr1 2 hdisk4
0516-404 allocp: This system cannot fulfill the allocation request.
There are not enough free partitions or not enough physical volumes
to keep strictness and satisfy allocation requests. The command
should be retried with different allocation characteristics.
ibm1:/>
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I'm not sure if I have the described requirement....
When you say two different locations do you mean like different servers? Internal/external?
I can tell you that all the drives are internal to a single server.
Does that mean I don't have the described requirement?
Just for good measure here are those outputs anyway
THANKS SO MUCH FOR YOUR HELP!!
When you say two different locations do you mean like different servers? Internal/external?
I can tell you that all the drives are internal to a single server.
Does that mean I don't have the described requirement?
Just for good measure here are those outputs anyway
THANKS SO MUCH FOR YOUR HELP!!
ibm1:/> lsvg -p usr1vg
usr1vg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 542 0 00..00..00..00..00
hdisk8 active 542 0 00..00..00..00..00
hdisk1 active 542 482 109..108..108..108..49
hdisk9 active 1084 1083 217..216..216..217..217
hdisk3 active 1084 0 00..00..00..00..00
hdisk7 active 542 542 109..108..108..108..109
hdisk4 active 1084 1084 217..217..216..217..217
hdisk5 active 1084 120 00..120..00..00..00
hdisk10 active 1084 927 202..75..216..217..217
hdisk11 active 1084 459 00..67..105..70..217
ibm1:/> lscfg |grep hdisk
+ hdisk0 10-60-00-8,0 16 Bit LVD SCSI Disk Drive (4500 MB)
+ hdisk1 10-60-00-9,0 16 Bit LVD SCSI Disk Drive (9100 MB)
+ hdisk2 10-60-00-10,0 16 Bit LVD SCSI Disk Drive (9100 MB)
+ hdisk3 10-60-00-11,0 16 Bit LVD SCSI Disk Drive (18200
+ hdisk4 10-60-00-12,0 16 Bit LVD SCSI Disk Drive (18200
+ hdisk5 10-60-00-13,0 16 Bit LVD SCSI Disk Drive (18200
+ hdisk6 10-88-00-8,0 16 Bit SCSI Disk Drive (4500 MB)
+ hdisk7 10-88-00-9,0 16 Bit LVD SCSI Disk Drive (9100 MB)
+ hdisk8 10-88-00-10,0 16 Bit LVD SCSI Disk Drive (9100 MB)
+ hdisk9 10-88-00-11,0 16 Bit LVD SCSI Disk Drive (18200
+ hdisk10 10-88-00-12,0 16 Bit LVD SCSI Disk Drive (18200
+ hdisk11 10-88-00-13,0 16 Bit LVD SCSI Disk Drive (18200
ibm1:/>
Your drives are behind two different SCSI controllers, that's all.
Regarding the lsvg output your data placement policy doesn't seem to be strict in any way, so you obviously don't have the mentioned requirement.
mklvcopy usr1 2
syncvg -v usr1vg
should work just fine for you then.
Regarding the lsvg output your data placement policy doesn't seem to be strict in any way, so you obviously don't have the mentioned requirement.
mklvcopy usr1 2
syncvg -v usr1vg
should work just fine for you then.
ASKER
Thanks!
I got this...
I got this...
ibm1:/> mklvcopy usr1 2
0516-404 allocp: This system cannot fulfill the allocation request.
There are not enough free partitions or not enough physical volumes
to keep strictness and satisfy allocation requests. The command
should be retried with different allocation characteristics.
Strange.
I'll need some more output:
lsvg usr1vg
lslv usr1
lslv -l usr1
I'll need some more output:
lsvg usr1vg
lslv usr1
lslv -l usr1
ASKER
Here you go.
Thanks!!
Thanks!!
ibm1:/> lsvg usr1vg
VOLUME GROUP: usr1vg VG IDENTIFIER: 00015051a5299fdf
VG STATE: active PP SIZE: 16 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 8672 (138752 megabytes)
MAX LVs: 256 FREE PPs: 4697 (75152 megabytes)
LVs: 2 USED PPs: 3975 (63600 megabytes)
OPEN LVs: 2 QUORUM: 6
TOTAL PVs: 10 VG DESCRIPTORS: 10
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 10 AUTO ON: yes
MAX PPs per PV: 2032 MAX PVs: 16
LTG size: 128 kilobyte(s) AUTO SYNC: no
HOT SPARE: no
ibm1:/> lslv usr1
LOGICAL VOLUME: usr1 VOLUME GROUP: usr1vg
LV IDENTIFIER: 00015051a5299fdf.2 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfs WRITE VERIFY: off
MAX LPs: 7000 PP SIZE: 16 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 3943 PPs: 3943
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: /usr1 LABEL: /usr1
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
ibm1:/> lslv -l usr1
usr1:/usr1
PV COPIES IN BAND DISTRIBUTION
hdisk3 1084:000:000 20% 217:217:216:217:217
hdisk2 542:000:000 19% 109:108:108:108:109
hdisk8 541:000:000 19% 109:107:108:108:109
hdisk1 060:000:000 0% 000:000:000:000:060
hdisk5 964:000:000 10% 217:097:216:217:217
hdisk11 610:000:000 24% 217:150:111:132:000
hdisk10 142:000:000 100% 000:142:000:000:000
ibm1:/>
OK, I'll have to analyze this. Please give me some time until tomorrow.
ASKER
sure -- thanks so much for your help!! don't know what we would do without your guru input!
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The second option... did you mean to say it CAN be done without disrupting? Or it CANT?
Either way -- I think we want to go the safe route.... We will probably only use the server for six more months, (half of our stuff is already moved off) but it is crucial
will this reduce the size of usr1?
THANKS!
Either way -- I think we want to go the safe route.... We will probably only use the server for six more months, (half of our stuff is already moved off) but it is crucial
will this reduce the size of usr1?
THANKS!
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
To your last questions:
- Everything I wrote can be done without disruption, except for the last thing - the varyoffvg/varyonvg or reboot required to make the new quorum checking option effective.
- Volume sizes are never affected by such operations. We still have all primary copies intact and complete, and this will not change.
- Everything I wrote can be done without disruption, except for the last thing - the varyoffvg/varyonvg or reboot required to make the new quorum checking option effective.
- Volume sizes are never affected by such operations. We still have all primary copies intact and complete, and this will not change.
ASKER
Thanks!! you are the best!! almost ready to run these commands...
mklvcopy -s s usr1 2 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11
syncvg -v usr1vg # This can take quite a long time!
on that last one do I include the # or do I just run..
syncvg -v usr1vg
mklvcopy -s s usr1 2 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11
syncvg -v usr1vg # This can take quite a long time!
on that last one do I include the # or do I just run..
syncvg -v usr1vg
The # and all which follows are just a comment of mine.
Since the "#" indicates the start of a comment you could have included it, it wouldn't have done any harm ....
The same is true here:
chvg -Q n usr1vg # Turn quorum checking off
savebase # Save changes to boot image
Since the "#" indicates the start of a comment you could have included it, it wouldn't have done any harm ....
The same is true here:
chvg -Q n usr1vg # Turn quorum checking off
savebase # Save changes to boot image
ASKER
Thanks! So lslv did infact just show disks 1-5 so I am running those next two commands now.
mklvcopy -s s usr1 2 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11
syncvg -v usr1vg # This can take quite a long time!
Then after that I just need to run...
chvg -Q n usr1vg # Turn quorum checking off
savebase # Save changes to boot image
and then I can just reboot -- right?
(are there any checks I can do before the reboot to make sure it will come back up OK? I am paranoid about reboots since We had a big problem a long time ago when it got stuck during boot and IBM support couldn't even help us... (this was back when they still supported 5.1 too)) We ended up having to restore from a tape.
mklvcopy -s s usr1 2 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11
syncvg -v usr1vg # This can take quite a long time!
Then after that I just need to run...
chvg -Q n usr1vg # Turn quorum checking off
savebase # Save changes to boot image
and then I can just reboot -- right?
(are there any checks I can do before the reboot to make sure it will come back up OK? I am paranoid about reboots since We had a big problem a long time ago when it got stuck during boot and IBM support couldn't even help us... (this was back when they still supported 5.1 too)) We ended up having to restore from a tape.
ASKER
shoot... I just got this error
ibm1:/> lslv -l usr1
usr1:/usr1
PV COPIES IN BAND DISTRIBUTION
hdisk3 1084:000:000 20% 217:217:216:217:217
hdisk2 542:000:000 19% 109:108:108:108:109
hdisk4 1084:000:000 20% 217:217:216:217:217
hdisk1 269:000:000 40% 000:108:101:000:060
hdisk5 964:000:000 10% 217:097:216:217:217
ibm1:/> mklvcopy -s s usr1 2 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11
0516-404 allocp: This system cannot fulfill the allocation request.
There are not enough free partitions or not enough physical volumes
to keep strictness and satisfy allocation requests. The command
should be retried with different allocation characteristics.
ibm1:/>
ASKER
here are these outputs if you need them
ibm1:/>
ibm1:/> lsvg usr1vg
VOLUME GROUP: usr1vg VG IDENTIFIER: 00015051a5299fdf
VG STATE: active PP SIZE: 16 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 8672 (138752 megabytes)
MAX LVs: 256 FREE PPs: 4697 (75152 megabytes)
LVs: 2 USED PPs: 3975 (63600 megabytes)
OPEN LVs: 2 QUORUM: 6
TOTAL PVs: 10 VG DESCRIPTORS: 10
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 10 AUTO ON: yes
MAX PPs per PV: 2032 MAX PVs: 16
LTG size: 128 kilobyte(s) AUTO SYNC: no
HOT SPARE: no
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> lslv usr1
LOGICAL VOLUME: usr1 VOLUME GROUP: usr1vg
LV IDENTIFIER: 00015051a5299fdf.2 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfs WRITE VERIFY: off
MAX LPs: 7000 PP SIZE: 16 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 3943 PPs: 3943
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: /usr1 LABEL: /usr1
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
ibm1:/>
ibm1:/>
ibm1:/>
Seems that the partitions of loglv00 are also kind of misplaced, which inhibits our desired superstrict allocation.
So we will clean up this one as well.
Please post
lslv loglv00
lslv -l loglv00
So we will clean up this one as well.
Please post
lslv loglv00
lslv -l loglv00
ASKER
Here you go
ibm1:/>
ibm1:/>
ibm1:/> lslv loglv00
LOGICAL VOLUME: loglv00 VOLUME GROUP: usr1vg
LV IDENTIFIER: 00015051a5299fdf.1 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfslog WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 16 megabyte(s)
COPIES: 2 SCHED POLICY: parallel
LPs: 16 PPs: 32
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: N/A LABEL: None
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> lslv -l loglv00
loglv00:N/A
PV COPIES IN BAND DISTRIBUTION
hdisk8 001:000:000 100% 000:001:000:000:000
hdisk9 001:000:000 100% 000:001:000:000:000
hdisk10 015:000:000 0% 015:000:000:000:000
hdisk11 015:000:000 0% 000:000:000:015:000
ibm1:/>
ibm1:/>
ibm1:/>
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks!!
So sorry to be such a pain!! but I just got this...
So sorry to be such a pain!! but I just got this...
ibm1:/> mklvcopy -s s loglv00 2 hdisk1
0516-404 allocp: This system cannot fulfill the allocation request.
There are not enough free partitions or not enough physical volumes
to keep strictness and satisfy allocation requests. The command
should be retried with different allocation characteristics.
ibm1:/>
ASKER
in case you need
ibm1:/> lslv loglv00
LOGICAL VOLUME: loglv00 VOLUME GROUP: usr1vg
LV IDENTIFIER: 00015051a5299fdf.1 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfslog WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 16 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 16 PPs: 16
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: N/A LABEL: None
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> lslv -l loglv00
loglv00:N/A
PV COPIES IN BAND DISTRIBUTION
hdisk10 016:000:000 6% 015:001:000:000:000
ibm1:/>
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Well on the one hand I have your advice, and on the other hand I have.... well... nothing... I'd just about be screwed if it weren't for you. :) I'll take you memory in any shape I can get it :)
Here is the latest error... and some outputs in case you need
Here is the latest error... and some outputs in case you need
ibm1:/> migratepv -l loglv00 hdisk10 hdisk1
ibm1:/> mklvcopy -s s loglv00 2 hdisk10
0516-404 allocp: This system cannot fulfill the allocation request.
There are not enough free partitions or not enough physical volumes
to keep strictness and satisfy allocation requests. The command
should be retried with different allocation characteristics.
ibm1:/>
ibm1:/>
ibm1:/> lslv loglv00
LOGICAL VOLUME: loglv00 VOLUME GROUP: usr1vg
LV IDENTIFIER: 00015051a5299fdf.1 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfslog WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 16 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 16 PPs: 16
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: N/A LABEL: None
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> lslv -l loglv00
loglv00:N/A
PV COPIES IN BAND DISTRIBUTION
hdisk1 016:000:000 0% 016:000:000:000:000
ibm1:/>
I'm not really sure what's going on here.
Anyway, since we're going to specify the target disks individually we can kind of "mimic" superstrictness without explicitly requesting it.
So let's try this:
mklvcopy loglv00 2 hdisk10
Anyway, since we're going to specify the target disks individually we can kind of "mimic" superstrictness without explicitly requesting it.
So let's try this:
mklvcopy loglv00 2 hdisk10
ASKER
that returned without error
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks!! No need to increase size later! :) We are half moved off and should be getting the other half of our stuff moved off by January.... can't wait to be off of this ancient thing!!
BTW -- are there any checks I can do before the reboot to make sure it will come back up OK? I am paranoid about reboots since We had a big problem a long time ago when it got stuck during boot and IBM support couldn't even help us... (this was back when they still supported 5.1 too)) We ended up having to restore from a tape.
BTW -- are there any checks I can do before the reboot to make sure it will come back up OK? I am paranoid about reboots since We had a big problem a long time ago when it got stuck during boot and IBM support couldn't even help us... (this was back when they still supported 5.1 too)) We ended up having to restore from a tape.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
THANKS! this all seems to look good.
Should I run the same checks for rootvg?
Should I run the same checks for rootvg?
ibm1:/> mklvcopy usr1 2 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11
ibm1:/> syncvg -v usr1vg
ibm1:/> chvg -Q n usr1vg
ibm1:/> savebase
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> lsvg usr1vg
VOLUME GROUP: usr1vg VG IDENTIFIER: 00015051a5299fdf
VG STATE: active PP SIZE: 16 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 8672 (138752 megabytes)
MAX LVs: 256 FREE PPs: 754 (12064 megabytes)
LVs: 2 USED PPs: 7918 (126688 megabytes)
OPEN LVs: 2 QUORUM: 1
TOTAL PVs: 10 VG DESCRIPTORS: 10
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 10 AUTO ON: yes
MAX PPs per PV: 2032 MAX PVs: 16
LTG size: 128 kilobyte(s) AUTO SYNC: no
HOT SPARE: no
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> lsvg -p usr1vg
usr1vg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk2 active 542 0 00..00..00..00..00
hdisk8 active 542 377 109..00..51..108..109
hdisk1 active 542 257 93..00..07..108..49
hdisk9 active 1084 0 00..00..00..00..00
hdisk3 active 1084 0 00..00..00..00..00
hdisk7 active 542 0 00..00..00..00..00
hdisk4 active 1084 0 00..00..00..00..00
hdisk5 active 1084 120 00..120..00..00..00
hdisk10 active 1084 0 00..00..00..00..00
hdisk11 active 1084 0 00..00..00..00..00
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> lsvg -l usr1vg
usr1vg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
loglv00 jfslog 16 32 2 open/syncd N/A
usr1 jfs 3943 7886 10 open/syncd /usr1
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/>
ibm1:/> synclvodm usr1vg
ibm1:/>
ibm1:/>
ibm1:/> varyonvg usr1vg
ibm1:/>
Perfect, congrats!
Run these checks for rootvg if you wish. I'm rather sure that you'll find no trouble there, because rootvg consists of just 2 disks cleanly mirrored, as it seems.
But please have a look at errpt!
Run these checks for rootvg if you wish. I'm rather sure that you'll find no trouble there, because rootvg consists of just 2 disks cleanly mirrored, as it seems.
But please have a look at errpt!
If you're indeed planning to reboot, recreate the boot records beforehand and set up the bootlist - just to be sure.
bosboot -ad hdisk0
bosboot -ad hdisk6
bootlist -m normal hdisk0 hdisk6
savebase
bosboot -ad hdisk0
bosboot -ad hdisk6
bootlist -m normal hdisk0 hdisk6
savebase
ASKER
only wierd thing on rootvg...
errpt looks good.
So the rebooting is to make sure quarum checking is on? How important is that? should I reboot within a week? a month? a couple months?
ibm1:/> varyonvg rootvg
PV Status: hdisk6 00011784d15410dc PVACTIVE
hdisk0 00015051814ca2c5 PVACTIVE
0516-1437 varyonvg: Varyonvg should not be used to force open or relock the drives of the volume group containing a dump device.
errpt looks good.
ibm1:/> bosboot -ad hdisk0
bosboot: Boot image is 13884 512 byte blocks.
ibm1:/> bosboot -ad hdisk6
bosboot: Boot image is 13884 512 byte blocks.
ibm1:/> bootlist -m normal hdisk0 hdisk6
ibm1:/> savebase
So the rebooting is to make sure quarum checking is on? How important is that? should I reboot within a week? a month? a couple months?
The rootvg message is irrelevant. AIX 5.3 and later don't issue it anymore. You can ignore it.
As for the quorum:
Every disk of a VG contains at least one VGDA (Volume Group Descriptor Area).
A 1-disk VG has 2 VGDAs, a 2-disk VG has three (2 on first disk, 1 on second),
VGs with 3 disks and up have one VGDA per hdisk.
Quorum checking means that more than 50% of the VGDAs must be available to keep the VG running, with 50% or less available VGDAs the VG will be forcibly varied off.
Now to your case: You have 10 disks in your VG, 5 containing original partitions, the other 5 containing the copies.
With quorum checking on the loss of 5 disks will make the VG go down, despite of the fact that all data might still be available if only "copy" disks or only "original" disks are lost.
This can e.g. happen when a SCSI adapter fails.
That's why we usually turn off quorum checking. Without this checking 1 VGDA is sufficient to keep the VG running.
I can't estimate the probability of a SCSI adapter failure in your machine.
It's up to you to decide how important this machine is and how reliable your hardware might be. But in any case, there is no reason to precipitate.
How about avoiding the reboot by following the instructions I gave you in comment #38252955?
As for the quorum:
Every disk of a VG contains at least one VGDA (Volume Group Descriptor Area).
A 1-disk VG has 2 VGDAs, a 2-disk VG has three (2 on first disk, 1 on second),
VGs with 3 disks and up have one VGDA per hdisk.
Quorum checking means that more than 50% of the VGDAs must be available to keep the VG running, with 50% or less available VGDAs the VG will be forcibly varied off.
Now to your case: You have 10 disks in your VG, 5 containing original partitions, the other 5 containing the copies.
With quorum checking on the loss of 5 disks will make the VG go down, despite of the fact that all data might still be available if only "copy" disks or only "original" disks are lost.
This can e.g. happen when a SCSI adapter fails.
That's why we usually turn off quorum checking. Without this checking 1 VGDA is sufficient to keep the VG running.
I can't estimate the probability of a SCSI adapter failure in your machine.
It's up to you to decide how important this machine is and how reliable your hardware might be. But in any case, there is no reason to precipitate.
How about avoiding the reboot by following the instructions I gave you in comment #38252955?
ASKER
I don't know how reliable the hardware is.... (In the last three years we have just had one tape drive and one disk drive fail... but that is no indication of the future I suppose)
I can say that even if we were completely without any mirroring -- we would almost definitely still want to be able to run production while we waited for replacement parts.
Assuming that the only risk is that we lose the data that is not backed up if a drive failes. (i.e. there is not risk of making the whole system inoperable in a way that once hardware was all back up and running, we could not just restore from tape and go)
it sounds like, if I understand you correctly, we might not want quorum checking on.... Does that sound right??
p.s. I plan to check errpt weekly for warnings
I can say that even if we were completely without any mirroring -- we would almost definitely still want to be able to run production while we waited for replacement parts.
Assuming that the only risk is that we lose the data that is not backed up if a drive failes. (i.e. there is not risk of making the whole system inoperable in a way that once hardware was all back up and running, we could not just restore from tape and go)
it sounds like, if I understand you correctly, we might not want quorum checking on.... Does that sound right??
p.s. I plan to check errpt weekly for warnings
You definitely don't want quorum checking on, you understood correctly.
All this has nothing to do with the ability to restore a failed volume group from tape.
You already turned off quorum checking in the ODM, so even if the VG goes down you will be able to bring it up again, the new setting being in effect then.
But please, why don't you just take down your application for a minute, umount /usr1, vary the VG off and on, mount /usr1 and start the application again?
Left aside the time your application takes to stop and start, this is a matter of less than one minute.
All this has nothing to do with the ability to restore a failed volume group from tape.
You already turned off quorum checking in the ODM, so even if the VG goes down you will be able to bring it up again, the new setting being in effect then.
But please, why don't you just take down your application for a minute, umount /usr1, vary the VG off and on, mount /usr1 and start the application again?
Left aside the time your application takes to stop and start, this is a matter of less than one minute.
ASKER
I am just crazy paranoid... Our entire business would be unbelievably screwed if anything happened that prevented this machine from running production - even for just a couple days.
You are absolutely awesome!! But you are basically our ONLY software support for this thing... Which despite your awesomeness is a bit scary :)
If there is a reason that it is important to do this -- then I guess we will do it...
But it sounded as if the reason for doing it was turning quorum checking on – which we don't even want anyway... is that right? Is there a need for doing the varryoff/on?
You are absolutely awesome!! But you are basically our ONLY software support for this thing... Which despite your awesomeness is a bit scary :)
If there is a reason that it is important to do this -- then I guess we will do it...
But it sounded as if the reason for doing it was turning quorum checking on – which we don't even want anyway... is that right? Is there a need for doing the varryoff/on?
No, just the other way - the reason for doing it is turning quorum checking off, and that's what you want to do in order to take precautions against the loss of a whole SCSI adapter (not against the loss of a disk or two behind a single adapter - your system will survive that even in its current state).
varyoff/on is needed to make the setting effective which you already configured in the ODM by means of "chvg -Q n usr1vg".
varyoff/on will make LVM read the new setting from ODM and apply it to the VG.
varyoff/on is needed to make the setting effective which you already configured in the ODM by means of "chvg -Q n usr1vg".
varyoff/on will make LVM read the new setting from ODM and apply it to the VG.
ASKER
I see. I have it all backwards.
So I guess I will try to Vary off/on soon....
If I run into any problems would you by chance be around Sunday around 11 PM?
p.s. just curious what does ODM stand for?
So I guess I will try to Vary off/on soon....
If I run into any problems would you by chance be around Sunday around 11 PM?
p.s. just curious what does ODM stand for?
Object Data Management. It's the internal configuration database on AIX, similar to the Windows Registry.
Well, I don't know what timezone you're in.
I'm in Europe here, so talking CUT/UTC I will be available until around 11 PM on Sunday, but not much longer. I'll be back around 8 AM CUT/UTC on Monday.
Well, I don't know what timezone you're in.
I'm in Europe here, so talking CUT/UTC I will be available until around 11 PM on Sunday, but not much longer. I'll be back around 8 AM CUT/UTC on Monday.
ASKER
I am eastern US time. I guess I will try tonight and post here if there are any problems... Hopefully you can reply early :)
ASKER
I tried to shut everything down but I got
So I rebooted -- things seem to be fine. Any way I can check to make sure quorum checking is turned off and all is good "under the hood"?
ibm1:/> umount /usr1
umount: 0506-349 Cannot unmount /dev/usr1: The requested resource is busy.
So I rebooted -- things seem to be fine. Any way I can check to make sure quorum checking is turned off and all is good "under the hood"?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Sweet! I see "QUORUM: 1"
Thanks a million!!
Thanks a million!!
Are you able to add a new disk before actually removing the old one?
This is generally no problem with SAN disks.
If you are, just
- add a new disk
- run "cfgmgr"
- find out the name of the new disk (I'll call it hdiskx below)
- run "replacepv hdisk4 hdiskx"
When finished you can safely remove hdisk4, because it's empty now and no longer part of usr1vg.
Now run "rmdev -dl hdisk4" and that's all.
If there is no possibility to make an additional disk available, we must know some more details. Please post the output of:
lspv -l hdisk4
lsvg -p usr1vg
lslv -l usr1
wmp