Dario Vercelli
asked on
AIX - GPFS slowness problem
Hello.
The customer complains a slowness in the transactions, from the analysis of the following output
you can understand if you need to change some configuration parameter ?
[root@MdapEtlA01 /] $ mmdiag --stats
=== mmdiag: stats ===
Global resources:
OpenFile counts: total created 256020 (in use 256000, free 20)
using 552000K memory
cached 256000, currently open 4918+172, cache limit 256000 (min 10, max 256000), eff limit 256000
stats: steals 179591029 (clean 179531269, dirty 59760)
StatCache counts: total created 256007 (in use 256000, free 7)
using 78000K memory
cache limit 256000
stats: inserts 179766566 steals 157897245 hits 14489 expands 17496150 revokes 127 uses 11359975
OpenInstance counts: total created 25607 (in use 18865, free 6742)
using 10316K memory
BufferDesc counts: total created 118726 (in use 117677, free 1049)
using 11643K memory
cached 117677 cache limit 786432 pseudo 31894 prefetch 516
indBlockDesc counts: total created 206656 (in use 196663, free 9993)
using 7811K memory
cached 196663 cache limit 256000 pseudo 194377
My verison of gpfs :
[root@MdapEtlA01 /] $ lslpp -L 'gpfs*'
Fileset Level State Type Description (Uninstaller)
-------------------------- ---------- ---------- ---------- ---------- ----------
gpfs.base 3.5.0.15 C F GPFS File Manager
gpfs.docs.data 3.5.0.4 C F GPFS Server Manpages and
Documentation
gpfs.gnr 3.5.0.11 C F GPFS Native RAID
gpfs.msg.en_US 3.5.0.13 C F GPFS Server Messages - U.S.
English
The customer complains a slowness in the transactions, from the analysis of the following output
you can understand if you need to change some configuration parameter ?
[root@MdapEtlA01 /] $ mmdiag --stats
=== mmdiag: stats ===
Global resources:
OpenFile counts: total created 256020 (in use 256000, free 20)
using 552000K memory
cached 256000, currently open 4918+172, cache limit 256000 (min 10, max 256000), eff limit 256000
stats: steals 179591029 (clean 179531269, dirty 59760)
StatCache counts: total created 256007 (in use 256000, free 7)
using 78000K memory
cache limit 256000
stats: inserts 179766566 steals 157897245 hits 14489 expands 17496150 revokes 127 uses 11359975
OpenInstance counts: total created 25607 (in use 18865, free 6742)
using 10316K memory
BufferDesc counts: total created 118726 (in use 117677, free 1049)
using 11643K memory
cached 117677 cache limit 786432 pseudo 31894 prefetch 516
indBlockDesc counts: total created 206656 (in use 196663, free 9993)
using 7811K memory
cached 196663 cache limit 256000 pseudo 194377
My verison of gpfs :
[root@MdapEtlA01 /] $ lslpp -L 'gpfs*'
Fileset Level State Type Description (Uninstaller)
--------------------------
gpfs.base 3.5.0.15 C F GPFS File Manager
gpfs.docs.data 3.5.0.4 C F GPFS Server Manpages and
Documentation
gpfs.gnr 3.5.0.11 C F GPFS Native RAID
gpfs.msg.en_US 3.5.0.13 C F GPFS Server Messages - U.S.
English
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Hi,
What code release and maintenance level is installed?
What code release and maintenance level is installed?
mmfsadm dump version | grep Buil
Cheers
ASKER
Solution accepted
ASKER
Solution accepted
ASKER
Solution accepted
Likely a person will require root ssh access to the entire network of machine to attempt problem resolution.
Tip: The general approach to debugging GFS problems is simple + also takes a massive amount of time.
What you'll do is run tshark watching packet flow + timestamps (to test for latency + large lags).
Start by ensuring underlying network never lags, using something like iperf3 to test all machines.
Then continue test with have GFS traffic of various file sizes.
During your GFS tests, you'll check both network latency + also cache hits/misses to ensure GFS is caching as expected.
Tip: Many people fail to realize working with massively large files on GFS will... technically work, in other words remote machines can access files + if files are large, or even small with rapid changes, GFS may invalidate it's own caching.
Meaning if remotes will see invalid data, then GFS will invalidate it's cache + remotes will have to refetch file blocks or entire files sometimes.
When using GFS, care must be taken to consider what type of files + format of files which will be served over GFS.
For example, using GFS to expose many mail files is likely wrong. Better to use an IMAP4 server.
Another example, using GFS to expose large backups for people to access + restore at will, is likely wrong. Better to use rsync.
So the other part of GFS debugging, is to do an audit of what files serve through GFS.