Link to home
Start Free TrialLog in
Avatar of rbtt
rbttFlag for Trinidad and Tobago

asked on

Out of Memory Error - Kill PIDs

My Architecture is a NFS LINUX Cluster (LINUX Cluster Suite 4) with an active/passive node.
During our backups to the NFS Servers, we are receiving a out of memory error. From the logs below is it easy to determine what is the real cause of the low memory?


Oct 15 07:50:53 ttecprodnfs2 su(pam_unix)[3371]: session closed for user root
Oct 15 07:50:55 ttecprodnfs2 sshd(pam_unix)[3344]: session closed for user browna
Oct 15 08:39:22 ttecprodnfs2 ntpd[18984]: time reset -0.132295 s
Oct 15 08:43:39 ttecprodnfs2 ntpd[18984]: synchronized to LOCAL(0), stratum 10
Oct 15 08:43:40 ttecprodnfs2 ntpd[18984]: synchronized to 10.33.2.35, stratum 11
Oct 15 08:44:43 ttecprodnfs2 ntpd[18984]: synchronized to 64.202.112.75, stratum 1
Oct 15 08:58:45 ttecprodnfs2 ntpd[18984]: time reset +0.168468 s
Oct 15 09:03:02 ttecprodnfs2 ntpd[18984]: synchronized to LOCAL(0), stratum 10
Oct 15 09:04:07 ttecprodnfs2 ntpd[18984]: synchronized to 65.111.164.223, stratum 2
Oct 15 09:04:08 ttecprodnfs2 ntpd[18984]: synchronized to 64.202.112.75, stratum 1
Oct 15 15:39:03 ttecprodnfs2 ntpd[18984]: synchronized to 65.111.164.223, stratum 2
Oct 15 15:40:09 ttecprodnfs2 ntpd[18984]: synchronized to 64.202.112.75, stratum 1
Oct 15 15:40:09 ttecprodnfs2 ntpd[18984]: time reset -0.161100 s
Oct 15 15:44:28 ttecprodnfs2 ntpd[18984]: synchronized to LOCAL(0), stratum 10
Oct 15 15:45:31 ttecprodnfs2 ntpd[18984]: synchronized to 64.202.112.75, stratum 1
Oct 15 16:04:53 ttecprodnfs2 ntpd[18984]: time reset +0.305731 s
Oct 15 16:09:10 ttecprodnfs2 ntpd[18984]: synchronized to LOCAL(0), stratum 10
Oct 15 16:10:08 ttecprodnfs2 ntpd[18984]: synchronized to 10.33.2.35, stratum 11
Oct 15 16:10:15 ttecprodnfs2 ntpd[18984]: synchronized to 65.111.164.223, stratum 2
Oct 15 16:11:11 ttecprodnfs2 ntpd[18984]: synchronized to 64.202.112.75, stratum 1
Oct 15 20:32:06 ttecprodnfs2 kernel: oom-killer: gfp_mask=0xd0
Oct 15 20:32:06 ttecprodnfs2 kernel: Mem-info:
Oct 15 20:32:06 ttecprodnfs2 kernel: DMA per-cpu:
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 0 hot: low 2, high 6, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 0 cold: low 0, high 2, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 1 hot: low 2, high 6, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 1 cold: low 0, high 2, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 2 hot: low 2, high 6, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 2 cold: low 0, high 2, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 3 hot: low 2, high 6, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 3 cold: low 0, high 2, batch 1
Oct 15 20:32:06 ttecprodnfs2 kernel: Normal per-cpu:
Oct 15 20:32:06 ttecprodnfs2 kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 2 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 2 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 3 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 3 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: HighMem per-cpu:
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 2 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 2 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 3 hot: low 32, high 96, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel: cpu 3 cold: low 0, high 32, batch 16
Oct 15 20:32:07 ttecprodnfs2 kernel:
Oct 15 20:32:07 ttecprodnfs2 kernel: Free pages:       14812kB (1600kB HighMem)
Oct 15 20:32:07 ttecprodnfs2 kernel: Active:22725 inactive:1351664 dirty:94129 writeback:329 unstable:0 free:3703 slab:171895 mapped
:21877 pagetables:614
Oct 15 20:32:07 ttecprodnfs2 kernel: DMA free:12556kB min:16kB low:32kB high:48kB active:0kB inactive:0kB present:16384kB pages_scan
ned:3593200 all_unreclaimable? yes
Oct 15 20:32:08 ttecprodnfs2 kernel: protections[]: 0 0 0
Oct 15 20:32:08 ttecprodnfs2 kernel: Normal free:656kB min:928kB low:1856kB high:2784kB active:1188kB inactive:135532kB present:9011
20kB pages_scanned:246543 all_unreclaimable? yes
Oct 15 20:32:08 ttecprodnfs2 kernel: protections[]: 0 0 0
Oct 15 20:32:08 ttecprodnfs2 kernel: HighMem free:1600kB min:512kB low:1024kB high:1536kB active:89888kB inactive:5270820kB present:
6160384kB pages_scanned:0 all_unreclaimable? no
Oct 15 20:32:08 ttecprodnfs2 kernel: protections[]: 0 0 0
Oct 15 20:32:08 ttecprodnfs2 kernel: DMA: 3*4kB 4*8kB 4*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 1255
6kB
Oct 15 20:32:08 ttecprodnfs2 kernel: Normal: 0*4kB 36*8kB 21*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =
 656kB
Oct 15 20:32:08 ttecprodnfs2 kernel: HighMem: 6*4kB 9*8kB 6*16kB 6*32kB 11*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB =
 1600kB
Oct 15 20:32:08 ttecprodnfs2 kernel: Swap cache: add 24201, delete 23569, find 9398/11696, race 0+0
Oct 15 20:32:08 ttecprodnfs2 kernel: 0 bounce buffer pages
Oct 15 20:32:08 ttecprodnfs2 kernel: Free swap:       4186800kB
Oct 15 20:32:08 ttecprodnfs2 kernel: 1769472 pages of RAM
Oct 15 20:32:08 ttecprodnfs2 kernel: 1343420 pages of HIGHMEM
Oct 15 20:32:08 ttecprodnfs2 kernel: 211992 reserved pages
Oct 15 20:32:08 ttecprodnfs2 kernel: 1359023 pages shared
Oct 15 20:32:08 ttecprodnfs2 kernel: 633 pages swap cached
Oct 15 20:32:08 ttecprodnfs2 kernel: Out of Memory: Killed process 25436 (klzagent).
Oct 15 23:28:42 ttecprodnfs2 kernel: oom-killer: gfp_mask=0xd0
Oct 15 23:28:42 ttecprodnfs2 kernel: Mem-info:
Oct 15 23:28:42 ttecprodnfs2 kernel: DMA per-cpu:
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 0 hot: low 2, high 6, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 0 cold: low 0, high 2, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 1 hot: low 2, high 6, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 1 cold: low 0, high 2, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 2 hot: low 2, high 6, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 2 cold: low 0, high 2, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 3 hot: low 2, high 6, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 3 cold: low 0, high 2, batch 1
Oct 15 23:28:42 ttecprodnfs2 kernel: Normal per-cpu:
Oct 15 23:28:42 ttecprodnfs2 kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 clurgmgrd[31568]: <notice> Shutting down
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 2 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 2 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 3 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 3 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: HighMem per-cpu:
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 2 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 2 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 3 hot: low 32, high 96, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel: cpu 3 cold: low 0, high 32, batch 16
Oct 15 23:28:43 ttecprodnfs2 kernel:
Oct 15 23:28:43 ttecprodnfs2 kernel: Free pages:       16340kB (3136kB HighMem)
Oct 15 23:28:43 ttecprodnfs2 kernel: Active:19425 inactive:1353469 dirty:105492 writeback:0 unstable:0 free:4085 slab:173041 mapped:



19387 pagetables:579
Oct 15 23:28:43 ttecprodnfs2 kernel: DMA free:12556kB min:16kB low:32kB high:48kB active:0kB inactive:0kB present:16384kB pages_scan
ned:3603829 all_unreclaimable? yes
Oct 15 23:28:43 ttecprodnfs2 kernel: protections[]: 0 0 0
Oct 15 23:28:43 ttecprodnfs2 kernel: Normal free:648kB min:928kB low:1856kB high:2784kB active:2328kB inactive:130068kB present:9011
20kB pages_scanned:250272 all_unreclaimable? yes
Oct 15 23:28:43 ttecprodnfs2 kernel: protections[]: 0 0 0
Oct 15 23:28:43 ttecprodnfs2 kernel: HighMem free:3136kB min:512kB low:1024kB high:1536kB active:75372kB inactive:5283808kB present:
6160384kB pages_scanned:0 all_unreclaimable? no
Oct 15 23:28:43 ttecprodnfs2 clurgmgrd[31568]: <notice> Stopping service NFS
Oct 15 23:28:43 ttecprodnfs2 kernel: protections[]: 0 0 0
Oct 15 23:28:43 ttecprodnfs2 kernel: DMA: 3*4kB 4*8kB 4*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 1255
6kB
Oct 15 23:28:43 ttecprodnfs2 kernel: Normal: 0*4kB 1*8kB 34*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =
648kB
Oct 15 23:28:43 ttecprodnfs2 kernel: HighMem: 34*4kB 21*8kB 15*16kB 33*32kB 16*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096
kB = 3136kB
Oct 15 23:28:43 ttecprodnfs2 kernel: Swap cache: add 24327, delete 23701, find 9531/11840, race 0+0
Oct 15 23:28:43 ttecprodnfs2 kernel: 0 bounce buffer pages
Oct 15 23:28:43 ttecprodnfs2 kernel: Free swap:       4186820kB
Oct 15 23:28:43 ttecprodnfs2 kernel: 1769472 pages of RAM
Oct 15 23:28:43 ttecprodnfs2 kernel: 1343420 pages of HIGHMEM
Oct 15 23:28:43 ttecprodnfs2 kernel: 211992 reserved pages
Oct 15 23:28:44 ttecprodnfs2 kernel: 1358536 pages shared
Oct 15 23:28:44 ttecprodnfs2 kernel: 627 pages swap cached
Oct 15 23:28:44 ttecprodnfs2 kernel: Out of Memory: Killed process 31568 (clurgmgrd).
Oct 15 23:28:44 ttecprodnfs2 clurgmgrd: [31568]: <info> Removing IPv4 address 10.33.2.37 from bond0.102
Oct 15 23:28:54 ttecprodnfs2 clurgmgrd: [31568]: <info> Removing export: *:/T24
Oct 15 23:28:55 ttecprodnfs2 last message repeated 10 times
Oct 15 23:28:55 ttecprodnfs2 clurgmgrd: [31568]: <info> unmounting /T24
Oct 15 23:28:56 ttecprodnfs2 clurgmgrd: [31568]: <notice> Forcefully unmounting /T24
Oct 15 23:28:56 ttecprodnfs2 clurgmgrd: [31568]: <warning> Dropping node-wide NFS locks
Oct 15 23:29:06 ttecprodnfs2 clurgmgrd: [31568]: <info> unmounting /T24
Oct 15 23:29:06 ttecprodnfs2 clurgmgrd: [31568]: <notice> Forcefully unmounting /T24
Oct 15 23:29:07 ttecprodnfs2 clurgmgrd: [31568]: <info> Sending reclaim notifications via ttecprodnfs2
Oct 15 23:29:07 ttecprodnfs2 rpc.statd[24171]: Version 1.0.6 Starting
Oct 15 23:29:07 ttecprodnfs2 rpc.statd[24171]: Flags: No-Daemon Notify-Only
Oct 15 23:29:07 ttecprodnfs2 rpc.statd[24171]: statd running as root. chown /tmp/statd-ttecprodnfs2.23993/sm to choose different use
r
Oct 15 23:29:10 ttecprodnfs2 rpc.statd[24171]: Caught signal 15, un-registering and exiting.
Oct 15 23:29:10 ttecprodnfs2 clurgmgrd: [31568]: <err> 'umount /T24' failed, error=0
Oct 15 23:29:10 ttecprodnfs2 clurgmgrd[31568]: <notice> stop on fs "T24" returned 2 (invalid argument(s))
Oct 15 23:29:10 ttecprodnfs2 clurgmgrd[31568]: <crit> #12: RG NFS failed to stop; intervention required
Oct 15 23:29:10 ttecprodnfs2 clurgmgrd[31568]: <notice> Service NFS is failed
Oct 15 23:29:32 ttecprodnfs2 clurgmgrd[31568]: <notice> Shutdown complete, exiting
Oct 16 00:10:19 ttecprodnfs2 sshd(pam_unix)[24196]: session opened for user thomasr by (uid=0)
Oct 16 00:10:22 ttecprodnfs2 su(pam_unix)[24223]: session opened for user root by thomasr(uid=0)
Oct 16 00:22:37 ttecprodnfs2 clurgmgrd[24316]: <notice> Resource Group Manager Starting
Oct 16 00:22:37 ttecprodnfs2 clurgmgrd[24316]: <info> Loading Service Data
Oct 16 00:22:37 ttecprodnfs2 rgmanager: clurgmgrd startup succeeded
Oct 16 00:22:37 ttecprodnfs2 clurgmgrd[24316]: <info> Initializing Services
Oct 16 00:22:37 ttecprodnfs2 clurgmgrd: [24316]: <info> Removing export: *:/T24
Oct 16 00:22:37 ttecprodnfs2 last message repeated 10 times
Oct 16 00:22:38 ttecprodnfs2 clurgmgrd: [24316]: <info> unmounting /T24
Oct 16 00:22:38 ttecprodnfs2 clurgmgrd: [24316]: <info> Removing export: *:/SIG
Oct 16 00:22:38 ttecprodnfs2 last message repeated 6 times
Oct 16 00:22:38 ttecprodnfs2 clurgmgrd: [24316]: <info> unmounting /SIG
Oct 16 00:22:38 ttecprodnfs2 clurgmgrd: [24316]: <info> Removing export: *:/backup
Oct 16 00:22:38 ttecprodnfs2 clurgmgrd: [24316]: <info> Removing export: *:/backup
Oct 16 00:22:39 ttecprodnfs2 clurgmgrd: [24316]: <info> unmounting /backup
Oct 16 00:22:40 ttecprodnfs2 clurgmgrd: [24316]: <info> Removing export: *:/DataBridge
Oct 16 00:22:40 ttecprodnfs2 last message repeated 5 times
Oct 16 00:22:41 ttecprodnfs2 clurgmgrd: [24316]: <info> unmounting /DataBridge
Oct 16 00:22:41 ttecprodnfs2 clurgmgrd[24316]: <info> Services Initialized
Oct 16 00:22:41 ttecprodnfs2 clurgmgrd[24316]: <info> Logged in SG "usrm::manager"
Oct 16 00:22:41 ttecprodnfs2 clurgmgrd[24316]: <info> Magma Event: Membership Change
Oct 16 00:22:41 ttecprodnfs2 clurgmgrd[24316]: <info> State change: Local UP
Oct 16 00:22:41 ttecprodnfs2 clurgmgrd[24316]: <info> State change: ttecprodnfs1 UP
Oct 16 00:48:08 ttecprodnfs2 shutdown: shutting down for system reboot
Oct 16 00:48:09 ttecprodnfs2 init: Switching to runlevel: 6
Oct 16 00:48:10 ttecprodnfs2 rgmanager: [25401]: <notice> Shutting down Cluster Service Manager...
Oct 16 00:48:10 ttecprodnfs2 clurgmgrd[24316]: <notice> Shutting down
Oct 16 00:48:10 ttecprodnfs2 ccsd[31530]: Unable to write package back to sender: Broken pipe
Oct 16 00:48:10 ttecprodnfs2 ccsd[31530]: Error while processing request: Operation not permitted
Oct 16 00:48:12 ttecprodnfs2 clurgmgrd[24316]: <notice> Shutdown complete, exiting
Oct 16 00:48:12 ttecprodnfs2 rgmanager: [25401]: <notice> Cluster Service Manager is stopped.
Oct 16 00:48:12 ttecprodnfs2 haldaemon: haldaemon -TERM succeeded
Oct 16 00:48:12 ttecprodnfs2 messagebus: messagebus -TERM succeeded
Oct 16 00:48:12 ttecprodnfs2 psacct: Shutting down process accounting:  succeeded
Oct 16 00:48:12 ttecprodnfs2 mountd[4592]: Caught signal 15, un-registering and exiting.
Oct 16 00:48:12 ttecprodnfs2 nfs: rpc.mountd shutdown succeeded
Oct 16 00:48:16 ttecprodnfs2 kernel: lockd: couldn't shutdown host module!
Oct 16 00:48:16 ttecprodnfs2 kernel: nfsd: last server has exited
Oct 16 00:48:16 ttecprodnfs2 kernel: nfsd: unexporting all filesystems
Oct 16 00:48:17 ttecprodnfs2 nfs: nfsd shutdown succeeded
Oct 16 00:48:17 ttecprodnfs2 nfs: rpc.rquotad shutdown succeeded
Oct 16 00:48:17 ttecprodnfs2 nfs: Shutting down NFS services:  succeeded
Oct 16 00:48:17 ttecprodnfs2 sshd: sshd -TERM succeeded
Oct 16 00:48:17 ttecprodnfs2 acpid: acpid shutdown succeeded
Oct 16 00:48:17 ttecprodnfs2 crond: crond shutdown succeeded
Oct 16 00:48:17 ttecprodnfs2 ntpd[18984]: ntpd exiting on signal 15
Oct 16 00:48:17 ttecprodnfs2 ntpd: ntpd shutdown succeeded
Oct 16 00:48:17 ttecprodnfs2 fenced: Stopping fence domain:
Oct 16 00:48:17 ttecprodnfs2 fenced: shutdown succeeded
Oct 16 00:48:17 ttecprodnfs2 fenced:
Oct 16 00:48:17 ttecprodnfs2 fenced:
Oct 16 00:48:17 ttecprodnfs2 rc: Stopping fenced:  succeeded
Oct 16 00:48:17 ttecprodnfs2 lock_gulmd: Stopping lock_gulmd:
Oct 16 00:48:17 ttecprodnfs2 lock_gulmd: shutdown succeeded
Oct 16 00:48:17 ttecprodnfs2 lock_gulmd:
Oct 16 00:48:17 ttecprodnfs2 lock_gulmd:
Oct 16 00:48:17 ttecprodnfs2 rc: Stopping lock_gulmd:  succeeded
Oct 16 00:48:17 ttecprodnfs2 cman: Stopping cman:
Oct 16 00:48:21 ttecprodnfs2 cman: failed to stop cman failed
Oct 16 00:48:21 ttecprodnfs2 cman:
Oct 16 00:48:21 ttecprodnfs2 cman:
Oct 16 00:48:21 ttecprodnfs2 rc: Stopping cman:  failed
Oct 16 00:48:21 ttecprodnfs2 ccsd[31530]: Stopping ccsd, SIGTERM received.
Oct 16 00:48:22 ttecprodnfs2 ccsd: shutdown succeeded
Oct 16 00:48:22 ttecprodnfs2 rpc.statd[3958]: Caught signal 15, un-registering and exiting.
Oct 16 00:48:22 ttecprodnfs2 nfslock: rpc.statd shutdown succeeded
Oct 16 00:48:22 ttecprodnfs2 auditd[3974]: The audit daemon is exiting.
Oct 16 00:48:22 ttecprodnfs2 kernel: audit(1224132502.355:26382): audit_pid=0 old=3974 by auid=4294967295
Oct 16 00:48:22 ttecprodnfs2 auditd: auditd shutdown succeeded
Oct 16 00:48:22 ttecprodnfs2 irqbalance: irqbalance shutdown succeeded
Oct 16 00:48:22 ttecprodnfs2 portmap: portmap shutdown succeeded
Oct 16 00:48:22 ttecprodnfs2 kernel: Kernel logging (proc) stopped.
Oct 16 00:48:22 ttecprodnfs2 kernel: Kernel log daemon terminating.
Oct 16 00:48:24 ttecprodnfs2 syslog: klogd shutdown succeeded
Oct 16 00:48:24 ttecprodnfs2 exiting on signal 15
Oct 16 00:50:54 ttecprodnfs2 syslogd 1.4.1: restart.
Oct 16 00:50:54 ttecprodnfs2 syslog: syslogd startup succeeded
Oct 16 00:50:54 ttecprodnfs2 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Oct 16 00:50:54 ttecprodnfs2 kernel: Linux version 2.6.9-42.ELsmp (bhcompile@hs20-bc1-1.build.redhat.com) (gcc version 3.4.6 2006040
4 (Red Hat 3.4.6-2)) #1 SMP Wed Jul 12 23:27:17 EDT 2006
Oct 16 00:50:54 ttecprodnfs2 kernel: BIOS-provided physical RAM map:
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 0000000000000000 - 000000000009d000 (usable)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 000000000009d000 - 00000000000a0000 (reserved)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 0000000000100000 - 00000000cffbce80 (usable)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 00000000cffbce80 - 00000000cffd0000 (ACPI data)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 00000000cffd0000 - 00000000d0000000 (reserved)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
Oct 16 00:50:54 ttecprodnfs2 kernel:  BIOS-e820: 0000000100000000 - 00000001b0000000 (usable)
Oct 16 00:50:54 ttecprodnfs2 syslog: klogd startup succeeded
Oct 16 00:50:54 ttecprodnfs2 kernel: 6016MB HIGHMEM available.
Oct 16 00:50:54 ttecprodnfs2 kernel: 896MB LOWMEM available.
Oct 16 00:50:54 ttecprodnfs2 kernel: found SMP MP-table at 0009d140
Oct 16 00:50:55 ttecprodnfs2 irqbalance: irqbalance startup succeeded
Oct 16 00:50:55 ttecprodnfs2 kernel: Using x86 segment limits to approximate NX protection
Oct 16 00:50:55 ttecprodnfs2 portmap: portmap startup succeeded
Oct 16 00:50:55 ttecprodnfs2 kernel: DMI 2.3 present.
Oct 16 00:50:55 ttecprodnfs2 kernel: ServerWorks chipset detected. Disabling timer routing over 8254.
Oct 16 00:50:55 ttecprodnfs2 kernel: Using APIC driver default
Oct 16 00:50:55 ttecprodnfs2 rpc.statd[3589]: Version 1.0.6 Starting
Oct 16 00:50:55 ttecprodnfs2 nfslock: rpc.statd startup succeeded
Oct 16 00:50:55 ttecprodnfs2 rpc.statd[3589]: statd running as root. chown /var/lib/nfs/statd/sm to choose different user
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: PM-Timer IO Port: 0x588
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Oct 16 00:50:55 ttecprodnfs2 auditd[3610]: Init complete, auditd 1.0.14 listening for events
Oct 16 00:50:55 ttecprodnfs2 auditd: auditd startup succeeded
Oct 16 00:50:55 ttecprodnfs2 kernel: Processor #0 6:15 APIC version 20
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled)
Oct 16 00:50:55 ttecprodnfs2 kernel: Processor #6 6:15 APIC version 20
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Oct 16 00:50:55 ttecprodnfs2 kernel: Processor #1 6:15 APIC version 20
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Oct 16 00:50:55 ttecprodnfs2 kernel: Processor #7 6:15 APIC version 20
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
Oct 16 00:50:55 ttecprodnfs2 kernel: ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
Oct 16 00:50:56 ttecprodnfs2 kernel: Enabling APIC mode:  Flat.  Using 0 I/O APICs
Oct 16 00:50:56 ttecprodnfs2 kernel: ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0])
Oct 16 00:50:56 ttecprodnfs2 kernel: IOAPIC[0]: apic_id 14, version 32, address 0xfec00000, GSI 0-23
Oct 16 00:50:56 ttecprodnfs2 kernel: ACPI: IOAPIC (id[0x0d] address[0xfec80000] gsi_base[24])
Oct 16 00:50:56 ttecprodnfs2 kernel: IOAPIC[1]: apic_id 13, version 32, address 0xfec80000, GSI 24-47
Oct 16 00:50:56 ttecprodnfs2 rpcidmapd: rpc.idmapd startup succeeded
Oct 16 00:50:56 ttecprodnfs2 kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
Oct 16 00:50:56 ttecprodnfs2 kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Oct 16 00:50:56 ttecprodnfs2 kernel: Using ACPI (MADT) for SMP configuration information
Oct 16 00:50:56 ttecprodnfs2 kernel: Allocating PCI resources starting at d1000000 (gap: d0000000:10000000)
Oct 16 00:50:56 ttecprodnfs2 kernel: Built 1 zonelists
Oct 16 00:50:56 ttecprodnfs2 kernel: Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet
Oct 16 00:50:56 ttecprodnfs2 kernel: Initializing CPU#0
Oct 16 00:50:56 ttecprodnfs2 kernel: CPU 0 irqstacks, hard=c03ee000 soft=c03ce000
Oct 16 00:50:56 ttecprodnfs2 kernel: PID hash table entries: 4096 (order: 12, 65536 bytes)
Oct 16 00:50:56 ttecprodnfs2 kernel: Detected 2000.969 MHz processor.
Oct 16 00:50:56 ttecprodnfs2 kernel: Using pmtmr for high-res timesource
Oct 16 00:50:56 ttecprodnfs2 kernel: Console: colour VGA+ 80x25
Oct 16 00:50:56 ttecprodnfs2 kernel: Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Oct 16 00:50:56 ttecprodnfs2 kernel: Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Oct 16 00:50:56 ttecprodnfs2 ccsd[3681]: Starting ccsd 1.0.7:
Oct 16 00:50:57 ttecprodnfs2 kernel: Memory: 6227796k/7077888k available (1876k kernel code, 62440k reserved, 759k data, 184k init,
5373680k highmem)
Oct 16 00:50:57 ttecprodnfs2 ccsd[3681]:  Built: Jun 22 2006 18:15:41
Oct 16 00:50:57 ttecprodnfs2 kernel: Calibrating delay using timer specific routine.. 4003.08 BogoMIPS (lpj=2001542)
Oct 16 00:50:57 ttecprodnfs2 ccsd[3681]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Oct 16 00:50:57 ttecprodnfs2 kernel: Security Scaffold v1.0.0 initialized
Oct 16 00:50:57 ttecprodnfs2 kernel: SELinux:  Initializing.
Oct 16 00:50:57 ttecprodnfs2 kernel: SELinux:  Starting in permissive mode
Oct 15 20:50:43 ttecprodnfs2 rc.sysinit: -e
Oct 16 00:50:57 ttecprodnfs2 kernel: There is already a security framework initialized, register_security failed.
Oct 16 00:50:57 ttecprodnfs2 ccsd:  succeede
ASKER CERTIFIED SOLUTION
Avatar of larsga
larsga
Flag of Norway image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of rbtt

ASKER

Is there a known problen with RHEL 32bit as far as memory managment is concerned?
From what information is in the log and what I can find in RH's support database, the following might apply to your situation:

http://kbase.redhat.com/faq/FAQ_85_8968.shtm
http://kbase.redhat.com/faq/FAQ_43_8555.shtm
http://kbase.redhat.com/faq/FAQ_85_10725.shtm

The workarounds/fixes mentioned in those articles are (1) tuning a kernel variable (set /proc/sys/vm/lower_zone_protection to 100) to make reclamation of lowmem pages more agressive and (2) changing to the hugemem kernel.

In general, running a 32bit kernel on a machine with 3GB+ of RAM can cause problems because the 0-4GB physical address space gets full (it is also used by PCI memory-mapped IO / DMA etc) and some of the RAM has to be mapped to above the 4GB limit and accessed through PAE. This causes a split in memory (some things like DMA must use RAM in the "lowmem" 0-4GB range and can not use memory in the "highmem" 4GB+ range). Thus you can get into a situation where you have lots of "highmem" available, but run out of "lowmem" - it is analogous to the "DOS memory"/extended-/expanded-memory issues one had in the MS-DOS days.

This 32bit/3GB+ issue is not limited to RHEL or to Linux. It is a PC architecture limit. 32bit versions of Windows also have the same problem.
I've looked into this a bit more. I feel pretty certain that you run out of lowmem. Changing to the 'hugemem' kernel should fix the issue, albeit at the cost of a little cpu overhead. Beats the alternative of being hit with ooms, though.