Link to home
Start Free TrialLog in
Avatar of amigan_99
amigan_99Flag for United States of America

asked on

Linux /var/tmp/sos directories are filling up disk on a regular basis. Help please.

I have a linux system that I use which is regularly getting filled by by these /var/tmp/sos directories.
They seem to be getting created every week and taking 10s of gigs of space. Is this a normal part of
linux operation? It's causing the disk to become critically full too often. Is there a way to automatically
delete the old sos directories?

Also - do you know if the directories systemd-private-* need to be kept and what generates them?

observium tmp]$ ls -l
total 24
-rw-r--r--. 1 root   root      0 Dec 27 20:15 need_home_synced
-rw-r--r--. 1 nagios nagios    6 Dec 28 09:16 retrans_state.txt
drwx------. 2 root   root   4096 Dec 28 04:13 sos.t92dy0
drwx------. 2 root   root   4096 Dec 21 03:35 sos.y5KVwk
drwx------. 3 root   root   4096 May 17  2018 systemd-private-04196687cdcc41d2be170a72a0dcb5fc-httpd.service-b7fs9F
drwx------. 3 root   root   4096 May 17  2018 systemd-private-04196687cdcc41d2be170a72a0dcb5fc-mariadb.service-HPqubD
drwx------. 3 root   root   4096 May 17  2018 systemd-private-04196687cdcc41d2be170a72a0dcb5fc-ntpd.service-A5CYzA

Linux observium.internal.acme.com 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Avatar of Scott Silva
Scott Silva
Flag of United States of America image

You have 2 issues.
The first one with /var/tmp/sos filling up is from what I believe is a program called sosreport. But it shouldn't be running automatically since it is a tool to collect reports for forwarding to a support entity. You should be able to delete older runs.

The second part is from systemd-private, which forces certain programs to have private log and temp space, usually for containerized programs.

This might give you a clue on the second...   https://support.plesk.com/hc/en-us/articles/115000063849-Directories-like-tmp-systemd-private-overflow-cause-server-crash-due-to-lack-of-disk-space

The first you could create a cron job to clear files older than say a week or so in the /var/tmp/sos directory.

A caution on deleting the systemd-private directories and not just the files contained in them...  https://access.redhat.com/discussions/3027351
Avatar of noci
noci

Strange....
sos report is meant to pass info to CentOS or RedHat when you have trouble with your system.
And they request a report of the system from you.
It is meant to give those organisations the logging needed to investigate problems.

Those reports are (AFAIK) not meant to be made on a regular basis.
Here is more info:  https://access.redhat.com/solutions/3592

So i guess you may need to check if some cron job creates them automatically.
OTOH they may get triggered because some error happens. In that case you need to resolve the error.

You can check the report contents yourself if needed:
Here is a description how/what/where:  https://www.ostechnix.com/sosreport-a-tool-to-collect-system-logs-and-diagnostic-information/
I'm with noci. This is very strange.

Try providing the following.

/bin/ls -d /tmp
/bin/ls -d /var/tmp
mount
df
find /var/tmp -type f -ls
lsof 2>/dev/null | grep "(deleted)"

Open in new window


These will show oddball linkages + binds + ghost files.
Avatar of amigan_99

ASKER

I ran sudo crontab -l | grep sos   - and sudo ps -ef | grep sos - but neither is showing sosreport. Any other thought where else I might spot the how these are getting spawned?
/bin/ls -d /tmp
/tmp

observium tmp]$ /bin/ls -d /var/tmp
/var/tmp

observium tmp]$ mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=16255652k,nr_inodes=4063913,mode=755)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,seclabel)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
configfs on /sys/kernel/config type configfs (rw,relatime)
/dev/sda4 on / type ext4 (rw,relatime,seclabel,data=ordered)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel)
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=32,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel)
/dev/sda2 on /boot type ext4 (rw,relatime,seclabel,data=ordered)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)

observium tmp]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda4       864G  642G  179G  79% /
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G  914M   15G   6% /run
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/sda2       190M   77M   99M  44% /boot

observium tmp]$ find /var/type -type f -ls
find: ‘/var/type’: No such file or directory

 lsof 2>/dev/null | grep "(deleted)"
{no output}
Perhaps I should run a script that deletes any sos* every week. ? Can I do that with a non-root user account that requires sudo to delete those directories manually?
You named it observium, so I assume that is what it is running. Is it the community version or a paid version?
Did you try their support channels? They might have more experience with what is happening.
That's a good thought. I didn't set it up - a former team member did and obviously my sysadmin skills are marginal.
cron also uses: /etc/cron* directories.
it can run under another account so crontab -l may not show it.
ls -lR /var/spool/cron may show other

You can get rid of the files with: rm -rf /var/tmp/sos*

The files should not get created unless there are problems that need to be reported to RedHat or CentOS support.
Rerun this command, as I made a typo, should be /var/tmp rather than /var/type which doesn't exist.

find /var/tmp -type f -ls

Open in new window


Other commands show nothing of use.

Note: Be sure to run these commands when the problem occurs.

The problem must be occurring to determine the cause, so just running the commands when everything's working won't help.

Especially, the find /var/tmp -type f -ls + the lsof of showing deleted/ghost files, are the really critical commands to run when problem occurs.
observium ~]$ sudo find /var/tmp -type f -ls
[sudo] password for mememe:
33824258  676 -rw-------   1 root     root       688394 Dec 28 04:01 /var/tmp/sos.t92dy0/tmpsxP6UK
33824263 101922356 -rw-------   1 root     root     104368414720 Dec 28 04:08 /var/tmp/sos.t92dy0/sosreport-observium.internal.acme.com-20181228035405.tar
33816634    4 -rw-------   1 root     root          955 Dec 28 04:01 /var/tmp/sos.t92dy0/tmp3G_gs8
33816633  176 -rw-------   1 root     root       178443 Dec 28 04:13 /var/tmp/sos.t92dy0/tmpKLsv8K
33824256 1568 -rw-------   1 root     root      1605040 Dec 28 04:01 /var/tmp/sos.t92dy0/tmpar3hIj
32637056    0 -rw-r--r--   1 root     root            0 Dec 28 20:45 /var/tmp/need_home_synced
32637128    4 -rw-r--r--   1 nagios   nagios          6 Dec 29 10:46 /var/tmp/retrans_state.txt
observium ~]$ sudo ls -lR /var/spool/cron
/var/spool/cron:
total 0
ASKER CERTIFIED SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you very much.
You're welcome!