docker gave '503 Service unavailable' after hardening various network parameters

sunhux
sunhux used Ask the Experts™
on
I've added the following settings in /etc/sysctl.conf  as well as
issued 'sysctl -w ...'  to make it effective as part of hardening.

My apps colleague rebooted the RHEL 7 VMs & now
the docker gave the error '503 Service Unavailable'.

How should I reverse them back: just by removing
those lines from sysctl.conf & reboot (sysctl.conf was
quite empty initially)
OR
re-issue "sysctl -w ..." with the  alternate value (ie if
it's 0, set it to 1 & if it's 1, set it to 0)?  But this doesn't
seem right as we don't know what's the default
value initially.  So how do we know what's the
initial default value before the change??


sysctl -w fs.suid_dumpable=0
sysctl -w kernel.randomize_va_space=2
sysctl -w net.ipv4.conf.default.accept_redirects=0
sysctl -w net.ipv4.conf.all.secure_redirects=0
sysctl -w net.ipv4.conf.default.secure_redirects=0
sysctl -w net.ipv4.conf.all.rp_filter=1
sysctl -w net.ipv4.conf.default.rp_filter=1
sysctl -w net.ipv4.ip_forward=0
sysctl -w net.ipv4.conf.all.send_redirects=0
sysctl -w net.ipv4.conf.default.send_redirects=0
sysctl -w net.ipv4.conf.all.accept_source_route=0
sysctl -w net.ipv4.conf.default.accept_source_route=0
sysctl -w net.ipv4.conf.all.accept_redirects=0
sysctl -w net.ipv4.conf.all.log_martians=1
sysctl -w net.ipv4.conf.default.log_martians=1
sysctl -w net.ipv4.icmp_echo_ignore_broadcasts=1
sysctl -w net.ipv4.icmp_ignore_bogus_error_responses=1
sysctl -w net.ipv4.tcp_syncookies=1
sysctl -w net.ipv6.conf.all.accept_ra=0
sysctl -w net.ipv6.conf.default.accept_ra=0
sysctl -w net.ipv6.conf.all.accept_redirects=0
sysctl -w net.ipv6.conf.default.accept_redirects=0
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Fractional CTO
Distinguished Expert 2018
Commented:
Simple way to debug.

1) Remove all these settings, by commenting them out.

2) Run this command - sysctl --system - to reset all settings.

3) Add them one by one + retest till you find the problem.

Note: You may have to reboot to reset all the settings you've changed, for any setting which isn't specifically reset in these files...

              /run/sysctl.d/*.conf
              /etc/sysctl.d/*.conf
              /usr/local/lib/sysctl.d/*.conf
              /usr/lib/sysctl.d/*.conf
              /lib/sysctl.d/*.conf
              /etc/sysctl.conf

Open in new window


In other words, if you set a setting which isn't specifically reset by one of the above files, you'll have to either reboot or look up the default settings of each setting.

Tip: Never trust a hardening guide... ever... To be safe, you must test every single setting one-by-one, for every setting you think you should change.

Author

Commented:
For the network parameters, I did take a backup copy
prior to change & I restored back the original (which is
almost empty) copy.

Then issued  'sysctl --system' : still getting the '503'
service unavailable message, so I rebooted & still no
joy.

The reboot process took extremely long (20 times the
usual time ie about 30mins).  At the console, I could see
"dracut Warning: Cannot umount /oldroot
 dracut Warning: Blocking umount of /oldroot [14015]
   /usr/linb/systemd/systemd-shutdownreboot--log-leval6-log-targetkmsg
 dracut Warning: lrwxrwxrwx.  1  root  0  0 ...
   /proc/14015/exe-> /oldroot/usr/lib/systemd/systemd-shutdown

A few links suggest to disable firewalld but after it boots up, can't
   see that firewalld is running:
$ firewall-cmd --list-all |more
FirewallD is not running

Author

Commented:
The important thing now is to fix this '503 service unavailable'
& not the slow bootup, sorry for distracting.
Success in ‘20 With a Profitable Pricing Strategy

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Author

Commented:
on the public-facing VM that runs the nginx  web server:
$ ps -ef |grep -i nginx
root     17668     1  0 14:05 ?        00:00:00 nginx: master process /opt/mesosphere/packages/adminrouter--a66d04237956623d1688a9ce19d8db630662185e/nginx/sbin nginx -c /opt/mesosphere/packages/adminrouter--a66d04237956623d1688a9ce19d8db630662185e/nginx/conf/nginx.agent.conf
nobody   17669 17668  0 14:05 ?        00:00:00 nginx: worker process


On the 2 internal VMs, nginx is probably running as loadbalancer:
Internal VM 1:
$ ps -ef |grep -i nginx
root     26937     1  0 15:06 ?        00:00:00 nginx: master process /opt/mesosphere/packages/adminrouter--a66d04237956623d1688a9ce19d8db630662185e/nginx/sbin/nginx -c /opt/mesosphere/packages/adminrouter--a66d04237956623d1688a9ce19d8db630662185e/nginx/conf/nginx.agent.conf
nobody   26938 26937  0 15:06 ?        00:00:00 nginx: worker process
root     27614 27581  0 15:07 ?        00:00:00 docker -H unix:///var/run/docker.sock run --cpu-shares 512 --cpu-quota 50000 --memory 536870912 -e HOST=10.121.2.77 -e LIBPROCESS_IP=10.121.2.77 -e LIBPROCESS_SSL_CA_FILE=/mnt/mesos/sandbox/.ssl/ca-bundle.crt -e LIBPROCESS_SSL_CERT_FILE=/mnt/mesos/sandbox/.ssl/scheduler.crt -e LIBPROCESS_SSL_CIPHERS=ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:AES128-SHA:AES256-SHA -e LIBPROCESS_SSL_ECDH_CURVES=auto -e LIBPROCESS_SSL_ENABLED=true -e LIBPROCESS_SSL_ENABLE_SSL_V3=false -e LIBPROCESS_SSL_ENABLE_TLS_V1_0=false -e LIBPROCESS_SSL_ENABLE_TLS_V1_1=false -e LIBPR ...

On internal VM 2:
$ ps -ef |grep -i nginx
root     18496     1  0 14:49 ?        00:00:00 nginx: master process /opt/mesosphere/packages/adminrouter--a66d04237956623d1688a9ce19d8db630662185e/nginx/sbin/nginx -c /opt/mesosphere/packages/adminrouter--a66d04237956623d1688a9ce19d8db630662185e/nginx/conf/nginx.agent.conf
nobody   18497 18496  0 14:49 ?        00:00:00 nginx: worker process
root     22544 18029  0 14:57 ?        00:00:00 mesos-journald-logger --destination_type=logrotate --help=false --journald_labels={"labels":[{"key":"HAPROXY_0_MODE","value":"http"},{"key":"HAPROXY_GROUP","value":"external"},{"key":"HAPROXY_0_PORT","value":"1399"},{"key":"DCOS_SPACE","value":"/nginx-local"},{"key":"FRAMEWORK_ID","value":"895dd639-02ac-4dd2-b0eb-249ef7d7a44a-0000"},{"key":"EXECUTOR_ID","value":"nginx-local.758e6a09-0907-11ea-aae8-b20d48a0505a"},{"key":"AGENT_ID","value":"895dd639-02ac-4dd2-b0eb-249ef7d7a44a-S3"},{"key":"CONTAINER_ID","value":"84cd2cd3-8b68-4036-9a4c-23253450d069"},{"key":"SYSLOG_IDENTIFIER","value":"Command Executor (Task: nginx-local.758e6a09-0907-11ea-aae8-b20d48a0505a) (Command: NO EXECUTABLE)"},{"key":"STREAM","value":"STDOUT"}]} --logrotate_filename=/var/lib/mesos/slave/slaves/895dd639-02ac-4dd2-b0eb-249ef7d7a44a-S3...

Author

Commented:
This is not a network issue because when I browse the internal IP,
it gave the same "503 service unavailable":

http://internal_ip:1888/main/ui

In the nginx VM that's serving the web, it's listening & I could telnet to port 1888 from my internal LAN:

$ netstat -ltnp |grep 1888
tcp        0      0 0.0.0.0:1888            0.0.0.0:*               LISTEN      26951/haproxy  

[root@mesopub1]:/root
$ ps -ef |grep 26951
root     26951 26805  0 16:56 ?        00:00:01 /usr/local/sbin/haproxy -W -f /marathon-lb/haproxy.cfg -sf 67 -x /var/run/haproxy/socket
root     29622 29335  0 17:17 pts/0    00:00:00 grep --color=auto 26951

Author

Commented:
the url I was testing missed the "/"  at the end;  all's well after reversing the parameter settings to system default,  thanks v much
David FavorFractional CTO
Distinguished Expert 2018

Commented:
Glad you got this resolved.

Tip: HAProxy or NGINX or any other code provides no more security than raw Apache.

Generally the more tech you put between your visitor + Apache, security gets worse (more moving parts to secure) along with new problems produced by every layer of code.

My preference is the Andrew Carnegie Approach - “Put all your eggs in one basket — and watch that basket.”

So I run Apache only, no other cruft either onsite (HAProxy/NGINX/Varnish/Squid/Anything). Also no other offsite cruft (CDNs).

Easier to secure + maintain Apache, rather than many layers between...

If maximum security + maximum speed + least resource usage are your goal...
David FavorFractional CTO
Distinguished Expert 2018

Commented:
Suggestion: Open a new question describing your entire infrastructure, requesting design related comments.

Likely you'll receive some great info.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial