Solved

rc.local no longer runs on boot under Centos 5.3

Posted on 2012-04-06
19
4,227 Views
Last Modified: 2012-04-16
I had problems with too many extraneous services running, so I pruned them.
Now the machine runs much more reliably.
But for some reason, even though I did not change rc.local or anything else about the boot time configuration, rc.local no longer runs.

I run rc.local using source 'source /etc/rc.local' and it runs and sets up some global requirements just as it always did, but it will not run on power up.

I noticed this:
[172.16.20.70:BMC_7 /etc/rc.d/rc3.d]# ls S99* -l
lrwxrwxrwx 1 root root 19 Aug 12  2010 S99firstboot -> ../init.d/firstboot
lrwxrwxrwx 1 root root 11 Aug 12  2010 S99local -> ../rc.local

[172.16.20.70:BMC_7 /etc/rc.d]# pwd
/etc/rc.d
[172.16.20.70:BMC_7 /etc/rc.d]# ls -l -a
total 136
drwxr-xr-x  10 root root  4096 Apr  4 15:31 .
drwxrwxrwx 105 root root 12288 Apr  4 15:32 ..
drwxr-xr-x   2 root root  4096 Aug 12  2010 init.d
-rwxr-xr-x   1 root root  2255 Nov 13  2008 rc
drwxr-xr-x   2 root root  4096 Jan 25 21:58 rc0.d
drwxr-xr-x   2 root root  4096 Jan 25 21:58 rc1.d
drwxr-xr-x   2 root root  4096 Jan 25 21:58 rc2.d
drwxr-xr-x   2 root root  4096 Jan 25 21:58 rc3.d
drwxr-xr-x   2 root root  4096 Jan 25 21:58 rc4.d
drwxr-xr-x   2 root root  4096 Jan 25 21:58 rc5.d
drwxr-xr-x   2 root root  4096 Jan 25 21:58 rc6.d
-rw-rw-rw-   1 root root   541 Dec 14  2010 rc.local
-rwxr-xr-x   1 root root 27420 Mar  5  2009 rc.sysinit
[172.16.20.70:BMC_7 /etc/rc.d]#

So there is an S99local that links to a copy of rc.local that is in /etc/rc.d, and it is an exact copy of /etc/rc.local, but it is not marked executable?!?!?!?!?

Could that be the issue?
What the #^$%%$# could have caused this to mysteriously happen, just by reconfiguring some services?

If that is NOT the issue, then what could cause rc.local to stop executing on reboot/power-up?

Thanks - Frank
0
Comment
Question by:fklein23
  • 10
  • 6
  • 2
  • +1
19 Comments
 
LVL 21

Expert Comment

by:Papertrip
ID: 37816890
I have no idea what could have caused this in the background, but rc.local needs the execute bit set.  You should talk to anyone else who has root access to see if they modified the permissions.

chmod 755 /etc/rc.local

Open in new window

0
 

Author Comment

by:fklein23
ID: 37817125
Thanks for the input, Papertrip.

I am the only one who EVER modifies the linux code on these controllers.
They have no other users, and I didn't change any attributes.

But you misunderstand the problem. /etc/rc.local is marked as executable:

172.16.20.70:BMC_7 /etc]# ls -l rc.local
lrwxrwxrwx 1 root root 13 May 11  2009 rc.local -> rc.d/rc.local
[172.16.20.70:BMC_7 /etc]#

It is the actual runtime copy of rc.local to which S99local is lined, that is not executable:

[172.16.20.70:BMC_7 /etc/rc.d/rc3.d]# ls S99local -l
lrwxrwxrwx 1 root root 11 Aug 12  2010 S99local -> ../rc.local
(no problem so far!)

HERE is the problem:
[172.16.20.70:BMC_7 /etc/rc.d]# ls -l -a rc.local
-rw-rw-rw-   1 root root   541 Dec 14  2010 rc.local


My understanding of the boot sequence is that /etc/rc.d/rc3.d contains a bunch of symbolic links to the actual executable scripts that apply to runlevel 3. My understanding is that at boot time, when going to runlevel 3, the system uses the S99local script to execute the rc.local script to which S99local is linked.

So to recap:
/etc/rc.local is executable
/etc/rc.d/rc.local is an exact copy of /etc/rc.local except that it is NOT executable
/etc/rc.d/rc3.d/S99local is executable and is symbolically linked to ../rc.local, which is not marked executable.

But no one ever copies /etc/rc.local to /etc/rc.d/rc.local (the Linux system does that as a result of system configuration) and certainly no one ever changes its attributes.

The problem is that it appears that several (if not all) of our 20 controllers now have this same disease, even though all I did was turn off unneeded services and configure these services to remain off at boot time. Some of them are powered down and are in a remote location, so can't be powered up for a few weeks.

Is it possible that some service is required to correctly set up the boot sequence that leads to rc.local being executed at boot time?????

Thanks - Frank
0
 
LVL 21

Assisted Solution

by:Papertrip
Papertrip earned 50 total points
ID: 37817975
Hey Frank,

Sorry I meant chmod 755 /etc/rc.d/rc.local


My understanding of the boot sequence is that /etc/rc.d/rc3.d contains a bunch of symbolic links to the actual executable scripts that apply to runlevel 3. My understanding is that at boot time, when going to runlevel 3, the system uses the S99local script to execute the rc.local script to which S99local is linked.
Correct.  These are ran by the init process.

Is it possible that some service is required to correctly set up the boot sequence that leads to rc.local being executed at boot time?????
Almost certain rc.local is not attached to any services.  It's provided by the initscripts package and executed by the init process during boot.  Do you have a list of packages you removed and/or services you disabled?  Did you do anything more than like yum erase's and chkconfig off's during your clean up, like any automation and/or scripts?  Did all the machines break at the same time or was it staggered?  Any correlation between them breaking and you doing changes?  Like did a problem pop up on a specific box immediately after your clean up while the as-of-yet untouched boxes still had the correct permissions?

Long story short -- I can't think of any reason why the permissions would change on /etc/rc.d/rc.local, I've never seen or heard of anything like this happening.  If any other Experts have ran across this issue before please chime in.

Sorry I can't be of more help but the cause of this is eluding me.  Fortunately the fix is relatively painless.
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37818133
Hi,

How do you know that it has not been executed? Can you post the contents of your rc.local and lets see what is there.  I want to stress this out that rc.local is always the last resource to run a program because it could only launch programs after init is complete but once a program was run you can not control the status. You can't properly shut it down too. The best way is to create a script in /etc/init.d and use an existing service script as a template and define proper run/shutdown sequences for your programs.

There's how to crate a proper one:
http://wiki.pylonshq.com/display/pylonscookbook/Red+Hat+friendly+init.d+script+with+chkconfig+support

Cheers,
K.
0
 
LVL 14

Expert Comment

by:kronostm
ID: 37822823
I had a similar issue about 2 years ago on an Ubuntu machine, I was running about 7 scripts from rc.local and unfortunately for me the first one was not ending with 0 exit code, so rc.local did NOT parse the others.
In order to check if rc.local is parsed at all, put a marker after each line, something like
echo -n >/tmp/first_command_was_executed
echo -n >/tmp/second_command_was_executed
and so on, and see if it actually gets stuck somewhere by checking the existence of the files.
0
 

Author Comment

by:fklein23
ID: 37828604
Thanks, guys. Let me address these responses 1 at a time:

Papertrip:
"Sorry I meant chmod 755 /etc/rc.d/rc.local"
--> OK, that is better, but I can't do that.
--> these 20 computers are under strict source control and I cannot do such an uncontrolled modification to them all without a new ECO. If I can't find a REASON for this problem to have happened, I can't close the ECO. :-(

I already though of doing this, and if I have to I will try it.
My boss will not approve doing this to any of these machines as an experiment.
He MUST know "why" the change is required, and I can't tell him. Plus it may not work for reasons discussed below at (*)
If this system weren't in production, it would be SO much simpler.

No I didn't do ANYTHING to the services except configure them off: Specifically, I did this:

1. disable some cron tasks that are completely optional:
    chmod a-x /etc/cron.weekly/makewhatis.cron
    chmod a-x /etc/cron.daily/makewhatis.cron
    chmod a-x /etc/cron.daily/rpm
    chmod a-x /etc/cron.daily/cups
AND

2. use 'chkconfig --level 3 ,service-name> off'
    chkconfig --level 3 acpid off
    chkconfig --level 3 auditd off
    chkconfig --level 3 autofs off
    chkconfig --level 3 avahi-daemon off
    chkconfig --level 3 bluetooth off
    chkconfig --level 3 cups off
    chkconfig --level 3 haldaemon off
    chkconfig --level 3 hidd off
    chkconfig --level 3 hplip off
    chkconfig --level 3 ip6tables off  
    chkconfig --level 3 iptables off
    chkconfig --level 3 isdn off
    chkconfig --level 3 kdump off
    chkconfig --level 3 lm_sensors off
    chkconfig --level 3 netfs off
    chkconfig --level 3 pcscd off
    chkconfig --level 3 setroubleshoot off
    chkconfig --level 3 smartd off
    chkconfig --level 3 yum-updateonboot off
    chkconfig --level 3 yum-updatesd off

Now I wonder: If I just turn the executable bit on on the local copy of rc.local that is in /etc/rc.d/rc.local will this change stick?

(*) What part of Linux makes that copy? It shouldn't have made the copy and not preserved the executable bit. How do I know that on powerdown and restart it won't just make another non-executable copy of it?

As for a correlation between what I did and being broken: As I said, these systems are under VERY strict source control. The config changes mentioned above were made and
the next time each system was power-cycled, they came up without executing rc.local.

I can't see the details for ALL the machines, because most are still down due to the maintenance outage. I will not have them all up for a couple more weeks.

KeremE and cronostm: I will answer your questions in a separate post.
0
 

Author Comment

by:fklein23
ID: 37829067
KeremE:

I am not familiar with pylons, and I don't know how that is pertinent.
I am not having trouble with an application.
I am having trouble in that the /etc/rc.local script does not seem to execute.

Here is my rc.local

#!/bin/sh
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
touch /var/lock/subsys/local
echo 'Starting up smb...'
/etc/init.d/smb start
echo 'Starting up ntp service...'
/etc/init.d/ntpd start
echo 'Defining CANBUS devices...'
/usr/bin/make -f /usr/src/Ecan_Linux_V02.02.01/driver/Makefile devices1000
echo 'Mounting SUPERVISOR shared files...'
mount -o ro -o passwd= //172.16.20.9/sysroot /mnt/SUPER

At WHAT point does a copy of /etc/rc.local get copied to /etc/rc.d/ ??
Are you saying that some script in  /etc/init.d/  does this?

We have been rebooting and power cycling these Linux boxes. Altogether we have had about 50 or 60 of them, and have rebooted them many hundreds of times during development, and never have we seen anything like this.

I have not now nor EVER changed anything in /etc/init.d

I have on numerous occasions changed rc.local and the file has routinely made its way to the right place to execute during reboot.

I have tried to duplicate the problem we have in the remote site with a computer hehre in our lab and can't duplicate the problem of /etc/rc.d/rc.local being non-executable.

I also cannot get /etc/rc.d/rc.local to be replaced by /etc/rc.local

I tried making /etc/rc.d/rc.local non-executable and then edited /etc/rc.local, to make SURE the timestamp of /etc/rc.local is much newer than /etc/rc.d/rc.local.

Then I typed "reboot" and after waiting a long while to let Linux finish rebooting,

I expected that rc.local or its copy in /etc/rc.d/rc.local would execute, but NO, the bad version of rc.local that is unexecutable is still lodged in /etc/rc.d and rc.local did not execute.

So the question is more than "how did /etc/rc.d/rc.local end up being unexecutable. It is "why can't I just edit rc.local and have it reliably just execute on reboot or power cycle?"

... because that ALWAYS worked before!!!

Thanks - Frank
0
 

Author Comment

by:fklein23
ID: 37829085
Finally, kronostm:

If I execute this directly:

'source /etc/rc.local'  the script runs just as it's supposed to and all is well.

I put touch commands in rc.local and NONE of them produced a marker file, so I am certain rc.local never even attempted to execute.

- Frank
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37829257
I am having trouble in that the /etc/rc.local script does not seem to execute.

It is impossivble if it is not set as executable. In fact this is how to disable it. So the problem is you've removed executable attrbutes generously and your system willnearly goe bankrupt.

You have disabled even the required services such as:
acpi, audit and smartd

Which are reponsible for monitoring your hardware.

The system would be OK if you did not modify x flag from most of the system files.
0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 

Accepted Solution

by:
fklein23 earned 0 total points
ID: 37829291
EUREKA!!! I think I found the problem.
I found a post on http://www.tek-tips.com/viewthread.cfm?qid=1451444
that seems to indicate my thinking about rc.local is completely wrong!!!

Notice:

[root@B-2 etc]# ls rc* -all
lrwxrwxrwx   1 root root    7 Feb 15 16:01 rc -> rc.d/rc
lrwxrwxrwx   1 root root   10 Feb 15 16:01 rc0.d -> rc.d/rc0.d
lrwxrwxrwx   1 root root   10 Feb 15 16:01 rc1.d -> rc.d/rc1.d
lrwxrwxrwx   1 root root   10 Feb 15 16:01 rc2.d -> rc.d/rc2.d
lrwxrwxrwx   1 root root   10 Feb 15 16:01 rc3.d -> rc.d/rc3.d
lrwxrwxrwx   1 root root   10 Feb 15 16:01 rc4.d -> rc.d/rc4.d
lrwxrwxrwx   1 root root   10 Feb 15 16:01 rc5.d -> rc.d/rc5.d
lrwxrwxrwx   1 root root   10 Feb 15 16:01 rc6.d -> rc.d/rc6.d
lrwxrwxrwx   1 root root   13 Feb 15 16:01 rc.local -> rc.d/rc.local
lrwxrwxrwx   1 root root   15 Feb 15 16:01 rc.sysinit -> rc.d/rc.sysinit

This indicates that /etc/rc.local is a symbolic link to /etc/rc.d/rc.local

If this is universally true of Linux distros, then my problem is that on those 20 controllers in our plant, I have been thinking about this backwards. I expected that /etc/rc.local was the primary file. I noticed a copy of it in /etc/rc.d/ and assumed that something in the Linux O/S would copy /etc/rc.local to /etc/rc.d/rc.local and that the symbolic link in /rc/rc.d/rc3/dS99local which liked to /etc/rc.d/rc.local would take care of everything.

How my system got /etc/rc.d/rc.local to no be executable is the big mystery.
But my mistake is that all this time I have been treating /etc/rc.local as the primary copy of the script, when in fact it is just a link to the underlying /etc/rc.d/rc.local

Because of this mistake, our source control was copying a new version of /etc/rc.local into that place as if it were a real file, thus breaking the link between where the REAL file is supposed to be and the symbolic link /etc/rc.local

Whenever we were editing /etc/rc.local using vi, all was taken care of because vi could thread through the symlink to the real file. But our file-copying source control method can't do that.

Since this file copying fiasco was donee to all 20 of the controllers, I guess I have to fix this problem 20 times.

Solution: make /etc/rc.d/rc.local the primary place to hold the script and make /etc/rc.local a symbolic link pointing to it.

Thanks for all your help.
- Frank
0
 

Author Comment

by:fklein23
ID: 37829458
KeremE:

Sorry, that isn't it.
I am pretty sure the problem is as I just posted.

I intentionally turned off executables in the cron.weekly and cron.daily directories, because I REALLY DO NOT WANT THEM TO EXECUTE, and it was preferable to unexecute them than to delete them. After making them unexecutable our observed problems with logrotate got better. But we had to continue searching for the inevitable solution to the problem, which ended up being the hal daemon, which apparently had a memory leak in the version we were using, and was consuming up to 70% of system memory after a few weeks, causing applications to mysteriously crash from being memory-starved.

Once the hal daemon was configured OFF (and I isolated several other non-essential services to turn off) the memory starvation problem was fixed.

But SOMEHOW (and I still don't know how) /etc/rc.d/rc.local got changed to being non-executable, and I subsequently found that it didn't execute on boot. But the real problem was that I was modifying the wrong file, thinking it was the ACTUAL file, when in reality it was only a symlink.

Now I have the unenviable task ahead of me that I must explain what went wrong to management and their eyes will totally gaze over as soon as I start talking about symbolic links.



Our systems are NOT desktop systems in which hardware comes and goes.
They are systems with very limited RAM resources and NEVER have hardware come and go during runtime. These processes that ran at midnight were causing resource problems and causing difficulties during logrotate.

I have not made any services non-executable.

The problem was misinterpretation of the connection between /etc/rc.local and /etc/rc.d/rc.local

I thought /etc/rc.local was the real thing. In reality /etc/rc.d/rc.local is what I should have been maintaining as the primary file all the time. Up until recently we maintained it with vi. Once the system "matured" we had to move to a source controlled version and I was simply source controlling the wrong thing!!!

If I hadn't made the mistake (however I did it) of making /etc/rc.d/rc.local non-executable, I would never have figured this out!!! It just happens that between the time a couple of years ago when we transitioned the Linux scripts to source control and today, there have been no changes to rc.local. So I never noticed the problem.

Thanks for all you input, I appreciate it, but this problem is SOLVED!!
0
 

Author Comment

by:fklein23
ID: 37829659
KeremE
Oh, and by the way, I did NOT unexecute ANY services, I only unexecuted scripts in the cron directories that were responsible to executing certain scripts at midnight.

For example: here are ls -l for the three services you mentioned:

-rwxr-x--- 1 root root 169952 Jan 21  2009 /sbin/auditd
-rwxr-xr-x 1 root root 269208 Jan 21  2009 /usr/sbin/smartd
-rwxr-x--- 1 root root 22156 Jan 21  2009 /usr/sbin/acpid
... etc.,...

I merely configured those services off. Since we never use smartcards on these controllers, and acpi is a power monitoring system (and we are not using Xwindows or PCs or batteries or anything that requires acpi) and so on, until I justified the removal of all the services that I viewed as optional to our system!!!

So no, I have NOT been indiscriminately turning off the executable attributes of any services.
0
 
LVL 30

Assisted Solution

by:Kerem ERSOY
Kerem ERSOY earned 300 total points
ID: 37829785
I merely configured those services off. Since we never use smartcards on these controllers, and acpi is a power monitoring system (and we are not using Xwindows or PCs or batteries or anything that requires acpi) and so on, until I justified the removal of all the services that I viewed as optional to our system!!!

You got it completely  wrong. smartd is not for smartcards. It monitors your hard disk health status and warns you when there's  a problem in one of the sectors over the disk.
acpi is not only responsible for battery but also several features such as the power buttons power statuses and fan speeds and lots things like that. So would never suggest to disable them nor disable them on a production system. I hate to say that but it seems that you're indiscriminately shutting down services.
0
 

Author Comment

by:fklein23
ID: 37830022
KeremE:

I accept your criticism about smartd. Perhaps I got carried away on that one. Thanks for that advice. I will reconsider. I guess I accidentally lumped it in with pcscd.

acpi, however just doesn't apply to our system. We have two states: ON and OFF and never hibernate or turn monitors off (because there aren't any), etc.

And the other services I turned off were turned off specifically on the recommendations of people on Experts Exchange and elsewhere who seemed to know what they were talking about.

I turned off services related to printers (we have none) human interface devices (have none), hal (which I definitely identified as the culprit in our mysterious crashed application problem that was the sole reason for this system-wide re-evaluation of services).

yum-updatesd is worthless to us because we are isolated and can't access the yum repos.

I agonized over things like auditd before deciding to take it down.

I removed the firewall and secure Linux services because we use firewall services at the Exchange server and the conflicts between Exchange server and anything related to Linux security were simply too much of a tangle to continue fighting them. We are confident the firewall service at the Exchange Server level and in our secure routers  protects us sufficiently.

I could recap the rest of my decisions about this, but the key question is this:
I STILL do not know how /etc/rc.d/rc.local had its executable bits turned off.

My main concern over services (other than smartd, which you've gotten me worried about)  is this: Is it remotely possible that the absence of any of the shut-down services could have had anything to do with this happening?

And, really, thank you for your input. It is very kind of you to hang in there and continue offering your advice in spite of my resistance. I don't mean to come off as a know-it-all, because I am NOT! I am first and formost an engineer and scientist and often get completely beyond my limits of patience when wrestling with O/S issues. They obey arbitrary rules made up by people instead of universal laws of physics!.  I appreciate all the help I can get wrestling with these obscure issues!

Thanks - Frank


Thanks - Frank
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37830272
Ok  thanks for the explanation. But Frank I really mean you should not actually shutdown acpid service. This service is responsible for hardware monitoring and actually manages services such as shutdown. Here's an article from the RedHat magazine and the authors agree that you should never disable it except when ordered explicitly to do some debugs. Also auditd is in this category. May be you think it is necessary when SELinux but t is not only that. It also logs su and logins etc and a valuable tool for security.

Here's the article about services and "know thy services" : http://magazine.redhat.com/2007/03/09/understanding-your-red-hat-enterprise-linux-daemons

This being said lets come . Let's come to your question. Since /etc/rc.local is actually a link to the actual file which is also linked to /etc/rc.d/rc2.d, rc3.d, rc4.d and rc5.d and this file under S99local is actually linked to ../rc.local it can be either a side effect of you've disabled some of the level 3 services. But none of the files could do it so I believe you've removed the x attribute by mistake at one point. Normally no service should be affected if you manually set the execution bit. Because all you have are links which are neutral to attribute changes.

Cheers,
K.
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37830301
You have some problem with your /etc/rc.d. and it seems that you've broke the link between the actual file in /etc/rc.d/rc.local to /etc/rc.local this should be why it was never executed:

# ls -al
total 136
drwxr-xr-x  10 root root  4096 Mar 16 02:49 .
drwxr-xr-x 119 root root 12288 Apr 10 20:39 ..
drwxr-xr-x   2 root root  4096 Apr 10 20:39 init.d
-rwxr-xr-x   1 root root  2255 Dec 19 19:00 rc
drwxr-xr-x   2 root root  4096 Apr 11 00:48 rc0.d
drwxr-xr-x   2 root root  4096 Apr 11 00:48 rc1.d
drwxr-xr-x   2 root root  4096 Apr 11 00:48 rc2.d
drwxr-xr-x   2 root root  4096 Apr 11 00:49 rc3.d
drwxr-xr-x   2 root root  4096 Apr 11 00:48 rc4.d
drwxr-xr-x   2 root root  4096 Apr 11 00:48 rc5.d
drwxr-xr-x   2 root root  4096 Apr 11 00:48 rc6.d
-rwxr-xr-x   1 root root   220 Dec 19 19:00 rc.local
-rwxr-xr-x   1 root root 27052 Feb 22 16:47 rc.sysinit

and the actual link should be at the /etc/rc.local:

lrwxrwxrwx 1 root root 13 Mar 16 02:49 rc.local -> rc.d/rc.local

So I guess I was right you've heavily intervened the system boot files and broke some links that should be there.

Cheers,
K.
0
 

Assisted Solution

by:fklein23
fklein23 earned 0 total points
ID: 37832472
Thanks, KeremE:

I will read your page about services, and I appreciate your input.
As to the links involving rc.local, I had already fixed the problem and can boot successfully, by restoring the symlink for rc.local.
The only mistake I made there is that I have /etc/rc.local under source control instead of /etc/rc.d/rc.local, because I had it firmly stuck in my mind that the real file was in /etc/rc.local

It still doesn't explain /etc/rc.d/rc.local having its execute attribute turned off. And, since that file has never been under source control and its timestamp is old enough to predate the last several changes to the source, I still don't understand how this file could have been modified on ALL the Linux machines.

Is it possible that something unpredictable might happen if I have this arrangement:

/etc/rc.local --> links symbolically to /etc/rc.d/rc.local
Now, if /etc/rc.local is a symlink (not an actual file) and I overwrite it with a brute-force file write, probably using samba, is it possible that the file to which the symlink points can somehow have its attributes damaged?

Unfortunately, since we are in plant shutdown, I cannot absolutely claim that ALL the machines are affected, but all the ones I have been allowed to dark-start over the past few days have the problem.

I will be extra-vigilant about this issue with regard to all the other system files that are under source control. So far all the symlinks of which I am aware are in good shape.

As for intentionally changing an attribute bit, I have long since gotten burned by that sort of thing, so I have a personal rule that I adhere to pretty stringently to NEVER do that.

I have had things like mail stop working because of system files being made writeable. I learned that by trying to make the job of cloning one system onto another by using file copies from one machine to another using a samba client. Since one must write to the destination drive to do this, one must turn on the "write-attribute" of the destination. This is a terrible idea, of course not only because system processes require certain files to NOT be world-writeable, but the file attributes under windows and Linux are inherently different and some attributes (like the 'S' attribute) simply do not translate. There is also the problem that symlinks can create recurrent loops which Windows simply can't naturally sort out.

Once burned by this, I have been adamant that any such future file sychronizations are done carefully and by using
'scp -p'  from the destination machine. This is a more tedious method than just using BeyondCompare from a Windows machine, but is much safer. The systems I contaminated using that method have long since been erased and rebuilt using other means :-)

So, thanks again for the link about essential services. I have been trying to find a good source for demystifying exactly which of the mind-numbingly long list of services are absolutely essential for supporting a system that is only used as an embedded controller.

What I find in the various on-line "expert" sites is that there is considerable controversy around this topic and little guidance for embedded systems users. We REALLY can't afford to have any gratuitous CPU use by services that are not completely essential. The hal daemon is a classic case of what NOT to keep alive in such a system. From my reading of war stories I am not the first to encounter problems with this service. Perhaps in my zeal to "trim down", now that our application is finally in shape for production and I had time to breathe and look at the big picture, I got carried away.

Thanks for you input.
Keep in touch, Frank
(aka docrudolph AT gmail DOT com)
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37833223
Now, if /etc/rc.local is a symlink (not an actual file) and I overwrite it with a brute-force file write, probably using samba, is it possible that the file to which the symlink points can somehow have its attributes damaged?

I think this is unlikely. First of all most files under /etc are writable to root. For you to be able to modify files under it you need to logon using uid/gid of root. and this is very unlikely for obvious reasons. Secondly you normally create shares using some subpath in directory tree but not entire disks. You should not share the entire disk for security reasons.
 
I understand that you're running lots of servers with limited resources and I am sure you're doing a wonderful job. But it seems that at some point you have overwritten some files and directory rights. May be even removed some links as in the case of /etc/rc.local. I'll suggest that you setup another system as a reference and compare rights and links in etc from there. Though I understand your concern that you baseline the config over one system and then copying the config to other systems and you're afraid to amplify the problem after distributing a config to several systems. You should check entire etc boot stanza since we already know that there are some discrepancies there. I've checked over my system and found that no service would remove rc.local execute when you disable them.

I have my ee gtalk address at my profile. Just keep in touchc.

Cheers,
K.
0
 

Author Closing Comment

by:fklein23
ID: 37850335
I found the specific fix on my own. However two users offered sound advice, and I awarded points appropriately.
Thanks to you both.
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
If you use Debian 6 Squeeze and you are tired of looking at the childish graphical GDM login screen that is used by default, here's an easy way to change it. If you've already tried to change it you've probably discovered that none of the old met…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now