Suggestions for shell script optimization articles and/or software

I have a 2000+ bash script that performs a maintenance activity and enumerates storage devices & reports state.   It needs to run every 10 seconds, and after a few rounds of optimization, I've got it down to 4-8 seconds, depending on what changes between the last and current invocation.  This script runs in an appliance, and compiling to C is not an option (because the company that contracted me wants it as a shell script for readability, and ease-of-maintenance).  This runs on a multi-core system, and CPU overhead is not much of a bottleneck, but as we all know, shell scripts are basically pigs, so I want to be frugal with CPU resources but not anal about using them

So rather than post a section and ask for suggestions, I thought I would open this up for something more useful to the community as a whole.

Q1:  Are there any freeware or open source shell script profiling tools that could assist in finding areas of the code that are greatest problems?  Ideally is there a low-overhead way to create a precision timer and start/stop it as necessary to show elapsed time, and what the holdup was? I.e, for a given chunk of code, can I see if the process performed disk I/O, had any wait states, or context switching, which would then be a candidate for some tweaking?  

Q2: Any good articles, tips, whatever in general on minimizing execution time of shell script?   This particular environment is latest LINUX kernel and I am using bash, so while it will be useful in general, please keep in mind that if there is a product that can help, it has to work on bash/LINUX.

For example, a tip might be, "If you have to perform frequent lookups that involve reading data files, make use of /dev/shm to store those results, and just see if whatever you are looking up has changed since last polling time.  If it has not changed, then just read results from a tmp file stored in the /dev/shm ramdisk. If it has changed, or no cross reference exists, then enumerate as before, and save results in /dev/shm.

Or another, use the native (( )) feature to perform numerical calculations rather than call the expr utility.  I.e,  (( x = x + 10 )) is more efficient then x=`expr $x + 10`

Thank you.  (yes, expect a lot of split points over next day or two)
LVL 47
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DavidPresidentAuthor Commented:
I did find a great one after I posted this, that resolves Q1, a kernel timer. Here are the details for others .. It works quite nicely BTW

1      timer_stats - timer usage statistics
2      ------------------------------------
4      timer_stats is a debugging facility to make the timer (ab)usage in a Linux
5      system visible to kernel and userspace developers. If enabled in the config
6      but not used it has almost zero runtime overhead, and a relatively small
7      data structure overhead. Even if collection is enabled runtime all the
8      locking is per-CPU and lookup is hashed.
10      timer_stats should be used by kernel and userspace developers to verify that
11      their code does not make unduly use of timers. This helps to avoid unnecessary
12      wakeups, which should be avoided to optimize power consumption.
14      It can be enabled by CONFIG_TIMER_STATS in the "Kernel hacking" configuration
15      section.
17      timer_stats collects information about the timer events which are fired in a
18      Linux system over a sample period:
20      - the pid of the task(process) which initialized the timer
21      - the name of the process which initialized the timer
22      - the function where the timer was intialized
23      - the callback function which is associated to the timer
24      - the number of events (callbacks)
26      timer_stats adds an entry to /proc: /proc/timer_stats
28      This entry is used to control the statistics functionality and to read out the
29      sampled information.
31      The timer_stats functionality is inactive on bootup.
33      To activate a sample period issue:
34      # echo 1 >/proc/timer_stats
36      To stop a sample period issue:
37      # echo 0 >/proc/timer_stats
39      The statistics can be retrieved by:
40      # cat /proc/timer_stats
42      The readout of /proc/timer_stats automatically disables sampling. The sampled
43      information is kept until a new sample period is started. This allows multiple
44      readouts.
46      Sample output of /proc/timer_stats:
48      Timerstats sample period: 3.888770 s
49        12,     0 swapper          hrtimer_stop_sched_tick (hrtimer_sched_tick)
50        15,     1 swapper          hcd_submit_urb (rh_timer_func)
51         4,   959 kedac            schedule_timeout (process_timeout)
52         1,     0 swapper          page_writeback_init (wb_timer_fn)
53        28,     0 swapper          hrtimer_stop_sched_tick (hrtimer_sched_tick)
54        22,  2948 IRQ 4            tty_flip_buffer_push (delayed_work_timer_fn)
55         3,  3100 bash             schedule_timeout (process_timeout)
56         1,     1 swapper          queue_delayed_work_on (delayed_work_timer_fn)
57         1,     1 swapper          queue_delayed_work_on (delayed_work_timer_fn)
58         1,     1 swapper          neigh_table_init_no_netlink (neigh_periodic_timer)
59         1,  2292 ip               __netdev_watchdog_up (dev_watchdog)
60         1,    23 events/1         do_cache_clean (delayed_work_timer_fn)
61      90 total events, 30.0 events/sec
63      The first column is the number of events, the second column the pid, the third
64      column is the name of the process. The forth column shows the function which
65      initialized the timer and in parenthesis the callback function which was
66      executed on expiry.
68          Thomas, Ingo
70      Added flag to indicate 'deferrable timer' in /proc/timer_stats. A deferrable
71      timer will appear as follows
72        10D,     1 swapper          queue_delayed_work_on (delayed_work_timer_fn)
I've been writing shell scripts for over 20 years and the single biggest thing that really slows down shell scripts is processing a file line by line, eg:

while read line

This construct is OK for smallish files, but when you start processing larger files, the slow  down is greater and greater.  In a lot of cases, a simple awk or possibly sed/awk can be used to speed up things.

For example a loop with

while read f1 f2
   echo "f1" >>output
done <file

is going to be many orders of magnitudes slower than

awk '{print $1}' file >output

The other area which can slow things down a lot is unnecessary calls to external commands when you can use the shell builtin functions.  bash is quite powerful (for a shell) for things like string manipulation, so quite often, bash scripts that use cut and sed can be rewritten and sped up using the bash builtins.


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Regarding Q1, the bash debugger may be useful.  I know it's not a profiler, but it can be useful for stepping through code to see what steps it is taking.

For the type of detailed info that you're after, I think DTrace would give you everything you need,  If you're not familar with DTrace, it was originally developed for Solaris kernels, but has since been ported for Linux, FreeBSD and OS/X.


It is a very powerful and flexible tool, although it can take a while to learn how to use it.  There are a bunch of existing DTrace scripts you should be able to find that will help you.

For Q1, "strace -f -T ...", you can see what actual system calls took time. Also check the calls that return errors ("grep ' E'" for the strace command).
For Q2, please check for correct PATH and LD_LIBRARY_PATH values: minimalistic and with correct/optimized order.
DavidPresidentAuthor Commented:
All wonderful tips, every one of them allowed me to squeeze a few seconds of my script so now it is much more efficient.  

Another trick I used in addition to the timer was - Pre-processing configuration and some metadata information that I will refer to in subsequent calls and placing it in the /dev/shm directory.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Shell Scripting

From novice to tech pro — start learning today.