I have here a linux based application that was recently moved onto a faster server with lots of memory and fast disks. The application, written in C, basically performs the following steps:
Read data from a file
allocate memory for the data
add the data to a linked list
When I run the application on the older server and monitor the CPU performance, I see it spending about 20% of its time in IO Wait and 15% in kernel mode. When I move the application to the new fast server, it spends about 1% in IO Wait mode and 40% in kernel mode. While this is sort of a good thing, if I run about 16 of these I can max out the CPUs my new dual quad core server. I need to do other things on this server and this gets in the way. When I analyze where the application is spending its time, on the old server it is mostly read and on the new server it is mostly malloc. On the new server there is lots of free memory and on the old server memory is more limited. In top, I can observe the memory allocation going up as the linked list is built. The list can be very long and it can take several minutes.
The obvious path is to rewrite the application so that the memory is allocated once at the beginning of data retrieval. The list is not open-ended, so the amount of memory required can be determined in advance (or at least close enough). However, changing the application will take some time and I would like to see if there is a way to tune the Linux OS to optimize performance of this application in its current state.
I can tweak system configuration paramenters, user account parameters, hardware, process priorities and stuff like that. I could probably even patch the kernel if that would help. The one thing I can't easily do is modify the application.
I realize that by changing the priorities of the processes, I can "buy back" some of the throttling effect being provided by the slower I/O subsystem on the older machine. That has the net effect of extending the time it takes to build the linked list, so doesn't count as a valid answer to this question. What I am looking for is an approach that reduces the total time the application is spending in kernel mode.
Thank you in advance.