Tomcat, Arch Linux memory optimization

I have a web-app running in Apache Tomcat 5.5 under Arch Linux. The application has a very high memory-usage - the JVM is collecting about 10gb/minute. The whole thing is running on a box with 8gb memory and due to the high rate the application consumes memory, we've allocated 6gb of heap-memory for the Tomcat JVM. Now, what I've discovered (or at least how it looks to me) is that Arch Linux will allocate a file-cache in RAM which appears to grow to use all available memory. My sense is that in our case, this is not ideal since the JVM:s memory usage flucuates wildly (a full GC collects over 2gb of memory). The problem we're having is that the server runs out of free memory regularly and starts to swap, forcing a restart.

Is there a way in Arch Linux to configure a max-value for the file-cache? I want to be able to cap the JVM and file-cache memory-usage to get more predictable memory usage profile. A lot of the application works against a file-based BDB on an SSD, so file-caching in RAM should not be crucial for performance.

Of course, other perspectives/ideas on the problem would also be appreciated.
LVL 4
pellepAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

woolmilkporcCommented:
Hi,
there is an option to help balance the difference between buffers and swap space. vm.swappiness is a kernel parameter that can be tuned to help you adjust how the kernel balances between using block device cache or swapping application memory to disk. The default is 60 and the scale is 0-100. With 0 being use up all cache before swapping applications, and 100 being swap as much as you can before using block device cache. To view this use the command sysctl |grep swappiness to change it use  
sysctl -w vm.swappiness=x  
So, if you lower x, the kernel will use more of the cache pages before swapping, leaving less memory for disk buffering.
HTH
wmp
 
0
ksivananthCommented:
if yours is the only application running in the 8gb box and you have set 6gb as the max, there is no need the system swap the memory to vertual. The server will swap the memory only if the running processes need more memory than the available( 8gb in this case )!

looks like your app needs more than the allocated 6gb... run some profiler and see if you can optimize the memory utilization!
0
woolmilkporcCommented:
There is a nice discussion about swappiness. Find it here -
http://kerneltrap.org/node/3000
wmp
 
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

pellepAuthor Commented:
>> woolmilkporc

Swappiness is definitly interesting and I'm going to try playing around with it. It doesn't seem to concern the disk cache though, only how/when application memory is swapped out.

What I'm concerned with is the size of the disk cache (the terminology differs, disk cache, file read cache, file IO cache etc). The information I've been able to gather so far about this is that "Linux will use as much RAM as it can to cache file reads". Is there a way to cap the size of this cache, or is that completely inappropriate?
0
woolmilkporcCommented:
swappiness does exactly that - lowering swappiness means using less memory for disk cache in favor of process memory.
0
Mick BarryJava DeveloperCommented:
Do you expect your app to be using that much memory?  Thats the first thing I'd be looking at

0
pellepAuthor Commented:
>> objects
Unfortunately, yes. The app is wrapping a huge lucene-index (>40gb), BDB-backed on an SSD. We're still analyzing heap-dumps to confirm, but my sense is that large memory-consumption is inevitable. Certainly, re-architecting the solution is an avenue I'm considering (I didn't build the original application) to size down the lucene index, or try to split it into smaller, discrete indices spread out over multiple servers. Bottom-line, the size of the index (I'm assuming) inevitably leads to lots of file-IO and lots of temp-memory used in the web-app for the lucene processing.

>> woolmilkporc
I think I'm starting to understand swappiness a little more now. It's definitly something I'm going to tweak around with but I'm still curious about my origianl query: is there a way to set an absolute ceiling on how much RAM the kernel will allocate for the file read cache (I assume this is the same thing as the "block device cache" you referred to above)? My ultimate goal here is to be able to force a scenario where the OS will never commit more than 6gb for the JVM and 1gb for file read cache, leaving 1gb RAM always guaranteed available for the inevitable overhead (sshd and such). Essentially, taking away the decision from the OS on how/when to grow or shrink the cache.
Or, is there some "hidden" or less obvious switch in the mount command I can give to instruct the kernel to completely disable caching for my SSD, for instance. Since it's solid-state, I'm hoping it would be fast enough as it is to not need caching.

If all this smells of madness, please enlighten me. I'm not a hard-core Linux admin so if I'm making faulty assumptions here I'd happily be corrected.
0
woolmilkporcCommented:
Yes. I used 'block device cache' and 'disk cache' synonymously here.
There is no 'absolute ceiling' option I'm aware of. You can only influence the page stealing algorithm:
Quoting myself (slightly modified) - "0 means steal all cache pages before swapping applications, and 100 means steal as much process memory pages as you can before using block device cache."
If no page stealing is necessary (enough memory available), cache memory will be allocated regardless of the 'swappiness' setting.

As for mount options - you can always try the 'sync' option, meaning: write data to disk as soon as possible.
Of course I can't tell how this will influence I/O performance in your environment.  
I don't know if this is equivalent to the 'directio' or 'dio' option found on some systems to bypass the cache completely.
wmp
 
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
pellepAuthor Commented:
>> woolmilkporc

Do you know if "sync" has any effect on reading, or is it only relevant in write-operations?

Directio seems like it might be interesting to play around with as well. Although, all the info I've found online about it indicates that it's a flag you specify programatically when doing file IO. Are you aware if there is a switch/configuration that forces all IO-operations against a device/mount to be directio?

Another (minor) follow-up, if you'll allow me (since you seem knowledgable): I'm assuming that XFS would be the best choice of FS for storing the large (>40g) lucene index-file we're using. Would you agree that's a fair assumption? Or is there a better one? Our priority is performance in read/scan operations.
0
woolmilkporcCommented:
Hi again,
no, 'sync' is an option only for write operations.
Some systems (AIX for example) recognize a 'dio' mount option.
Basically you're right,  I/O on a filesystem mounted that way will behave as if all the files had been opened with O_DIRECT specified in the open() system call. Consequently, if your system doesn't support such a mount option, all concerned files have to be opened using O_DIRECT. There is no other switch to achieve direct I/O, afaik.
XFS is very well suited for large files. If striping is in use, the stripe unit can be customized during XFS filesystem creation to ensure better aligning. Block sizes can be tuned, where larger sizes can be chosen to achieve better performance for large files (this will cost a bit more memory, however). Direct I/O is supported with XFS, but there is no mount option for that I'd be aware of.
Personal note - since we switched our bigger (production) systems to AIX (jfs2), my working knowledge about XFS is surely outdated and might not be quite accurate as well.
HTH anyway,
Cheers
wmp
 
0
pellepAuthor Commented:
Thanks for all the input and guidance, it definitly set me a good bit further on the way. Your suggestions sound very resonable from the somewhat limited information I managed to provide and you've explained them with commendable clarity and detail. I have a few more "knobs" to experiment with now. Hopefully this will help stabilize our system so I don't end up having to re-design the whole mess...
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java App Servers

From novice to tech pro — start learning today.