Thoughout my experience working on eCommerce web applications I have seen applications succumbing to increased user demand and throughput. With increased loads the response times started to spike, which leads to user frustration and lost sales. I have seen several downtimes during the holiday season and promotional events, which is catastrophic for businesses.
The reason for this is mostly performance bottlenecks or scalability issues. Usually web applications get their data from relational databases. If an application is not designed right, databases become bottlenecks. The cost of scaling an enterprise databases like Oracle or DB2 is huge and can take us only so far as most of the relational databases are not always horizontally scalable. The other bottleneck is usually web service calls. On high loads web service calls can literally kill a website not designed for scalability and performance.
I recommend the use of caching for almost any application. Though I have used memcached
as caching providers and will be using memcached's example for this article, I am not here to propose any one caching solution. For simple websites simple map based solution of WhirlyCache
might work pretty well, though you can still use more advance solutions like EHCache, memcached, JCS
, Oracle Coherence
, and many more. There is an upfront cost to using some of the advance distributed caching solutions but it really pays off over a longer period with better scalability and maintainability.
I will use memcached for explaning the caching concepts and examples in this article. More information about memcached can be found from http://memcached.org/
Let’s take a hypothetical example where a customer is logged in to the site and is browsing different pages. We might show a welcome message to the user on header of each page and also showing some personalized content on different pages. To fetch user information we need to query the database on each page; that is unnecessary as the user information is not going to change with each request. So we will be caching the customer information to reduce requests to the database.
memcached is a distributed memory object caching system, which means we will be running outside the application's Java virtual machine
. memcached has 2 components: the memcached server and client. For web applications written in Java, the client library implementing the memcached interface is called spymemcached
. For client libraries in other popular languages, refer to https://code.google.com/p/memcached/wiki/Clients
Once you setup the server, it can be referred to using host and port. On login, you can cache the customer object as:
MemcachedClient c=new MemcachedClient(
new InetSocketAddress("hostname of memcache server", portNum));
// Cache customer object for 1 hour. Here customerKey is be unique key per customer.
c.set(customerKey, 3600, customerObject);
You can create customer key based on Session ID or customer ID or a combination of both depending on you application need.
We can now bypass the database for any future use of customer object by fetching the object from cache:
// Retrieve customer object (synchronously).
One might now ask why not use a session scoped object then. The answer is that by using a session scoped object we are overloading application heap. Thus we are offloading calls to databases but we are also pushing the issue to a different layer. By using memcached we can scale the application horizontally by adding more applications and/or memcached instances.
Similarly one can also cache webservice responses when for the same request the response is always going to be the same (or at least the same for immediate calls).
If you are using Spring, memcached provides a plugin for that avoids the boilerplate code for putting and getting objects from cache: http://code.google.com/p/spymemcached/wiki/SpringIntegration
Caching is a huge topic and I am pretty sure I have left lots of room for questions and clarification. Feel free to send me your comments and feedback.