<

Improving Web Site Performance via PHP Cache

Published on
10,472 Points
1,972 Views
Last Modified:
Introduction
This article explores the design of a cache system that can improve the performance of a web site or web application.  The assumption is that the web site has many more “read” operations than “write” operations (this is commonly the case for informational sites) and for this reason, the site should be able to recognize repeated identical requests and return an immediate cached response, rather than going back to the database queries for the reformulation of the original response. 

The rationale for this strategy comes from recognition of the difference in speed between in-memory processes and disk-based processes.  While memory access is typically measured in nanoseconds, even a very fast disk spinning at 7200RPM requires 8.3 milliseconds for a single rotation, and the nature of file lookup or database operations is such that a great many disk rotations may be required for some queries.  Since the ratio of nanoseconds to milliseconds is several orders of magnitude, it follows that cache may produce substantial quantitative improvements in server performance.

Characteristics of a Cache
Popular cache systems include Memcached and Redis, and it is also possible to use the file system for cache storage, but in-memory systems will give the best performance.  All cache systems work in similar ways.  They are key:value data storage systems.  Access to a value in the cache is made by reference to the key; the cache system associates the key with the value, much like the keys of an associative array or the property names of an object.  The cache system provides for expiration of the cached values based on elapsed time since the key:value pair was most recently stored.  The API for a cache system includes methods to put() information into the cache, get() information out of the cache, and delete() information from the cache.  There may be other methods in the API, for example, Laravel offers a has() method to ask whether a key exists without actually returning the associated value.  And our Cache Interface defines a flush() method to remove all  data from the cache.

Because there could be more than one concrete implementation of the cache, we will "code to the interface" and follow the standard practices of abstraction in object-oriented design.  By doing our design this way we can write code that uses the cache without any need to concern ourselves with the exact implementation of the cache.  Interface-driven design has several advantages.  One obvious advantage is that the programmers who agree upon the interface can work independently to develop the implementations and the use cases.  Although we are using the PHP session in our demonstration script below, we could easily switch to Redis by replacing only the CacheDemo class.  None of the rest of our script would require a change.

A simple example of the cache API is shown here:
<?php

/**
 * A Cache API
 */
error_reporting(E_ALL);

Interface Cache
{
    public static function put($key, $value, $duration);
    public static function get($key);
    public static function delete($key);
    public static function flush();
}

Open in new window


Using the Cache
When the server receives a request, it can check the cache to see if the requested resource is available.  A call to the get() method will return the resource or FALSE, if the key does not exist.  When the server must generate a resource from database queries or other time-consuming activities, it can store the generated resource in the cache with a call to the put() method.  The call to put() provides the key and value to be stored, as well as a duration in seconds.  The keys in the cache are unique; a put() with a key that matches an existing resource will overwrite the stored data.  When the server needs to remove a resource from the cache, a call to delete() with the named key will eliminate the cached resource.  In certain circumstances it may be necessary to remove all cached information; the flush() method will remove all resources resulting in an empty cache.

A “resource” can be any data element that the application needs to store.  It might be a query results set, a web page fragment, or even an entire web page – HTML and all.  A key can be any data element that the application needs to identify.  It can be an entire HTTP request, identified by the full URL and query string.  In the WWW, a RESTful design, a GET-method request is made by the web browser to a URL and the server responds with a web page containing the requested information.   It follows that a web page made up from several complex queries, API calls, etc., can be an excellent candidate for cache storage.

A cached resource is considered to be an element.  It cannot be subdivided or augmented; it can only be added, removed or replaced.  A cached resource can be represented by an object instance of the CacheElement class:
Class CacheElement
{
    public $key;
    public $value;
    public $expiry;

    public function __construct($key, $value, $duration)
    {
        $this->key    = $key;
        $this->value  = $value;
        $this->expiry = time() + $duration;
    }
}

Open in new window


A Concrete Implementation of the Cache Methods
Each cache element is represented by a CacheElement object. 

The put() method stores a CacheElement object in the collection of elements, using the key to create or overwrite any existing element with a matching key. 

The delete() method removes the CacheElement object with the matching key, and removes the key from the collection. 

The flush() method removes all CacheElement objects and removes all keys from the collection. 

The get() method attempts to locate the element associated with the key argument.  If no matching key is found, it returns FALSE.  If a matching key is found in the collection, get() checks the current time against the expiration time of the CacheElement object.  If the object has not expired, get() returns the value property.  If the object has expired, it calls delete() and returns FALSE. 

For methods other than get() the return value is undefined.  Here is the code that implements the Cache:
Class CacheDemo implements Cache
{
    public static function put($key, $value, $duration)
    {
        $_SESSION['cache_data'][$key] = new CacheElement($key, $value, $duration);
    }

    public static function get($key)
    {
        if (empty($_SESSION['cache_data'][$key])) return FALSE;
        if ($_SESSION['cache_data'][$key]->expiry > time()) return $_SESSION['cache_data'][$key]->value;
        self::delete($key);
        return FALSE;
    }

    public static function delete($key)
    {
        unset($_SESSION['cache_data'][$key]);
    }

    public static function flush()
    {
        foreach ($_SESSION['cache_data'] as $key => $element)
        {
            self::delete($key);
        }
    }
}

Open in new window


A Simulation Strategy to Show the Value of Cache in Action
For our example we will use the PHP session as the cache storage repository.  This is not as high-performance as Memcached or Redis, but it will provide a durable and stateful data storage tool that will allow us to make a few requests, and thereby see the behavior and value of cache.  We simulate the complexity and time requirements of uncached data requests by using PHP sleep() to add a few seconds to each request that cannot be satisfied by cache.  As each response is created, it will be cached.  The cached responses will be stored in an array of objects located at $_SESSION[‘cache_data’].

Our simulation script will be intentionally very simple so we can illustrate the cache behavior without other activities that might distract from the message.  The script will use the GET argument, key=, to determine its action.  If the key is omitted or empty, the script will do nothing.  If key==flush, the script will flush the cache. 

For all other values of key= the script will act like a well-behaved web server that is aware of its ability to cache responses.  It will attempt to get() a matching response from the cache, and will return the cached response if one is available.  Failing that, it will generate the response (a slow process) and store the response in the cache before returning the response.  Subsequent requests can then be satisfied from cache (a fast process) until the cache expires and the response must be regenerated.

Here is the complete demonstration script.  You can copy it and install it on your own server to study, test and experiment.
<?php // demo/EE_cache_demo.php

/**
 * A Cache API
 */
error_reporting(E_ALL);

Interface Cache
{
    public static function put($key, $value, $duration);
    public static function get($key);
    public static function delete($key);
    public static function flush();
}

Class CacheElement
{
    public $key;
    public $expiry;
    public $value;

    public function __construct($key, $value, $duration)
    {
        $this->key    = $key;
        $this->value  = $value;
        $this->expiry = time() + $duration;
    }
}

Class CacheDemo implements Cache
{
    public static function put($key, $value, $duration)
    {
        $_SESSION['cache_data'][$key] = new CacheElement($key, $value, $duration);
    }

    public static function get($key)
    {
        if (empty($_SESSION['cache_data'][$key])) return FALSE;
        if ($_SESSION['cache_data'][$key]->expiry > time()) return $_SESSION['cache_data'][$key]->value;
        self::delete($key);
        return FALSE;
    }

    public static function delete($key)
    {
        unset($_SESSION['cache_data'][$key]);
    }

    public static function flush()
    {
        foreach ($_SESSION['cache_data'] as $key => $element)
        {
            self::delete($key);
        }
    }
}


// WE USE THE PHP SESSION FOR THE CACHE
session_start();


// THE ACTIONS THAT CAN BE PERFORMED BY THIS SCRIPT
$key = (!empty($_GET['key'])) ? strtolower($_GET['key']) : NULL;


// SPECIAL CASE KEY: OMITTED
if (!$key) die('No key');


// SPECIAL CASE KEY: FLUSH THE CACHE
if ($key == 'flush')
{
    CacheDemo::flush();
    $response = 'At ' . date('H:i:s') . ', the cache was flushed';
    die($response);
}


// TRY THE CACHE FOR THIS KEY
$response = CacheDemo::get($key);

// IF THE RESPONSE HAS BEEN CACHED
if ($response)
{
    die('Retrieved from cache at ' . date('H:i:s ') . $response);
}


// IF NO CACHED REPRESENTATION IS AVAILABLE YET
else
{
    // PRETEND IT TAKES A LONG TIME TO GENERATE THIS COMPLICATED RESPONSE
    sleep(3);

    // CREATE THE RESPONSE AND CACHE IT FOR HALF A MINUTE
    $response = 'This response to the request for: ' . $key . ' was created at ' . date('H:i:s');
    CacheDemo::put($key, $response, 30);

    // RETURN THE NEWLY CREATED RESPONSE
    die($response);
}

Open in new window


What Happens in Practice?
We can see the behavior illustrated in these screen shots that include the browser URL, the browser output, and the console.  First, we flush the cache.  This is a fast process, returning a complete response in less than 1/10 second..
flush-cache.png
Next we make a request for a page that is not cached.  As we can see, this is a slow process.  The console shows us that the request was not completed for more than three seconds.  But we have a bright future for the next request to this URL.
slow-response-before-cache.png
When we reqest the same resource again, we see a nearly instantaneous response.  Because the data was in the cache, we are able to return it without any delay.  From the timestamp at the end of the response, we can see that this is the same response data we created above.  Instead of taking three seconds to generate the page, the cached response was completed in less than 1/10 second.
fast-response-from-cache.png
After a period of time, the cache expiration will have passed, and the page must be regenerated.  When that happens it will again take a long time to generate the page, but there will be a new timestamp on the end of the response, showing that we have gotten the latest data from the server.
after-cache-expiration.png
Summary
This article has shown the design and desirability of a server-side cache algorithm.  The benefits of cache include faster responses to the client, decreased loads on the server, and reduced calls to external APIs.  Cache is appropriate whenever generation of a web page is "slow" relative to the arrival of requests.  In a heavily used web site, even a short-term cache can provide great performance benefits.  Cache is also appropriate when there is a desire to capture an "image" of a web resource and serve the same image over a period of time, for example, a church sermon that only changes weekly.  In contrast, cache is inappropriate in a test and development environment, where an exact representation of each server response is necessary for debugging new code.  But once an application is deployed, the ability to cache the server responses can provide great improvements in performance at very little cost.

Please give us your feedback!
If you found this article helpful, please click the "thumb's up" button below. Doing so lets the E-E community know what is valuable for E-E members and helps provide direction for future articles.  If you have questions or comments, please add them.  Thanks!
 
0
Comment
Author:Ray Paseur
0 Comments

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Join & Write a Comment

Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
Suggested Courses

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month