Web Caching
Introduction
Ehcache provides a set of general purpose web caching filters in the ehcache-web
module.
Using these can make an amazing difference to web application performance. A typical server can deliver
5000+ pages per second from the page cache. With built-in gzipping, storage and network transmission is
highly efficient. Cache pages and fragments make excellent candidates for DiskStore
storage, because the
object graphs are simple and the largest part is already a byte[]
.
SimplePageCachingFilter
This is a simple caching filter suitable for caching compressible HTTP responses such as HTML, XML or JSON. It uses a Singleton CacheManager created with the default factory method. Override to use a different CacheManager It is suitable for:
- complete responses i.e. not fragments.
- A content type suitable for gzipping. For example, text or text/html
For fragments see the SimplePageFragmentCachingFilter.
Keys
Pages are cached based on their key. The key for this cache is the URI followed by the query string. An example
is /admin/SomePage.jsp?id=1234&name=Beagle
.
This key technique is suitable for a wide range of uses. It is independent of hostname and port number, so will
work well in situations where there are multiple domains which get the same content, or where users access
based on different port numbers.
A problem can occur with tracking software, where unique ids are inserted into request query strings. Because
each request generates a unique key, there will never be a cache hit. For these situations it is better to
parse the request parameters and override calculateKey(javax.servlet.http.HttpServletRequest)
with
an implementation that takes account of only the significant ones.
Configuring the cacheName
A cache entry in ehcache.xml should be configured with the name of the filter.
Names can be set using the init-param cacheName
, or by sub-classing this class and overriding the name.
Concurrent Cache Misses
A cache miss will cause the filter chain, upstream of the caching filter to be processed. To avoid threads requesting
the same key to do useless duplicate work, these threads block behind the first thread.
The thead timeout can be set to fail after a certain wait by setting the init-param blockingTimeoutMillis
.
By default threads wait indefinitely. In the event upstream processing never returns, eventually the web server
may get overwhelmed with connections it has not responded to. By setting a timeout, the waiting threads will only block
for the set time, and then throw a {@link net.sf.ehcache.constructs.blocking.LockTimeoutException}. Under either
scenario an upstream failure will still cause a failure.
Gzipping
Significant network efficiencies, and page loading speedups, can be gained by gzipping responses. Whether a response can be gzipped depends on:
-
Whether the user agent can accept GZIP encoding. This feature is part of HTTP1.1.
If a browser accepts GZIP encoding it will advertise this by including in its HTTP header: All common browsers except IE 5.2 on Macintosh are capable of accepting gzip encoding. Most search engine robots do not accept gzip encoding.
-
Whether the user agent has advertised its acceptance of gzip encoding. This is on a per request basis. If they will accept a gzip response to their request they must include the following in the HTTP request header:
Accept-Encoding: gzip
Responses are automatically gzipped and stored that way in the cache. For requests which do not accept gzip encoding the page is retrieved from the cache, ungzipped and returned to the user agent. The ungzipping is high performance.
Caching Headers
The SimpleCachingHeadersPageCachingFilter
extends SimplePageCachingFilter
to provide
the HTTP cache headers: ETag, Last-Modified and Expires. It supports conditional GET.
Because browsers and other HTTP clients have the expiry information returned in the response headers,
they do not even need to request the page again. Even once the local browser copy has expired, the browser
will do a conditional GET.
So why would you ever want to use SimplePageCachingFilter, which does not set these headers?
The answer is that in some caching scenarios you may wish to remove a page before its natural expiry. Consider a scenario where a web page shows dynamic data. Under Ehcache the Element can be removed at any time. However if a browser is holding expiry information, those browsers will have to wait until the expiry time before getting updated. The caching in this scenario is more about defraying server load rather than minimising browser calls.
Init-Params
The following init-params are supported:
cacheName
- the name in ehcache.xml used by the filter.blockingTimeoutMillis
- the time, in milliseconds, to wait for the filter chain to return with a response on a cache miss. This is useful to fail fast in the event of an infrastructure failure.-
varyHeader
- set to true to set Vary:Accept-Encoding in the response when doing Gzip. This header is needed to support HTTP proxies however it is off by default.<init-param> <param-name>varyHeader</param-name> <param-value>true</param-value> </init-param>
Re-entrance
Care should be taken not to define a filter chain such that the same CachingFilter
class is reentered.
The CachingFilter
uses the BlockingCache
. It blocks until the thread which
did a get which results in a null does a put. If reentry happens a second get happens before the first put. The second
get could wait indefinitely. This situation is monitored and if it happens, an IllegalStateException will be thrown.
SimplePageFragmentCachingFilter
The SimplePageFragmentCachingFilter does everything that SimplePageCachingFilter does, except it never gzips, so the fragments can be combined. There is a variant of this filter which sets browser caching headers, because that is only applicable to the entire page.
Example web.xml configuration
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
version="2.5">
<filter>
<filter-name>CachePage1CachingFilter</filter-name>
<filter-class>net.sf.ehcache.constructs.web.filter.SimplePageCachingFilter
</filter-class>
<init-param>
<param-name>suppressStackTrace</param-name>
<param-value>false</param-value>
</init-param>
<init-param>
<param-name>cacheName</param-name>
<param-value>CachePage1CachingFilter</param-value>
</init-param>
</filter>
<filter>
<filter-name>SimplePageFragmentCachingFilter</filter-name> <filter-class>net.sf.ehcache.constructs.web.filter.SimplePageFragmentCachingFilter
</filter-class>
<init-param>
<param-name>suppressStackTrace</param-name>
<param-value>false</param-value>
</init-param>
<init-param>
<param-name>cacheName</param-name>
<param-value>SimplePageFragmentCachingFilter</param-value>
</init-param>
</filter>
<filter>
<filter-name>SimpleCachingHeadersPageCachingFilter</filter-name> <filter-class>net.sf.ehcache.constructs.web.filter.SimpleCachingHeadersPageCachingFilter
</filter-class>
<init-param>
<param-name>suppressStackTrace</param-name>
<param-value>false</param-value>
</init-param>
<init-param>
<param-name>cacheName</param-name>
<param-value>CachedPage2Cache</param-value>
</init-param>
</filter>
<!-- This is a filter chain. They are executed in the order below.
Do not change the order. -->
<filter-mapping>
<filter-name>CachePage1CachingFilter</filter-name>
<url-pattern>/CachedPage.jsp</url-pattern>
<dispatcher>REQUEST</dispatcher>
<dispatcher>INCLUDE</dispatcher>
<dispatcher>FORWARD</dispatcher>
</filter-mapping>
<filter-mapping>
<filter-name>SimplePageFragmentCachingFilter</filter-name>
<url-pattern>/include/Footer.jsp</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>SimplePageFragmentCachingFilter</filter-name>
<url-pattern>/fragment/CachedFragment.jsp</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>SimpleCachingHeadersPageCachingFilter</filter-name>
<url-pattern>/CachedPage2.jsp</url-pattern>
</filter-mapping>
An ehcache.xml configuration file, matching the above would then be:
<Ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="../../main/config/ehcache.xsd">
<diskStore path="auto/default/path"/>
<defaultCache
maxEntriesLocalHeap="10"
eternal="false"
timeToIdleSeconds="5"
timeToLiveSeconds="10">
<persistence strategy="localTempSwap"/>
/>
<!-- Page and Page Fragment Caches -->
<cache name="CachePage1CachingFilter"
maxEntriesLocalHeap="10"
eternal="false"
timeToIdleSeconds="10000"
timeToLiveSeconds="10000">
<persistence strategy="localTempSwap"/>
</cache>
<cache name="CachedPage2Cache"
maxEntriesLocalHeap="10"
eternal="false"
timeToLiveSeconds="3600">
<persistence strategy="localTempSwap"/>
</cache>
<cache name="SimplePageFragmentCachingFilter"
maxEntriesLocalHeap="10"
eternal="false"
timeToIdleSeconds="10000"
timeToLiveSeconds="10000">
<persistence strategy="localTempSwap"/>
</cache>
<cache name="SimpleCachingHeadersTimeoutPageCachingFilter"
maxEntriesLocalHeap="10"
eternal="false"
timeToIdleSeconds="10000"
timeToLiveSeconds="10000">
<persistence strategy="localTempSwap"/>
</cache>
</ehcache>
CachingFilter Exceptions
Additional exception types have been added to the Caching Filter.
FilterNonReentrantException
Thrown when it is detected that a caching filter’s doFilter method is reentered by the same thread. Reentrant calls will block indefinitely because the first request has not yet unblocked the cache.
ResponseHeadersNotModifiableException
Same as FilterNonReentrantException.
AlreadyGzippedException
This exception is thrown when a gzip is attempted on already gzipped content.
The web package performs gzipping operations. One cause of problems on web browsers is getting content that is double or triple gzipped. They will either get unreadable content or a blank page.
ResponseHeadersNotModifiableException
A gzip encoding header needs to be added for gzipped content. The HttpServletResponse#setHeader()
method
is used for that purpose. If the header had already been set, the new value normally overwrites the previous one.
In some cases according to the servlet specification, setHeader silently fails. Two scenarios where this happens are:
- The response is committed.
RequestDispatcher#include
method caused the request.