Smarter, domain specific caching

desember 18, 2011 — by Edvin Syse1

Most enterprise applications employ some kind of caching to boost performance. Traditionally, the database is the most common and most beneficial layer to cache. However, only caching your database often result in some major drawbacks:

  • The cache becomes tied to your particular RDBMS or ORM, making an eventual switch down the road more troublesome
  • You have no way to cache other parts of your application, like the result of remoting calls or expensive calculations

On top of that, a traditional, automatic database cache will often invalidate too much of your cached data, making the cache less efficient. I will explain why that is, and then show you the solution I came up with, and implemented into my Spring Framework alternative, Tornado Inject.

My first thought was to cache the actual result of method calls in my services and/or DAO’s. This way, I would not be tied to any RDBMS, and I could cache much more than just the plain data retrieved from the database. Sure, you can do this with any of the available caching solutions, but they will require boilerplate code for every lookup/cache insertion. I wanted a solution that didn’t pollute my service methods with cache specific code. I’ve been using Apache iBATIS (now MyBatis) for my database layer for years, and really like the way you configure caching there. I took this paradigm and applied it to Tornado Inject.

I will use a traditional CustomerDao as an example. This is how you would declare a cache in your ApplicationContext:

To apply this cache to a method call, let’s say in your CustomerDAO, you would annotate the method to be cached like this:

Next, you need a way to invalidate the data in your cache, for example when a customer is saved:

This pattern, taken from iBATIS, workes extremely well, and is very easy to use. However, in many applications, your cache could be a lot smarter. I’ll use our Tornado CMS as an example. The CMS is virtual hosted, so the database contains information about houndreds of different websites (called instances here). When you change a page owned by one instance, there is no need to invalidate the page cache for all the other instances. The solution was both simple and elegant:

Use indexes to invalidate parts of your cache

When you put something into the cache, you also add an index that can be used later to invalidate only the parts of the cache that is tied to that index. In our application, many entities references the instance they are tied to, so this is a perfect hook for us. Let’s first cache the call that lists all pages:

The @CacheIndex annotation tells the pageCache to index all pages that are tied to a specific instance. Then, when we save a Page:

The CacheIndex annotation is used again, this time with a property parameter, telling the cache to extract the page.instance field and use as the “instance” index. The savePage call will now only invalidate the pageCache entries that is tied to the same instance as the page that was saved. This makes for a dramatic performance increase, since we now only drop a fragment of the total pageCache on each save.

What to do with corner cases?

Some times however, you don’t have enough information in your method calls to determine the indexes. You can then choose to delete the whole cache by not supplying a @CacheIndex annotation, or use a cache interceptor to add indexes. In our case this turned out to be a non-issue, because each HTTP request to the CMS will configure a ThreadLocal containing the instance id, based on the hostname of the request. We then configure the cache with an interceptor that augments the indexes available to the invalidate/store methods in the cache:

Et voilà, now we don’t even need the @CacheIndex annotation, and all caches will only evict entries pertaining to the current instance.


From version 1.0.2, Tornado Inject also supports invididual timeouts for cached entries, and an arbitrary number of indexes per cache. The caches are LRU with configurable number of items, you can query it for the total number of bytes stored, and there are a lot of other small but important features in there. Bottom line is that this small library solves both our container/injection needs as well as caching, in a very transparent, easy to work with manner.

One comment

  • Bård Johannessen

    desember 18, 2011 at 14:54

    Genial løsning! Og genialt med SYSE-blogg så man kan følge med på utviklingen selv når man har ferie!

Comments are closed.