If some of your functionality fails, should it take your website down?

Let's say you have a frontpage of a webshop displaying promoted items, a campaign and a list of the most popular items. If you fail fetching the popular items, or have some invalid data in it which throws an exception, having the whole frontpage fail would be really bad. You want the rest of the page to render.

This happened to us recently, a part of the frontpage failed. But instead of killing the whole page, only the failing feature failed. Here's how it looks when you enter the page
Frontpage with error and below you have the failing functionality - in this case it was not even visible to the user until they clicked the "sist sent" (recently aired) tab. Crashing our whole page because of this would be plain out stupid.
Frontpage showing error failure

This is just as much about isolating features as isolating failures. By isolating features we can start applying several patterns which will help in creating a responsive and well behaved page:

  1. Circuit breaker for each feature/system. If the system is down, deliver what you got.

  2. Timeouts per feature - if the most popular items feature will exceed your response time requirements, treat it as a fail and deliver what you got. That's better than building the request queue with waiting clients.

  3. Cache - we can cache each part by itself. If we know when feature A and B changes but not C - we can pre-compute A and B and update the cache, or at least invalidate it. C can then have a less agressive cache setting.

So, by isolating your features you should be able to deliver some value to the user even if a part of your system is in error.