Data - Cache

Data System Architecture

About

In computer science, a data cache is a component that aims to:

  • improve performance
  • reduce load on the server.

The cache will:

  • store transparently a request response
  • and use it to for later requests until a cache time limit has been reached

The data that is stored within a cache might be:

  • values that have been computed earlier
  • or duplicates of original values that are stored elsewhere.

Properties

Cache Request:

Cache Store

Goal

Which information is loaded into cache depends on algorithms and certain assumptions about programming code. The goal of the cache system is to ensure that the application has the next bit of data it will need already loaded into cache by the time it goes looking for it (also called a cache hit).

Cache hit, cache miss

If requested data is contained in the cache, a cache hit occurs and this request can be served by simply reading the cache, which is comparably faster.

Otherwise, a cache miss occurs and the data has to be recomputed or fetched from its original storage location, which is comparably slower.

Hence, the more requests can be served from the cache the better the overall system performance is.

A cache is transparent

As opposed to a buffer, which is managed explicitly by a client, a cache stores data transparently: This means that a client who is requesting data from a system is not aware that the cache exists, which is the origin of the name cache (from French “cacher”, to conceal).

Validity Cache Time vs Business Level

This section is about data processing cache where data needs to be aggregated.

Architecture Level Data Latency
Raw Source Immediate
Real Time Cache Seconds
Business Process Minutes
Business Management Day
Business Lead Week, Month, Year

Key

Whenever a cache receives a request, it needs to decide:

  • whether it has a copy of this exact request already saved and can reply with that,
  • if it needs to forward the request to the server.

Caches tackle this problem using the concept of cache keys that identifies uniquely the resource.

See HTTP cache key

Implemententation

1)

Cachedir tag to identify a cache directory

To avoid backing up, archiving, copying, or moving cache directories, you can add the file CACHEDIR.TAG to identify a directory as containing cache data.

Example: All CACHEDIR.TAG have the same content

Signature: 8a477f597d28d172789f06886806bc55
# This file is a cache directory tag created by (application name).
# For information about cache directory tags, see:
#	http://www.brynosaurus.com/cachedir/

See: https://bford.info/cachedir/

Task Runner