Google Big Table


BigTable is NoSql database where value may be versioned by time. It's then a time serie database and its open source version is called Hbase.

From Google to support actively update.

  • OSDI paper in 2006 (Some overlap with the authors of the MapReduce paper)
  • Complementary to MapReduce

Data model

BigTable stores data:

  • in tables,
  • which contain rows (identified by a row key)
  • Data in a row is organized into column families, which are groups of columns. A column qualifier identifies a single column within a column family.
  • A cell is at the intersection of a row and a column. A cell contains versioned value.

BigTable is a Column-Oriented DB that stores data in a Multidimensional, sparse, distributed, persistent Sorted Map with the following format:

(row:string, column:string, time:int64) -> String


  • row and column define the value (a cell)
  • and time is a timestamps permitting to store the history of this value.


  • Each cell can be versioned
  • Each new version increments the timestamp
  • Policies:
    • “keep only latest n versions”
    • “keep only versions since time t”


  • Data is sorted lexicographically by row key / row number.
  • Row key range broken into tablets (Data are contiguous in a tablet)
  • A tablet is the unit of distribution and load balancing

Column families

  • Column names of the form family:qualifier
  • “family” is the basic unit of:
    • access control
    • memory accounting
    • disk accounting (move around on disc)
  • Typically all columns in a family the same type (for instance to compress)

Tablet management

  • Master assigns tablets to tablet servers
  • Tablet server manages reads and writes from its tablets
  • Clients communicate directly with tablet server
  • Tablet server splits tablets that have grown too large.

Write processing

  • When the memtable size reaches a given threshold, either or both minor and major compaction occur to keep read throughput high
    • Minor Compaction: Write memtable buffer to a SSTable
    • Major Compaction: Rewrites all SSTables into one SSTable and cleans all deletes.

Documentation / Reference

