Cassandra - Partition

Data Modeling Chebotko Logical

About

A partition in Cassandra is a unit of storage that does not get divided across nodes.

A partition is an ordered dictionary (ordered by clustering key).

A partition is a file and should be kept small (less than 100MB). You can then move them easily in the cluster.

To keep the partition small, you can create bucket (ie a new bucket id every n days for instance).

The partition key determine:

  • how much data will be stored in each partition
  • how the data is organized on disk

It affect then how quickly Cassandra processes read queries.

Key

The partition key is a key column that is defined from one or more columns in the first element definition of the primary key

Range

A partition range is the range in which the node (called replica) owns the data.

SQL

Example of partition created by channel_id, bucket:

CREATE TABLE messages (
   channel_id bigint,
   bucket int,
   message_id bigint,
   author_id bigint,
   content text,
   PRIMARY KEY ((channel_id, bucket), message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);





Discover More
Data Modeling Chebotko Logical
Cassandra - Primary Key

This page talks the primary key of a cassandra table. The primary key is composed of: in first position: the partition keys columns that defines the data location and partition in successive position:...
Data Modeling Chebotko Logical
Cassandra - Replica Node

A replica node is a node that owns the data for a partition range.
Data Modeling Chebotko Logical
Cassandra - Storage Layout

The data is automatically distributed by horizontal partition (sharding). Request processing: When data is inserted into the cluster, the first step is to apply a hash function to the partition key...
Data Modeling Chebotko Logical
Cassandra - Time Series

This page is time series in Cassandra. They are stored in a wide partition, where the time is used as part of the partition key. measurements at specific time intervals: business analysis, sensor...
Data Modeling Chebotko Logical
Cassandra NoSql Database

Cassandra is a NoSql database for transactional workloads that require high scale and maximum availability. Cassandra is suited for transactional workloads at high volume and shouldn’t be considered...
Data Modeling Chebotko Logical
Wide Partition (Wide row pattern)

Wide Partition is a data modeling pattern where multiple related rows are grouped in a partition in order to support fast access to multiple rows in a single query (within the partition). This pattern...



Share this page:
Follow us:
Task Runner