About
A partition in Cassandra is a unit of storage that does not get divided across nodes.
A partition is an ordered dictionary (ordered by clustering key).
A partition is a file and should be kept small (less than 100MB). You can then move them easily in the cluster.
To keep the partition small, you can create bucket (ie a new bucket id every n days for instance).
The partition key determine:
- how much data will be stored in each partition
- how the data is organized on disk
It affect then how quickly Cassandra processes read queries.
Key
The partition key is a key column that is defined from one or more columns in the first element definition of the primary key
Range
A partition range is the range in which the node (called replica) owns the data.
SQL
Example of partition created by channel_id, bucket:
CREATE TABLE messages (
channel_id bigint,
bucket int,
message_id bigint,
author_id bigint,
content text,
PRIMARY KEY ((channel_id, bucket), message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);