A partition in Cassandra is a unit of storage that does not get divided across nodes.
A partition is an ordered dictionary (ordered by clustering key).
A partition is a file and should be kept small (less than 100MB). You can then move them easily in the cluster.
To keep the partition small, you can create bucket (ie a new bucket id every n days for instance).
The partition key determine:
- how much data will be stored in each partition
- how the data is organized on disk
It affect then how quickly Cassandra processes read queries.
A partition range is the range in which the node (called replica) owns the data.
Example of partition created by channel_id, bucket:
CREATE TABLE messages ( channel_id bigint, bucket int, message_id bigint, author_id bigint, content text, PRIMARY KEY ((channel_id, bucket), message_id) ) WITH CLUSTERING ORDER BY (message_id DESC);