About
The data is automatically distributed by horizontal partition (sharding).
Request processing:
- When data is inserted into the cluster, the first step is to apply a hash function to the partition key to get a numeric token
- The coordinator does a token lookup and retrieve the node that owns the numeric token called a replica (Each node owns a particular ranges of tokens - numeric)
- The coordinator assigns the data to a given partition
In a 3 replica node, if a request comes in for data, even if one of our replicas has gone down, the other two are still available to fulfill the request.
Data Storage Hierarchy
- Table: Logical
- Partition: Physical (one or more file)