Spark DataSet - Bucket

1 - About

A partition may be divided in bucket.

3 - Management

3.1 - Write

Buckets the output by the given columns. If specified, the output is laid out on the file system similar to Hive's bucketing scheme.

This is applicable for all file-based data sources (e.g. Parquet, JSON) starting with Spark 2.1.0.

Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap