Spark Engine - Aggregation

Spark Query Plan Generation


Before an aggregation, there is a shuffle taking place

Discover More
Spark Query Plan Generation
Spark Engine - Shuffle

shuffle means moving data rows by rows between partition. spark.sql.shuffle.partitions - Configures the number of partitions to use when shuffling data for joins or aggregations.

Share this page:
Follow us:
Task Runner