Spark Engine - (Operations | Functions )

Spark Query Plan Generation


Operations are divided into transformations and actions.

Transformations are pipelined function (producing the same input type), and actions trigger computation and return results.

Transformation functions are saved and computed only when an action function is called. It permits to do optimizations.


Transformation Action
select show
distinct count
groupBy collect
sum save

Discover More
Card Puncher Data Processing
Spark DataSet - Data Frame

The data frame is a dataset of rows (ie organized into named columns). Technically, a data frame is an untyped view of a dataset. A SparkDataFrame is a distributed collection of data organized into...

Share this page:
Follow us:
Task Runner