Spark Engine - (Operations | Functions )

Spark Query Plan Generation

About

Operations are divided into transformations and actions.

Transformations are pipelined function (producing the same input type), and actions trigger computation and return results.

Transformation functions are saved and computed only when an action function is called. It permits to do optimizations.

List

Transformation Action
select show
distinct count
groupBy collect
sum save
orderBy
filter
limit





Discover More
Card Puncher Data Processing
Spark DataSet - Data Frame

The data frame is a dataset of rows (ie organized into named columns). Technically, a data frame is an untyped view of a dataset. A SparkDataFrame is a distributed collection of data organized into...



Share this page:
Follow us:
Task Runner