Spark - (RDD) Transformation

About

Transformations	Description
filter	returns a new data set that's formed by selecting those elements of the source on which a function returns true.
distinct([numTasks]))	returns a new data set that contains the distinct elements of the source data set.
map and flatMap	returns a new distributed data set that's formed by passing each element of the source through a function.
zip (optionally with index or id)	returning key-value pairs of the n element of each RDD: <math>\forall i\in \{0, \dots, N\} (rdd1_i,rdd2_i)</math>
split	split data set
pipe