RDD - Pipe

About

pipe is a transformation

pipe return an RDD created by piping elements to a forked external process.

Snippet

pipe(command, env={})

Example with a Spark RDD - Spark Context (sc, sparkContext)

sc = spark.sparkContext
sc.parallelize(['1', '2', '', '3']).pipe('cat').collect()
[u'1', u'2', u'', u'3']

Powered by ComboStrap