randomSplit randomly splits a RDD with the provided weights.
randomSplit(weights, seed=None)
where:
Example of percentage split
weights = [.8, .1, .1]
seed = 42 # seed=0L
# Use randomSplit with weights and seed
rawTrainData, rawValidationData, rawTestData = rawData.randomSplit(weights, seed)
The exact number of entries in each dataset varies slightly due to the random nature of the randomSplit() transformation.