pySpark
pySpark provides shared variables in two different types.
Broadcast
Broadcast variables are an efficient way of sending data once that would otherwise be sent multiple times automatically in closures.
Accumulator
Accumulators can only be written by workers and read by the driver program.
They allow us to aggregate values from workers back to the driver.