About
HDFS in Spark.
Articles Related
Management
Configuration
If you plan to read and write from HDFS using Spark, there are two Hadoop configuration files that should be included on Spark’s classpath (???):
- hdfs-site.xml, which provides default behaviors for the HDFS client.
- core-site.xml, which sets the default filesystem name.
To make these files visible to Spark, set HADOOP_CONF_DIR in $SPARK_HOME/conf/spark-env.sh to a location containing the configuration files.