Table of Contents

About

The Hadoop configuration has two entry point:

Configuration

HADOOP_CONF_DIR

HADOOP_CONF_DIR is the environment variable that set the directory location.

Default is:

HADOOP_HOME/etc/hadoop/

Example:

C:\hadoop\hadoop-2.7.5\etc\hadoop

From the command line:

hdfs envvars | grep -i HADOOP_CONF_DIR
HADOOP_CONF_DIR='/usr/hdp/2.6.2.25-1/hadoop/conf'

File

Hadoop configuration is driven by two types of important configuration files:

  • default files that are read-only default configuration. Example: core-default.xml
  • site file that specific configuration that overwrite the default values of the default files. Example: core-site.xml overrides values in core-default.xml.

They are loaded in order (first default) from the classpath

List example of configuration files:

Site files Default
core-site.xml core-default.xml
hdfs-site.xml hdfs-default.xml
yarn-site.xml yarn-default.xml
mapred-site.xml mapred-default.xml

Environment variable

See Hadoop - Environment variable

Class

http://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/conf/Configuration.html

Management

final

Configuration parameters may be declared final. Once a resource declares a value final, no subsequently-loaded resource can alter that value. For example, one might define a final parameter with:

<property>
	<name>dfs.hosts.include</name>
	<value>/etc/hadoop/conf/hosts.include</value>
	<final>true</final>
</property>

Administrators typically define parameters as final in core-site.xml for values that user applications may not alter.

Variable Expansion

Value strings are first processed for variable expansion. The available properties are:

Other properties defined in this Configuration; and, if a name is undefined here, Properties in System.getProperties().

For example, if a configuration resource contains the following property definitions:

<property>
	<name>basedir</name>
	<value>/user/${user.name}</value>
</property>

<property>
	<name>tempdir</name>
	<value>${basedir}/tmp</value>
</property>

When conf.get(“tempdir”) is called, then basedir will be resolved to another property in this Configuration, while user.name would then ordinarily be resolved to the value of the System property with that name. By default, warnings will be given to any deprecated configuration parameters and these are suppressible by configuring log4j.logger.org.apache.hadoop.conf.Configuration.deprecation in log4j.properties file.

Documentation / Reference