HDFS - NameNode

Yarn Hortonworks


NameNode is an HDFS daemon that run on the head node.

It' s the head process of the cluster that manages:

  • the file system namespace
  • and regulates access to files by clients.

The NameNode:

  • executes file system namespace operations like opening, closing, and renaming files and directories.
  • determines the mapping of blocks to DataNodes

The NameNode is the arbitrator and repository for all HDFS metadata.

The NameNode makes all decisions regarding replication of blocks.

It periodically receives from each of the DataNodes in the cluster:

The NameNode manages the file system metadata. See HDFS - File System Metadata

The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary.



A browser admin client is available at



  • Default HTTP port is 50070.


hdfs namenode --help
Usage: java NameNode [-backup] |
        [-checkpoint] |
        [-format [-clusterid cid ] [-force] [-nonInteractive] ] |
        [-upgrade [-clusterid cid] [-renameReserved<k-v pairs>] ] |
        [-upgradeOnly [-clusterid cid] [-renameReserved<k-v pairs>] ] |
        [-rollback] |
        [-rollingUpgrade <rollback|downgrade|started> ] |
        [-finalize] |
        [-importCheckpoint] |
        [-initializeSharedEdits] |
        [-bootstrapStandby] |
        [-recover [ -force] ] |
        [-metadataVersion ]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]


NNPID=$("$JAVA_HOME"/bin/jps | grep -E '^[0-9]+[ ]+NameNode$' | awk '{print $1}')
# secondary namenode
SNNPID=$("$JAVA_HOME"/bin/jps | grep -E '^[0-9]+[ ]+SecondaryNameNode$' | awk '{print $1}')

with Java - jps (Java Process Utility)


On startup, the NameNode enters a special state called Safemode. Replication of data blocks does not occur when the NameNode is in the Safemode state.

The NameNode receives Heartbeat and Blockreport messages from the DataNodes. After a configurable percentage of safely replicated data blocks checks in with the NameNode (plus an additional 30 seconds), the NameNode exits the Safemode state. It then determines the list of data blocks (if any) that still have fewer than the specified number of replicas. The NameNode then replicates these blocks to other DataNodes.


See the options refreshNamenodes of dfsadmin

For the given datanode:


HDFS - hdfs command line

  • gets list of namenodes in the cluster.
hdfs getconf -namenodes
hdfs getconf -secondaryNameNodes

rpc adresses

  • gets the namenode rpc addresses
hdfs getconf -nnRpcAddresses


# $HDFS_USER is the HDFS user. normally hdfs.
su -l $HDFS_USER -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh start journalnode"
  • then
# $HDFS_USER is the HDFS user. normally hdfs.
su -l $HDFS_USER -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode"





Discover More
Yarn Hortonworks

ACL POSIX style permissions/HDFS ACLs in HDFS is one authorization method . By default, ACLs are disabled. dfs.namenode.acls.enabled - Set to true to enable support for HDFS ACLs (Access Control...
Yarn Hortonworks
HDFS - Block Replication

in HDFS HDFS stores each file as a sequence of blocks. The blocks of a file are replicated for fault tolerance. The NameNode makes all decisions regarding replication of blocks. It periodically receives...
Yarn Hortonworks
HDFS - Blockreport

A blockreport is a list of all HDFS data blocks that correspond to each of the local files, and sends this report to the NameNode. Each datanode create and send this report to the namenode: when the...
Yarn Hortonworks
HDFS - Client Connection

A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode. A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol...
Yarn Hortonworks
HDFS - Cluster

An HDFS cluster consists of: a single NameNode (the head node) managing the file system. The NameNode is the arbitrator and repository for all HDFS metadata. a number of DataNodes, usually one per...
Yarn Hortonworks

The DFSAdmin is a sub-command of the hdfs command line and is used for administering an HDFS cluster. These are commands that are used only by an HDFS administrator. dfsadmin is a subcommand of...
Yarn Hortonworks
HDFS - DataNode

A dataNode is a HDFS process that manage storage attached to the nodes that they run on. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes...
Yarn Hortonworks
HDFS - DistCp (distributed inter/intra-cluster copy)

DistCp (distributed copy) is a tool used for large inter/intra-cluster copying distcp is a mapReduce application and run therefore in parallel. It expands a list of files and directories into input...
Yarn Hortonworks
HDFS - EditLog (transaction log)

The NameNode uses a transaction log called the EditLog to persistently record every change that occurs to file system metadata. The NameNode to insert a record into the EditLog when a new file...
Yarn Hortonworks
HDFS - File System Metadata

The file system metadata section of HDFS. The NameNode is the repository of all HDFS metadata. The metadata are stored in two files: fsimage file which is the metadata store EditLog transaction...

Share this page:
Follow us:
Task Runner