HDFS - Checkpoint

Yarn Hortonworks

About

During a checkpoint the changes from the transaction log (Editlog) are applied to the metadata store (FsImage) (because it's not efficient to record each change on the metadata store (FsImage)

Checkpoint process

When the NameNode starts up, or a checkpoint is triggered by a configurable threshold,:

  • it reads the FsImage and EditLog from disk
  • it applies all the transactions from the EditLog to the in-memory representation of the FsImage
  • it flushes out this new version into a new FsImage on disk.
  • It truncates the old EditLog because its transactions have been applied to the persistent FsImage.

Management

Trigger / Run

A checkpoint can be triggered:

  • at a given time interval (dfs.namenode.checkpoint.period) expressed in seconds,
  • or after a given number of filesystem transactions have accumulated (dfs.namenode.checkpoint.txns).

If both of these properties are set, the first threshold to be reached triggers a checkpoint.

From the config file:

<property>
  <name>dfs.namenode.checkpoint.period</name>
  <value>21600</value>
</property>

<property>
  <name>dfs.namenode.checkpoint.txns</name>
  <value>1000000</value>
</property>

or command line:

hdfs getconf -confKey dfs.namenode.checkpoint.period

Location

From the config file:

<property>
  <name>dfs.namenode.checkpoint.dir</name>
  <value>/hadoop/hdfs/namesecondary</value>
</property>

<property>
  <name>dfs.namenode.checkpoint.edits.dir</name>
  <value>${dfs.namenode.checkpoint.dir}</value>
</property>

or command line:

hdfs getconf -confKey dfs.namenode.checkpoint.dir
/hadoop/hdfs/namesecondary





Discover More
Hadoop Hdfs Fsimage
HDFS - FsImage File

The HDFS file system metadata are stored in a file called the FsImage. It contains: the entire file system namespace the mapping of blocks to files and file system properties The FsImage...



Share this page:
Follow us:
Task Runner