About
During a checkpoint the changes from the transaction log (Editlog) are applied to the metadata store (FsImage) (because it's not efficient to record each change on the metadata store (FsImage)
Articles Related
Checkpoint process
When the NameNode starts up, or a checkpoint is triggered by a configurable threshold,:
- it applies all the transactions from the EditLog to the in-memory representation of the FsImage
- it flushes out this new version into a new FsImage on disk.
- It truncates the old EditLog because its transactions have been applied to the persistent FsImage.
Management
Trigger / Run
A checkpoint can be triggered:
- at a given time interval (dfs.namenode.checkpoint.period) expressed in seconds,
- or after a given number of filesystem transactions have accumulated (dfs.namenode.checkpoint.txns).
If both of these properties are set, the first threshold to be reached triggers a checkpoint.
From the config file:
<property>
<name>dfs.namenode.checkpoint.period</name>
<value>21600</value>
</property>
<property>
<name>dfs.namenode.checkpoint.txns</name>
<value>1000000</value>
</property>
or command line:
hdfs getconf -confKey dfs.namenode.checkpoint.period
Location
From the config file:
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/hadoop/hdfs/namesecondary</value>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>${dfs.namenode.checkpoint.dir}</value>
</property>
or command line:
hdfs getconf -confKey dfs.namenode.checkpoint.dir
/hadoop/hdfs/namesecondary