HDFS - Trash

Yarn Hortonworks

About

If trash configuration is enabled, files removed by FS Shell is not immediately removed from HDFS.

Instead, HDFS moves it to a trash directory. The file can be restored quickly as long as it remains in trash.

There could be an appreciable time delay between the time a file is deleted by a user and the time of the corresponding increase in free space in HDFS.

Process

Most recent deleted files are moved to the current trash directory (/user/

/.Trash/Current), and in a configurable interval, HDFS creates checkpoints (under /user//.Trash/Wednesday, 12 June 2024) for files in current trash directory and deletes old checkpoints when they are expired. After the expiry of its life in trash, the NameNode deletes the file from the HDFS namespace. The deletion of a file causes the blocks associated with the file to be freed. ===== Management ===== ==== Location ==== Each user has its own trash directory under /user//.Trash''
Trash Checkpoint

See expunge command of FS shell about checkpointing of trash.

Documentation / Reference





Discover More
Yarn Hortonworks
HDFS - File

A typical file in HDFS is gigabytes to terabytes in size. A file is split into one or more blocks. Files in HDFS are write-once (except for appends and truncates) and have strictly one writer at any...



Share this page:
Follow us:
Task Runner