HDFS - Client Connection


A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode.

A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol and the DataNode Protocol.

Client Operations


When a client retrieves file contents it perform a data integrity check on the blocks. If the check is negative, the client can opt to retrieve the replica of that block from another DataNode.


Lazy Persist writes: The Data Nodes will flush in-memory data to disk asynchronously thus removing expensive disk IO and checksum computations. See Memory Storage Support in HDFS



Web UI

Command line


  • NFS gateway, HDFS can be mounted as part of the client’s local file system.

