HDFS - Client Connection

Yarn Hortonworks

HDFS - Client Connection


A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode.

A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol and the DataNode Protocol.

Client Operations


When a client retrieves file contents it perform a data integrity check on the blocks. If the check is negative, the client can opt to retrieve the replica of that block from another DataNode.


Lazy Persist writes: The Data Nodes will flush in-memory data to disk asynchronously thus removing expensive disk IO and checksum computations. See Memory Storage Support in HDFS



Web UI

Command line


  • NFS gateway, HDFS can be mounted as part of the client’s local file system.

Discover More
Yarn Hortonworks
HDFS - Block Replication

in HDFS HDFS stores each file as a sequence of blocks. The blocks of a file are replicated for fault tolerance. The NameNode makes all decisions regarding replication of blocks. It periodically receives...
Yarn Hortonworks
HDFS - Configuration (hdfs-site.xml)

HDFS follows the same configuration scheme than the whole Hadoop platform. See The configuration are split between two files: hdfs-site.xml, which provides default behaviors for the HDFS client. ...
Yarn Hortonworks
HDFS - Data Integrity Implementation

in HDFS The HDFS client software implements checksum checking on the contents of HDFS files. When a client creates an HDFS file, it computes a checksum of each block of the file and stores these checksums...
Yarn Hortonworks
HDFS - DataNode

A dataNode is a HDFS process that manage storage attached to the nodes that they run on. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes...
Yarn Hortonworks
HDFS - Fs Shell

Fs Shell is a client command line tool to manage HDFS. where: hadoop is the hadoop client hdfs is command is a file system command (ie ls, cat, ...) uri is For copy, you can also use...
Yarn Hortonworks
HDFS - JournalNode (JN)

JournalNode is a daemon that enable high availbility of namenode In a typical HA cluster, two separate machines are configured as NameNodes. At any point in time, exactly one of the NameNodes is in an...
Yarn Hortonworks
HDFS - RPC (protocol)

A Remote Procedure Call (RPC) abstraction wraps both: the Client Protocol (between the client and the namenode) and the DataNode Protocol (between the namenode and the datanode)
Hdfs Namenode Ui
HDFS - Web UI (Namenode UI)

A typical HDFS install configures a client web server to navigate the HDFS namespace and view the contents of its files. Azure: Service Nodes Port Protocol Description NameNode web UI Head...
Yarn Hortonworks
HDFS - hdfs command line

hdfs client which is an alias for ??
Card Puncher Data Processing
Hive - Load data

How to load data into Hive Table. This is are the following possibilities: File System operation SQL Operations Third party tools Replace the file on HDFS when the input data format is the...

Share this page:
Follow us:
Task Runner