Azure HDInsight is a cluster distribution of the Hadoop components from the Hortonworks Data Platform (HDP).
It regroups open-source frameworks:
- Hadoop,
- Spark,
- Hive,
- Kafka,
- Storm,
- R,
- and more.
Post-install script
Install component at the end of the creation with script action
Example: install hue
- On a headnode (mycluster is the name of the cluster)
# List the root
hdfs dfs -D "fs.default.name=hdfs://mycluster/" -ls /
# Report
hdfs dfsadmin -D "fs.default.name=hdfs://mycluster/" -report
- Check the integrity of HDFS on the HDInsight cluster by using the following commands:
hdfs fsck -D "fs.default.name=hdfs://mycluster/" /
Connecting to namenode via http://hn0-ha.ax.internal.cloudapp.net:30070/fsck?ugi=hdsshadm&path=%2F
FSCK started by hdsshadm (auth:SIMPLE) from / for path / at Thu Jan 17 16:04:30 UTC 2019
................................Status: HEALTHY
Total size: 1200 B
Total dirs: 139
Total files: 32
Total symlinks: 0 (Files currently being written: 15)
Total blocks (validated): 15 (avg. block size 80 B)
Minimally replicated blocks: 15 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 5
Number of racks: 1
FSCK ended at Thu Jan 17 16:04:30 UTC 2019 in 12 milliseconds
The filesystem under path '/' is HEALTHY
- Leave Safe Mode
hdfs dfsadmin -D "fs.default.name=hdfs://mycluster/" -safemode leave
From SSH Connection
- To headnode
ssh -i ~/.ssh/myPrivatekey -p 22 [email protected] # Primary HeadNode
ssh -i ~/.ssh/myPrivatekey -p 23 [email protected] # Secondary HeadNode
- To edge
ssh [email protected]
- To WorkerNode (from head or edge node). If the SSH account is secured using SSH keys, make sure that ssh forwarding is enabled on the client.
ssh sshuser@wn0-myhdi