Cluster of computer.
!!! duplicate of Azure - HDInsight (Microsoft's Hadoop) !!!
Each cluster has:
It is referred as the default storage account.
HDInsight cluster and its default storage account must be co-located in the same Azure region.
Only the following cluster types support the Enterprise Security Package:
By default, the cluster come with:
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-port-settings-for-services
Azure HDInsight using an Azure Virtual Network
Azure provides name resolution for Azure services that are installed in a virtual network.
The cluster nodes can communicate directly with each other, and other nodes in HDInsight, by using internal DNS names. Example of internal DNS names assigned to HDInsight worker nodes:
Using Azure Data Factory, you can create HDInsight clusters on demand, and configure a TimeToLive setting to delete the clusters automatically.
Note: Cluster creation (Provisioning)
When you create a metastore for Hive or Oozie, don't use dashes, hyphens, or spaces in the database name. This can cause the cluster creation process to fail.
User creation example :
CREATE USER hi_hive WITH PASSWORD = 'the pwd';
CREATE SCHEMA hi_hive AUTHORIZATION hi_hive;
GRANT CONNECT TO hi_hive;
GRANT CREATE TABLE TO hi_hive;
GRANT CREATE VIEW TO hi_hive;
ALTER USER hi_hive WITH DEFAULT_SCHEMA = hi_hive;
-- https://social.technet.microsoft.com/wiki/contents/articles/7662.use-sql-azure-database-as-a-hive-metastore.aspx
EXEC sp_addrolemember 'db_ddladmin', 'hi_hive';
EXEC sp_addrolemember 'db_datawriter', 'hi_hive';
EXEC sp_addrolemember 'db_datareader', 'hi_hive';
Data is stored in Azure Storage. A cluster can be safely delete.
Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.
In the cli 1.0 from doc
azure hdinsight cluster delete clusterName
Azure - Template (Resource): https://github.com/Azure/azure-quickstart-templates/tree/master/101-hdinsight-linux-ssh-password
Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-component-versioning
Deployment failed. Correlation ID: 6d6465b6-8727-409a-aaa5-4f754112ee1c. {
"status": "Failed",
"error": {
"code": "ResourceDeploymentFailure",
"message": "The resource operation completed with terminal provisioning state 'Failed'.",
"details": [
{
"code": "HiveMetastoreSchemaInitializationFailedErrorCode",
"message": "Failed to start Hive Metastore due to metastore schema initialization error. If you are using a custom Hive metastore, please run 'Hive Schema Tool' against your metastore to check for possible issues with metastore configuration."
}
]
}
}
Verify that you SQL Azure Server firewall allows inbound connection from the same subnet of your cluster.