Azure - Data Lake (ADLS)

About

A Data lake storage has more enterprise features than a blob storage. See a comparison

Management

Command line

az dls fs ...

URI

adl://<Account Name>.azuredatalakestore.net/

API

<property>
      <name>fs.AbstractFileSystem.wasbs.impl</name>
      <value>org.apache.hadoop.fs.azure.Wasbs</value>
</property>
<property>
      <name>fs.adl.impl</name>
      <value>org.apache.hadoop.fs.adl.HdiAdlFileSystem</value>
</property>
<property>
      <name>fs.defaultFS</name>
      <value>adl://home</value>
      <final>true</final>
</property>

Access

Mount

<property>
    <name>dfs.adls.home.hostname</name>
    <value>adlsName.azuredatalakestore.net</value>
</property>
    
<property>
  <name>dfs.adls.home.mountpoint</name>
  <value>/clusters/adlsName/</value>
</property>

<property>
      <name>dfs.adls.home.hostname</name>
      <value>storageAccount.azuredatalakestore.net</value>
</property>

where:

Auth

End user

https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-authenticate-using-active-directory#create-an-active-directory-application

  • Client ID
  • Client Secret
  • Token Endpoint
<configuration>
  <property>
        <name>dfs.adls.oauth2.access.token.provider.type</name>
        <value>ClientCredential</value>
  </property>
  
  <property>
      <name>dfs.adls.oauth2.refresh.url</name>
      <value>YOUR TOKEN ENDPOINT</value>
  </property>
  <property>
      <name>dfs.adls.oauth2.client.id</name>
      <value>YOUR CLIENT ID</value>
  </property>
  <property>
      <name>dfs.adls.oauth2.credential</name>
      <value>YOUR CLIENT SECRET</value>
  </property>
  <property>
      <name>fs.adl.impl</name>
      <value>org.apache.hadoop.fs.adl.AdlFileSystem</value>
  </property>
  <property>
      <name>fs.AbstractFileSystem.adl.impl</name>
      <value>org.apache.hadoop.fs.adl.Adl</value>
  </property>  
</configuration>
Service

https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory

Language

Java

Example value

{
    "clientId": "ad735158-65ca-11e7-ba4d-ecb1d756380e",
    "clientSecret": "b70bb224-65ca-11e7-810c-ecb1d756380e",
    "subscriptionId": "bfc42d3a-65ca-11e7-95cf-ecb1d756380e",
    "tenantId": "c81da1d8-65ca-11e7-b1d1-ecb1d756380e",
    "activeDirectoryEndpointUrl": "https://login.microsoftonline.com",
    "resourceManagerEndpointUrl": "https://management.azure.com/",
    "activeDirectoryGraphResourceId": "https://graph.windows.net/",
    "sqlManagementEndpointUrl": "https://management.core.windows.net:8443/",
    "galleryEndpointUrl": "https://gallery.azure.com/",
    "managementEndpointUrl": "https://management.core.windows.net/"
}

Documentation / Reference

Task Runner