Hortonworks - Hortonworks Data Platform 2-6-4 (HDP) installation on Docker Windows 7

Card Puncher Data Processing

About

A page on how to install the docker image of the Hortonworks Data Platform (HDP)

This page is based on this Hortonworks article.

Steps

Install and configure your docker VM host

The most important part is in the creation/configuration of the docker host machine. You need:

  • a minimal memory of 8Gb.
  • a big disk. The HDP standalone image is 12.4Gb.

Example of host creation:

docker-machine create \
    --driver virtualbox \
    --virtualbox-memory 8192 \
    --virtualbox-disk-size "40960"  \
    default

More see: Docker - Installation

Stop all heavy process

The HDP container is a process that consume a lot of memory. You need to have a minimal of 16Gb on your laptop.

This is my memory with outlook, chrome open and HDP started. If you have other high-memory process that you don't need, stop them.

The scale are not good on the below graphic. The commit memory is higher.

Hdp Memory Usage

Pull the image

Hdp gives you a powershell script but I prefer to pull the image before starting the script as this is a very long operation.

docker pull hortonworks/sandbox-hdp-standalone:2.6.4

Download the install script

Hdp Ps Docker Install

fciv start-sandbox-hdp-standalone_2-6-4.ps1
//
// File Checksum Integrity Verifier version 2.05.
//
a5aed8818d76091bd503529a16eb0f36 start-sandbox-hdp-standalone_2-6-4.ps1.zip

  • Unzip it
unzip start-sandbox-hdp-standalone_2-6-4.ps1.zip

Modify the install script

If you are running a docker machine

This step is only needed if you use a docker machine and the docker host is then not reachable via localhost

  • The script verify that the docker daemon is running. As the daemon is running inside the host, it will fail. Suppress the first below line
Write-Host "Checking docker daemon..."
If ((Get-Process | Select-String docker) -ne $null) {
    Write-Host "Docker is up and running"
}
Else {
    $Host.UI.WriteErrorLine("Please start Docker service. https://docs.docker.com/docker-for-windows/")
    return
}

to enable livy

The livy port is not open.

You need to add it in the list of port

-p 8999:8999 

Execute the install script

  • Execute the script
powershell -ExecutionPolicy ByPass -File start-sandbox-hdp-standalone_2-6-4.ps1
Found HDP Sandbox image
Running HDP Sandbox for the first time...
2.6.4: Pulling from hortonworks/sandbox-hdp-standalone
Digest: sha256:d8591fdf9d082a0ba4aba2a0b1045b6599103b04e56ca9cfa241498f5166de00
Status: Image is up to date for hortonworks/sandbox-hdp-standalone:2.6.4
8d958bd6038f6fbad29924d8255a81f8bdbce80e08924f558da5113aac9fe198
Starting mysqld:                                           [  OK  ]
Starting postgresql service:                               [  OK  ]
Using python  /usr/bin/python
Starting ambari-server
Ambari Server running with administrator privileges.
Running initdb: This may take up to a minute.
About to start PostgreSQL
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start........................................
Server started listening on 8080

DB configs consistency check: no errors and warnings were found.
Ambari Server 'start' completed successfully.
Verifying Python version compatibility...
Using python  /usr/bin/python
Checking for previously running Ambari Agent...
Checking ambari-common dir...
Starting ambari-agent
Verifying ambari-agent process status...
Ambari Agent successfully started
Agent PID at: /var/run/ambari-agent/ambari-agent.pid
Agent out at: /var/log/ambari-agent/ambari-agent.out
Agent log at: /var/log/ambari-agent/ambari-agent.log

docker-machine status
Paused

docker-machine start
  • If the HDP is correctly started, you should see its container running
docker ps
CONTAINER ID        IMAGE                                      COMMAND               CREATED             STATUS              PORTS                               NAMES
8d958bd6038f        hortonworks/sandbox-hdp-standalone:2.6.4   "/usr/sbin/sshd -D"   27 minutes ago      Up 22 minutes       a bunchof port sandbox-hdp

Host File

The HDP tutorial access the machine through the hostname sandbox-hdp.hortonworks.com.

You can then add the docker host ip into your host file.

#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#	127.0.0.1       localhost
#	::1             localhost
192.168.99.100   sandbox-hdp.hortonworks.com
192.168.99.100   docker-host

Login

Changing the root password with Ssh

  • You are required to change the root password on the sandbox. You must then make a first connection with ssh and the user root, password hadoop at the port 2222 (because 22 is already taken by the docker host). I changed it to welcome01
ssh [email protected] -p 2222
The authenticity of host '[docker-host]:2222 ([192.168.99.100]:2222)' can't be established.
RSA key fingerprint is SHA256:oCHVVt8XBDItJbjH0XExlhePO93VcXJQGHx5WdiMhLE.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[docker-host]:2222,[192.168.99.100]:2222' (RSA) to the list of known hosts.
root@docker-host's password:
You are required to change your password immediately (root enforced)
Changing password for root.
(current) UNIX password:
New password:
Retype new password:

Winscp

  • Winscp login with root and password welcom01

Hdp Sandbox Winscp

Shell web client

There is also a shel web client to: http://sandbox-hdp.hortonworks.com:4200/

Shell Web Hdp

Ambari

  • Rest the admin password to welcome01 with your favorite ssh client. Example with putty:
ambari-admin-password-reset
Please set the password for admin:
Please retype the password for admin:

The admin password has been set.
Restarting ambari-server to make the password change effective...

Using python  /usr/bin/python
Restarting ambari-server
Waiting for server stop...
Ambari-server failed to stop gracefully. Sending SIGKILL to it
Ambari Server stopped
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start..............................................
Server started listening on 8080

DB configs consistency check: no errors and warnings were found.

Ambari On Localhost

  • Login with the user admin, password welcome01

Hdp Ambari Homepage

Stop it

docker stop sandbox-hdp

Start it again

  • Start
docker start sandbox-hdp

Further

That was it.

We have a functional environment and we can learn further:

Support

A connection attempt failed

When starting the image, I got the following error:

read tcp 192.168.99.1:25533->192.168.99.100:2376: wsarecv: A connection attempt failed because the connected party did not properly respond after a pe
riod of time, or established connection failed because connected host has failed to respond.

  • The machine was paused
docker-machine status
Paused

  • Start it again
docker-machine start

Note

The run command

docker run --name sandbox-hdp --hostname "sandbox-hdp.hortonworks.com" --privileged -d `
    -p 15500:15500 `
    -p 15501:15501 `
    -p 15502:15502 `
    -p 15503:15503 `
    -p 15504:15504 `
    -p 15505:15505 `
    -p 1111:111 `
    -p 4242:4242 `
    -p 50079:50079 `
    -p 6080:6080 `
    -p 16000:16000 `
    -p 16020:16020 `
    -p 10502:10502 `
    -p 33553:33553 `
    -p 39419:39419 `
    -p 15002:15002 `
    -p 18080:18080 `
    -p 10015:10015 `
    -p 10016:10016 `
    -p 2049:2049 `
    -p 9090:9090 `
    -p 3000:3000 `
    -p 9000:9000 `
    -p 8000:8000 `
    -p 8020:8020 `
    -p 2181:2181 `
    -p 42111:42111 `
    -p 10500:10500 `
    -p 16030:16030 `
    -p 8042:8042 `
    -p 8040:8040 `
    -p 2100:2100 `
    -p 4200:4200 `
    -p 4040:4040 `
    -p 8032:8032 `
    -p 9996:9996 `
    -p 9995:9995 `
    -p 8080:8080 `
    -p 8088:8088 `
    -p 8886:8886 `
    -p 8889:8889 `
    -p 8443:8443 `
    -p 8744:8744 `
    -p 8888:8888 `
    -p 8188:8188 `
    -p 8983:8983 `
    -p 8999:8999 `
    -p 1000:1000 `
    -p 1100:1100 `
    -p 11000:11000 `
    -p 10001:10001 `
    -p 15000:15000 `
    -p 10000:10000 `
    -p 8993:8993 `
    -p 1988:1988 `
    -p 5007:5007 `
    -p 50070:50070 `
    -p 19888:19888 `
    -p 16010:16010 `
    -p 50111:50111 `
    -p 50075:50075 `
    -p 50095:50095 `
    -p 18081:18081 `
    -p 60000:60000 `
    -p 8090:8090 `
    -p 8091:8091 `
    -p 8005:8005 `
    -p 8086:8086 `
    -p 8082:8082 `
    -p 60080:60080 `
    -p 8765:8765 `
    -p 5011:5011 `
    -p 6001:6001 `
    -p 6003:6003 `
    -p 6008:6008 `
    -p 1220:1220 `
    -p 21000:21000 `
    -p 6188:6188 `
    -p 2222:22 `
    hortonworks/sandbox-hdp-standalone:2.6.4 /usr/sbin/sshd -D

The init script

docker exec -t sandbox-hdp /root/start-sandbox-hdp.sh





Discover More
Card Puncher Data Processing
Hadoop - Hortonworks

Hortonworks offers an Hadoop Distribution called HDP (Hortonwokrs Data Platform). HDP can be found: in the Azure Hadoop offering called Hdinsight on there sandbox ...



Share this page:
Follow us:
Task Runner