Holen Sie sich die neuesten Updates von Hortonworks per E-Mail

Einmal monatlich erhalten Sie die neuesten Erkenntnisse, Trends und Analysen sowie Fachwissen zu Big Data.


Erste Schritte


Sind Sie bereit?

Sandbox herunterladen

Wie können wir Ihnen helfen?

SchließenSchaltfläche „Schließen“

Analyze IoT Weather Station Data via Connected Data Architecture


You will build an Internet of Things (IoT) Weather Station using Connected Data Architecture, which incorporates open source frameworks: MiNiFi, Hortonworks DataFlow (HDF) and Hortonworks Data Platform (HDP). In addition you will work with the Raspberry Pi (R-Pi) and Sense HAT. You will use MiNiFi to route the weather data from the Raspberry Pi to HDF via Site-to-Site protocol, then you will connect the NiFi service running on HDF to HBase running on HDP. From within HDP, you will learn to visually monitor weather data in HBase using Zeppelin’s Phoenix Interpreter.

Goals And Objectives

By the end of this tutorial series, you will acquire the fundamental knowledge to build IoT related applications of your own. You will be able to connect MiNiFi, HDF Sandbox and HDP Sandbox. You will learn to transport data across remote systems, and visualize data to bring meaningful insight to your customers. You will need to have a background in the fundamental concepts of programming (any language is adequate) to enrich your experience in this tutorial.

The learning objectives of this tutorial series include:

  • Install an Operating System (Linux) on the R-Pi (Tutorial 1)
  • Setup HDP Sandbox on your local machine (Tutorial 1)
  • Understand the R-Pi’s Place in the IoT Spectrum (Tutorial 1)
  • Understand Barometric Pressure/Temperature/Altitude Sensor’s Functionality (Tutorial 1)
  • Configure R-Pi to communicate with Sensor via I2C (Tutorial 2)
  • Implement a Python Script to Show Sensor Readings (Tutorial 2)
  • Create HBase Table to hold Sensor Readings (Tutorial 3)
  • Build a MiNiFi flow in NiFi using ExecuteProcess Processor and Remote Process Group to Ingest Raw Sensor Data from Python and Transport it to a Remote NiFi (Tutorial 3)
  • Build a Remote NiFi flow that geographically enriches the sensor dataset, converts the data format to JSON and stores the data into HBase (Tutorial 3)
  • Perform Aggregate Functions for Temperature, Pressure & Altitude with Phoenix (Tutorial 5)
  • Visualize the Analyzed Data with Apache Zeppelin (Tutorial 5)


Bill of Materials:

Hardware Requirements

  • At least 12 GB of RAM to run both HDF and HDP Sandboxes on one laptop

Tutorial Series Overview

In this tutorial, we work with barometric pressure, temperature and humidity sensor data gathered from a R-Pi using Apache MiNiFi. We transport the MiNiFi data to NiFi using Site-To-Site, then we upload the data with NiFi into HBase to perform data analytics.

This tutorial consists of five sections:

If you have a R-Pi and Sense HAT, follow track 1: tutorials 1 – 4. If you don’t have it, then start at Track 2: tutorial 5. Track 2 is optional for those users who don’t have access R-Pi and Sense HAT.

Track 1: Tutorial Series with R-Pi and Sense HAT

Tutorial 1 – Set up the IoT Weather Station for processing the sensor data. You will install Raspbian OS and MiNiFi on the R-Pi, HDF Sandbox and HDP Sandbox on your local machine.

Tutorial 2 – Program the R-Pi to retrieve the sensor data from the Sense HAT Sensor. Embed a MiNiFi Agent onto the R-Pi to collect sensor data and transport it to NiFi on HDF via Site-to-Site. Store the Raw sensor readings into HDFS on HDP using NiFi.

Tutorial 3 – Enhance the NiFi flow by adding on Geographic location attributes to the sensor data and converting it to JSON format for easy storage into HBase.

Tutorial 4 – Monitor the weather data with Phoenix and create visualizations of those readings using Zeppelin’s Phoenix Interpreter.

Track 2: Tutorial Series with Simulated Data

Tutorial 5 – Import New workflow of NiFi. This template runs the sensor data simulator, then perform the same processing operations against the data as Tutorial 3 and 4.

The tutorial series is broken into multiple tutorials that provide step by step instructions, so that you can complete the learning objectives and tasks associated with it. You are also provided with a dataflow template for each tutorial that you can use for verification. Each tutorial builds on the previous tutorial.

IoT and Connected Data Architecture Concepts


Anytime there are at least two platforms connected: HDF and HDP, that environment is called Connected Data Architecture. For this tutorial, we are building a Connected Data Architecture that incorporates: MiNiFi based at the Raspberry Pi, HDF docker sandbox and HDP docker sandbox both located within the docker engine. The purpose of the concepts section is to dive into each tool that is used within this tutorial.

Apache NiFi is a dataflow engine that makes it possible to ingest data from any data source and transport it to any destination, so that developers can focus on data processing implementation. We will use Apache MiNiFi to ingest the sensor readings from the R-Pi and transport that data to NiFi to send to HBase. HBase is a noSQL database and we will use it because it’s efficient at storing unstructured data. We will use Apache Phoenix to map to the HBase table and Perform SQL queries against the HBase table. Apache Zeppelin will be the business reporting tool to visualize these queries we perform against the HBase table.

In the concepts tutorial, the goal is to give you information on the background of each hardware/software tool used to build this Connected Data Architecture System for use cases, such as the IoT Weather Station.


  • Introduction


  • What is a Raspberry Pi?
  • Internet of Things on R-Pi
  • Sense HAT Sensor Functionality
  • Docker
  • HDF Sandbox Docker Container
  • Apache NiFi
  • Apache MiNiFi
  • HDP Sandbox Docker Container
  • Apache HBase
  • Apache Phoenix
  • Apache Zeppelin

What is a Raspberry Pi?

The Raspberry Pi (R-Pi) 3 is a microprocessor or computer with an open-source platform commonly used for beginners learning to code to practitioners building Internet of Things (IoT) related Applications. This embedded device has a 1.2 GHz ARMv8 CPU, 1GB of memory, integrated Wi-Fi and Bluetooth. The R-Pi comes with various General Purpose Input Output (GPIO) pins, input/output ports for connecting the device to external peripherals, such as sensors, keyboards, mouses and other peripherals. As can be seen in Figure 1, the R-Pi is connected to the internet via Ethernet port, a monitor via HDMI port, keyboard and mouse via USB port and powered by 12V power supply. This device has the capability to run various operating systems, such as Linux. Additionally, it can run an instance of Apache NiFi and for embedded devices that have a limited amount of memory, we can run Apache MiNiFi.


Internet of Things on R-Pi

The R-Pi is not just a platform for building IoT projects, it is a super platform for learning about IoT. R-Pi is a great way to gain practical experience with IoT. According to IBM Watson Internet of Things, the R-Pi IoT platform can be used for the factory, environment, sports, vehicles, buildings, home and retail. All these R-Pi IoT platform use cases have in common that data is processed, which can result in augmented productivity in factories, enhanced environmental stewardship initiatives, provided winning strategies in sports, enhanced driving experience, better decision making, enhanced resident safety and security and customized and improved shopping experience in retail.

Sense HAT Functionality


What exactly does the Sense HAT Sensor Measure?

The Sense HAT sensors enable users to measure orientation via an 3D accelerometer, 3D gyroscope and 3D magnetometer combined into one chip LSM9DS1. The Sense HAT also functions to measure air pressure and temperature via barometric pressure and temperature combined into the LPS25H chip. The HAT can monitor the percentage of humidity in correlation with temperature in the air via humidity and temperature sensor HTS221. All three of the these sensors are I2C.

How does the Sense HAT Sensor Transfer Data to the R-Pi?

The Sense HAT sensor uses I2C, a communication protocol, to transfer data to the R-Pi and other devices. I2C requires two shared lines: serial clock signal (SCL) and bidirectional data transfers (SDA). Every I2C device uses a 7-bit address, which allows for more than 120 devices sharing the bus, and freely communicate with them one at a time on as-needed basis.

What is an Advantage of I2C Sensors?

I2C makes it possible to have multiple devices in connection with the R-Pi, each having a unique address and can be set by updating the settings on the Pi. It will be easier to verify everything is working because one can see all the devices connected to the Pi.


Docker is a an open source platform for developers and sysadmins to develop, ship, and run applications. Docker provides faster delivery of applications, allows users to deploy and scale easily, which results in easier maintenance. This platform includes a Docker Engine, which is a lightweight and powerful open source containerization technology. This technology incorporates work flow for building and containerizing your applications. Docker containerization incorporates Docker images and containers. Docker images contain applications or services that can easily deploy an application into a testing, staging and production environment known as containers. For instance, with one Docker image, you can deploy multiple containers of that Docker image instance. Containers are a way to package and run an application, such as a framework: Hadoop, Spark, NiFi in an isolated environment. Containers are different from virtual machine because they do not need the extra layer of a hypervisor, instead they run directly on the host machine’s kernel. Users are able to share their docker images and containers through the Docker Hub.


In the docker architecture above, Docker registry are services used for storing Docker images, such as Docker Hub. Docker Host is the computer Docker runs on. Diving deeper into the host, you can see the Docker Daemon, which is used to create and manage Docker objects, such as images, containers, networks and volumes. The user or client is able to interact with Docker daemon via Client Docker CLI. Additionally, the CLI are scripts or direct commands entered by the user. The Docker daemon is a long-running program also known as a server. The CLI utilizes Docker’s REST API to interact with the Docker daemon. As you can observe, the Docker Engine is a client-server application comprised of Client Docker CLI, REST API and Docker daemon.

You will use Docker as the backbone to deploy the Connected Data Architecture: IoT Devices, HDF Sandbox Container and HDP Sandbox Container.


In the Connected Data Architecture, the Sense HAT and Raspberry Pi will be located in the Internet of Anything (alias Internet of Things), HDF Sandbox Container and HDP Sandbox Container will run in their own containers connected in a simulated network by Docker’s default network feature known as Bridge.

HDF Sandbox Docker Container

HDF Sandbox comes in different flavors: VirtualBox, VMware and Docker. You will be using the HDF Sandbox Docker container to build this IoT Weather Station via Connected Data Architecture. Hortonworks DataFlow (HDF) Sandbox is a way to deploy HDF into an isolated environment for testing, staging and sometimes production. HDF is a application stack framework used to process data-in-motion. HDF comes with frameworks: Zookeeper, Storm, Ambari Infra, Ambari Metrics, Kafka, Log Search, Ranger and NiFi.

Apache NiFi

NiFi is a robust and secure framework for ingesting data from various sources, performing simple transformations on that data and transporting it across a multitude of systems. The NiFi UI provides flexibility to allow teams to simultaneously change flows on the same machine. NiFi uses SAN or RAID storage for the data it ingests and the provenance data manipulation events it generates. Provenance is a record of events in the NiFi UI that shows how data manipulation occurs while data flows throughout the components, known as processors, in the NiFi flow. NiFi, program sized at approximately 800MB, has over 190 processors for custom integration with various systems and operations.

Apache MiNiFi

MiNiFi was built to live on the edge for ingesting data at the central location of where it is born, then transport that data to your data center where NiFi lives. MiNiFi comes in two flavors Java or C++ agent. These agents access data from IoT and Desktop level devices. The idea behind MiNiFi is to be able to get as close to the data as possible from any particular location no matter how small the footprint on a particular embedded device.

Visualization of MiNiFi and NiFi Place in IoT


HDP Sandbox Docker Container

HDP Sandbox comes in different flavors: VirtualBox, VMware and Docker. You will be using the HDP Sandbox Docker container to build this IoT Weather Station via Connected Data Architecture. Hortonworks Data Platform (HDP) Sandbox is a way to deploy HDP into an isolated environment for testing, staging and sometimes production. HDP is a application stack framework used to process data-at-rest. HDP comes with various frameworks: HDFS, Yarn + MapReduce2, HBase, Phoenix, Zeppelin, etc.

Apache HBase

Apache HBase is a noSQL database programmed in Java, but unlike other noSQL databases, it provides strong data consistency on reads and writes. HBase is a column-oriented key/value data store implemented to run on top of Hadoop Distributed File System (HDFS). HBase scales out horizontally in distributed compute clusters and supports rapid table-update rates. HBase focuses on scale, which enables it to handle very large database tables. Common scenarios include HBase tables that hold billions of rows and millions of columns. Another example is Facebook utilizes HBase as a structured data handler for its messaging infrastructure.

Critical part of the HBase architecture is utilization of the master nodes to manage region servers that distribute and process parts of data tables. HBase is part of the Hadoop ecosystem along with other services such as Zookeeper, Phoenix, Zeppelin.

Apache Phoenix

Apache Phoenix provides the flexibility of late-bound, schema-on-read capabilities from NoSQL technology by leveraging HBase as its backing store. Phoenix has the power of standard SQL and JDBC APIs with full ACID transaction capabilities. Phoenix also enables online transaction processing (OLTP) and operational analytics in Hadoop specifically for low latency applications. Phoenix comes fully integrated to work with other products in the Hadoop ecosystem, such as Spark and Hive.

Apache Zeppelin

Apache Zeppelin is a data science notebook that allows users to use interpreters to visualize their data with line, bar, pie, scatter charts and various other visualizations. One particular interpreter we will utilize is Phoenix that way we can visualize our weather data.


Congratulations, you’ve finished the concepts tutorial! Now you are familiar with the technologies you will be utilizing in the tutorial series and will have a better understanding of each tools purpose in deploying this IoT Weather Station by way of Connected Data Architecture.

Further Reading

Deploy IoT Weather Station


You’ll make an IoT Weather Station with a Raspberry Pi and Sense HAT. Additionally, you’ll add on data analytics to this IoT Weather Station Platform with Connected Data Architecture communication between the MiNiFi, HDF Sandbox and HDP Sandbox.


  • Downloaded and Installed Docker Engine on Local Machine
    • Set Docker Memory to 12GB to run both HDF and HDP Sandboxes on one laptop.
      • Link above will take you to Docker preferences for Mac. In the Docker documentation, choose your OS.
  • Downloaded Latest HDF and HDP Sandboxes for Docker Engine
  • Installed Latest HDF and HDP Sandboxes on Local Machine
  • Downloaded and Installed Latest Raspbian OS onto Raspberry Pi
    • If you need help installing Raspbian OS onto the Raspberry Pi, refer to Appendix A.
  • Downloaded the latest MiNiFi Toolkit onto your local machine
  • Read Analyze IoT Weather Station Data via Connected Data Architecture Intro


Steps for Embedding MiNiFi on IoT Device(Raspberry Pi)

  • Step 1: Connect Sense HAT to Raspberry Pi
  • Step 2: SSH into the Raspberry Pi
  • Step 3: Install MiNiFi Java Agent onto Raspberry Pi
    • 3.1: Install OS Dependencies

Steps for Deploying MiNiFi, HDF, HDP Connected Data Architecture

  • Step 4: Start HDF Sandbox
    • 4.1: Configure NiFi to Receive Data
    • 4.2: Restart NiFi
    • 4.3: Add GeoLite2 database
  • Step 5: Start HDP Sandbox
    • 5.1 Disable Oozie, Flume
    • 5.2 Start HBase
  • Step 6: Connect HDF and HDP
    • 6.1: Update hosts file for HDF and HDP CentOS
    • 6.2: Create HBaseClient Service in HDF NiFi
  • Summary
  • Further Reading
  • Appendix A: Install Raspbian OS onto Raspberry Pi
  • Appendix B: Verify Communication between HDF and HDP Sandboxes

There are two phases you will go through in deploying IoT Weather Station Data Analytics, which include embedding MiNiFi onto the Raspberry Pi and deploying Connected Data Architecture between MiNiFi, HDF Sandbox and HDP Sandbox.

Embedding MiNiFi on Raspberry Pi

Step 1: Connect Sense HAT to Raspberry Pi

1. Connect the Sense HAT’s 40 female pins to the Raspberry Pi’s 40 male pins.




Step 2: SSH into the Raspberry Pi

If you haven’t installed Raspbian on your device, refer to Appendx A.

1. Open your terminal, clone the Raspberry Pi Finder Open Source Program provided by Adafruit Inc:

wget https://github.com/adafruit/Adafruit-Pi-Finder/releases/download/3.0.0/PiFinder-3.0.0-osx-x64.zip
unzip PiFinder-*-osx-*.zip

2. Open Raspberry Pi Finder and Click Find My Pi!:


3. Results include the IP address of your Raspberry Pi:


4. SSH into the Pi from your laptop using the ip address you just collect with the following command:

ssh pi@<pi-ip-addr> -p 22

Note: will be different for each Raspberry Pi.

Example of the ssh command used to ssh into a raspberry pi:

ssh pi@ -p 22

Note: You’ll be asked for the password, enter raspberry.

After successfully SSHing into the Raspberry Pi, your console will look similar:


2.1 Install the Sense HAT Software

1. Download and install the Sense HAT Software.

sudo apt-get update
sudo apt-get install sense-hat
sudo pip3 install pillow

Now you have the software needed to program the Sense HAT, you will utilize it in the next tutorial.

Step 3: Install MiNiFi Java Agent onto Raspberry Pi

1. In Raspbian’s OS terminal, download Java 8 and JDK1.8 using the following command:

sudo apt-get update && sudo apt-get install oracle-java8-jdk

Note: the install will take approximately 10 minutes depending on Raspbian OS resources being used.

2. Download MiNiFi Java Agent from apache website downloads using the following command:

wget http://public-repo-1.hortonworks.com/HDF/

3. Unpack the MiNiFi project using the following command:

tar -zxvf minifi-*-bin.tar.gz

A MiNiFi Agent is embedded onto the Raspberry Pi.

Deploying MiNiFi, HDF and HDP Connected Data Architecture

Note: Before deploying HDF and HDP Sandbox, you will need to set the Docker Engine memory to at least 12GB to run both sandboxes on one laptop.

Step 4: Start HDF Sandbox

For starting HDF Sandbox, there are two options listed below. Option1 is for users who have not downloaded or installed HDF Sandbox in the Docker Engine. Option 2 is for users who have installed and deployed an HDF Sandbox Docker Container.

Option1: For User Who Haven’t Deployed HDF Sandbox Container

If you have’t downloaded the Docker HDF Sandbox, download it here: HDF Sandbox Docker

1. Run the command to load the docker sandbox image into the docker engine:

docker load < {name-of-your-hdf-sandbox-docker-image}

Example of the above docker load command:

docker load < HDF_2.1.2_docker_image_04_05_2017_13_12_03.tar.gz

You’ll need the script to deploy an HDF Sandbox Container off of the HDF Sandbox Docker image, Download start HDF Sandbox script here: start_sandbox-hdf.sh

1. run the start_sandbox-hdf.sh script below:


Your Docker HDF Sandbox will deploy as a Container and Start up too. You are now ready to go to substep 4.1

Option 2: For User Who Has Deployed HDF Sandbox Container

1. Turn on your HDF Sandbox using the script:

wget https://raw.githubusercontent.com/james94/data-tutorials/master/tutorials/hdf/hdf-2.1/analyze-traffic-pattern-with-apache-nifi/assets/auto_scripts/docker-scripts/docker_sandbox_hdf.sh

Your Docker HDF Sandbox Container will Start up soon.

4.1: Configure NiFi to Receive Data

2. Login to Ambari at:


The user/password is admin/admin

3. Head to Advanced NiFi-Properties in Ambari Config Settings for NiFi. Update the input socket port and add the remote host NiFi runs on:

The properties should be updated with the following values:

nifi.remote.input.host = <internal ip address>
nifi.remote.input.socket.port = 17000

Note: laptop was printed as output earlier when you were learning how to SSH into Raspberry Pi. `ifconfig | grep inet`.

The updates to Advanced NiFi-Properties should look similar as below:


3.1. Enter NiFi Service in Ambari Stack

3.2. Enter NiFi Configs

3.3. Filter search for nifi.remote

3.4. Insert nifi.remote.input.host with your laptop's internal ip address

3.5. Verify nifi.remote.input.http.enabled checked

3.6. Insert nifi.remote.input.socket.port with 17000

3.7. Save the configuration.

Now NiFi is configured for Socket Site-To-Site protocol.

4.2: Restart NiFi

3. Restart NiFi from Ambari with the orange restart button for the changes to take effect.

4.3: Add GeoLite2 database to HDF Sandbox CentOS

You will need to add the GeoLite2 to HDF Sandbox CentOS for when you add geographic location enhancement to the NiFi DataFlow.

4. SSH into the HDF Sandbox:

ssh root@ -p 12222

5. Create directory for GeoFile:

mkdir -p /tmp/nifi/GeoFile

6. Ensure NiFi has access to that location:

chmod 777 -R /tmp/nifi

7. Download GeoLite2-City.mmdb to specified location GeoEnrichIP looks:

cd /tmp/nifi/GeoFile
wget https://github.com/hortonworks/data-tutorials/raw/master/tutorials/hdp/hdp-2.5/refine-and-visualize-server-log-data/assets/GeoLite2-City.mmdb

The warning message on the GeoEnrichIP processor should disappear:

Step 5: Start HDP Sandbox

For starting HDP Sandbox, there are two options listed below. Option1 is for users who have not downloaded or installed HDP Sandbox in the Docker Engine. Option 2 is for users who have installed and deployed an HDP Sandbox Docker Container.

Option1: For User Who Haven’t Deployed HDF Sandbox Container

If you have’t downloaded the the latest Docker HDP Sandbox, download it here: HDP Sandbox Docker

1. Run the command to load the Docker HDP Sandbox image into the docker engine:

docker load < {name-of-your-hdp-sandbox-docker-image}

Example of the above docker load command:

docker load < HDP_2.6_docker_image_05_05_2017_15_01_40.tar.gz

You’ll need the script to deploy an HDP Sandbox Container off of the HDP Sandbox Docker image, Download the install and start HDP Sandbox script here: start_sandbox-hdp.sh

1. Run the start_sandbox-hdp.sh script below:


Your Docker HDP Sandbox will deploy as a Container and Start up too. You are now ready to go to substep 5.1

Option 2: For User Who Has Deployed HDP Sandbox Container

Download the start HDP Sandbox Container script here: docker_sandbox_hdp.sh

1. Turn on your HDP Sandbox container using the script:


Your Docker HDP Sandbox Container will Start up soon.

5.1 Disable Oozie, Flume

2. Login to Ambari at:


user is admin. password is what you set it up as in Learning the Ropes of Hortonworks Sandbox: Section 2.2

3. In the left hand side of HDP Services on the Ambari Stack, turn off Oozie and Flume with Ambari Service Actions -> Stop since you’ll need more memory to run HBase.

5.2 Start HBase

4. Turn on HBase Service with Ambari Service Actions -> Start


HBase service activated will be indicated by a green check mark.

Step 6: Connect HDF and HDP

6.1: Update hosts file for HDF and HDP CentOS

Update the hosts file for HDF and HDP CentOS, so both sandboxes can reach each other using their hostnames.

1. From your local machine terminal, to find the ip address of each docker sandbox, type or copy/paste the following command:

docker network inspect bridge


Note: the internal ip address for HDP and HDF Sandbox. You will need these IP addresses in order for each sandbox to be reach each other.

2. Add HDP’s ip address mapped to its hostname into HDF’s hosts file using the following command:

echo '{hdp-ip-address} sandbox.hortonworks.com' | sudo tee -a /private/etc/hosts

3. Add HDF’s ip address mapped to its hostname into HDP’s hosts file using the following command:

echo '{hdf-ip-address} sandbox-hdf.hortonworks.com' | sudo tee -a /private/etc/hosts

Note: the two commands were tested on a mac. So, the paths may be different depending on your OS.

6.2: Create HBaseClient Service in HDF NiFi

1. Go to NiFi UI on HDF Sandbox at:


2. At the top right corner, go to Global Menu -> Controller Settings -> Controller Services -> Plus Button

3. Look for the HBaseClient Service in the list of controller services, then add it.


4. Configure the HBaseClient Service and add to the property tab under Hadoop Configuration Files, the file path:



NiFi needs this file to be able to connect to the remote HBase instance running on HDP Sandbox.


Congratulations! You know how to setup your own IoT Weather Station using the Raspberry Pi, Sense HAT, MiNiFi, HDF Sandbox and HDP Sandbox. You are also familiar with how to embed MiNiFi onto the Raspberry Pi and how to setup MiNiFi, HDF and HDP Connected Data Architecture. In the next tutorials, you’ll focus on data preprocessing, data storage into a nosql database and analyzing the data in real-time as it saves to the database.

Further Reading

Appendix A: Install Raspbian OS onto Raspberry Pi

For users who need help installing Raspbian OS onto their Raspberry Pi, we have provided this appendix for the step-by-step procedure for users who have computers with SD card slots.

A.1 Configure a bootable Raspbian OS on microSD

1. Connect the SanDisk MicroSD Card into SanDisk microSD Adapter and insert that SDCard Adapter into the SD card slot of your computer.

MicroSD Card on left and SD Card Adapter on right.


MicroSD Card connected to SD Card Adapter insereted into computer’s SD Card slot.


2. Download Raspbian Jessie Lite operating system image onto your local machine.

3. Open your terminal for local machine, navigate to the Downloads folder. Unzip raspbian-jessie-lite.zip:

cd ~/Downloads
unzip 2017-*-raspbian-jessie-lite.zip

4. See a list of all devices that are mounted on laptop using command:


5. Note down the device path listed next to the volume, look for the most recent volume added, it’ll probably have name /Volumes/BOOT under Mounted On column.


6. Open Disk Utility, select SD card, then press Unmount, so we can write to the entire card.


7. Head to terminal, in the Downloads folder where the Raspbian OS is located, run the DD command to write a bootable Raspbian OS onto micro SD card:

sudo dd bs=1m if=2017-02-16-raspbian-jessie-lite.img of=/dev/rdisk2

Note: Explanation of three arguments used in dd: bs = block size, if = location of raspbian input file, of = location of peripheral device output file. Notice how in the of= argument, SD card volume changes from disk2s1 to rdisk2 to ensure the entire SD card is overloaded, not just the partition.

The DD operation will take 1 to 5 minutes until completion.


After the dd operation completes, you should see the Raspbian bootable OS successfully transferred over to the SD card.

8. To setup a headless raspberry pi, ssh can be enabled by placing a file named ssh onto the boot partition’s base directory:


Note: the path to the SD card is /Volumes/boot. touch ssh creates a new file. ls -ltr verifies new file was created.

9. Eject the microSD card Adapter and remove it from your laptop. Insert the microSD card into the micro SD card slot of the Raspberry Pi.


10. Connect ethernet cable to the Raspberry Pi to give it internet access, connect the 5V for power and the Pi should start up.


The Pi’s default login credentials:

username/password = pi/raspberry

Note: you will need the password for ssh access to the Raspberry Pi.

Appendix B: Verify Communication between HDF and HDP Sandboxes

Check the docker networks available:

docker network ls

Check if both docker containers are connected to the “bridge” docker network

docker network inspect bridge

SSH into HDF container

ssh root@localhost -p 12222

SSH into HDP container

ssh root@localhost -p 2222

Check for connectivity to HDP container from HDF container:

ping sandbox.hortonworks.com

Check for connectivity to HDF container from HDF container:

ping sandbox-hdf.hortonworks.com

As long as both container can ping each other, they can communicate. Message you should see looks like:

PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=0.078 ms

Check connectivity to zookeeper from HDF Sandbox Container

telnet sandbox.hortonworks.com 2181

Check connectivity to HBase Master and Regionserver

Connection to HMaster UI

telnet sandbox.hortonworks.com 160000

Connection to HMaster Bind Port

telnet sandbox.hortonworks.com 160010

Connection to HBase RegionServer

telnet sandbox.hortonworks.com 160020

Connection to HBase RegionServer Bind Port

telnet sandbox.hortonworks.com 160030

As long as all commands above check out with no error responses, then you have successful communication between HDF and HDP Sandboxes