Holen Sie sich die neuesten Updates von Hortonworks per E-Mail

Einmal monatlich erhalten Sie die neuesten Erkenntnisse, Trends und Analysen sowie Fachwissen zu Big Data.

Sign up for the Developers Newsletter

Einmal monatlich erhalten Sie die neuesten Erkenntnisse, Trends und Analysen sowie Fachwissen zu Big Data.

cta

Erste Schritte

Cloud

Sind Sie bereit?

Sandbox herunterladen

Wie können wir Ihnen helfen?

* Ich habe verstanden, dass ich mich jederzeit abmelden kann. Außerdem akzeptiere ich die weiteren Angaben in der Datenschutzrichtlinie von Hortonworks.
SchließenSchaltfläche „Schließen“
HDP > Entwicklung mit Hadoop > Apache Spark

Spark on YARN Example

Cloud Sind Sie bereit?

SANDBOX HERUNTERLADEN

Introduction

In this brief tutorial you will run a pre-built Spark example on YARN

Prerequisites

This tutorial assumes that you are running an HDP Sandbox.

Please ensure you complete the prerequisites before proceeding with this tutorial.

Pi Example

To test compute intensive tasks in Spark, the Pi example calculates pi by “throwing darts” at a circle. The example points in the unit square ((0,0) to (1,1)) and sees how many fall in the unit circle. The fraction should be pi/4, which is used to estimate Pi.

To calculate Pi with Spark in yarn-client mode.

Assuming you start as root user follow these steps depending on your Spark version.

Note: We have provided two examples one for Spark 2.x and another for Spark 1.6.x for reference.

Spark 2.x Version

From the Sandbox terminal (command line):

root@sandbox# export SPARK_MAJOR_VERSION=2
root@sandbox# cd /usr/hdp/current/spark2-client
root@sandbox spark2-client# su spark
spark@sandbox spark2-client$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/jars/spark-examples*.jar 10

Spark 1.6.x Version

From the Sandbox terminal (command line):

root@sandbox# export SPARK_MAJOR_VERSION=1
root@sandbox# cd /usr/hdp/current/spark-client
root@sandbox spark-client# su spark
spark@sandbox spark-client$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples*.jar 10

Note: The Pi job should complete without any failure messages and produce output similar to below, note the value of Pi in the output message:

...
16/02/25 21:27:11 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/02/25 21:27:11 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:36, took 19.346544 s
Pi is roughly 3.143648
16/02/25 21:27:12 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
...

Next steps

At this point, you might want to set up a complete development environment for writing and debugging your Spark applications.

Checkout one of the following tutorials on how to set up a full development environment for either Python, Scala, or Java.

User Reviews

User Rating
0 No Reviews
5 Star 0%
4 Star 0%
3 Star 0%
2 Star 0%
1 Star 0%
Tutorial Name
Spark on YARN Example

To ask a question, or find an answer, please visit the Hortonworks Community Connection.

No Reviews
Write Review

Registrieren

Please register to write a review

Share Your Experience

Example: Best Tutorial Ever

You must write at least 50 characters for this field.

Success

Thank you for sharing your review!