Apache Spark™ is a fast and general engine for large-scale data processing.
- Install Java- Download Oracle Java SE Development Kit 7 or 8 at Oracle JDK downloads .- Double click on .dmg file to start the installation- Open up the terminal.- Type java -version, should display the followingjava version "1.7.0_71"Java(TM) SE Runtime Environment (build 1.7.0_71-b14)Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
- Set JAVA_HOME
export JAVA_HOME=$(/usr/libexec/java_home)
- Install Homebrew
ruby -e "$(curl -fsSL )"
- Install Scala
brew install scala
- Set SCALA_HOME
export SCALA_HOME=/usr/local/bin/scala
export PATH=$PATH:$SCALA_HOME/bin
- Download Spark from https://spark.apache.org/downloads.html
tar -xvzf spark-1.1.1.tar
cd spark-1.1.1
- Build and Install Apache Spark
sbt/sbt clean assembly
- Fire up the Spark
For the Scala shell:
./bin/spark-shell
For the Python shell:
./bin/pyspark
- Run Examples
Calculate Pi:
./bin/run-example org.apache.spark.examples.SparkPi
MLlib Correlations example:
./bin/run-example org.apache.spark.examples.mllib.Correlations
MLlib Linear Regression example:
./bin/spark-submit
--class org.apache.spark.examples.mllib.LinearRegression
examples/target/scala-*/spark-*.jar data/mllib/sample_linear_regression_data.txt
References:
How to install Spark on Mac OS X
How To Set $JAVA_HOME Environment Variable On Mac OS X
Homebrew - The missing package manager for OS X
posted on 2017-08-01 16:33 阅读( ...) 评论( ...)