2017. 8. 5. 18:00ㆍ서버 프로그래밍
<Spark 설치 및 실행>
$ wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0-bin-hadoop2.7.tgz
$ tar xzvf spark-2.1.0-bin-hadoop2.7.tgz
$ ./sbin/start-master.sh
$ ./sbin/start-slave.sh spark://localhost:7077
$ ./bin/pyspark –master spark://localhost:7077
<Scala 설치>
$ wget https://downloads.lightbend.com/scala/2.12.3/scala-2.12.3.tgz
$ tar xzvf scalar-2.12.3.tgz
$ vi .bashrc
export SCALA_HOME=/home/eduuser/scala-2.12.3
export PATH=$PATH:$SCALA_HOME/bin
$ source .bashrc
$ curl https://bintray.com/sbt/rpm/rpm > bintray-sbt-rpm.repo
$ sudo mv bintray-sbt-rpm.repo /etc/yum.repos.d/
$ sudo yum install sbt
$ cd .sbt
$ cd 0.13
$ mkdir plugins
$ cd plugins
$ vi plugins.sbt
addSbtPlugin("com.typesafe.sbteclipse"%"sbteclipse-plugin"%"4.0.0")
-------------------------------
<Eclipse spark programming>
$ mkdir simple-spark
$ cd simple-spark
$ vi build.sbt
$ sbt eclipse
Eclipse Import Project
append New Source Folder : src/main/scala
append New Scala Object : simpleapp
code writing
$ sbt package
[info] Loading global plugins from /home/eduuser/.sbt/0.13/plugins
[info] Loading project definition from /home/eduuser/Documents/workspace-sts-3.6.4.RELEASE/simple-spark/project
[info] Set current project to simple (in build file:/home/eduuser/Documents/workspace-sts-3.6.4.RELEASE/simple-spark/)
[info] Compiling 1 Scala source to /home/eduuser/Documents/workspace-sts-3.6.4.RELEASE/simple-spark/target/scala-2.11/classes...
[info] 'compiler-interface' not yet compiled for Scala 2.11.8. Compiling...
[info] Compilation completed in 16.249 s
[info] Packaging /home/eduuser/Documents/workspace-sts-3.6.4.RELEASE/simple-spark/target/scala-2.11/simple_2.11-1.0.jar ...
[info] Done packaging.
[success] Total time: 19 s, completed 2017. 8. 3 오후 3:04:31
$ ~/spark/bin/spark-submit --class "simpleapp" --master localhost:7077 --executor-memory 512m --total-executor-cores 1 simple_2.11-1.0.jar local simpleoutput
$ cat simpleoutput/part-00000
1
2
3
4
5
$ mkdir rankingcount
$ cd rankingcount
$ vi build.sbt
$ sbt eclipse
$ wget http://www.grouplens.org/system/files/ml-100k.zip
$ cd ~/Downloads
$ cp u.data ~/Documents/workspace-sts-3.6.4.RELEASE//rankingcount
Eclipse Import Project
append New Source Folder : src/main/scala
append New Scala Object : RatingCounter
code writing
$ sbt package
$ ~/spark/bin/spark-submit --class "RatingCounter" --master localhost:7077 target/scala-2.11/rankingcount_2.11-1.0.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
(1,6110)
(2,11370)
(3,27145)
(4,34174)
(5,21201)
------------------------------------------------------------
$ mkdir wordcount-spark
$ cd wordcount-spark
$ vi build.sbt
name := "wordcount-spark"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark"%"spark-core_2.11"%"2.1.0"%"provided"
$ sbt eclipse
$ vi input.txt
$ cat input.txt
read a book
write a book
Eclipse Import Project
append New Source Folder : src/main/scala
append New Scala Object : WordCount
code writing
$ sbt package
$ ~/spark/bin/spark-submit --class "WordCount" --master localhost:7077 target/scala-2.11/wordcount-spark_2.11-1.0.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
[Stage 0:> (0 + 2) / 2]
(read,1)
(book,2)
(a,2)
(write,1)
$ ~/spark/bin/spark-submit --class "MaxTemp" --master localhost:7077 target/scala-2.11/maxtemp-spark_2.11-1.0.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
(1901,317)