安装前准备

  1. 启动zookeeper集群
  2. 启动Hadoop集群
  3. 上传并解压spark2.2.0(版本2以上使用jdk1.8以上)
  4. 在spark创建logs文件夹,pids文件夹
  5. 在hdfs上创建目录hadoop fs -mkdir -p hdfs://ns/spark/eventlog

安装步骤

1.配置环境变量

1
2
3
4
cd /etc/profile
export SPARK_HOME=/home/ldl/software/spark-2.2.0
export PATH=$PATH:$SPARK_HOME/bin
使其生效source /etc/profile

2.修改spark.env.sh配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
export JAVA_HOME=/usr/local/jdk1.8
#export SCALA_HOME=/e3base/scala
#export SCALA_LIBRARY_PATH=${SPARK_LIBRARY_PATH}
#export IN_HOME=/e3base
export HADOOP_HOME=/home/ldl/software/hadoop-2.7.1
export HADOOP_CONF_DIR=/home/ldl/software/hadoop-2.7.1/etc/hadoop/
export HIVE_HOME=/home/ldl/software/hive-1.2.1
export SPARK_HOME=/home/ldl/software/spark-2.2.0
export SPARK_CONF_DIR=/home/ldl/software/spark-2.2.0/conf
export SPARK_LIBRARY_PATH=/home/ldl/software/spark-2.2.0/lib
#export SPARK_WORKER_DIR=/home/hadoop/work/spark
export SPARK_LOG_DIR=/home/ldl/software/spark-2.2.0/logs
export SPARK_PID_DIR=/home/ldl/software/spark-2.2.0/pids
export SPARK_MASTER_PORT=17001
export SPARK_MASTER_WEBUI_PORT=17002
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop01:2181,hadoop02:2181,hadoop03:2181 -Dspark.deploy.zookeeper.dir=/spark"
#export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=17003 -Dspark.history.retainedApplications=100 -Dspark.history.fs.logDirectory=hdfs://ns/spark/eventlog"

3.修改spark-defaults.conf配置文件

1
2
3
4
5
6
7
8
9
10
11
spark.master spark://hadoop01:2181
spark.eventLog.enabled true
spark.eventLog.dir hdfs://ns/spark/eventlog
spark.local.dir /home/ldl/software/spark-2.2.0/tmp
spark.sql.shuffle.partitions 250
spark.default.parallelism 10
spark.driver.memory 1g
spark.executor.memory 1g
spark.ui.port 17004
spark.yarn.historyServer.address http://172.21.3.67:17003
spark.history.ui.port 17003

4.修改slave配置文件

1
2
3
hadoop01
hadoop02
hadoop03

5.拷贝hive.site.xml

1
cp /home/ldl/software/hive-1.2.1/conf/hive-site.xml /home/ldl/software/spark-2.2.0/conf/

6.将sprk文件夹拷贝到三台节点上

1
2
scp -r /home/ldl/software/spark-2.2.0/ hadoop02:/home/ldl/software/
scp -r /home/ldl/software/spark-2.2.0/ hadoop02:/home/ldl/software/

7.spark启动

到spark/sbin目录下启动

1
start-all.sh

img

停止

stop-all.sh

img

单独命令(进入目录下执行+./)

启停master进程

start-master.sh

stop-master.sh

启停worker进程

start-slave.sh

stop-slave.sh

启停spark jobhistory服务

start-history-server.sh

stop-history-server.sh

启停spark thriftserver服务

start-thriftserver.sh

stop-thriftserver.sh

master进程截图

img

成功截图

img