CentOS7.0安装配置hadoop2.7.0
资源准备
资源下载:
注意事项:
- 如果自己下载资源的话,注意hadoop,jdk,centos都应该是64位或者32位的,以免出现无法预料的错误,上面的资源都是64位的
- 我是在mac下配置的,virtual Box是ios x系统的,如果是其它系统的另外寻找资源下载安装
linux 虚拟机配置
系统配置:
- 虚拟机:一个master(Master.Hadoop),两个slave(Slave1.Hadoop,Slave2.Hadoop)
- 网络设置:我使用的是桥接的方式,因为只是实验性安装,我没有设置静态ip
- 内存:每个虚拟机配置1024M内存
- 分区:自动
- 软件选择:最小安装,注意选择开发工具
- 用户设置:密码都设置为:hadoophadoop,不创建任何用户,操作时使用root直接进行
额外软件安装:
- centos7.0初始化时并没有ifconfig指令,需要使用下面两个指令进行安装:
yum search ifconfg
yum
install net-tools.x86_64
完成其它两个虚拟机的安装:
- 两个slave的hostname可以改成Slave1.Hadoop,Slave2.Hadoop,方便区分
安装完后各个虚拟机的ip配置(参考用)
主机 |
ip地址 |
master.hadoop |
192.168.1.122 |
slave1.hadoop |
192.168.1.125 |
slave2.hadoop |
192.168.1.124 |
配置本地hosts
1
2
3
4
5
6
7
vi /etc/hosts
// 将以下数据复制进入各个主机中
192.168.1.122 Master
.Hadoop
.125 Slave1
.124 Slave2
.Hadoop
- 使用以下指令对master主机中进行测试,可使用类似指令在slave主机测试
1
2
3
ping Slave1
.Hadoop
ping Slave2
.Hadoop
配置Master无密码登录所有Salve
以下在Master主机上配置
1
2
3
ssh-keygen
/
/ 会生成两个文件,放到默认的/root/.ssh/文件夹中
1
cat ~/
.ssh/id_rsa
.pub >> ~/
.ssh/authorized_keys
1
chmod
600 ~
/.ssh/authorized_keys
1
2
3
4
5
6
7
8
9
vi /etc/ssh/sshd_config
// 以下三项
修改成以下配置
RSAAuthentication
yes
PubkeyAuthentication
# 启用公钥私钥配对认证方式
AuthorizedKeysFile .ssh/authorized_keys
1
service sshd restart
1
2
3
4
/
/ scp ~/.ssh/id_rsa.pub 远程
用户名@远程服务器
IP:~/
scp ~
/.ssh/id_rsa.pub root
@192.
168.1.
125:~/
scp ~
124:~/
以下在Slave主机上配置
1
2
3
4
mkdir ~/
.ssh
//
修改权限
chmod
700 ~/
.ssh
1
2
3
4
cat ~
/id_rsa.pub >> ~/.ssh/authorized_keys
chmod
/.ssh/authorized_keys
1
rm –r ~/id_rsa.
pub
在master主机下进行测试
1
2
3
4
ssh 192.168.1.125
ssh
192.168.1.124
// 如果能够分别无密码登陆slave1,slave2主机,则成功配置
进行jdk,hadoop软件安装
jdk安装:
- 在/usr下创建java文件夹
- 使用以下指令从真机传入jdk-8u45-linux-x64.tar.gz文件到虚拟主机中
1
2
3
4
5
// 需要在真机中进入
文件目录,地址为虚拟主机的ip地址
scp jdk-
8u45-linux-x64
.tar.gz root
@192.168.122:/usr/java
scp jdk-
.125:/usr/java
scp jdk-
.124:/usr/java
1
2
3
4
tar zxvf jdk
-8u45
-linux-x64.tar
.gz
rm jdk
.gz
1
2
vi /etc/profile
1
2
3
4
5
6
7
8
9
10
// 将以下数据复制到
文件底部
export JAVA_HOME=/usr/java/jdk1.
8.0_45
export JRE_HOME=/usr/java/jdk1.
8.0_45/jre
export CLASSPATH=.:
$CLASSPATH:
$JAVA_HOME/lib:
$JRE_HOME/lib
export PATH=
$PATH:
$JAVA_HOME/bin:
$JRE_HOME/bin
1
source /etc/profile
1
2
3
4
5
6
java
-version
java version
"1.8.0_45"
Java(TM) SE Runtime Environment (build
1.8.0_45
-b14)
Java HotSpot(TM)
64-Bit Server VM (build
25.45-b02,mixed mode)
hadoop安装
- 使用以下指令将hadoop-2.7.0.tar.gz文件复制到/usr目录下
1
scp hadoop-
2.7.
0.tar.gz root
122:/usr/
- 解压hadoop-2.7.0.tar.gz文件,并重命名
1
2
3
4
5
6
cd /usr
tar zxvf hadoop-
2.7.0.gz
mv hadoop-
.0 hadoop
//
删除hadoop-
.gz文件
rm –rf hadoop-
.gz
1
2
cd /usr/hadoop
mkdir tmp
- 把Hadoop的安装路径添加到”/etc/profile”中
1
2
3
4
5
6
7
8
9
10
11
12
vi /etc/profile
// 将以下数据加入到
文件末尾
export HADOOP_INSTALL=/usr/hadoop
${HADOOP_INSTALL}/bin:
${HADOOP_INSTALL}/sbin
${PATH}
export HADOOP_MAPRED_HOME=
${HADOOP_INSTALL}
export HADOOP_COMMON_HOME=
export HADOOP_HDFS_HOME=
export YARN_HOME=
${HADOOP_INSTALLL}
export HADOOP_COMMON_LIB_NATIVE_DIR=
${HADOOP_INSTALL}/lib/natvie
export HADOOP_OPTS=
"-Djava.library.path=${HADOOP_INSTALL}/lib:${HADOOP_INSTALL}/lib/native"
source /etc/profile
配置hadoop(先只在Master主机配置,配置完后传入两个Slave主机)
- 设置hadoop-env.sh和yarn-env.sh中的java环境变量
1
2
3
4
5
cd /usr/hadoop/etc/hadoop/
vi hadoop-env.sh
//
修改JAVA_HOME
8.0_45
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
vi core-site.xml
//
修改文件内容为以下
<configuration>
property>
name>hadoop.tmp.dir
</name>
value>/usr/hadoop/tmp
value>
description>A base for other temporary directories.
description>
name>fs.default.name
value>hdfs://Master.Hadoop:9000
value>
property>
configuration>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
vi hdfs-site.xml
//
修改文件内容为以下
configuration>
property>
name>dfs.namenode.name.dir
name>
value>file:///usr/hadoop/dfs/name
value>
property>
name>dfs.d
atanode.data.dir
value>file:///usr/hadoop/dfs/data
name>dfs.replication
value>1
property>
name>dfs.nameservices
name>
value>hadoop-cluster1
value>
property>
name>dfs.namenode.secondary.http-address
value>Master.Hadoop:50090
name>dfs.webhdfs.enabled
value>true
configuration>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
vi mapred-site.xml
//
修改文件为以下
name>mapreduce.framework.name
value>yarn
value>
final>true
final>
name>mapreduce.jobtracker.http.address
value>Master.Hadoop:50030
name>mapreduce.jobhistory.address
value>Master.Hadoop:10020
name>mapreduce.jobhistory.webapp.address
value>Master.Hadoop:19888
name>mapred.job.tracker
value>http://Master.Hadoop:9001
property>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
vi yarn-site.xml
//
修改文件内容为以下
name>yarn.resourcemanager.hostname
value>Master.Hadoop
name>yarn.nodemanager.aux-services
value>mapreduce_shuffle
name>yarn.resourcemanager.address
value>Master.Hadoop:8032
name>yarn.resourcemanager.scheduler.address
value>Master.Hadoop:8030
name>yarn.resourcemanager.resource-tracker.address
value>Master.Hadoop:8031
name>yarn.resourcemanager.admin.address
value>Master.Hadoop:8033
name>yarn.resourcemanager.webapp.address
value>Master.Hadoop:8088
property>
配置Hadoop的集群
- 将Master中配置好的hadoop传入两个Slave中
1
2
scp -r /usr/hadoop root
:/usr/
scp -r /usr/hadoop root
:/usr/
1
2
3
4
5
6
7
cd /usr/hadoop/etc/hadoop
vi slaves
// 将
文件内容修改为
Slave1
.Hadoop
Slave2
.Hadoop
1
2
hadoop namenode -
format
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// 关闭机器防火墙 service iptables stop cd /usr/hadoop/sbin ./start-all.sh // 更推荐的运行方式: cd /usr/hadoop/sbin ./start-dfs.sh ./start-yarn.sh 应该输出以下信息: Starting namenodes on [Master.Hadoop] Master.Hadoop: starting namenode,logging to /usr/hadoop/logs/hadoop-root-namenode-localhost.localdomain.out Slave2.Hadoop: starting datanode,logging to /usr/hadoop/logs/hadoop-root-datanode-Slave2.Hadoop.out Slave1.Hadoop: starting datanode,logging to /usr/hadoop/logs/hadoop-root-datanode-Slave1.out starting yarn daemons starting resourcemanager,logging to /usr/hadoop/logs/yarn-root-resourcemanager-localhost.out Slave1.Hadoop: starting nodemanager,logging to /usr/hadoop/logs/yarn-root-nodemanager-Slave1.out Slave2.Hadoop: starting nodemanager,logging to /usr/hadoop/logs/yarn-root-nodemanager-Slave2.out
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
jps
Master:
3930 ResourceManager
4506 Jps
3693 NameNode
Slave:
2792 NodeManager
2920 Jps
2701 D
atanode
hadoop dfsadmin
-report
Configured Capacity:
14382268416 (
13.39 GB)
Present Capacity:
10538565632 (
9.81 GB)
DFS Remaining:
10538557440 (
9.81 GB)
DFS Used:
8192 (
8 KB)
DFS Used
%:
0.00%
Under replicated blocks:
0
Blocks
with corrupt replicas:
0
Missing blocks:
0
Missing blocks (
with replication factor
1):
0
-------------------------------------------------
Live d
atanodes (
2):
Name:
.124:
50010 (Slave2
.Hadoop)
Hostname: Slave2
.Hadoop
Decommission Status : Normal
Configured Capacity:
7191134208 (
6.70 GB)
DFS Used:
4096 (
4 KB)
Non DFS Used:
1921933312 (
1.79 GB)
DFS Remaining:
5269196800 (
4.91 GB)
DFS Used
%
DFS Remaining
73.27%
Configured
Cache Capacity:
0 (
0 B)
Cache Used:
Cache Remaining:
Cache Used
100.00%
Cache Remaining
%
Xceivers:
1
Last contact: Thu Jul
02 10:
45:
04 CST
2015
Name:
.125:
50010 (Slave1
.Hadoop)
Hostname: Slave1
1921769472 (
5269360640 (
73.281
1
2
3
4
5
6
7
systemctl stop firewalld
http:
参考资料