本文最后更新于:2021年7月23日 晚上
题目 第六天答辩题目,随机抽取,抽到了第一题(答案是同学写的,我整理发出来了):
Hadoop分布式搭建 入门:用至少3个节点搭建Hadoop集群,没有高可用,节点进程分布合理 升级:搭建Zookeeper集群,实现Hadoop高可用 高级:在上述基础上,搭建历史服务器以及实现黑白名单机制 满分:实现联邦HDFS
做这一题
隐藏好友 入门:用3个及以上MapReduce完成 升级:用2个MapReduce完成 高级:用工作流调度2个MapReduce
Flume采集MySQL数据 入门:自定义Source,采集固定表的数据 升级:可以采集任意表的数据 高级:不限制于MySQL数据库,可以采集任意数据库的任意表数据
Flume向MySQL写入数据 入门:自定义Sink,写入固定的表 升级:可以用户指定表及字段,向数据库写入 高级:不限制于MySQL数据库,可以向任意数据库写入数据
部署Ganglia监控Flume 入门:成功部署Ganglia 升级:成功部署Ganglia并且监控Flume 高级:在监控基础上,形成监控日志,可以通过日志分析出请求高峰时间以及数据读写高峰
共同好友&访问次数 入门:完成共同好 升级:完成共同好友&访问次数 高级:可以获取任意多个人的共同好友,例如可以获取3/4/5等个人的共同好友
friend.txt
tom rose tom jim tom smith tom lucy rose tom rose lucy rose smithjim tomjim lucy smith jim smith tom smith rose
commonfriend.txt
A : B , C , D , F , E , O B : A , C , E , K C : F , A , D , I D : A , E , F , L E : B , C , D , M , L F : A , B , C , D , E , O , M G : A , C , D , E , F H : A , C , D , E , O I : A , O J : B , O K : A , C , D L : D , E , F M : E , F , G O : A , H , I , J
tomcat.log
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 192.168.120.23 -- [30/Apr/2018:20:25:32 +0800] "GET /asf.avi HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:32 +0800] "GET /bupper.png HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:32 +0800] "GET /bupper.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /bg-button HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /bbutton.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /asf.jpg HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /tomcat.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /tomcat.png HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /tbutton.png HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /tinput.png HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:33 +0800] "GET /tbg.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:34 +0800] "GET /tomcat.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:34 +0800] "GET /bg.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:34 +0800] "GET /bg-button.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:34 +0800] "GET /bg-input.css HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:34 +0800] "GET /bd-input.png HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:34 +0800] "GET /bg-input.png HTTP/1.1" 304 -192.168.120.23 -- [30/Apr/2018:20:25:34 +0800] "GET /music.mp3 HTTP/1.1" 304 -
答案 Hadoop完全分布式 环境 主机hadoop:
/opt/hadoop-3.1.3
/opt/jdk1.8
/etc/profile # 环境变量位置
/mnt/hgfs # 共享文件夹
node1、node2为hadoop的拷贝
hadoop
设置hosts
vim /etc/hosts 192.168.244.100 hadoop 192.168.244.101 node1 192.168.244.102 node2
生成ssh公钥,准备配置三台主机间的ssh连接
ssh-keygen -t rsa cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys # 记录公钥
node1
VMware克隆node1(问题一,紧急模式)
配置网络
cd /etc/sysconfig/network-scripts vim ifcfg-ens33 # IPADDR=192.168.244.101 DNS1=114.114.114.114
配置映射
vim /etc/hostname # 修改主机名为node1 vim /etc/hosts # 修改ip映射 添加node1、node2
重新配置本机ssh
ssh-keygen -t rsa # 配置免密 cd /root/.ssh cat id_rsa.pub >> authorized_keys # 记录公钥 chmod 0600 ~/.ssh/authorized_keys # 修改文件权限 ssh-copy-id -i hadoop # 将公钥传给主机 vim /etc/ssh/sshd_config # 允许root登录 PermitRootLogin yes systemctl enable sshd # 设置ssh服务开机自启
检查配置
# terminal ssh root@192.168.244.101 # 检测ssh是否正常运行 ping 192.168.100 # 检测和其他主机是否连通 ping www.baidu.com # 检测网络是否正常 ip addr # ip是否正常 java -version # java是否正常 hadoop -version # hadoop是否正常
node2
VMware克隆node2
配置网络
cd /etc/sysconfig/network-scripts vim ifcfg-ens33 # IPADDR=192.168.244.102 DNS1=114.114.114.114
配置映射
vim /etc/hostname # 修改主机名为node2 vim /etc/hosts # 修改ip映射 添加node1、node2
重新配置本机ssh
ssh-keygen -t rsa # 配置免密 cd /root/.ssh cat id_rsa.pub >> authorized_keys # 记录公钥 chmod 0600 ~/.ssh/authorized_keys # 修改文件权限 ssh-copy-id -i hadoop # 将公钥传给主机 vim /etc/ssh/sshd_config # 允许root登录 PermitRootLogin yes systemctl enable sshd # 设置ssh服务开机自启
检查配置
# terminal ssh root@192.168.244.101 # 检测ssh是否正常运行 ping 192.168.100 # 检测和其他主机是否连通 ping www.baidu.com # 检测网络是否正常 ip addr # ip是否正常 java -version # java是否正常 hadoop -version # hadoop是否正常
综合
hadoop主机公钥传输给子节点,并测试连接子节点
# hadoop ssh-copy-id -i node1 ssh-copy-id -i node2 ssh node1 # 连接成功 ssh node2 # 连接成功
配置hadoop
# hadoop cd /opt/hadoop-3.1.3/etc/hadoop/
① vim core-site.xml
<configuration > <property > <name > fs.default.name</name > <value > hdfs://hadoop:9000</value > </property > <property > <name > hadoop.tmp.dir</name > <value > /opt/hadoop-3.1.3/tmp</value > </property > </configuration >
② vim hdfs-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 <configuration > <property > <name > dfs.replication</name > <value > 3</value > </property > <property > <name > dfs.name.dir</name > <value > /opt/hadoop-3.1.3/name</value > </property > <property > <name > dfs.data.dir</name > <value > /opt/hadoop-3.1.3/data</value > </property > <property > <name > dfs.permissions</name > <value > false</value > </property > </configuration >
③ vim mapred-site.xml
<configuration > <property > <name > mapreduce.framework.name</name > <value > yarn</value > </property > </configuration >
④ vim yarn-site.xml
<configuration > <property > <name > yarn.resourcemanager.hostname</name > <value > hadoop</value > </property > <property > <name > yarn.nodemanager.aux-services</name > <value > mapreduce_shuffle</value > </property > <property > <name > yarn.nodemanager.vmem-check-enabled</name > <value > false</value > </property > </configuration >
⑤ vim workers
拷贝共享主机上的hadoop文件
# hadoopcp -r /opt/hadoop-3.1.3/etc/hadoop /mnt/hgfs/hadoop_copy
配置完毕,在主机hadoop进行初始化
rm -rf /opt/hadoop-3.1.3/logsmkdir /opt/hadoop-3.1.3/logshdfs namenode -format
启动分布式集群(问题二,bash v3.2+)
hadoop分布式集群启动成功
上传文件成功
idea BigDataTools连接成功
问题一 显示进入紧急模式
开始不太清楚问题所在,输入root密码后也可以成功进入系统
但随后,配置网络始终不生效,显示网卡未在运行(systemctl status network),polkit状态为死亡(systemctl status polkit)
于是查看系统日志发现,是某个挂载未找到对应内容导致的
随后在通过vim /etc/fstab找到,是挂载的共享文件夹没有正常配置,因为VMware在新的虚拟机中会默认关闭这个选项,打开后问题成功解决
启用共享文件夹
问题二 出现的问题如下图:
开始看到 “要求bash v3.2+” 比较奇怪,不可能bash版本出问题吧
随后想到了当前终端并不是bash,而是在配置快捷工具的zsh中
于是执行以下命令,简单解决 。
Zookeeper高性能集群 软件版本:
软件
版本
java
1.8
hadoop
3.1.3
hbase
2.3.5
zookeeper
3.6.0
hbase和zookeeper版本对应
http://hbase.apache.org/book.html#zookeeper
hbase参考:https://www.w3cschool.cn/hbase_doc/hbase_doc-vxnl2k1n.html
配置zk hadoop cp /mnt/hgfs/apache-zookeeper-3.6.0-bin.tar.gz /opt cp /mnt/hgfs/hbase-2.3.5-bin.tar.gz /opt cd /opt tar -xvf apache-zookeeper-3.6.0-bin.tar.gz tar -xvf hbase-2.3.5-bin.tar.gz rm -rf /opt/apache-zookeeper-3.6.0-bin.tar.gz rm -rf /opt/hbase-2.3.5-bin.tar.gz
cd opt mv apache-zookeeper-3.6.0-bin zookeeper-3.6.0 cd zookeeper-3.6.0/conf cp zoo_sample.cfg zoo.cfg vim zoo.cfg
dataDir =/opt/zookeeper-3.6.0/tmp server.1 =hadoop:2888:3888 server.2 =node1:2888:3888 server.3 =node2:2888:3888
vim /etc/profile
export ZOOKEEPER_HOME=/opt/zookeeper-3.6.0 export PATH=$PATH:$ZOOKEEPER_HOME/bin
source /etc/profile
cd /opt/zookeeper-3.6.0 # zookeeper-3.6.0目录下 mkdir tmp touch tmp/myid# 共享配置好的zookeeper文件 cp -r /opt/zookeeper-3.6.0 /mnt/hgfs
node1 cp -r /mnt/hgfs/zookeeper-3.6.0 /opt/
vim /etc/profile
export ZOOKEEPER_HOME=/opt/zookeeper-3.6.0 export PATH=$PATH:$ZOOKEEPER_HOME/bin
source /etc/profile
node2 cp -r /mnt/hgfs/zookeeper-3.6.0 /opt/
vim /etc/profile
export ZOOKEEPER_HOME=/opt/zookeeper-3.6.0 export PATH=$PATH:$ZOOKEEPER_HOME/bin
source /etc/profile
综合(启动时出现错误1) # hadoop echo 1 > /opt/zookeeper-3.6.0/tmp/myid# node1 echo 2 > /opt/zookeeper-3.6.0/tmp/myid# node2 echo 3 > /opt/zookeeper-3.6.0/tmp/myid# hadoop /opt/zookeeper-3.6.0/bin/zkServer.sh start# node1 /opt/zookeeper-3.6.0/bin/zkServer.sh start# node2 /opt/zookeeper-3.6.0/bin/zkServer.sh start
查看版本(错误2)
zk配置成功 启动成功
查看状态,一个leader,两个follower
查看版本
错误1 启动失败
查看日志logs
错误: 找不到或无法加载主类 org.apache.zookeeper.server.quorum.QuorumPeerMain
查阅文档发现,-bin的包才是我们需要的,重新下载进行配置
错误2 命令不在白名单
解决:
# hadoopvim /opt/zookeeper-3.6.0/conf/zoo.cfgecho stat |nc localhost 2181
hadoop文件修改 # hadoopcd /opt/hadoop-3.1.3/etc/hadoop/
① vim core-site.xml
<configuration > <property > <name > fs.defaultFS</name > <value > hdfs://myha01/</value > </property > <property > <name > hadoop.tmp.dir</name > <value > /opt/hadoop-3.1.3/tmp</value > </property > <property > <name > io.file.buffer.size</name > <value > 131072</value > </property > <property > <name > ha.zookeeper.quorum</name > <value > hadoop:2181,node1:2181,node2:2181</value > </property > <property > <name > ha.zookeeper.session-timeout.ms</name > <value > 1000</value > <description > hadoop链接zookeeper的超时时长设置ms</description > </property > </configuration >
② vim hdfs-site.xml
<configuration > <property > <name > dfs.replication</name > <value > 3</value > </property > <property > <name > dfs.namenode.name.dir</name > <value > file:///opt/hadoop-3.1.3/name</value > </property > <property > <name > dfs.datanode.data.dir</name > <value > file:///opt/hadoop-3.1.3/data</value > </property > <property > <name > dfs.permissions</name > <value > false</value > </property > <property > <name > dfs.nameservices</name > <value > myha01</value > </property > <property > <name > dfs.webhdfs.enabled</name > <value > true</value > </property > <property > <name > dfs.ha.namenodes.myha01</name > <value > nn1,nn2,nn3</value > </property > <property > <name > dfs.namenode.rpc-address.myha01.nn1</name > <value > hadoop:9000</value > </property > <property > <name > dfs.namenode.http-address.myha01.nn1</name > <value > hadoop:50070</value > </property > <property > <name > dfs.namenode.rpc-address.myha01.nn2</name > <value > node1:9000</value > </property > <property > <name > dfs.namenode.http-address.myha01.nn2</name > <value > node1:50070</value > </property > <property > <name > dfs.namenode.rpc-address.myha01.nn3</name > <value > node2:9000</value > </property > <property > <name > dfs.namenode.http-address.myha01.nn3</name > <value > node2:50070</value > </property > <property > <name > dfs.namenode.shared.edits.dir</name > <value > qjournal://hadoop:8485;node1:8485;node2:8485/myha01</value > </property > <property > <name > dfs.journalnode.edits.dir</name > <value > /opt/hadoop-3.1.3/journaldata</value > </property > <property > <name > dfs.ha.automatic-failover.enabled</name > <value > true</value > </property > <property > <name > dfs.client.failover.proxy.provider.myha01</name > <value > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value > </property > <property > <name > dfs.ha.fencing.methods</name > <value > sshfence </value > </property > <property > <name > dfs.ha.fencing.ssh.private-key-files</name > <value > /root/.ssh/id_rsa</value > </property > <property > <name > dfs.ha.fencing.ssh.connect-timeout</name > <value > 30000</value > </property > <property > <name > ha.failover-controller.cli-check.rpc-timeout.ms</name > <value > 60000</value > </property > </configuration >
③ vim mapred-site.xml
<configuration > <property > <name > mapreduce.framework.name</name > <value > yarn</value > </property > <property > <name > mapreduce.jobtracker.http.address</name > <value > hadoop:50030</value > </property > <property > <name > mapreduce.jobhistory.address</name > <value > hadoop:10020</value > </property > <property > <name > mapreduce.jobhistory.webapp.address</name > <value > hadoop:19888</value > </property > <property > <name > mapred.job.tracker</name > <value > http://hadoop:9001</value > </property > </configuration >
④ vim yarn-site.xml
<configuration > <property > <name > yarn.nodemanager.vmem-check-enabled</name > <value > false</value > </property > <property > <name > yarn.resourcemanager.ha.enabled</name > <value > true</value > </property > <property > <name > yarn.resourcemanager.cluster-id</name > <value > yrc</value > </property > <property > <name > yarn.resourcemanager.ha.rm-ids</name > <value > rm1,rm2</value > </property > <property > <name > yarn.resourcemanager.hostname.rm1</name > <value > node1</value > </property > <property > <name > yarn.resourcemanager.hostname.rm2</name > <value > node2</value > </property > <property > <name > yarn.resourcemanager.zk-address</name > <value > hadoop:2181,node1:2181,node2:2181</value > </property > <property > <name > yarn.nodemanager.aux-services</name > <value > mapreduce_shuffle</value > </property > <property > <name > yarn.log-aggregation-enable</name > <value > true</value > </property > <property > <name > yarn.log.server.url</name > <value > http://hadoop:19888/jobhistory/logs</value > </property > <property > <name > yarn.log-aggregation.retain-seconds</name > <value > 86400</value > </property > <property > <name > yarn.resourcemanager.recovery.enabled</name > <value > true</value > </property > <property > <name > yarn.resourcemanager.store.class</name > <value > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value > </property > <property > <name > yarn.application.classpath</name > <value > /opt/hadoop-3.1.3/etc/hadoop:/opt/hadoop-3.1.3/share/hadoop/common/lib/*:/opt/hadoop-3.1.3/share/hadoop/common/*:/opt/hadoop-3.1.3/share/hadoop/hdfs:/opt/hadoop-3.1.3/share/hadoop/hdfs/lib/*:/opt/hadoop-3.1.3/share/hadoop/hdfs/*:/opt/hadoop-3.1.3/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.1.3/share/hadoop/mapreduce/*:/opt/hadoop-3.1.3/share/hadoop/yarn:/opt/hadoop-3.1.3/share/hadoop/yarn/lib/*:/opt/hadoop-3.1.3/share/hadoop/yarn/*</value > </property > </configuration >
将文件共享给子节点
# hadoopscp -r /opt/hadoop-3.1.3/etc/hadoop node1:/opt/hadoop-3.1.3/etcscp -r /opt/hadoop-3.1.3/etc/hadoop node2:/opt/hadoop-3.1.3/etc
初始化(错误3)
# hadoop node1 node2zkServer.sh starthadoop-daemon.sh start journalnode
运行命令
# hadoop node1 node2zkServer.sh start
关闭命令(错误5)
# hadoopstop-all.sh(错误5)hadoop-daemon.sh stop zkfchadoop-daemon.sh stop journalnodezkServer.sh stop
运行成功
后台状态运行正常(两个namenode)
分布式文件系统使用正常
查看(node1、node2)中namenode状态
准备杀死node1:active
可以看到node1中namenode已被杀死
node2中namenode状态自动转为active
网页访问
错误3 journaldata中有数据存在使得journalnode格式化失败
修改journaldata存放地址
为:
cd /opt/hadoop-3.1.3rm -rf data/journaldatamkdir journaldata
错误4 没有指定HDFS_JOURNALNODE_USER
此时hadoop整体能够运行
联系第一次配置时分析,应该是需要修改sh文件指定执行者
cd /opt/hadoop-3.1.3/sbinvim start-dfs.shHDFS_JOURNALNODE_USER=rootHDFS_ZKFC_USER=rootvim stop-dfs.shHDFS_JOURNALNODE_USER=rootHDFS_ZKFC_USER=root
子节点同样修改
错误5 namenode没有正常关闭
查阅资料发现问题是任务的不正常结束
另外在hadoop2中还发现存在pid文件的不安全问题
先尝试清理数据文件并重新初始化,成功解决
cd /opt/hadoop-3.1.3rm -rf datarm -rf namerm -rf tmprm -rf journaldatarm -rf logsmkdir datamkdir namemkdir tmpmkdir journaldatamkdir logs
历史服务器及高可用性
历史服务器 hadoop vim mapred-site.xml
<property > <name > mapreduce.jobhistory.address</name > <value > hadoop:10020</value > </property > <property > <name > mapreduce.jobhistory.webapp.address</name > <value > hadoop:19888</value > </property >
vim yarn-site.xml
<property > <name > yarn.log-aggregation-enable</name > <value > true</value > </property > <property > <name > yarn.log.server.url</name > <value > http://hadoop102:19888/jobhistory/logs</value > </property > <property > <name > yarn.log-aggregation.retain-seconds</name > <value > 604800</value > </property >
分发配置
scp -r /opt/hadoop-3.1.3/etc/hadoop node1:/opt/hadoop-3.1.3/etc scp -r /opt/hadoop-3.1.3/etc/hadoop node2:/opt/hadoop-3.1.3/etc mapred --daemon start historyserver
进入以下网址查看日志:https://hadoop:19888/jobhistory
测试
# 创建测试文件 hdfs dfs -mkdir /test vim a.txt hdfs dfs -put a.txt /test# 如果HDFS有/output目录删除 hdfs dfs -rmr /output# 执行mapreduce程序 hadoop jar /opt/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /test /output
黑白名单 白名单 cd /opt/hadoop-3.1.3/etc/hadoop vim dfs.hosts# hadoop # node1
vim hdfs-site.xml
<property > <name > dfs.hosts</name > <value > /opt/hadoop-3.1.3/etc/hadoop/dfs.hosts</value > </property >
分发:
scp -r /opt/hadoop-3 .1 .3 /etc/hadoop node1 :/opt/hadoop-3 .1 .3 /etcscp -r /opt/hadoop-3 .1 .3 /etc/hadoop node2 :/opt/hadoop-3 .1 .3 /etc
仅允许hadoop、node1加入集群,node2就无法加入集群
检测:
# 刷新节点 hdfs dfsadmin -refreshNodes
node2将无法加入集群
黑名单 cd /opt/hadoop-3.1.3/etc/hadoopvim dfs.hosts.exclude# node2
vim hdfs-site.xml
<property > <name > dfs.hosts.exclude</name > <value > /opt/hadoop-3.1.3/etc/hadoop/dfs.hosts.exclude</value > </property >
分发:
scp -r /opt/hadoop-3 .1 .3 /etc/hadoop node1 :/opt/hadoop-3 .1 .3 /etcscp -r /opt/hadoop-3 .1 .3 /etc/hadoop node2 :/opt/hadoop-3 .1 .3 /etc
仅允许hadoop、node1加入集群,node2就无法加入集群
检测:
# 刷新节点hdfs dfsadmin -refreshNodesyarn rmadmin -refreshNodes
node2节点状态将变为:decommission in progress(退役中)
在一段时间后变为decommissioned(所有块已经复制完成)