Hadoop集群的部署

装置版本: hadoop-2.8.3.tar.gz

mkdir /usr/local/hadooptar zxvf hadoop-2.8.3.tar.gz -C /usr/local/hadoop

配置环境变量(hadoop2和hadoop3同样也须要批改hosts文件)

vi /etc/profileexport FLINK_HOME=/usr/local/hadoop/hadoop-2.8.3export PATH=$FLINK_HOME/bin:$PATHsource /etc/profile

先建好稍后须要用到的文件夹

mkdir /usr/local/hadoopmkdir /usr/local/hadoop/tmpmkdir /usr/local/hadoop/varmkdir /usr/local/hadoop/dfsmkdir /usr/local/hadoop/dfs/namemkdir /usr/local/hadoop/dfs/data

批改core-site.xml文件

vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/core-site.xml<configuration><property><name>hadoop.tmp.dir</name><value>/usr/local/hadoop/tmp</value><description>Abase for other temporary directories.</description></property><property><name>fs.defaultFS</name><value>hdfs://hadoop1:9000</value></property></configuration>

批改mapred-site.xml文件

cp /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/mapred-site.xml  vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/mapred-site.xml  <name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.jobhistory.address</name><value>hadoop1:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>hadoop1:19888</value></property>

批改hdfs-site.xml文件

vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/hdfs-site.xml<configuration><property><name>dfs.replication</name><value>1</value></property><property><name>dfs.namenode.name.dir</name><value>/usr/local/hadoop/dfs/name</value></property><property><name>dfs.datanode.data.dir</name><value>/usr/local/hadoop/dfs/data</value></property><property>    <name>dfs.replication</name>    <value>2</value>     <description>HDFS 的数据块的正本存储个数, 默认是3</description>  </property>  <property>      <name>dfs.permissions</name>      <value>false</value>      <description>need not permissions</description></property></configuration>

批改hdfs-site.xml文件

vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/yarn-site.xml<property><name>yarn.resourcemanager.hostname</name><value>hadoop1</value></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.resourcemanager.address</name><value>hadoop1:8032</value></property><property><name>yarn.resourcemanager.scheduler.address</name><value>hadoop1:8030</value></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>hadoop1:8031</value></property><property><name>yarn.resourcemanager.admin.address</name><value>hadoop1:8033</value></property><property><name>yarn.resourcemanager.webapp.address</name><value>hadoop1:8088</value></property><property>  <name>yarn.resourcemanager.am.max-attempts</name>  <value>4</value>  <description>    The maximum number of application master execution attempts.  </description></property>

如果是以Flink on Yarn形式启动的,因为Hadoop Yarn是一个资源调度器,所以咱们应该思考好每个Conatiner被调配到的内存资源,所以须要在文件hdfs-site.xml中配置好 yarn.nodemanager.resource.memory-mb, yarn.scheduler.minimum-allocation-mb, yarn.scheduler.maximum-allocation-mb, yarn.app.mapreduce.am.resource.mbyarn.app.mapreduce.am.command-opts,不然会产生内存不足,导致Application启动失败。

Current usage: 303.2 MB of 1 GB physical memory used; 2.3 GB of 2.1 GB virtual memory used. Killing container.

vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/yarn-site.xml

<property>  <name>yarn.nodemanager.vmem-check-enbaled</name>  <value>false</value></property><property>    <name>yarn.nodemanager.resource.memory-mb</name>    <value>106496</value></property><property>    <name>yarn.scheduler.minimum-allocation-mb</name>    <value>2048</value></property><property>    <name>yarn.scheduler.maximum-allocation-mb</name>    <value>106496</value></property><property>    <name>yarn.app.mapreduce.am.resource.mb</name>    <value>4096</value></property><property>    <name>yarn.app.mapreduce.am.command-opts</name>    <value>-Xmx3276m</value></property>

批改 hadoop-env.sh , mapred-env.shyarn-env.sh

vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/hadoop-env.shexport JAVA_HOME="/usr/local/jdk/jdk1.8.0_251"vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/mapred-env.shexport JAVA_HOME="/usr/local/jdk/jdk1.8.0_251"vi /usr/local/hadoop/hadoop-2.8.3/etc/hadoop/yarn-env.shexport JAVA_HOME="/usr/local/jdk/jdk1.8.0_251"

把/hadoop发送给另外两台服务器

scp -r /usr/local/hadoop hadoop2:/usr/localscp -r /usr/local/hadoop hadoop3:/usr/local
启动Hadoop集群

初始化HDFS零碎

/usr/local/hadoop/hadoop-2.8.3/bin/hdfs namenode -format

开启 NameNode 和 DataNode 守护过程

/usr/local/hadoop/hadoop-2.8.3/sbin/start-all.sh

在浏览器中输出 http://hadoop1:50070,可查看相干信息

运行wordcount demo
bin/hdfs dfs -mkdir /inputbin/hdfs dfs -ls /bin/hdfs dfs -put /usr/local/hadoop/tmp/input_hadoop_demo_test.txt /input/bin/hdfs dfs -ls /input/bin/hadoop jar /usr/local/hadoop/hadoop-2.8.5/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.5.jar wordcount /input/input_hadoop_demo_test.txt /outputbin/hdfs dfs -ls /outputbin/hdfs dfs -cat /output/part-r-00000