︿
Top

2015年8月1日 星期六

Hadoop: 如何在 CentOS 7.1.1503 安裝 Hadoop 2.7.1 (Single-Node Cluster)

緣起

承續前一篇 CentOS: 如何在 VMware Workstation 11 安裝 CentOS 7, 並避免 Easy Install  , 本文將繼續就 Hadoop 的安裝作記錄.

在網路上搜尋了一下,
1. 參考文件前 3 篇, 有針對在 Ubuntu Desktop 12.04 上如何安裝 Hadoop 1.0.4 作了說明
2. 參考文件前 3 篇, 因為是比較舊的版本, 且不是 CentOS, 所以又參考了最後 2 篇的內容

本文只是將筆者的操作過程作記錄, 供後續安裝的參考, 以免忘記. 且重點在於安裝的部份, 至於運作的部份 (Map Reduce), 將另文作記錄.

如果跟其它作者的文章有雷同, 尚請見諒.

以下將分為三大部份:

1. Hadoop 安裝套件下載
 1.1 建立專門執行 Hadoop 的使用者
 1.2 以 Hadoop 使用者身份, 下載 Hadoop 2.7.1 套件

2. Hadoop 安裝環境檢查
 2.1 停用 IPv6
 2.2 更新 JDK 至最新版本
 2.3 啟用 ssh 取代 telnet, 加強連線安全性

3. Hadoop 實際安裝
 3.1 解壓縮, 並搬移至對應的資料夾
 3.2 修正相關的 login 及 hadoop 環境設定 script
 3.3 喘口氣, 查一下 Hadoop 的版本
 3.4 建立 HDFS 資料夾
 3.5 格式化 HDFS 資料夾
 3.6 啟動 Hadoop 服務
 3.7 開啟網頁管理介面
 3.8 停止 Hadoop 服務



1. Hadoop 安裝套件下載

1.1 建立專門執行 Hadoop 的使用者

  • 加入群組及使用者
註: ll 是 ls -l 的別名 (alias)
註: grep 那行指令, 是在找出多個帳號資料檔的有含括 hadoop 這個字串的內容
[root@localhost ~]# groupadd hadoop
[root@localhost ~]# useradd hadoop -g hadoop
[root@localhost ~]# ll -d /home/hadoop
drwx------. 3 hadoop hadoop 4096  7月 28 11:49 /home/hadoop

[root@localhost ~]# grep hadoop /etc/passwd /etc/shadow /etc/group
/etc/passwd:hadoop:x:1001:1001::/home/hadoop:/bin/bash
/etc/shadow:hadoop:!!:16644:0:99999:7:::
/etc/group:hadoop:x:1001:


  • 設定 hadoop 的密碼為 hadoop
[root@localhost ~]# passwd hadoop
更改使用者 hadoop 的密碼。
新 密碼:
不良的密碼:密碼短於 8 個字元
再次輸入新的 密碼:
passwd:所有驗證 token 都已成功更新。
  • 查詢 hadoop 使用者的密碼設定資訊
[root@localhost ~]# chage -l hadoop
最近一次密碼修改時間     : 7月 28, 2015
密碼過期     :從不
密碼失效     :從不
帳戶過期     :從不
最少必須相隔幾天才能改變密碼    :0
最多必須相隔幾天才能改變密碼    :99999
在密碼將要過期之前多少天會發出警告

  • 以 root 執行 visudo, 將 hadoop 加入 sudoers
[root@localhost ~]# [root@localhost ~]# visudo

## Next comes the main part: which users can run what software on
## which machines (the sudoers file can be shared between multiple
## systems).
## Syntax:
##
##      user    MACHINE=COMMANDS
##
## The COMMANDS section may have other options added to it.
##
## Allow root to run any commands anywhere
root    ALL=(ALL)       ALL
jasper  ALL=(ALL)       ALL
hadoop  ALL=(ALL)       ALL

1.2 以 Hadoop 使用者身份, 下載 Hadoop 2.7.1 套件

http://hadoop.apache.org/releases.html, 點選所需版本, 會顯示可下載的 Mirror Site
[hadoop@localhost ~]$ wget http://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz 
--2015-07-28 15:56:30--  http://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz
正在查找主機 ftp.twaren.net (ftp.twaren.net)... 140.110.123.9, 2001:e10:5c00:5::9
正在連接 ftp.twaren.net (ftp.twaren.net)|140.110.123.9|:80... 連上了。
已送出 HTTP 要求,正在等候回應... 200 OK
長度: 210606807 (201M) [application/x-gzip]
Saving to: ‘hadoop-2.7.1.tar.gz’

100%[======================================>] 210,606,807 5.32MB/s   in 46s    

2015-07-28 15:57:16 (4.41 MB/s) - ‘hadoop-2.7.1.tar.gz’ saved [210606807/210606807]


2. Hadoop 安裝環境檢查

2.1 停用 IPv6

  • 停用 IPv6 (全域)(加入以下幾列)
[hadoop@localhost ~]$ sudo gedit /etc/sysctl.conf
[sudo] password for hadoop: 

# disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
  • 重開機, 檢查是否已停用 (回傳值如果是1, 代表停用)
[hadoop@localhost ~]$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
1
[hadoop@localhost ~]$

2.2 更新 JDK 至最新版本


  • 檢查 JDK 版本
首次檢查為 1.7.0_75 版
[hadoop@localhost ~]# java -version
java version "1.7.0_75"
OpenJDK Runtime Environment (rhel-2.5.4.2.el7_0-x86_64 u75-b13)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)

-- http://hadoop.apache.org/docs/current/
-- IMPORTANT notes
--      This release drops support for JDK6 runtime and works with JDK 7+ only.

  • 更新 JDK
[hadoop@localhost ~]$ sudo yum install java
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * base: ftp.nsysu.edu.tw
 * extras: ftp.stust.edu.tw
 * updates: mirror01.idc.hinet.net
Resolving Dependencies
--> Running transaction check
---> Package java-1.7.0-openjdk.x86_64 1:1.7.0.75-2.5.4.2.el7_0 will be updated
--> Processing Dependency: java-1.7.0-openjdk = 1:1.7.0.75-2.5.4.2.el7_0 for package: 1:java-1.7.0-openjdk-devel-1.7.0.75-2.5.4.2.el7_0.x86_64
---> Package java-1.7.0-openjdk.x86_64 1:1.7.0.85-2.6.1.2.el7_1 will be an update
--> Processing Dependency: java-1.7.0-openjdk-headless = 1:1.7.0.85-2.6.1.2.el7_1 for package: 1:java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64
--> Running transaction check
---> Package java-1.7.0-openjdk-devel.x86_64 1:1.7.0.75-2.5.4.2.el7_0 will be updated
---> Package java-1.7.0-openjdk-devel.x86_64 1:1.7.0.85-2.6.1.2.el7_1 will be an update
---> Package java-1.7.0-openjdk-headless.x86_64 1:1.7.0.75-2.5.4.2.el7_0 will be updated
---> Package java-1.7.0-openjdk-headless.x86_64 1:1.7.0.85-2.6.1.2.el7_1 will be an update
--> Processing Dependency: libsctp.so.1(VERS_1)(64bit) for package: 1:java-1.7.0-openjdk-headless-1.7.0.85-2.6.1.2.el7_1.x86_64
--> Processing Dependency: libsctp.so.1()(64bit) for package: 1:java-1.7.0-openjdk-headless-1.7.0.85-2.6.1.2.el7_1.x86_64
--> Running transaction check
---> Package lksctp-tools.x86_64 0:1.0.13-3.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package                      Arch    Version                    Repository
                                                                           Size
================================================================================
Updating:
 java-1.7.0-openjdk           x86_64  1:1.7.0.85-2.6.1.2.el7_1   updates  204 k
Installing for dependencies:
 lksctp-tools                 x86_64  1.0.13-3.el7               base      87 k
Updating for dependencies:
 java-1.7.0-openjdk-devel     x86_64  1:1.7.0.85-2.6.1.2.el7_1   updates  9.2 M
 java-1.7.0-openjdk-headless  x86_64  1:1.7.0.85-2.6.1.2.el7_1   updates   25 M

Transaction Summary
================================================================================
Install             ( 1 Dependent package)
Upgrade  1 Package  (+2 Dependent packages)

Total size: 35 M
Is this ok [y/d/N]: y
Downloading packages:
警告:/var/cache/yum/x86_64/7/updates/packages/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64.rpm: 表頭 V3 RSA/SHA256 Signature, key ID f4a80eb5: NOKEY
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
Importing GPG key 0xF4A80EB5:
 Userid     : "CentOS-7 Key (CentOS 7 Official Signing Key) "
 Fingerprint: 6341 ab27 53d7 8a78 a7c2 7bb1 24c6 a8a7 f4a8 0eb5
 Package    : centos-release-7-1.1503.el7.centos.2.8.x86_64 (@anaconda)
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
Is this ok [y/N]: y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : lksctp-tools-1.0.13-3.el7.x86_64                             1/7 
  Updating   : 1:java-1.7.0-openjdk-headless-1.7.0.85-2.6.1.2.el7_1.x86_6   2/7 
warning: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/jre/lib/security/US_export_policy.jar created as /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/jre/lib/security/US_export_policy.jar.rpmnew
warning: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/jre/lib/security/java.security created as /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/jre/lib/security/java.security.rpmnew
warning: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/jre/lib/security/local_policy.jar created as /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/jre/lib/security/local_policy.jar.rpmnew
  Updating   : 1:java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64           3/7 
  Updating   : 1:java-1.7.0-openjdk-devel-1.7.0.85-2.6.1.2.el7_1.x86_64     4/7 
  Cleanup    : 1:java-1.7.0-openjdk-devel-1.7.0.75-2.5.4.2.el7_0.x86_64     5/7 
  Cleanup    : 1:java-1.7.0-openjdk-headless-1.7.0.75-2.5.4.2.el7_0.x86_6   6/7 
  Cleanup    : 1:java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64           7/7 
warning: file /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre/lib/amd64/xawt/libmawt.so: remove failed: No such file or directory
warning: file /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre/lib/amd64/libsplashscreen.so: remove failed: No such file or directory
warning: file /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre/lib/amd64/libpulse-java.so: remove failed: No such file or directory
warning: file /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre/lib/amd64/libjsoundalsa.so: remove failed: No such file or directory
  Verifying  : 1:java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64           1/7 
  Verifying  : 1:java-1.7.0-openjdk-headless-1.7.0.85-2.6.1.2.el7_1.x86_6   2/7 
  Verifying  : lksctp-tools-1.0.13-3.el7.x86_64                             3/7 
  Verifying  : 1:java-1.7.0-openjdk-devel-1.7.0.85-2.6.1.2.el7_1.x86_64     4/7 
  Verifying  : 1:java-1.7.0-openjdk-headless-1.7.0.75-2.5.4.2.el7_0.x86_6   5/7 
  Verifying  : 1:java-1.7.0-openjdk-devel-1.7.0.75-2.5.4.2.el7_0.x86_64     6/7 
  Verifying  : 1:java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64           7/7 

Dependency Installed:
  lksctp-tools.x86_64 0:1.0.13-3.el7                                            

Updated:
  java-1.7.0-openjdk.x86_64 1:1.7.0.85-2.6.1.2.el7_1                            

Dependency Updated:
  java-1.7.0-openjdk-devel.x86_64 1:1.7.0.85-2.6.1.2.el7_1                      
  java-1.7.0-openjdk-headless.x86_64 1:1.7.0.85-2.6.1.2.el7_1                   

Complete!

  • 檢查 JDK 版本
更新後檢查為 1.7.0_85 版
[hadoop@localhost ~]$ java -version
java version "1.7.0_85"
OpenJDK Runtime Environment (rhel-2.6.1.2.el7_1-x86_64 u85-b01)
OpenJDK 64-Bit Server VM (build 24.85-b03, mixed mode)


2.3 啟用 ssh 取代 telnet, 加強連線安全性


  • 安裝 ssh
[hadoop@localhost ~]$ sudo yum install ssh
Loaded plugins: fastestmirror, langpacks
base                                                     | 3.6 kB     00:00     
extras                                                   | 3.4 kB     00:00     
updates                                                  | 3.4 kB     00:00     
updates/7/x86_64/primary_db                                | 2.5 MB   00:01     
Loading mirror speeds from cached hostfile
 * base: ftp.nsysu.edu.tw
 * extras: ftp.stust.edu.tw
 * updates: mirror01.idc.hinet.net
No package ssh available.
Error: Nothing to do

  • 安裝 rsync
[hadoop@localhost ~]$ sudo yum install rsync
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * base: ftp.yzu.edu.tw
 * extras: ftp.yzu.edu.tw
 * updates: mirror01.idc.hinet.net
Package rsync-3.0.9-15.el7.x86_64 already installed and latest version
Nothing to do

  • 設定 ssh (產生一個不用密碼的 RSA key pair)
[hadoop@localhost ~]$ ssh-keygen -t rsa -P "" 
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 這裡按 
Created directory '/home/hadoop/.ssh'.
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
11:f6:aa:00:5d:21:cd:c2:a4:86:75:21:8e:11:1c:ce hadoop@localhost.localdomain
The key's randomart image is:
+--[ RSA 2048]----+
|+o+o=+o.o        |
|oB =ooo. o       |
|oE= ..  . .      |
| . .     o       |
|    .   S        |
|     . .         |
|      .          |
|                 |
|                 |
+-----------------+

  • 設定 ssh (把新產生的 key 放到已認證的 key 中)
[hadoop@localhost ~]$ cat $HOME/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDYI7vmi4HqvyrvyYr974Hfy0oWVIEV4Utx9X4HjSCLKcFktht/+lLLa7WJ/lCP+IKbTEJKi8cVue09WUfKlatCI9BrEC8f3RPL4nFamjx/dfJUGyxxbGg52rogPuCgDLi0KgtepjgG/ykgfm2nzL9L+63LEQ9YKNMoyX3tl755rgkiIx9TuFwb4c+8yQ0h+BBaCFkiRAI7jJyGMJVSMoalraedQu5QBpSdIGtNCWTsifF8cv7NwxO9eCWpdguxm3S1OXTek29/ML0blqi3UdsBDZ6jsE/pYPcfZPShLw3v4tjA1gOy8wis3qJ7JA4eW3S90GvtBQkHIyOs2kyFTHSB hadoop@localhost.localdomain
[hadoop@localhost ~]$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

  • 設定 ssh (更新 ssh server 至最版本)
[hadoop@localhost ~]$ sudo yum install openssh-server
[sudo] password for hadoop: 
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * base: ftp.nsysu.edu.tw
 * extras: ftp.stust.edu.tw
 * updates: mirror01.idc.hinet.net
Resolving Dependencies
--> Running transaction check
---> Package openssh-server.x86_64 0:6.6.1p1-11.el7 will be updated
---> Package openssh-server.x86_64 0:6.6.1p1-12.el7_1 will be an update
--> Processing Dependency: openssh = 6.6.1p1-12.el7_1 for package: openssh-server-6.6.1p1-12.el7_1.x86_64
--> Running transaction check
---> Package openssh.x86_64 0:6.6.1p1-11.el7 will be updated
--> Processing Dependency: openssh = 6.6.1p1-11.el7 for package: openssh-clients-6.6.1p1-11.el7.x86_64
---> Package openssh.x86_64 0:6.6.1p1-12.el7_1 will be an update
--> Running transaction check
---> Package openssh-clients.x86_64 0:6.6.1p1-11.el7 will be updated
---> Package openssh-clients.x86_64 0:6.6.1p1-12.el7_1 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package               Arch         Version                 Repository     Size
================================================================================
Updating:
 openssh-server        x86_64       6.6.1p1-12.el7_1        updates       432 k
Updating for dependencies:
 openssh               x86_64       6.6.1p1-12.el7_1        updates       431 k
 openssh-clients       x86_64       6.6.1p1-12.el7_1        updates       634 k

Transaction Summary
================================================================================
Upgrade  1 Package (+2 Dependent packages)

Total size: 1.5 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Updating   : openssh-6.6.1p1-12.el7_1.x86_64                              1/6 
  Updating   : openssh-clients-6.6.1p1-12.el7_1.x86_64                      2/6 
  Updating   : openssh-server-6.6.1p1-12.el7_1.x86_64                       3/6 
  Cleanup    : openssh-server-6.6.1p1-11.el7.x86_64                         4/6 
  Cleanup    : openssh-clients-6.6.1p1-11.el7.x86_64                        5/6 
  Cleanup    : openssh-6.6.1p1-11.el7.x86_64                                6/6 
  Verifying  : openssh-clients-6.6.1p1-12.el7_1.x86_64                      1/6 
  Verifying  : openssh-server-6.6.1p1-12.el7_1.x86_64                       2/6 
  Verifying  : openssh-6.6.1p1-12.el7_1.x86_64                              3/6 
  Verifying  : openssh-server-6.6.1p1-11.el7.x86_64                         4/6 
  Verifying  : openssh-clients-6.6.1p1-11.el7.x86_64                        5/6 
  Verifying  : openssh-6.6.1p1-11.el7.x86_64                                6/6 

Updated:
  openssh-server.x86_64 0:6.6.1p1-12.el7_1                                      

Dependency Updated:
  openssh.x86_64 0:6.6.1p1-12.el7_1  openssh-clients.x86_64 0:6.6.1p1-12.el7_1 

Complete!

  • 設定 ssh (試著以 ssh 連接本機)
發現有警告訊息
ECDSA key fingerprint is c5:69:94:f8:f2:cf:4f:fb:ad:d6:35:22:d7:50:71:09.
Are you sure you want to continue connecting (yes/no)? y
Please type 'yes' or 'no': yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
[hadoop@localhost ~]$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is c5:69:94:f8:f2:cf:4f:fb:ad:d6:35:22:d7:50:71:09.
Are you sure you want to continue connecting (yes/no)? y
Please type 'yes' or 'no': yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
hadoop@localhost's password: 
Last login: Tue Jul 28 14:56:13 2015

[hadoop@localhost ~]$ exit
logout
Connection to localhost closed.

  • 設定 ssh (產生一個不用密碼的 ECDSA key pair)
[hadoop@localhost ~]$ ssh-keygen -t ecdsa -P "" 
Generating public/private ecdsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_ecdsa): 
Your identification has been saved in /home/hadoop/.ssh/id_ecdsa.
Your public key has been saved in /home/hadoop/.ssh/id_ecdsa.pub.
The key fingerprint is:
41:bf:5e:86:f9:d5:59:b5:e4:f1:2f:73:51:58:08:7f hadoop@localhost.localdomain
The key's randomart image is:
+--[ECDSA  256]---+
|        .   .. *+|
|       . .   .= *|
|        . .   .+E|
|         . +   o=|
|        S + o +.+|
|         . + . + |
|          . .    |
|                 |
|                 |
+-----------------+

  • 設定 ssh (把新產生的 ECDSA key 放到已認證的 key 中)
[hadoop@localhost ~]$ cat $HOME/.ssh/id_ecdsa.pub >> $HOME/.ssh/authorized_keys
[hadoop@localhost ~]$ 

  • 設定 ssh (再連接一次, 成功了, 不再有 WARNING)
[hadoop@localhost ~]$ ssh localhost
hadoop@localhost's password: 
Last login: Tue Jul 28 15:37:08 2015 from localhost
[hadoop@localhost ~]$


3. Hadoop 實際安裝

3.1 解壓縮, 並搬移至對應的資料夾

  • 查一下資料夾
[hadoop@localhost ~]$ ls -l
總計 205704
-rw-rw-r--. 1 hadoop hadoop 210606807  7月  7 08:32 hadoop-2.7.1.tar.gz
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 下載
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 公共
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 圖片
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 影片
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 文件
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 桌面
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 模板
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 音樂

  • 解壓縮
[hadoop@localhost ~]$ sudo tar xzf hadoop-2.7.1.tar.gz
[sudo] password for hadoop:

  • 查一下資料夾
[hadoop@localhost ~]$ ls -l
總計 205708
drwxr-xr-x. 9  10021  10021      4096  6月 29 14:15 hadoop-2.7.1
-rw-rw-r--. 1 hadoop hadoop 210606807  7月  7 08:32 hadoop-2.7.1.tar.gz
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 下載
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 公共
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 圖片
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 影片
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 文件
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 桌面
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 模板
drwxr-xr-x. 2 hadoop hadoop      4096  7月 28 14:45 音樂

  • 把 hadoop-2.7.1 改名為 hadoop, 再搬到 /usr/local 資料夾
過程中有出現權限不符的問題, 用 sudo 解決
[hadoop@localhost ~]$ mv hadoop-2.7.1 hadoop
[hadoop@localhost ~]$ ls
hadoop  hadoop-2.7.1.tar.gz  下載  公共  圖片  影片  文件  桌面  模板  音樂
[hadoop@localhost ~]$ 

[hadoop@localhost ~]$ mv hadoop /usr/local
mv: 無法建立目錄‘/usr/local/hadoop’: 拒絕不符權限的操作
[hadoop@localhost ~]$ sudo mv hadoop /usr/local
[sudo] password for hadoop: 
[hadoop@localhost ~]$ 


  • 檢查一下資料夾
發現 user, group 都是數字 10021, 怪怪的 ..., 查了一下, 也沒有這種 uid, gid, 還是改一下 Owner 好了
[hadoop@localhost local]$ cd /usr/local
[hadoop@localhost local]$ ls -al
總計 52
drwxr-xr-x. 13 root  root  4096  7月 28 16:12 .
drwxr-xr-x. 13 root  root  4096  7月 24 18:14 ..
drwxr-xr-x.  2 root  root  4096  6月 10  2014 bin
drwxr-xr-x.  2 root  root  4096  6月 10  2014 etc
drwxr-xr-x.  2 root  root  4096  6月 10  2014 games
drwxr-xr-x.  9 10021 10021 4096  6月 29 14:15 hadoop
drwxr-xr-x.  2 root  root  4096  6月 10  2014 include
drwxr-xr-x.  2 root  root  4096  6月 10  2014 lib
drwxr-xr-x.  2 root  root  4096  6月 10  2014 lib64
drwxr-xr-x.  2 root  root  4096  6月 10  2014 libexec
drwxr-xr-x.  2 root  root  4096  6月 10  2014 sbin
drwxr-xr-x.  5 root  root  4096  7月 24 18:14 share
drwxr-xr-x.  2 root  root  4096  6月 10  2014 src
[hadoop@localhost local]$ 

-- 修正 /usr/local/hadoop 資料夾的 owner
[hadoop@localhost local]$ sudo chown -R hadoop:hadoop hadoop
[sudo] password for hadoop: 
[hadoop@localhost local]$ ls
bin  etc  games  hadoop  include  lib  lib64  libexec  sbin  share  src
[hadoop@localhost local]$ ls -al
總計 52
drwxr-xr-x. 13 root   root   4096  7月 28 16:12 .
drwxr-xr-x. 13 root   root   4096  7月 24 18:14 ..
drwxr-xr-x.  2 root   root   4096  6月 10  2014 bin
drwxr-xr-x.  2 root   root   4096  6月 10  2014 etc
drwxr-xr-x.  2 root   root   4096  6月 10  2014 games
drwxr-xr-x.  9 hadoop hadoop 4096  6月 29 14:15 hadoop
drwxr-xr-x.  2 root   root   4096  6月 10  2014 include
drwxr-xr-x.  2 root   root   4096  6月 10  2014 lib
drwxr-xr-x.  2 root   root   4096  6月 10  2014 lib64
drwxr-xr-x.  2 root   root   4096  6月 10  2014 libexec
drwxr-xr-x.  2 root   root   4096  6月 10  2014 sbin
drwxr-xr-x.  5 root   root   4096  7月 24 18:14 share
drwxr-xr-x.  2 root   root   4096  6月 10  2014 src

3.2 修正相關的 login 及 hadoop 環境設定 script

  • 修改 hadoop 使用者的登入設定, 加入以下這幾列
主要在於異動 hadoop 的 login script
[hadoop@localhost ~]$ gedit .bashrc

########################### for Hadoop #############################

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
# 注意:這裡必須依照實際的路徑進行設定
export JAVA_HOME=/usr/lib/jvm/java

# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"

# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
    hadoop fs -cat $1 | lzop -dc | head -1000 | less
}

## Add Hadoop bin/ directory to PATH
#export PATH=$PATH:$HADOOP_HOME/bin

# 以下來自 http://shaurong.blogspot.tw/2015/03/hadoop-260-single-cluster-centos-70.html
#export HADOOP_HOME=/usr/local/hadoop
export HADOOP_PREFIX=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export YARN_CONF_DIR=$HADOOP_CONF_DIR

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#export PATH=$PATH:$HADOOP_HOME/sbin

  • 修改 hadoop-env.sh
[hadoop@localhost hadoop]$ cd $HADOOP_HOME/etc/hadoop
[hadoop@localhost hadoop]$ ls
capacity-scheduler.xml      httpfs-env.sh            mapred-env.sh
configuration.xsl           httpfs-log4j.properties  mapred-queues.xml.template
container-executor.cfg      httpfs-signature.secret  mapred-site.xml.template
core-site.xml               httpfs-site.xml          slaves
hadoop-env.cmd              kms-acls.xml             ssl-client.xml.example
hadoop-env.sh               kms-env.sh               ssl-server.xml.example
hadoop-metrics2.properties  kms-log4j.properties     yarn-env.cmd
hadoop-metrics.properties   kms-site.xml             yarn-env.sh
hadoop-policy.xml           log4j.properties         yarn-site.xml
hdfs-site.xml               mapred-env.cmd
[hadoop@localhost hadoop]$ 
-- 停用 IPv6
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true

  • 改一堆 .sh 的 JAVA_HOME 設定
參考這篇[研究] Hadoop 2.6.0 Single Cluster 安裝 (CentOS 7.0 x86_64)
[hadoop@localhost hadoop]$ gedit hadoop-env.sh
[hadoop@localhost hadoop]$ gedit httpfs-env.sh
[hadoop@localhost hadoop]$ gedit mapred-env.sh
[hadoop@localhost hadoop]$ gedit yarn-env.sh
[hadoop@localhost hadoop]$ 


3.3 喘口氣, 查一下 Hadoop 的版本

  • 查一下 Hadoop 的版本
[hadoop@localhost hadoop]$ hadoop version
Hadoop 2.7.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a
Compiled by jenkins on 2015-06-29T06:04Z
Compiled with protoc 2.5.0
From source with checksum fc0a1a23fc1868e4d5ee7fa2b28a58a
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.1.jar
[hadoop@localhost hadoop]$ 

3.4 建立 HDFS 資料夾

  • 建立 HDFS 所在資料夾
[hadoop@localhost ~]$ sudo mkdir -p /app/hadoop/tmp
[sudo] password for hadoop: 

  • 建立 HDFS 所在資料夾
[hadoop@localhost ~]$ sudo mkdir -p /app/hadoop/tmp
[sudo] password for hadoop: 

  • 修正 HDFS 所在資料夾的擁有者及權限
[hadoop@localhost ~]$ sudo chown hadoop:hadoop /app/hadoop/tmp
[hadoop@localhost ~]$ ls -l /app/hadoop
總計 4
drwxr-xr-x. 2 hadoop hadoop 4096  7月 29 09:25 tmp
[hadoop@localhost ~]$ sudo chmod 750 /app/hadoop/tmp
[hadoop@localhost ~]$ ls -l /app/hadoop
總計 4
drwxr-x---. 2 hadoop hadoop 4096  7月 29 09:25 tmp
[hadoop@localhost ~]$ 

  • 修正一些 .xml 的設定
[hadoop@localhost ~]$ cd $HADOOP_CONF_DIR
[hadoop@localhost hadoop]$ pwd
/usr/local/hadoop/etc/hadoop
[hadoop@localhost hadoop]$ gedit core-site.xml
-- gedit mapred-site.xml 會出現找不到檔案的狀況, 要由 mapred-site.xml.template 複製修正
[hadoop@localhost hadoop]$ gedit mapred-site.xml
[hadoop@localhost hadoop]$ ls *.xml
capacity-scheduler.xml  hadoop-policy.xml  httpfs-site.xml  kms-site.xml
core-site.xml           hdfs-site.xml      kms-acls.xml     yarn-site.xml
[hadoop@localhost hadoop]$ ls *.template
mapred-queues.xml.template  mapred-site.xml.template
[hadoop@localhost hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@localhost hadoop]$ gedit mapred-site.xml
[hadoop@localhost hadoop]$ gedit hdfs-site.xml
* core-site.xml

    
        hadoop.tmp.dir
        /app/hadoop/tmp
        A base for other temporary directories.
    
    
    
        fs.default.name
        hdfs://localhost:54310
        The name of the default file system.  A URI whose
        scheme and authority determine the FileSystem implementation.  The
        uri's scheme determines the config property (fs.SCHEME.impl) naming
        the FileSystem implementation class.  The uri's authority is used to
        determine the host, port, etc. for a filesystem.
    

* mapred-site.xml (要由 mapred-site.xml.template 複製過來)

    
        mapred.job.tracker
        localhost:54311
        The host and port that the MapReduce job tracker runs
        at.  If "local", then jobs are run in-process as a single map
        and reduce task.
        
    

* hdfs-site.xml

    
        dfs.replication
        1
        Default block replication.
        The actual number of replications can be specified when the file is created.
        The default is used if replication is not specified in create time.
        
    


3.5 格式化 HDFS 資料夾

  • 格式化 HDFS 資料夾
參考文件前3篇用的方式已過時, 改用新的
* 舊方式
[hadoop@localhost hadoop]$ /usr/local/hadoop/bin/hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
* 新方式
[hadoop@localhost hadoop]$ /usr/local/hadoop/bin/hdfs namenode -format
15/07/29 09:57:31 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = localhost.localdomain/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.7.1
STARTUP_MSG:   classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-auth-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.1-tests.jar:/usr/local/hadoop/share/hadoop/common/hadoop-nfs-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.1.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-2.7.1.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.1-tests.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/usr/local/hadoop/share/hadoop/yarn/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-api-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-registry-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/hadoop-annotations-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.1-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.1.jar:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a; compiled by 'jenkins' on 2015-06-29T06:04Z
STARTUP_MSG:   java = 1.7.0_85
************************************************************/
15/07/29 09:57:31 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
15/07/29 09:57:31 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-8d90d8f0-e802-4325-aa0e-87a4ff66902f
15/07/29 09:57:32 INFO namenode.FSNamesystem: No KeyProvider found.
15/07/29 09:57:32 INFO namenode.FSNamesystem: fsLock is fair:true
15/07/29 09:57:32 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
15/07/29 09:57:32 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
15/07/29 09:57:32 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
15/07/29 09:57:32 INFO blockmanagement.BlockManager: The block deletion will start around 2015 七月 29 09:57:32
15/07/29 09:57:32 INFO util.GSet: Computing capacity for map BlocksMap
15/07/29 09:57:32 INFO util.GSet: VM type       = 64-bit
15/07/29 09:57:32 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
15/07/29 09:57:32 INFO util.GSet: capacity      = 2^21 = 2097152 entries
15/07/29 09:57:32 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
15/07/29 09:57:32 INFO blockmanagement.BlockManager: defaultReplication         = 1
15/07/29 09:57:32 INFO blockmanagement.BlockManager: maxReplication             = 512
15/07/29 09:57:32 INFO blockmanagement.BlockManager: minReplication             = 1
15/07/29 09:57:32 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
15/07/29 09:57:32 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false
15/07/29 09:57:32 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
15/07/29 09:57:32 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
15/07/29 09:57:32 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
15/07/29 09:57:32 INFO namenode.FSNamesystem: fsOwner             = hadoop (auth:SIMPLE)
15/07/29 09:57:32 INFO namenode.FSNamesystem: supergroup          = supergroup
15/07/29 09:57:32 INFO namenode.FSNamesystem: isPermissionEnabled = true
15/07/29 09:57:32 INFO namenode.FSNamesystem: HA Enabled: false
15/07/29 09:57:32 INFO namenode.FSNamesystem: Append Enabled: true
15/07/29 09:57:32 INFO util.GSet: Computing capacity for map INodeMap
15/07/29 09:57:32 INFO util.GSet: VM type       = 64-bit
15/07/29 09:57:32 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
15/07/29 09:57:32 INFO util.GSet: capacity      = 2^20 = 1048576 entries
15/07/29 09:57:32 INFO namenode.FSDirectory: ACLs enabled? false
15/07/29 09:57:32 INFO namenode.FSDirectory: XAttrs enabled? true
15/07/29 09:57:32 INFO namenode.FSDirectory: Maximum size of an xattr: 16384
15/07/29 09:57:32 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/07/29 09:57:32 INFO util.GSet: Computing capacity for map cachedBlocks
15/07/29 09:57:32 INFO util.GSet: VM type       = 64-bit
15/07/29 09:57:32 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
15/07/29 09:57:32 INFO util.GSet: capacity      = 2^18 = 262144 entries
15/07/29 09:57:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
15/07/29 09:57:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
15/07/29 09:57:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
15/07/29 09:57:32 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
15/07/29 09:57:32 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
15/07/29 09:57:32 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
15/07/29 09:57:32 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
15/07/29 09:57:32 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
15/07/29 09:57:32 INFO util.GSet: Computing capacity for map NameNodeRetryCache
15/07/29 09:57:32 INFO util.GSet: VM type       = 64-bit
15/07/29 09:57:32 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
15/07/29 09:57:32 INFO util.GSet: capacity      = 2^15 = 32768 entries
Re-format filesystem in Storage Directory /app/hadoop/tmp/dfs/name ? (Y or N) Y
15/07/29 09:57:56 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1287759520-127.0.0.1-1438135076819
15/07/29 09:57:57 INFO common.Storage: Storage directory /app/hadoop/tmp/dfs/name has been successfully formatted.
15/07/29 09:57:57 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
15/07/29 09:57:57 INFO util.ExitUtil: Exiting with status 0
15/07/29 09:57:57 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/

3.6 啟動 Hadoop 服務

  • 啟動 Hadoop 服務
參考文件前3篇用的方式已過時, 改用新的
* 舊方式
[hadoop@localhost hadoop]$ /usr/local/hadoop/sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [localhost]
hadoop@localhost's password: 
* 新方式
[hadoop@localhost hadoop]$ /usr/local/hadoop/sbin/start-dfs.sh
Starting namenodes on [localhost]
hadoop@localhost's password: 
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-namenode-localhost.localdomain.out
hadoop@localhost's password: 
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-localhost.localdomain.out
Starting secondary namenodes [0.0.0.0]
hadoop@0.0.0.0's password: 
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out
[hadoop@localhost hadoop]$ 
[hadoop@localhost hadoop]$ /usr/local/hadoop/sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-resourcemanager-localhost.localdomain.out
hadoop@localhost's password: 
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-localhost.localdomain.out

  • 檢查 jvm 上面有那些程序在執行 ( Java Virtual Machine Process Status Tool )
[hadoop@localhost ~]$ jps
10394 NameNode
11816 Jps
11582 NodeManager
11007 SecondaryNameNode
10640 DataNode
11254 ResourceManager
[hadoop@localhost ~]$ 

  • 檢查 listen port
連結上述 jps 結果, 10394 NameNode, 有 2 個 port 54310 及 50070; 10640 DataNode, 有 4 個 port 50020, 41806, 50010, 50075
[hadoop@localhost ~]$ sudo netstat -plten | grep java
[sudo] password for hadoop: 
tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN      1001       73631      10640/java          
tcp        0      0 127.0.0.1:54310         0.0.0.0:*               LISTEN      1001       72112      10394/java          
tcp        0      0 0.0.0.0:50090           0.0.0.0:*               LISTEN      1001       78302      11007/java          
tcp        0      0 127.0.0.1:41806         0.0.0.0:*               LISTEN      1001       74658      10640/java          
tcp        0      0 0.0.0.0:50070           0.0.0.0:*               LISTEN      1001       72105      10394/java          
tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN      1001       74652      10640/java          
tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN      1001       73630      10640/java          
tcp6       0      0 :::8033                 :::*                    LISTEN      1001       79302      11254/java          
tcp6       0      0 :::8040                 :::*                    LISTEN      1001       85107      11582/java          
tcp6       0      0 :::8042                 :::*                    LISTEN      1001       85111      11582/java          
tcp6       0      0 :::35146                :::*                    LISTEN      1001       85102      11582/java          
tcp6       0      0 :::8088                 :::*                    LISTEN      1001       80182      11254/java          
tcp6       0      0 :::8030                 :::*                    LISTEN      1001       79293      11254/java          
tcp6       0      0 :::8031                 :::*                    LISTEN      1001       79286      11254/java          
tcp6       0      0 :::8032                 :::*                    LISTEN      1001       79298      11254/java          
[hadoop@localhost ~]$


3.7 開啟網頁管理介面

網頁管理介面共有以下 3 組, 但至目前的操作為止, 還沒有啟動 JobTracker 及 TaskTracker, 所以只有第1項可以看到頁面
http://localhost:50070/ – web UI of the NameNode daemon
http://localhost:50030/ – web UI of the JobTracker daemon
http://localhost:50060/ – web UI of the TaskTracker daemon

以下擷取一些畫面


3.8 停止 Hadoop 服務

  • 停止 Hadoop 服務
停止 Hadoop 服務 (YARN, ResourceManager, NodeManager)
[hadoop@localhost ~]$ /usr/local/hadoop/sbin/stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
hadoop@localhost's password: 
localhost: stopping nodemanager
no proxyserver to stop
停止 Hadoop 服務 (NameNode and DataNode)
[hadoop@localhost ~]$ /usr/local/hadoop/sbin/stop-dfs.sh
Stopping namenodes on [localhost]
hadoop@localhost's password: 
localhost: stopping namenode
hadoop@localhost's password: 
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
hadoop@0.0.0.0's password: 

  • 再以 jps 檢查一下, 以確定服務已停止 ( Java Virtual Machine Process Status Tool )
[hadoop@localhost ~]$ jps
15262 Jps
[hadoop@localhost ~]$ 


總結

經過了一堆繁瑣的程序, 終於完成了一個 Single-Node Cluster 的 Hadoop VM. 但目前只是完成最初的安裝, 還沒有實際把資料放上去作測試. 這個部份, 將另文作記錄.

參考文件



.

沒有留言:

張貼留言