Ambari——大数据平台的搭建利器（4）

论坛元老

Rank: 8 Rank: 8

UID: 1066743

1^#

打印

字体大小: tT

look_w发表于 2018-6-23 10:56 | 只看该作者

Ambari——大数据平台的搭建利器（4）

Ambari 的架构和工作原理Ambari 基本的架构和工作原理如下图 17 所示。
图 17. Ambari 的基本架构

Ambari Server 会读取 Stack 和 Service 的配置文件。当用 Ambari 创建集群的时候，Ambari Server 传送 Stack 和 Service 的配置文件以及 Service 生命周期的控制脚本到 Ambari Agent。Agent 拿到配置文件后，会下载安装公共源里软件包（Redhat，就是使用 yum 服务）。安装完成后，Ambari Server 会通知 Agent 去启动 Service。之后 Ambari Server 会定期发送命令到 Agent 检查 Service 的状态，Agent 上报给 Server，并呈现在 Ambari 的 GUI 上。
Ambari Server 支持 Rest API，这样可以很容易的扩展和定制化 Ambari。甚至于不用登陆 Ambari 的 GUI，只需要在命令行通过 curl 就可以控制 Ambari，以及控制 Hadoop 的 cluster。具体的 API 可以参见 Apache Ambari 的官方网页 API reference。
对于安全方面要求比较苛刻的环境来说，Ambari 可以支持 Kerberos 认证的 Hadoop 集群。
扩展 Ambari 管理一个自定义的 Service首先，我们需要规划自定义的 Service 属于哪个 Stack（当然 Stack 也是可以自定义的）。这里为了快速创建一个新的 Service，而且我们已经安装了 HDP 2.2 的 Stack，所以就将自定义的 Service 放在 HDP 2.2 之下。
第一步，首先在 Ambari Service 机器上找到 HDP 2.2 Stack 的目录，如下图所示。
图 18. HDP 2.2 的目录

第二步，需要创建一个 Service 目录，我们这里用“SAMPLE”作为目录名。并在 SAMPLE 底下创建 metainfo.xml。示例代码如下。主要解释下 xml 代码中的两个字段 category 和 cardinality。category 指定了该模块（Component）的类别，可以是 MASTER、SLAVE、CLIENT。Cardinality 指的是所要安装的机器数，可以是固定数字 1，可以是一个范围比如 1-2，也可以是 1+，或者 ALL。如果是一个范围的时候，安装的时候会让用户选择机器。另外这里有关 Service 和 Component 的 name 配置要用大写，小写有时候会有问题。Displayname 可以随意设置。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

<?xml version="1.0"?>
<metainfo>
<schemaVersion>2.0</schemaVersion>
<services>
<service>
<name>SAMPLE</name>
<displayName>My Sample</displayName>
<comment>My v1 Sample</comment>
<version>1.0</version>
<components>
<component>
<name>MYMASTER</name>
<displayName>My Master</displayName>
<category>MASTER</category>
<cardinality>1</cardinality>
<commandScript>
<script>scripts/master.py</script>
<scriptType>PYTHON</scriptType>
<timeout>5000</timeout>
</commandScript>
</component>
<component>
<name>MYSALVE</name>
<displayName>My Slave</displayName>
<category>SLAVE</category>
<cardinality>1+</cardinality>
<commandScript>
<script>scripts/slave.py</script>
<scriptType>PYTHON</scriptType>
<timeout>5000</timeout>
</commandScript>
</component>
</components>
<osSpecifics>
<osSpecific>
<osFamily>any</osFamily>
</osSpecific>
</osSpecifics>
</service>
</services>
</metainfo>

第三步，需要创建 Service 的控制脚本。这里我们需要在 SAMPLE 底下创建一个 package 目录，然后在 package 底下创建目录 scripts，进而创建 master.py 和 slave.py。这里需要保证脚本路径和上一步中 metainfo.xml 中的配置路径是一致的。这两个 Python 脚本是用来控制 Master 和 Slave 模块的生命周期。脚本中函数的含义也如其名字一样：install 就是安装调用的接口；start、stop 分别就是启停的调用；Status 是定期检查 component 状态的调用；Configure 是安装完成配置该模块的调用。示例目录结构如下图。
图 19. Sample Service 的目录结构

Python 脚本的示例代码：
Master.py：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

import sys, os
from resource_management import *
from resource_management.core.exceptions import ComponentIsNotRunning
from resource_management.core.environment import Environment
from resource_management.core.logger import Logger

class Master(Script):
def install(self, env):
print "Install My Master"

def configure(self, env):
print "Configure My Master"

def start(self, env):
print "Start My Master"

def stop(self, env):
print "Stop My Master"

def status(self, env):
print "Status..."

if __name__ == "__main__":
Master().execute()

Slave.py:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

import sys, os
from resource_management import *
from resource_management.core.exceptions import ComponentIsNotRunning
from resource_management.core.environment import Environment
from resource_management.core.logger import Logger

class Slave(Script):
def install(self, env):
print "Install My Slave"

def configure(self, env):
print "Configure My Slave"

def start(self, env):
print "Start My Slave"

def stop(self, env):
print "Stop My Slave"
def status(self, env):
print "Status..."

if __name__ == "__main__":
Slave().execute()

第四步，需要重启 Ambari Server。因为 Ambari Server 只有在重启的时候才会读取 Service 和 Stack 的配置。命令行执行：

1	ambari-server restart

第五步，登录 Ambari 的 GUI，点击左下角的 Action，选择 Add Service。如下图：
图 20. Add Service 按钮

这时候就可以看到我们自定义的 Service：SAMPLE。如下图：
图 21. Sample Service 列表

选择左侧 My Sample 后，就可以一路 Next 了，这个过程其实和我们在搭建 Hadoop2.x 集群的时候是类似的。由于这个 Service 没有真的安装包，所以安装过程会非常的快，启动命令也没有真正的逻辑，所以启动过程也是很快的。等最后点击完 Complete，整个安装过程也就结束了。再回到 Ambari 的 Dashboard 的时候，我们就可以看到这个 My Sample 了，如下图：
图 22. My Sample 的 Service 页面

到此就可以和第四节中管理 Hadoop 集群一样管理我们的 My Sample。例如下图，Stop 我们的 My Sample。
图 23. Stop Sample 页面 1

图 24. Stop Sample 页面 2

图 25. Stop Sample 页面 3

进阶的篇幅中，将会探讨如何给我们的 My Sample 自定义一些 Actions，以及 Action 之间的依赖关系如何定义。篇幅有限，这里就先到此为止。希望以上的介绍能够燃起大家对 Ambari 的热情。

收藏分享评分

回复引用

订阅 TOP

返回列表