大数据平台搭建利器 Ambari 之 Kerberos 集成之路（2）

论坛元老

Rank: 8 Rank: 8

UID: 1066743

1^#

打印

字体大小: tT

look_w发表于 2018-6-24 14:08 | 只看该作者

大数据平台搭建利器 Ambari 之 Kerberos 集成之路（2）

Ambari 与 Kerberos在 Ambari 环境中，Kerberos 本身是作为 Ambari 的一个 Service 而存在。当用户通过 Ambari 的 Automated Kerberization 机制启用 Kerberos 认证的时候，Ambari 会为各个 Service 创建对应的 Principal 以及 Keytab 文件。在 Linux Redhat6（CentOS 6）上，Ambari 默认的会使用 Kerberos1.10.3 的版本，而在 Redhat7（CentOS 7）上，则默认使用 Kerberos1.13.2。因此，需要启用 Kerberos 认证的集群，需要注意 Kerberos 版本的兼容性问题。Ambari、Stack、Service 以及 KDC 的关系大致如下图所示：
图 2. Ambari 与 KDC 的关系图

当一个模块或者一个用户要通过 KDC 认证自己的时候，会需要一个 Principal 以及该 Principal 的 Key。当 Keytab 文件存在的时候，模块（或用户）则可以直接使用 Principal 以及包含该 Principal 的 Keytab 文件向 KDC 认证。因而，对 Ambari 而言，其功能就是要为对应的 Service 或者 Component 创建 Principal 以及该 Principal 的 Keytab 文件。这里要理解，Principal 是在 KDC 中创建，并保存在 KDC Server 的数据库。Keytab 是一个存放 Principal 以及加密过的 Principal Key 的文件，该文件需要存放在 Kerberos Client 对应的机器中，并且要对相应的用户设置只读权限。在 Ambari 的环境中，Ambari 已经为用户完成了以上的操作。有兴趣的读者可以想想，为什么这里需要定向的设置只读权限。
图 3. 启动 Kerberos 的 Ambari 机器上面的 Keytab 文件权限

Ambari Server 端的接口下来我们简单看下 Ambari 如何创建 Principal 以及 Keytab 文件。首先我们需要假设 Ambari Server 已经获取到将要创建的 Principal 的名字以及存放 Keytab 文件的路径（这些参数都定义在 Kerberos Descriptor 中，后面会详细介绍）。这里主要涉及两个 Java 代码文件，分别是 CreatePrincipalServerAction.java 和 CreateKeytabFileServerAction.java。具体接口如下：
清单 1. 创建 Principal 的代码（Ambari Server）

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

/**CreatePrincipalServerAction.java
* Creates a principal in the relevant KDC
*
* @param principal the principal name to create
* @param isServicePrincipal true if the principal is a service principal; false if the
* principal is a user principal
* @param kerberosConfiguration the kerberos-env configuration properties
* @param kerberosOperationHandler the KerberosOperationHandler for the relevant KDC
* @param actionLog the logger (may be null if no logging is desired)
* @return a CreatePrincipalResult containing the generated password and key number value
*/
public CreatePrincipalResult createPrincipal(String principal, boolean isServicePrincipal,
Map<String,String> kerberosConfiguration, KerberosOperationHandler kerberosOperationHandler,
ActionLog actionLog) {
….
//根据用户设定的规则生成一个密码字符串
String password = securePasswordHelper.createSecurePassword(length, minLowercaseLetters,
minUppercaseLetters, minDigits, minPunctuation, minWhitespace);
…
…
//调用 kerberosOperationHandler 向 KDC 创建 Principal
Integer keyNumber = kerberosOperationHandler.createPrincipal(principal, password, isServicePrincipal);
…
}

从上面的简要代码中，我们可以看到是由 Ambari Server 调用 KDC 的接口创建的 Principal。有兴趣的读者可以看看完整的开源代码，这里 Principal 的 Key，也就是 password 变量。Ambari Server 会根据用户在 Kerberos 中配置的密码限制条件生成一个临时密码串来作为 Principal 的 Key，并且不会将该密码保存到持久化的存储中（但是会保存到内存中一个 Map 的数据结构）或者返回。也就是说 Ambari Server 并不会给自己赋予管理 Principal 密码的责任。这是由 Ambari 的设计决定的。熟悉 Ambari 的读者，应该清楚 Ambari Server 会将所有的相关配置信息持久化的存储在 Postgres 数据库中。默认情况下，库的名字为 ambari，并且数据库对应的访问密码为 bigdata。 Ambari 目前的实现中，并不会将数据库的密码等信息加密存储，而是明文的放在”/etc/ambari-server/conf/password.dat”。所以，一个别有用心的人可以很容易的获取 Ambari 数据库的信息，甚至可以更改其中的内容。
图 4. Ambari 的数据库密码

由此，我们可以认为 Ambari Server 数据库中的数据本身就不是很安全，所以如果将固定的 Principal 密码存放在 Postgres 数据库中，就显得更不合理了。并且对于 Kerberos 的认证机制来说，是完全可以抛弃密码而使用 Keytab 来代替的。如果有的读者需要将第三方服务托管给 Ambari，并且期望使用 Ambari 的 Automated Kerberization 机制，就需要注意，在 Service 的配置项中不要提供 Principal 的密码（其实也不必要）。
到这里，我们已经大致了解了 Ambari 如何创建 Principal。下来让我们再看下 Ambari 如何创建对应的 Keytab 文件。具体的接口如下：
清单 2. 创建 Keytab 的入口代码（Ambari Server）

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

/**CreteKeytabFilesServerActon.java
* For each identity, create a keytab and append to a new or existing keytab file.
* <
*
*@param identityRecord a Map containing the data for the current identity record
*@param evaluatedPrincipal a String indicating the relevant principal
*@param operationHandler a KerberosOperationHandler used to perform Kerberos-related
* tasks for specific Kerberos implementations
* (MIT, Active Directory, etc...)
* @param kerberosConfiguration a Map of configuration properties from kerberos-env
* @param requestSharedDataContext a Map to be used a shared data among all ServerActions related
* to a given request @return a CommandReport, indicating an error
* condition; or null, indicating a success condition
* @throws AmbariException if an error occurs while processing the identity record
*/
@Override
protected CommandReport processIdentity(Map<String, String> identityRecord, String evaluatedPrincipal,
KerberosOperationHandler operationHandler,
Map<String, String> kerberosConfiguration,
Map<String, Object> requestSharedDataContext)
throws AmbariException {
…
//创建 Keytab 文件的内容，也就是 Keytab data
Keytab keytab = createKeytab(evaluatedPrincipal, password, keyNumber,
operationHandler, visitedPrincipalKeys != null, canCache, actionLog);
…
…
//将创建的 Keytab data，写入 Keytab 文件中，如果文件不存在就创建，如果存在，就将内容 merge 到一起
operationHandler.createKeytabFile(keytab, destinationKeytabFile)) {
ensureAmbariOnlyAccess(destinationKeytabFile);
…
}

在上面这个接口中，我只简要的列出了两行比较关键的代码，前一行是生成 Keytab 的内容，后一行则将内容写入 Keytab 文件中。我们从代码中看到的 password 参数，这个参数也就是来源于创建 Principal 的函数中。在上一段描述中，已经提到 Ambari Server 虽然不会将 Principal 的密码存入数据库，但会放到一个共享的 Map 中，这里便会从该 Map 中取出对应的 Principal 密码，并生成该 Principal 的 Keytab Data。有兴趣的读者可以仔细阅读完整的 Ambari Server 代码，这里我们还需要注意一个变量，便是 destinationKeytabFile，也就是 Keytab 文件的存放路径，这个路径可能是远程的（不在 Ambari Server）。如果需要完整的理解 Ambari Server 创建 Keytab 和 Principal 的流程，还需要查看很多代码如 Keytab 类以及 KerberosOperationHandler（接口类，提供具体操作 Principal 和 Keytab 的方法）。
Ambari Agent 端接口为了更好的理解 Ambari Server 在 Automated Kerberization 中的职责，我们介绍了 Ambari Server 中比较关键的两个函数。接下来让我们再看下 Ambari Agent 中提供的 Kerberos 相关的方法。首先我们知道 Agent 主要是由 Python 代码以及一些 Shell 脚本实现。因而，下面的内容主要也是以 Python 和 Shell 为基础。在之前的文章中我们已经了解，在 Agent 端，Service 必须实现 install\start\stop\status 等控制函数，这些都是最基础的逻辑。对于 Automated Kerberization 来说，Service 首先需要获取当前集群 Kerberos 认证启用的状态（enabled/disabled），而这个状态被保存在 cluster-env 中。以下我们就先介绍下如何获取该状态的方法。
在之前的文章中已经介绍过 Script 类以及其提供的 get_config 接口，调用该接口可以在 Agent 端获得 Ambari 集群配置信息的全集，也就自然会包含 Kerberos 状态的信息。简要的 Python 示例代码如下：
清单 3. Python 示例代码

1
2
3
4
5
6

from resource_management.libraries.script.script import Script
config = Script.get_config()
security_enabled = config['configurations']['cluster-env']['security_enabled']
if security_enabled:
#do something
pass

在 Ambari Server 的安装目录中提供了一个名为 configs.sh 的 Shell 脚本，其通过 curl 命令以及 sed 编辑器，实现了获取 Ambari 集群配置的功能，并且会格式化输出结果。因此，我们可以将该脚本同步到 Agent 机器，在 Agent 端直接调用这个脚本获取 Kerberos 的状态信息。这里需要注意，该脚本不仅可以获取集群的相关配置，也可以更新或删除某一个配置项。本质上讲，该脚本就是通过 curl 命令调用 Rest API 来操作 Ambari 的 Desired Configuration（Ambari 的一种资源概念）信息，有兴趣的读者可以尝试更多该脚本的功能。获取 Kerberos 状态的示例命令如下 (configs.sh 脚本路径为/var/lib/ambari-server/resources/scripts/)：
configs.sh get ambari-server cluster-name cluster-env
图 5. 执行 configs.sh 脚本的结果

通过以上的方法，我们便可以从 Agent 端获取到集群的 Kerberos 状态，这只是第一步。Ambari 只会帮助 Service 创建对应的 Principal 和 Keytab 文件，如何使用则是 Service 该考虑的问题。当 Ambari 中的 Service 成功拿到 Principal 和 Keytab 的名称之后，便需要通过 KDC 去认证，进而生成对应的 Kerberos Confidential 文件。该文件一般为 8 小时有效，并默认存放在/tmp 目录中。有了该文件便可以直接向其他模块通信。所以在 Service 的控制脚本中，一般需要生成 Kerberos Confidential 文件。这里 Ambari 提供了一个公用的方法“get_kinit_path”获取 Kinit 命令的路径，这样脚本便可以调用 Kinit 来生成 Kerberos 的 Confidential 文件。如果要使用该方法，需要在 Python 脚本中导入该模块，具体如下：
from resource_management.libraries.functions import get_kinit_path

收藏分享评分

回复引用

订阅 TOP

返回列表