博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Cloudera Manager和Managed Service的数据库
阅读量:4254 次
发布时间:2019-05-26

本文共 3478 字,大约阅读时间需要 11 分钟。

背景
从业务发展需求,大数据平台需要使用spark作为机器学习、数据挖掘、实时计算等工作,所以决定使用Cloudera Manager5.2.0版本和CDH5。
以前搭建过Cloudera Manager4.8.2和CDH4,在搭建Cloudera Manager5.2.0版本的时候,发现相应的Service Host Monitor 和 Service Monitor不能配置外部表,刚开是还以为是配置出错,后来才发现应该是新版本的Cloudera的存储改变方式了。查了很多文档,果然发现,新版本中
Service Host Monitor 和 ServicMonitor
e 不需要配置数据库,默认使用内置存储方式,并且不能修改

概述

Cloudera Manager uses databases to store information about the Cloudera Manager configuration, as well as information such as the health of the system or task progress. For quick, simple installations, Cloudera Manager can install and configure an embedded PostgreSQL database as part of the Cloudera Manager installation process. In addition, some CDH services use databases and are automatically configured to use a default database. If you plan to use the embedded and default databases provided during the Cloudera Manager installation, see Installation Path A - Automated Installation by Cloudera Manager.

Although the embedded database is useful for getting started quickly, you can also use your own
 PostgreSQL, MySQL, or Oracle database for the Cloudera Manager Server and services that use databases.

需要的数据库
    
The 
Cloudera Manager Server, Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server
, and 
Cloudera Navigator Metadata Server
 all require databases. The type of data contained in the databases and their estimated sizes are as follows:
  • Cloudera Manager - Contains all the information about services you have configured and their role assignments, all configuration history, commands, users, and running processes. This relatively small database (<100 MB) is the most important to back up.
  • Activity Monitor - Contains information about past activities. In large clusters, this database can grow large. Configuring an Activity Monitor database is only necessary if a MapReduce service is deployed.
  • Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.
  • Hive Metastore - Contains Hive metadata. Relatively small.
  • Sentry Server - Contains authorization metadata. Relatively small.
  • Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow large.
  • Cloudera Navigator Metadata Server - Contains authorization, policies, and audit report metadata. Relatively small.

The Cloudera Manager Service Host Monitor and Service Monitor roles have an 
internal datastore
(注意,就是此处说明了, Host Monitor and Service Monitor在CM5版本中,不能配置外部表,只能使用内置表。与CM4版本有区别)

Cloudera Manager 提供三种不同的安装方式,方法A是自动化安装,方法B和C是使用rpm或tar手动安装:
  • Path A automatically installs an embedded PostgreSQL database to meet the requirements of the services. This path reduces the number of installation tasks to complete and choices to make. In Path A you can optionally choose to create external databases forActivity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server.
  • Path B and Path C require you to create databases for the Cloudera Manager Server, Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server.

使用外部数据库需要更多的输入以及相关工作,但是cloudera提供了更多的兼容性和扩展性,让你可以弹性的选择数据库和配置。
当然可以在一套系统中安装多种不同的数据库,但是这样会带来很多不确定的因素,所以cloudera建议始终使用同一种数据库。

在很多例子中,你需要将相应的service与database安装到同一台机器上,可以减小网络IO,提高整体效率。
当然,你也可以将service和database分开安装到不同的机器上,在大型部署中或者database管理员需要这样的配置,比如这样的场景,Oracle DBA需要独立的管理database。

搭建数据库的配置参考官网,有详细配置步骤:

转载地址:http://rccei.baihongyu.com/

你可能感兴趣的文章
openssl实现md5加rsa签名
查看>>
史上最全的前端学习路线图,干货满满
查看>>
来点不一样的:解耦 HTML、CSS 和 JS之间的那些事
查看>>
使用go编写webassembly
查看>>
从矩阵与空间操作的关系理解CSS3的transform(科普文)
查看>>
你也想做掌控全局的 React 大师吗?
查看>>
Javascript中的尾递归及其优化
查看>>
前端面试之手写一个bind方法
查看>>
浅析当下的 Node.js CommonJS 模块系统
查看>>
如何让 node 也支持从 url 加载一个 module?
查看>>
使用 HeadlessChrome 来测试 WebRTC 应用
查看>>
从输入URL到页面加载的过程?如何由一道题完善自己的前端知识体系!
查看>>
想象一双结实而富有弹性的大腿:理解 Flexbox 布局
查看>>
GraphQL 初探—面向未来 API 及其生态圈
查看>>
使用 CSS Houdini 绘制平滑圆角
查看>>
聊聊Vue.js的template编译
查看>>
Vue源码阅读连载之响应式设计
查看>>
CSS布局 -- 圣杯布局 & 双飞翼布局
查看>>
产生随机数
查看>>
android 基本布局
查看>>