Apache HBase Logo

Preface

前言

This is the official reference guide for the HBase version it ships with.

这是它附带的HBase版本的官方参考指南。

Herein you will find either the definitive documentation on an HBase topic as of its standing when the referenced HBase version shipped, or it will point to the location in Javadoc or JIRA where the pertinent information can be found.

在本文中,您将找到关于HBase主题的最终文档,就像它在被引用的HBase版本发布时的位置一样,或者它将指向Javadoc或JIRA中的位置,在那里可以找到相关信息。

About This Guide

This reference guide is a work in progress. The source for this guide can be found in the _src/main/asciidoc directory of the HBase source. This reference guide is marked up using AsciiDoc from which the finished guide is generated as part of the 'site' build target. Run

这个参考指南是一项正在进行中的工作。本指南的源代码可以在HBase源代码的_src/main/asciidoc目录中找到。这个参考指南使用了一个“网站”构建目标的一部分,并将完成的指南作为“网站”的一部分生成。运行

mvn site

to generate this documentation. Amendments and improvements to the documentation are welcomed. Click this link to file a new documentation bug against Apache HBase with some values pre-selected.

生成这个文档。欢迎修改和改进文件。单击此链接,以预先选择的一些值向Apache HBase提交新的文档错误。

Contributing to the Documentation

For an overview of AsciiDoc and suggestions to get started contributing to the documentation, see the relevant section later in this documentation.

要了解关于AsciiDoc的概述,以及开始对文档做出贡献的建议,请参阅本文档后面的相关部分。

Heads-up if this is your first foray into the world of distributed computing…​

If this is your first foray into the wonderful world of Distributed Computing, then you are in for some interesting times. First off, distributed systems are hard; making a distributed system hum requires a disparate skillset that spans systems (hardware and software) and networking.

如果这是你第一次涉足分布式计算的奇妙世界,那么你将会进入一个有趣的时代。首先,分布式系统很难;要使分布式系统发出嗡嗡声,需要一个跨越系统(硬件和软件)和网络的完全不同的技能集。

Your cluster’s operation can hiccup because of any of a myriad set of reasons from bugs in HBase itself through misconfigurations — misconfiguration of HBase but also operating system misconfigurations — through to hardware problems whether it be a bug in your network card drivers or an underprovisioned RAM bus (to mention two recent examples of hardware issues that manifested as "HBase is slow"). You will also need to do a recalibration if up to this your computing has been bound to a single box. Here is one good starting point: Fallacies of Distributed Computing.

集群的操作可以打嗝,因为无数的理由的任何bug在HBase本身通过配置错误——错误配置的HBase也是操作系统配置错误,通过硬件问题无论是一个错误在你的网卡驱动程序或underprovisioned内存总线(更不用说两个最近的例子的硬件问题,表现为“HBase缓慢”)。您还需要进行重新校准,如果您的计算已经绑定到一个单独的框。这里有一个很好的起点:分布式计算的谬论。

That said, you are welcome.
It’s a fun place to be.
Yours, the HBase Community.

那就是说,你是受欢迎的。这是一个有趣的地方。你的,HBase社区。

Reporting Bugs

Please use JIRA to report non-security-related bugs.

请使用JIRA报告与安全相关的错误。

To protect existing HBase installations from new vulnerabilities, please do not use JIRA to report security-related bugs. Instead, send your report to the mailing list private@apache.org, which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report.

为了保护现有的HBase安装免受新的漏洞,请不要使用JIRA报告与安全相关的错误。相反,将你的报告发送到邮件列表private@apache.org,它允许任何人发送消息,但是限制谁可以阅读它们。名单上的人会联系你跟进你的报告。

Support and Testing Expectations

The phrases /supported/, /not supported/, /tested/, and /not tested/ occur several places throughout this guide. In the interest of clarity, here is a brief explanation of what is generally meant by these phrases, in the context of HBase.

这些短语/支持/不支持/ / /测试/,以及/未测试/在本指南中出现了几个地方。为了清晰起见,本文简要地解释了这些短语在HBase上下文中的含义。

Commercial technical support for Apache HBase is provided by many Hadoop vendors. This is not the sense in which the term /support/ is used in the context of the Apache HBase project. The Apache HBase team assumes no responsibility for your HBase clusters, your configuration, or your data.
Supported

In the context of Apache HBase, /supported/ means that HBase is designed to work in the way described, and deviation from the defined behavior or functionality should be reported as a bug.

在Apache HBase的上下文中,/支持/意味着HBase按照描述的方式工作,并且偏离定义的行为或功能应该报告为bug。

Not Supported

In the context of Apache HBase, /not supported/ means that a use case or use pattern is not expected to work and should be considered an antipattern. If you think this designation should be reconsidered for a given feature or use pattern, file a JIRA or start a discussion on one of the mailing lists.

在Apache HBase的上下文中,/不支持/表示一个用例或使用模式不能工作,应该被视为反模式。如果您认为这个名称应该被重新考虑为一个给定的特性或使用模式,将JIRA文件或开始讨论一个邮件列表。

Tested

In the context of Apache HBase, /tested/ means that a feature is covered by unit or integration tests, and has been proven to work as expected.

在Apache HBase的上下文中,/测试/意味着一个特性被单元或集成测试所覆盖,并且已经被证明可以按预期工作。

Not Tested

In the context of Apache HBase, /not tested/ means that a feature or use pattern may or may not work in a given way, and may or may not corrupt your data or cause operational issues. It is an unknown, and there are no guarantees. If you can provide proof that a feature designated as /not tested/ does work in a given way, please submit the tests and/or the metrics so that other users can gain certainty about such features or use patterns.

在Apache HBase的上下文中,/未测试/意味着某个特性或使用模式可能在某个特定的方式下工作,也可能不会损坏您的数据或导致操作问题。这是一个未知数,也没有任何保证。如果您可以提供一个指定为/未测试/确实工作的特性,请提交测试和/或度量,以便其他用户能够获得关于这些特性或使用模式的确定性。

Getting Started

开始

1. Introduction

1。介绍

Quickstart will get you up and running on a single-node, standalone instance of HBase.

快速启动将使您在一个单节点、独立的HBase实例上运行。

2. Quick Start - Standalone HBase

2。快速启动-独立的HBase。

This section describes the setup of a single-node standalone HBase. A standalone instance has all HBase daemons — the Master, RegionServers, and ZooKeeper — running in a single JVM persisting to the local filesystem. It is our most basic deploy profile. We will show you how to create a table in HBase using the hbase shell CLI, insert rows into the table, perform put and scan operations against the table, enable or disable the table, and start and stop HBase.

本节描述单节点独立HBase的设置。一个独立的实例拥有所有HBase守护进程——主服务器、区域服务器和ZooKeeper——在一个JVM持久化到本地文件系统中运行。它是我们最基本的部署配置文件。我们将向您展示如何使用HBase shell CLI在HBase中创建表,将行插入到表中,对表执行put和scan操作,启用或禁用表,并启动和停止HBase。

Apart from downloading HBase, this procedure should take less than 10 minutes.

除了下载HBase,这个过程还需要不到10分钟。

Prior to HBase 0.94.x, HBase expected the loopback IP address to be 127.0.0.1. Ubuntu and some other distributions default to 127.0.1.1 and this will cause problems for you. See Why does HBase care about /etc/hosts? for detail

HBase 0.94之前。HBase期望回环IP地址为127.0.0.1。Ubuntu和其他一些发行版默认为127.0.1.1,这会给你带来麻烦。为什么HBase关心/etc/hosts?对细节

The following /etc/hosts file works correctly for HBase 0.94.x and earlier, on Ubuntu. Use this as a template if you run into trouble.

下面的/etc/hosts文件对HBase 0.94是正确的。x和之前的Ubuntu。如果遇到麻烦,可以使用它作为模板。

127.0.0.1 localhost
127.0.0.1 ubuntu.ubuntu-domain ubuntu

This issue has been fixed in hbase-0.96.0 and beyond.

这个问题已经在hbase-0.96.0和更高版本中得到了修正。

2.1. JDK Version Requirements

2.1。JDK版本需求

HBase requires that a JDK be installed. See Java for information about supported JDK versions.

HBase要求安装JDK。有关受支持的JDK版本的信息,请参见Java。

2.2. Get Started with HBase

2.2。开始使用HBase

Procedure: Download, Configure, and Start HBase in Standalone Mode
  1. Choose a download site from this list of Apache Download Mirrors. Click on the suggested top link. This will take you to a mirror of HBase Releases. Click on the folder named stable and then download the binary file that ends in .tar.gz to your local filesystem. Do not download the file ending in src.tar.gz for now.

    从这个Apache下载镜像列表中选择一个下载站点。点击建议的顶部链接。这将带你到一个HBase版本的镜像。单击名为stable的文件夹,然后下载以.tar结尾的二进制文件。gz到您的本地文件系统。不要下载在src.tar中结束的文件。现在广州。

  2. Extract the downloaded file, and change to the newly-created directory.

    提取下载的文件,并更改为新创建的目录。

    $ tar xzvf hbase-2.0.0-beta-2-bin.tar.gz
    $ cd hbase-2.0.0-beta-2/
  3. You are required to set the JAVA_HOME environment variable before starting HBase. You can set the variable via your operating system’s usual mechanism, but HBase provides a central mechanism, conf/hbase-env.sh. Edit this file, uncomment the line starting with JAVA_HOME, and set it to the appropriate location for your operating system. The JAVA_HOME variable should be set to a directory which contains the executable file bin/java. Most modern Linux operating systems provide a mechanism, such as /usr/bin/alternatives on RHEL or CentOS, for transparently switching between versions of executables such as Java. In this case, you can set JAVA_HOME to the directory containing the symbolic link to bin/java, which is usually /usr.

    在启动HBase之前,需要设置JAVA_HOME环境变量。您可以通过操作系统的通常机制来设置变量,但是HBase提供了一个中央机制,conf/ HBase -env.sh。编辑此文件,取消注释从JAVA_HOME开始的行,并将其设置为您的操作系统的适当位置。应该将JAVA_HOME变量设置为包含可执行文件bin/java的目录。大多数现代Linux操作系统都提供了一种机制,例如在RHEL或CentOS上的/usr/bin/ options,以透明地在Java等可执行文件的版本之间切换。在这种情况下,您可以将JAVA_HOME设置为包含到bin/java的符号链接的目录,这通常是/usr。

    JAVA_HOME=/usr
  4. Edit conf/hbase-site.xml, which is the main HBase configuration file. At this time, you only need to specify the directory on the local filesystem where HBase and ZooKeeper write data. By default, a new directory is created under /tmp. Many servers are configured to delete the contents of /tmp upon reboot, so you should store the data elsewhere. The following configuration will store HBase’s data in the hbase directory, in the home directory of the user called testuser. Paste the <property> tags beneath the <configuration> tags, which should be empty in a new HBase install.

    编辑conf / hbase-site。xml,它是主要的HBase配置文件。此时,您只需要在HBase和ZooKeeper写入数据的本地文件系统上指定目录。默认情况下,将在/tmp下创建一个新目录。许多服务器被配置为在重新启动时删除/tmp的内容,因此您应该将数据存储在其他地方。下面的配置将在HBase目录中存储HBase的数据,在用户名为testuser的主目录中。在 <配置> 标签下面粘贴 <属性> 标记,在新的HBase安装中应该是空的。

    Example 1. Example hbase-site.xml for Standalone HBase
    <configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>file:///home/testuser/hbase</value>
      </property>
      <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/home/testuser/zookeeper</value>
      </property>
    </configuration>

    You do not need to create the HBase data directory. HBase will do this for you. If you create the directory, HBase will attempt to do a migration, which is not what you want.

    您不需要创建HBase数据目录。HBase会为你做这个。如果创建目录,HBase将尝试进行迁移,这不是您想要的。

    The hbase.rootdir in the above example points to a directory in the local filesystem. The 'file:/' prefix is how we denote local filesystem. To home HBase on an existing instance of HDFS, set the hbase.rootdir to point at a directory up on your instance: e.g. hdfs://namenode.example.org:8020/hbase. For more on this variant, see the section below on Standalone HBase over HDFS.
  5. The bin/start-hbase.sh script is provided as a convenient way to start HBase. Issue the command, and if all goes well, a message is logged to standard output showing that HBase started successfully. You can use the jps command to verify that you have one running process called HMaster. In standalone mode HBase runs all daemons within this single JVM, i.e. the HMaster, a single HRegionServer, and the ZooKeeper daemon. Go to http://localhost:16010 to view the HBase Web UI.

    bin / start-hbase。sh脚本提供了一种方便的启动HBase的方式。发出命令,如果一切顺利,将记录一个消息,以显示HBase成功启动的标准输出。您可以使用jps命令来验证您是否有一个名为HMaster的正在运行的进程。在单机模式下,HBase在这个单一的JVM中运行所有的守护进程,即HMaster、单个hlocationserver和ZooKeeper守护进程。转到http://localhost:16010查看HBase Web UI。

    Java needs to be installed and available. If you get an error indicating that Java is not installed, but it is on your system, perhaps in a non-standard location, edit the conf/hbase-env.sh file and modify the JAVA_HOME setting to point to the directory that contains bin/java on your system.
Procedure: Use HBase For the First Time
  1. Connect to HBase.

    连接到HBase。

    Connect to your running instance of HBase using the hbase shell command, located in the bin/ directory of your HBase install. In this example, some usage and version information that is printed when you start HBase Shell has been omitted. The HBase Shell prompt ends with a > character.

    连接到HBase的运行实例,使用HBase shell命令,位于HBase安装的bin/目录中。在本例中,我们省略了启动HBase Shell时打印的一些用法和版本信息。HBase Shell提示符以>字符结束。

    $ ./bin/hbase shell
    hbase(main):001:0>
  2. Display HBase Shell Help Text.

    显示HBase Shell帮助文本。

    Type help and press Enter, to display some basic usage information for HBase Shell, as well as several example commands. Notice that table names, rows, columns all must be enclosed in quote characters.

    类型帮助和按Enter,显示一些基本的使用信息的HBase Shell,以及几个示例命令。注意,表名、行、列都必须用引号括起来。

  3. Create a table.

    创建一个表。

    Use the create command to create a new table. You must specify the table name and the ColumnFamily name.

    使用create命令创建一个新表。您必须指定表名和ColumnFamily名称。

    hbase(main):001:0> create 'test', 'cf'
    0 row(s) in 0.4170 seconds
    
    => Hbase::Table - test
  4. List Information About your Table

    列出关于您的表的信息。

    Use the list command to

    使用列表命令。

    hbase(main):002:0> list 'test'
    TABLE
    test
    1 row(s) in 0.0180 seconds
    
    => ["test"]
  5. Put data into your table.

    把数据放到你的桌子上。

    To put data into your table, use the put command.

    要将数据放入表中,请使用put命令。

    hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
    0 row(s) in 0.0850 seconds
    
    hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
    0 row(s) in 0.0110 seconds
    
    hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3'
    0 row(s) in 0.0100 seconds

    Here, we insert three values, one at a time. The first insert is at row1, column cf:a, with a value of value1. Columns in HBase are comprised of a column family prefix, cf in this example, followed by a colon and then a column qualifier suffix, a in this case.

    在这里,我们插入三个值,一次一个。第一个插入是在row1,列cf:a,值为value1。HBase中的列由一个列家族前缀组成,在这个例子中是cf,后面是一个冒号,然后是一个列限定符后缀,在这个例子中是a。

  6. Scan the table for all data at once.

    立即扫描所有数据表。

    One of the ways to get data from HBase is to scan. Use the scan command to scan the table for data. You can limit your scan, but for now, all data is fetched.

    从HBase获取数据的方法之一是扫描。使用扫描命令扫描数据表。您可以限制您的扫描,但是现在,所有的数据都是被获取的。

    hbase(main):006:0> scan 'test'
    ROW                                      COLUMN+CELL
     row1                                    column=cf:a, timestamp=1421762485768, value=value1
     row2                                    column=cf:b, timestamp=1421762491785, value=value2
     row3                                    column=cf:c, timestamp=1421762496210, value=value3
    3 row(s) in 0.0230 seconds
  7. Get a single row of data.

    获取一行数据。

    To get a single row of data at a time, use the get command.

    要一次获取一行数据,请使用get命令。

    hbase(main):007:0> get 'test', 'row1'
    COLUMN                                   CELL
     cf:a                                    timestamp=1421762485768, value=value1
    1 row(s) in 0.0350 seconds
  8. Disable a table.

    禁用一个表。

    If you want to delete a table or change its settings, as well as in some other situations, you need to disable the table first, using the disable command. You can re-enable it using the enable command.

    如果您想删除一个表或更改它的设置,以及在其他一些情况下,您需要首先禁用该表,使用disable命令。您可以使用enable命令重新启用它。

    hbase(main):008:0> disable 'test'
    0 row(s) in 1.1820 seconds
    
    hbase(main):009:0> enable 'test'
    0 row(s) in 0.1770 seconds

    Disable the table again if you tested the enable command above:

    如果您测试了上面的enable命令,请再次禁用该表:

    hbase(main):010:0> disable 'test'
    0 row(s) in 1.1820 seconds
  9. Drop the table.

    删除表。

    To drop (delete) a table, use the drop command.

    要删除(删除)一个表,请使用drop命令。

    hbase(main):011:0> drop 'test'
    0 row(s) in 0.1370 seconds
  10. Exit the HBase Shell.

    退出HBase壳。

    To exit the HBase Shell and disconnect from your cluster, use the quit command. HBase is still running in the background.

    要退出HBase Shell并断开与集群的连接,可以使用quit命令。HBase仍然在后台运行。

Procedure: Stop HBase
  1. In the same way that the bin/start-hbase.sh script is provided to conveniently start all HBase daemons, the bin/stop-hbase.sh script stops them.

    就像bin/start-hbase一样。为方便地启动所有HBase守护进程(bin/stop-hbase)提供了sh脚本。sh脚本停止它们。

    $ ./bin/stop-hbase.sh
    stopping hbase....................
    $
  2. After issuing the command, it can take several minutes for the processes to shut down. Use the jps to be sure that the HMaster and HRegionServer processes are shut down.

    发出命令后,进程关闭可能需要几分钟时间。使用jps确保HMaster和h区域性服务器进程被关闭。

The above has shown you how to start and stop a standalone instance of HBase. In the next sections we give a quick overview of other modes of hbase deploy.

上面介绍了如何启动和停止HBase的独立实例。在下一节中,我们将简要介绍hbase部署的其他模式。

2.3. Pseudo-Distributed Local Install

2.3。伪分布本地安装

After working your way through quickstart standalone mode, you can re-configure HBase to run in pseudo-distributed mode. Pseudo-distributed mode means that HBase still runs completely on a single host, but each HBase daemon (HMaster, HRegionServer, and ZooKeeper) runs as a separate process: in standalone mode all daemons ran in one jvm process/instance. By default, unless you configure the hbase.rootdir property as described in quickstart, your data is still stored in /tmp/. In this walk-through, we store your data in HDFS instead, assuming you have HDFS available. You can skip the HDFS configuration to continue storing your data in the local filesystem.

通过快速启动独立模式,您可以重新配置HBase以运行伪分布式模式。伪分布式模式意味着HBase仍然完全运行在单个主机上,但是每个HBase守护进程(HMaster、hlocal服务器和ZooKeeper)作为一个单独的进程运行:在独立模式下,所有守护进程都在一个jvm进程/实例中运行。默认情况下,除非配置hbase。如快速启动所描述的rootdir属性,您的数据仍然存储在/tmp/中。在这个过程中,我们将数据存储在HDFS中,假设您有HDFS可用。您可以跳过HDFS配置来继续将数据存储在本地文件系统中。

Hadoop Configuration

This procedure assumes that you have configured Hadoop and HDFS on your local system and/or a remote system, and that they are running and available. It also assumes you are using Hadoop 2. The guide on Setting up a Single Node Cluster in the Hadoop documentation is a good starting point.

这个过程假设您已经在本地系统和/或远程系统上配置了Hadoop和HDFS,并且它们正在运行和可用。它还假设您使用的是Hadoop 2。在Hadoop文档中设置单个节点集群的指南是一个很好的起点。

  1. Stop HBase if it is running.

    停止HBase,如果它正在运行。

    If you have just finished quickstart and HBase is still running, stop it. This procedure will create a totally new directory where HBase will store its data, so any databases you created before will be lost.

    如果您刚刚完成快速启动,HBase仍在运行,请停止。这个过程将创建一个全新的目录,HBase将存储其数据,因此您之前创建的任何数据库都将丢失。

  2. Configure HBase.

    配置HBase。

    Edit the hbase-site.xml configuration. First, add the following property which directs HBase to run in distributed mode, with one JVM instance per daemon.

    编辑hbase-site。xml配置。首先,添加以下属性,该属性指导HBase在分布式模式下运行,每个守护进程有一个JVM实例。

    <property>
      <name>hbase.cluster.distributed</name>
      <value>true</value>
    </property>

    Next, change the hbase.rootdir from the local filesystem to the address of your HDFS instance, using the hdfs://// URI syntax. In this example, HDFS is running on the localhost at port 8020.

    接下来,更改hbase。使用HDFS://// URI语法从本地文件系统到HDFS实例的地址。在本例中,HDFS在端口8020的本地主机上运行。

    <property>
      <name>hbase.rootdir</name>
      <value>hdfs://localhost:8020/hbase</value>
    </property>

    You do not need to create the directory in HDFS. HBase will do this for you. If you create the directory, HBase will attempt to do a migration, which is not what you want.

    您不需要在HDFS中创建目录。HBase会为你做这个。如果创建目录,HBase将尝试进行迁移,这不是您想要的。

  3. Start HBase.

    HBase开始。

    Use the bin/start-hbase.sh command to start HBase. If your system is configured correctly, the jps command should show the HMaster and HRegionServer processes running.

    使用bin / start-hbase。sh命令启动HBase。如果您的系统配置正确,jps命令应该显示HMaster和h分区服务器进程。

  4. Check the HBase directory in HDFS.

    检查HDFS中的HBase目录。

    If everything worked correctly, HBase created its directory in HDFS. In the configuration above, it is stored in /hbase/ on HDFS. You can use the hadoop fs command in Hadoop’s bin/ directory to list this directory.

    如果一切正常,HBase在HDFS中创建了它的目录。在上面的配置中,它存储在/hbase/ HDFS上。您可以在hadoop的bin/目录中使用hadoop fs命令来列出此目录。

    $ ./bin/hadoop fs -ls /hbase
    Found 7 items
    drwxr-xr-x   - hbase users          0 2014-06-25 18:58 /hbase/.tmp
    drwxr-xr-x   - hbase users          0 2014-06-25 21:49 /hbase/WALs
    drwxr-xr-x   - hbase users          0 2014-06-25 18:48 /hbase/corrupt
    drwxr-xr-x   - hbase users          0 2014-06-25 18:58 /hbase/data
    -rw-r--r--   3 hbase users         42 2014-06-25 18:41 /hbase/hbase.id
    -rw-r--r--   3 hbase users          7 2014-06-25 18:41 /hbase/hbase.version
    drwxr-xr-x   - hbase users          0 2014-06-25 21:49 /hbase/oldWALs
  5. Create a table and populate it with data.

    创建一个表并用数据填充它。

    You can use the HBase Shell to create a table, populate it with data, scan and get values from it, using the same procedure as in shell exercises.

    您可以使用HBase Shell创建一个表,用数据填充它,扫描并从中获取值,使用与Shell练习相同的过程。

  6. Start and stop a backup HBase Master (HMaster) server.

    启动和停止备份HBase (HMaster)服务器。

    Running multiple HMaster instances on the same hardware does not make sense in a production environment, in the same way that running a pseudo-distributed cluster does not make sense for production. This step is offered for testing and learning purposes only.

    The HMaster server controls the HBase cluster. You can start up to 9 backup HMaster servers, which makes 10 total HMasters, counting the primary. To start a backup HMaster, use the local-master-backup.sh. For each backup master you want to start, add a parameter representing the port offset for that master. Each HMaster uses three ports (16010, 16020, and 16030 by default). The port offset is added to these ports, so using an offset of 2, the backup HMaster would use ports 16012, 16022, and 16032. The following command starts 3 backup servers using ports 16012/16022/16032, 16013/16023/16033, and 16015/16025/16035.

    HMaster服务器控制HBase集群。您可以启动多达9个备份的HMaster服务器,这使得10个完全的HMaster,计算主服务器。要启动备份HMaster,请使用本地主备份。对于想要启动的每个备份主,添加一个参数,该参数表示该主机的端口偏移量。每个HMaster使用三个端口(默认为16010、16020和16030)。端口偏移量被添加到这些端口,因此使用2的偏移量,备份HMaster将使用端口16012、16022和16032。以下命令使用端口16012/16022/16032、16013/16023/16033和16015/16025/16035启动3个备份服务器。

    $ ./bin/local-master-backup.sh 2 3 5

    To kill a backup master without killing the entire cluster, you need to find its process ID (PID). The PID is stored in a file with a name like /tmp/hbase-USER-X-master.pid. The only contents of the file is the PID. You can use the kill -9 command to kill that PID. The following command will kill the master with port offset 1, but leave the cluster running:

    要在不杀死整个集群的情况下杀死备份主,您需要找到它的进程ID (PID)。PID存储在一个名称为/tmp/hbase-USER-X-master.pid的文件中。该文件的惟一内容是PID。您可以使用kill -9命令来杀死PID。下面的命令将用端口偏移1杀死master,但是离开集群运行:

    $ cat /tmp/hbase-testuser-1-master.pid |xargs kill -9
  7. Start and stop additional RegionServers

    启动和停止额外的区域服务器。

    The HRegionServer manages the data in its StoreFiles as directed by the HMaster. Generally, one HRegionServer runs per node in the cluster. Running multiple HRegionServers on the same system can be useful for testing in pseudo-distributed mode. The local-regionservers.sh command allows you to run multiple RegionServers. It works in a similar way to the local-master-backup.sh command, in that each parameter you provide represents the port offset for an instance. Each RegionServer requires two ports, and the default ports are 16020 and 16030. However, the base ports for additional RegionServers are not the default ports since the default ports are used by the HMaster, which is also a RegionServer since HBase version 1.0.0. The base ports are 16200 and 16300 instead. You can run 99 additional RegionServers that are not a HMaster or backup HMaster, on a server. The following command starts four additional RegionServers, running on sequential ports starting at 16202/16302 (base ports 16200/16300 plus 2).

    hlocationserver按照HMaster的指示管理其存储文件中的数据。通常,集群中的每个节点都运行一个hlocationserver。在相同的系统上运行多个h区域性服务器,可以在伪分布式模式下进行测试。local-regionservers。sh命令允许您运行多个区域服务器。它的工作方式与本地主备份类似。sh命令,您提供的每个参数表示实例的端口偏移量。每个区域服务器需要两个端口,默认端口是16020和16030。但是,其他区域服务器的基本端口不是默认端口,因为HMaster使用默认端口,这也是自HBase 1.0.0版本以来的区域服务器。基本端口是16200和16300。您可以在服务器上运行99个不属于HMaster或备份HMaster的区域服务器。下面的命令启动4个额外的区域服务器,从16202/16302(基本端口16200/16300 + 2)开始运行。

    $ .bin/local-regionservers.sh start 2 3 4 5

    To stop a RegionServer manually, use the local-regionservers.sh command with the stop parameter and the offset of the server to stop.

    要手动停止分区服务器,请使用本地分区服务器。sh命令与停止参数和服务器的偏移量停止。

    $ .bin/local-regionservers.sh stop 3
  8. Stop HBase.

    停止HBase。

    You can stop HBase the same way as in the quickstart procedure, using the bin/stop-hbase.sh command.

    您可以使用bin/stop-hbase在快速启动过程中停止HBase。sh命令。

2.4. Advanced - Fully Distributed

2.4。高级——完全分布式

In reality, you need a fully-distributed configuration to fully test HBase and to use it in real-world scenarios. In a distributed configuration, the cluster contains multiple nodes, each of which runs one or more HBase daemon. These include primary and backup Master instances, multiple ZooKeeper nodes, and multiple RegionServer nodes.

实际上,您需要一个完全分布式的配置来全面测试HBase,并在实际场景中使用它。在分布式配置中,集群包含多个节点,每个节点运行一个或多个HBase守护进程。它们包括主实例和备份主实例、多个ZooKeeper节点和多个区域服务器节点。

This advanced quickstart adds two more nodes to your cluster. The architecture will be as follows:

这个高级的快速启动为集群增加了两个节点。其架构如下:

Table 1. Distributed Cluster Demo Architecture
Node Name Master ZooKeeper RegionServer

node-a.example.com

node-a.example.com

yes

是的

yes

是的

no

没有

node-b.example.com

node-b.example.com

backup

备份

yes

是的

yes

是的

node-c.example.com

node-c.example.com

no

没有

yes

是的

yes

是的

This quickstart assumes that each node is a virtual machine and that they are all on the same network. It builds upon the previous quickstart, Pseudo-Distributed Local Install, assuming that the system you configured in that procedure is now node-a. Stop HBase on node-a before continuing.

这个快速启动假设每个节点都是一个虚拟机,并且它们都在同一个网络上。它建立在前面的快速启动、伪分布式本地安装之上,假设您在该过程中配置的系统现在是nodea。在继续之前,停止在节点a上的HBase。

Be sure that all the nodes have full access to communicate, and that no firewall rules are in place which could prevent them from talking to each other. If you see any errors like no route to host, check your firewall.
Procedure: Configure Passwordless SSH Access

node-a needs to be able to log into node-b and node-c (and to itself) in order to start the daemons. The easiest way to accomplish this is to use the same username on all hosts, and configure password-less SSH login from node-a to each of the others.

node-a需要能够登录到node-b和nodec(以及自己)来启动守护进程。实现这一目标的最简单方法是在所有主机上使用相同的用户名,并从节点a配置无密码的SSH登录到其他主机。

  1. On node-a, generate a key pair.

    在节点a上,生成一个密钥对。

    While logged in as the user who will run HBase, generate a SSH key pair, using the following command:

    当登录为将运行HBase的用户时,使用以下命令生成一个SSH密钥对:

    $ ssh-keygen -t rsa

    If the command succeeds, the location of the key pair is printed to standard output. The default name of the public key is id_rsa.pub.

    如果命令成功,则将密钥对的位置打印到标准输出。公钥的默认名称是id_rsa.pub。

  2. Create the directory that will hold the shared keys on the other nodes.

    创建将在其他节点上保存共享密钥的目录。

    On node-b and node-c, log in as the HBase user and create a .ssh/ directory in the user’s home directory, if it does not already exist. If it already exists, be aware that it may already contain other keys.

    在node-b和nodec上,作为HBase用户登录,在用户的主目录中创建一个.ssh/目录,如果它还不存在的话。如果它已经存在,请注意它可能已经包含其他键。

  3. Copy the public key to the other nodes.

    将公钥复制到其他节点。

    Securely copy the public key from node-a to each of the nodes, by using the scp or some other secure means. On each of the other nodes, create a new file called .ssh/authorized_keys if it does not already exist, and append the contents of the id_rsa.pub file to the end of it. Note that you also need to do this for node-a itself.

    通过使用scp或其他安全方法,安全地将公钥从节点a复制到每个节点。在每个其他节点上,创建一个名为.ssh/authorized_keys的新文件,如果它还不存在,并附加id_rsa的内容。酒吧文件到此为止。注意,您还需要为node-a本身这样做。

    $ cat id_rsa.pub >> ~/.ssh/authorized_keys
  4. Test password-less login.

    测试无密码登录。

    If you performed the procedure correctly, you should not be prompted for a password when you SSH from node-a to either of the other nodes using the same username.

    如果您正确地执行了这个过程,那么当您使用相同的用户名从node-a到另一个节点时,不应该提示您输入密码。

  5. Since node-b will run a backup Master, repeat the procedure above, substituting node-b everywhere you see node-a. Be sure not to overwrite your existing .ssh/authorized_keys files, but concatenate the new key onto the existing file using the >> operator rather than the > operator.

    由于node-b将运行一个备份主程序,重复上面的步骤,在您看到node-a的地方替换node-b。确保不要覆盖现有的.ssh/authorized_keys文件,而是使用>>操作符而不是>操作符将新键连接到现有文件。

Procedure: Prepare node-a

node-a will run your primary master and ZooKeeper processes, but no RegionServers. Stop the RegionServer from starting on node-a.

节点a将运行您的主主机和ZooKeeper进程,但是没有分区服务器。停止区域服务器从节点a开始。

  1. Edit conf/regionservers and remove the line which contains localhost. Add lines with the hostnames or IP addresses for node-b and node-c.

    编辑conf/区域服务器并删除包含本地主机的行。为节点-b和节点-c添加带有主机名或IP地址的行。

    Even if you did want to run a RegionServer on node-a, you should refer to it by the hostname the other servers would use to communicate with it. In this case, that would be node-a.example.com. This enables you to distribute the configuration to each node of your cluster any hostname conflicts. Save the file.

    即使您确实想在节点a上运行一个区域服务器,您也应该通过其他服务器使用的主机名来引用它来与之通信。在这种情况下,那就是node-a.example.com。这使您能够将配置分配给集群中的每个节点,任何主机名冲突。保存文件。

  2. Configure HBase to use node-b as a backup master.

    配置HBase以使用node-b作为备份主。

    Create a new file in conf/ called backup-masters, and add a new line to it with the hostname for node-b. In this demonstration, the hostname is node-b.example.com.

    在conf/称为backup-masters中创建一个新文件,并在其上添加一个新行,以node-b的主机名。在这个演示中,主机名是node-b.example.com。

  3. Configure ZooKeeper

    配置管理员

    In reality, you should carefully consider your ZooKeeper configuration. You can find out more about configuring ZooKeeper in zookeeper section. This configuration will direct HBase to start and manage a ZooKeeper instance on each node of the cluster.

    实际上,您应该仔细考虑您的ZooKeeper配置。您可以在ZooKeeper部分找到更多关于配置ZooKeeper的信息。此配置将指导HBase在集群的每个节点上启动和管理一个ZooKeeper实例。

    On node-a, edit conf/hbase-site.xml and add the following properties.

    在节点上,编辑conf / hbase-site。xml并添加以下属性。

    <property>
      <name>hbase.zookeeper.quorum</name>
      <value>node-a.example.com,node-b.example.com,node-c.example.com</value>
    </property>
    <property>
      <name>hbase.zookeeper.property.dataDir</name>
      <value>/usr/local/zookeeper</value>
    </property>
  4. Everywhere in your configuration that you have referred to node-a as localhost, change the reference to point to the hostname that the other nodes will use to refer to node-a. In these examples, the hostname is node-a.example.com.

    在您的配置中,您已经将node-a作为localhost引用,将引用改为指向其他节点用来引用node-a的主机名。在这些示例中,主机名是node-a.example.com。

Procedure: Prepare node-b and node-c

node-b will run a backup master server and a ZooKeeper instance.

节点b将运行一个备份主服务器和一个ZooKeeper实例。

  1. Download and unpack HBase.

    下载并解压HBase。

    Download and unpack HBase to node-b, just as you did for the standalone and pseudo-distributed quickstarts.

    下载和解压HBase到node-b,就像您为独立和伪分布式快速启动所做的一样。

  2. Copy the configuration files from node-a to node-b.and node-c.

    从node-a复制配置文件到node-b。和node-c。

    Each node of your cluster needs to have the same configuration information. Copy the contents of the conf/ directory to the conf/ directory on node-b and node-c.

    集群的每个节点都需要具有相同的配置信息。将conf/目录的内容复制到node-b和nodec上的conf/目录。

Procedure: Start and Test Your Cluster
  1. Be sure HBase is not running on any node.

    确保HBase不运行在任何节点上。

    If you forgot to stop HBase from previous testing, you will have errors. Check to see whether HBase is running on any of your nodes by using the jps command. Look for the processes HMaster, HRegionServer, and HQuorumPeer. If they exist, kill them.

    如果您忘记了从以前的测试中停止HBase,您将会有错误。检查HBase是否使用jps命令在任何节点上运行。查找HMaster、hlocserver和HQuorumPeer的进程。如果存在,就杀了他们。

  2. Start the cluster.

    启动集群。

    On node-a, issue the start-hbase.sh command. Your output will be similar to that below.

    在node-a上,发出start-hbase。sh命令。您的输出将与下面类似。

    $ bin/start-hbase.sh
    node-c.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-c.example.com.out
    node-a.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-a.example.com.out
    node-b.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-b.example.com.out
    starting master, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-master-node-a.example.com.out
    node-c.example.com: starting regionserver, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-regionserver-node-c.example.com.out
    node-b.example.com: starting regionserver, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-regionserver-node-b.example.com.out
    node-b.example.com: starting master, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-master-nodeb.example.com.out

    ZooKeeper starts first, followed by the master, then the RegionServers, and finally the backup masters.

    管理员首先启动,然后是主服务器,然后是区域服务器,最后是备份主服务器。

  3. Verify that the processes are running.

    验证进程正在运行。

    On each node of the cluster, run the jps command and verify that the correct processes are running on each server. You may see additional Java processes running on your servers as well, if they are used for other purposes.

    在集群的每个节点上运行jps命令,并验证是否在每个服务器上运行正确的进程。您还可以看到在服务器上运行的其他Java进程,如果它们用于其他目的。

    Example 2. node-a jps Output
    $ jps
    20355 Jps
    20071 HQuorumPeer
    20137 HMaster
    Example 3. node-b jps Output
    $ jps
    15930 HRegionServer
    16194 Jps
    15838 HQuorumPeer
    16010 HMaster
    Example 4. node-c jps Output
    $ jps
    13901 Jps
    13639 HQuorumPeer
    13737 HRegionServer
    ZooKeeper Process Name

    The HQuorumPeer process is a ZooKeeper instance which is controlled and started by HBase. If you use ZooKeeper this way, it is limited to one instance per cluster node and is appropriate for testing only. If ZooKeeper is run outside of HBase, the process is called QuorumPeer. For more about ZooKeeper configuration, including using an external ZooKeeper instance with HBase, see zookeeper section.

    HQuorumPeer流程是一个由HBase控制和启动的ZooKeeper实例。如果以这种方式使用ZooKeeper,它只适用于每个集群节点的一个实例,并且只适合于测试。如果ZooKeeper在HBase之外运行,这个过程称为QuorumPeer。有关ZooKeeper配置的更多信息,包括使用HBase的外部ZooKeeper实例,请参见ZooKeeper部分。

  4. Browse to the Web UI.

    浏览到Web UI。

    Web UI Port Changes
    Web UI Port Changes

    In HBase newer than 0.98.x, the HTTP ports used by the HBase Web UI changed from 60010 for the Master and 60030 for each RegionServer to 16010 for the Master and 16030 for the RegionServer.

    在HBase中,比0.98更新。HBase Web UI使用的HTTP端口从主服务器的60010改为主服务器的60030,而区域服务器的主服务器为16010,区域服务器为16030。

    If everything is set up correctly, you should be able to connect to the UI for the Master http://node-a.example.com:16010/ or the secondary master at http://node-b.example.com:16010/ using a web browser. If you can connect via localhost but not from another host, check your firewall rules. You can see the web UI for each of the RegionServers at port 16030 of their IP addresses, or by clicking their links in the web UI for the Master.

    如果一切设置正确,您应该能够连接到主界面http://node-a.example.com:16010/或http://node-b.example.com:16010/使用web浏览器。如果您可以通过localhost而不是从另一个主机进行连接,请检查您的防火墙规则。您可以在其IP地址的16030端口上看到每个区域服务器的web UI,或者单击它们在web UI中的链接以获得主服务器。

  5. Test what happens when nodes or services disappear.

    测试节点或服务消失时发生的情况。

    With a three-node cluster you have configured, things will not be very resilient. You can still test the behavior of the primary Master or a RegionServer by killing the associated processes and watching the logs.

    您已经配置了一个三节点集群,事情将不会很有弹性。您仍然可以通过杀死相关进程和查看日志来测试主主服务器或区域服务器的行为。

2.5. Where to go next

2.5。下次要去哪里

The next chapter, configuration, gives more information about the different HBase run modes, system requirements for running HBase, and critical configuration areas for setting up a distributed HBase cluster.

下一章,配置,提供了关于不同HBase运行模式的更多信息,运行HBase的系统需求,以及建立分布式HBase集群的关键配置区域。

Apache HBase Configuration

Apache HBase配置

This chapter expands upon the Getting Started chapter to further explain configuration of Apache HBase. Please read this chapter carefully, especially the Basic Prerequisites to ensure that your HBase testing and deployment goes smoothly, and prevent data loss. Familiarize yourself with Support and Testing Expectations as well.

3. Configuration Files

3所示。配置文件

Apache HBase uses the same configuration system as Apache Hadoop. All configuration files are located in the conf/ directory, which needs to be kept in sync for each node on your cluster.

Apache HBase使用与Apache Hadoop相同的配置系统。所有配置文件都位于conf/目录中,需要对集群中的每个节点保持同步。

HBase Configuration File Descriptions
backup-masters

Not present by default. A plain-text file which lists hosts on which the Master should start a backup Master process, one host per line.

默认情况下不存在。一个纯文本文件,该文件列出主机应该启动一个备份主进程的主机,每一行有一个主机。

hadoop-metrics2-hbase.properties

Used to connect HBase Hadoop’s Metrics2 framework. See the Hadoop Wiki entry for more information on Metrics2. Contains only commented-out examples by default.

用于连接HBase Hadoop的Metrics2框架。有关Metrics2的更多信息,请参见Hadoop Wiki条目。默认只包含有注释的示例。

hbase-env.cmd and hbase-env.sh

Script for Windows and Linux / Unix environments to set up the working environment for HBase, including the location of Java, Java options, and other environment variables. The file contains many commented-out examples to provide guidance.

为Windows和Linux / Unix环境编写脚本,为HBase设置工作环境,包括Java、Java选项和其他环境变量的位置。该文件中包含了许多供参考的示例。

hbase-policy.xml

The default policy configuration file used by RPC servers to make authorization decisions on client requests. Only used if HBase security is enabled.

RPC服务器使用的默认策略配置文件,用于在客户端请求上进行授权决策。只有在启用了HBase安全性时才使用。

hbase-site.xml

The main HBase configuration file. This file specifies configuration options which override HBase’s default configuration. You can view (but do not edit) the default configuration file at docs/hbase-default.xml. You can also view the entire effective configuration for your cluster (defaults and overrides) in the HBase Configuration tab of the HBase Web UI.

主HBase配置文件。此文件指定覆盖HBase默认配置的配置选项。您可以在docs/hbase-default.xml中查看(但不编辑)缺省配置文件。您还可以在HBase Web UI的HBase配置选项卡中查看集群的整个有效配置(默认和覆盖)。

log4j.properties

Configuration file for HBase logging via log4j.

通过log4j进行HBase日志记录的配置文件。

regionservers

A plain-text file containing a list of hosts which should run a RegionServer in your HBase cluster. By default this file contains the single entry localhost. It should contain a list of hostnames or IP addresses, one per line, and should only contain localhost if each node in your cluster will run a RegionServer on its localhost interface.

一个纯文本文件,其中包含一个主机列表,该列表应该在您的HBase集群中运行一个区域服务器。默认情况下,该文件包含单个条目localhost。它应该包含一个主机名或IP地址列表,每行一个,并且如果集群中的每个节点在其本地主机接口上运行一个区域服务器,那么应该只包含localhost。

Checking XML Validity

When you edit XML, it is a good idea to use an XML-aware editor to be sure that your syntax is correct and your XML is well-formed. You can also use the xmllint utility to check that your XML is well-formed. By default, xmllint re-flows and prints the XML to standard output. To check for well-formedness and only print output if errors exist, use the command xmllint -noout filename.xml.

在编辑XML时,最好使用一个支持XML的编辑器,以确保语法正确,并且XML格式良好。还可以使用xmllint实用程序检查XML是否格式良好。默认情况下,xmllint重新流并将XML输出到标准输出。要检查是否有良好的格式,并且只有在存在错误时才打印输出,请使用命令xmllint -noout filename.xml。

Keep Configuration In Sync Across the Cluster

When running in distributed mode, after you make an edit to an HBase configuration, make sure you copy the contents of the conf/ directory to all nodes of the cluster. HBase will not do this for you. Use rsync, scp, or another secure mechanism for copying the configuration files to your nodes. For most configurations, a restart is needed for servers to pick up changes. Dynamic configuration is an exception to this, to be described later below.

在分布式模式下运行时,在对HBase配置进行编辑之后,确保将conf/目录的内容复制到集群的所有节点。HBase不会为你做这些。使用rsync、scp或其他安全机制将配置文件复制到节点。对于大多数配置,服务器需要重新启动来获取更改。动态配置是这方面的一个例外,稍后将对此进行描述。

4. Basic Prerequisites

4所示。基本的先决条件

This section lists required services and some required system configuration.

本节列出所需的服务和一些必需的系统配置。

Table 2. Java
HBase Version JDK 7 JDK 8

2.0

2.0

Not Supported

不支持

yes

是的

1.3

1.3

yes

是的

yes

是的

1.2

1.2

yes

是的

yes

是的

1.1

1.1

yes

是的

Running with JDK 8 will work but is not well tested.

使用JDK 8运行但是没有经过很好的测试。

HBase will neither build nor compile with Java 6.
You must set JAVA_HOME on each node of your cluster. hbase-env.sh provides a handy mechanism to do this.
Operating System Utilities
ssh

HBase uses the Secure Shell (ssh) command and utilities extensively to communicate between cluster nodes. Each server in the cluster must be running ssh so that the Hadoop and HBase daemons can be managed. You must be able to connect to all nodes via SSH, including the local node, from the Master as well as any backup Master, using a shared key rather than a password. You can see the basic methodology for such a set-up in Linux or Unix systems at "Procedure: Configure Passwordless SSH Access". If your cluster nodes use OS X, see the section, SSH: Setting up Remote Desktop and Enabling Self-Login on the Hadoop wiki.

HBase使用安全Shell (ssh)命令和实用工具广泛地在集群节点之间进行通信。集群中的每个服务器都必须运行ssh,以便管理Hadoop和HBase守护进程。您必须能够通过SSH连接到所有节点,包括本地节点、主节点和任何备份主节点,使用共享密钥而不是密码。您可以在“过程:配置无密码SSH访问”的Linux或Unix系统中看到这种设置的基本方法。如果您的集群节点使用OS X,请参见这一节,SSH:设置远程桌面并启用Hadoop wiki上的自登录。

DNS

HBase uses the local hostname to self-report its IP address. Both forward and reverse DNS resolving must work in versions of HBase previous to 0.92.0. The hadoop-dns-checker tool can be used to verify DNS is working correctly on the cluster. The project README file provides detailed instructions on usage.

HBase使用本地主机名自报告其IP地址。正向和反向DNS解析都必须在HBase之前的版本中工作到0.92.0。hadoop-dns-checker工具可以用来验证DNS在集群上是否正确工作。项目README文件提供关于使用的详细说明。

Loopback IP

Prior to hbase-0.96.0, HBase only used the IP address 127.0.0.1 to refer to localhost, and this was not configurable. See Loopback IP for more details.

在HBase -0.96.0之前,HBase只使用了IP地址127.0.0.1来引用localhost,这是不可配置的。有关更多细节,请参见Loopback IP。

NTP

The clocks on cluster nodes should be synchronized. A small amount of variation is acceptable, but larger amounts of skew can cause erratic and unexpected behavior. Time synchronization is one of the first things to check if you see unexplained problems in your cluster. It is recommended that you run a Network Time Protocol (NTP) service, or another time-synchronization mechanism on your cluster and that all nodes look to the same service for time synchronization. See the Basic NTP Configuration at The Linux Documentation Project (TLDP) to set up NTP.

集群节点上的时钟应该同步。少量的偏差是可以接受的,但是较大的偏差会导致不稳定和意外的行为。如果您在集群中看到无法解释的问题,时间同步是首先要检查的东西之一。建议您运行一个网络时间协议(NTP)服务,或者在集群上运行另一个时间同步机制,并且所有的节点都希望使用相同的服务来进行时间同步。请参阅Linux文档项目(TLDP)中的基本NTP配置来设置NTP。

Limits on Number of Files and Processes (ulimit)

Apache HBase is a database. It requires the ability to open a large number of files at once. Many Linux distributions limit the number of files a single user is allowed to open to 1024 (or 256 on older versions of OS X). You can check this limit on your servers by running the command ulimit -n when logged in as the user which runs HBase. See the Troubleshooting section for some of the problems you may experience if the limit is too low. You may also notice errors such as the following:

Apache HBase是一个数据库。它需要能够同时打开大量文件的能力。许多Linux发行版限制单个用户可以打开到1024的文件的数量(在旧版本的OS X上是256个)。当用户登录时,可以通过运行命令ulimit -n来检查服务器上的这个限制。如果限制太低,请参阅故障排除部分,了解一些您可能会遇到的问题。您还可能注意到以下错误:

2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException
2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901

It is recommended to raise the ulimit to at least 10,000, but more likely 10,240, because the value is usually expressed in multiples of 1024. Each ColumnFamily has at least one StoreFile, and possibly more than six StoreFiles if the region is under load. The number of open files required depends upon the number of ColumnFamilies and the number of regions. The following is a rough formula for calculating the potential number of open files on a RegionServer.

建议将ulimit提高到至少10,000,但更可能是10240,因为该值通常以1024的倍数表示。每个ColumnFamily至少有一个存储文件,如果该区域处于负载状态,则可能有超过6个存储文件。所需打开文件的数量取决于ColumnFamilies的数量和区域的数量。下面是计算区域服务器上打开文件的潜在数量的粗略公式。

Calculate the Potential Number of Open Files
(StoreFiles per ColumnFamily) x (regions per RegionServer)

For example, assuming that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles per ColumnFamily, and there are 100 regions per RegionServer, the JVM will open 3 * 3 * 100 = 900 file descriptors, not counting open JAR files, configuration files, and others. Opening a file does not take many resources, and the risk of allowing a user to open too many files is minimal.

例如,假设一个模式的每个区域有3个ColumnFamily,平均每个ColumnFamily有3个StoreFiles,并且每个区域服务器有100个区域,JVM将打开3 * 3 * 100 = 900的文件描述符,不包括打开的JAR文件、配置文件和其他文件。打开一个文件不需要很多资源,允许用户打开太多文件的风险是最小的。

Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the ulimit -u command. This should not be confused with the nproc command, which controls the number of CPUs available to a given user. Under load, a ulimit -u that is too low can cause OutOfMemoryError exceptions. See Jack Levin’s major HDFS issues thread on the hbase-users mailing list, from 2011.

另一个相关设置是允许用户同时运行的进程数量。在Linux和Unix中,使用ulimit -u命令设置进程的数量。这不应该与nproc命令相混淆,nproc命令控制给定用户可用的cpu数量。在负载下,ulimit -u太低,会导致OutOfMemoryError异常。从2011年开始,在hbase-用户邮件列表上看到Jack Levin的主要HDFS问题。

Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user’s ulimit configuration, look at the first line of the HBase log for that instance. A useful read setting config on your hadoop cluster is Aaron Kimball’s Configuration Parameters: What can you just ignore?

为运行HBase进程的用户配置文件描述符和进程的最大数量是操作系统配置,而不是HBase配置。同样重要的是,确保为实际运行HBase的用户更改了设置。要查看哪个用户启动了HBase,以及该用户的ulimit配置,请查看该实例的HBase日志的第一行。在hadoop集群上一个有用的读设置配置是Aaron Kimball的配置参数:您可以忽略什么?

Example 5. ulimit Settings on Ubuntu

To configure ulimit settings on Ubuntu, edit /etc/security/limits.conf, which is a space-delimited file with four columns. Refer to the man page for limits.conf for details about the format of this file. In the following example, the first line sets both soft and hard limits for the number of open files (nofile) to 32768 for the operating system user with the username hadoop. The second line sets the number of processes to 32000 for the same user.

要配置Ubuntu的ulimit设置,编辑/etc/security/limits。conf是一个空间分隔的文件,有四列。参考手册页的限制。关于该文件格式的详细信息。在下面的示例中,第一行为操作系统用户使用用户名hadoop设置了软和硬限制(nofile)到32768。第二行将进程的数量设置为同一用户的32000个进程。

hadoop  -       nofile  32768
hadoop  -       nproc   32000

The settings are only applied if the Pluggable Authentication Module (PAM) environment is directed to use them. To configure PAM to use these limits, be sure that the /etc/pam.d/common-session file contains the following line:

只有当可插入身份验证模块(PAM)环境被引导使用它们时,才会应用这些设置。要配置PAM以使用这些限制,请确保/etc/ pam。d/common session文件包含以下一行:

session required  pam_limits.so
Linux Shell

All of the shell scripts that come with HBase rely on the GNU Bash shell.

HBase的所有shell脚本都依赖于GNU Bash shell。

Windows

Prior to HBase 0.96, running HBase on Microsoft Windows was limited only for testing purposes. Running production systems on Windows machines is not recommended.

在HBase 0.96之前,在Microsoft Windows上运行HBase只是为了测试目的。不推荐在Windows机器上运行生产系统。

4.1. Hadoop

4.1。Hadoop

The following table summarizes the versions of Hadoop supported with each version of HBase. Based on the version of HBase, you should select the most appropriate version of Hadoop. You can use Apache Hadoop, or a vendor’s distribution of Hadoop. No distinction is made here. See the Hadoop wiki for information about vendors of Hadoop.

下表总结了每个版本的HBase支持的Hadoop版本。基于HBase的版本,您应该选择最合适的Hadoop版本。您可以使用Apache Hadoop或供应商的Hadoop发行版。这里没有区别。有关Hadoop供应商的信息,请参见Hadoop wiki。

Hadoop 2.x is recommended.

Hadoop 2.x is faster and includes features, such as short-circuit reads, which will help improve your HBase random read profile. Hadoop 2.x also includes important bug fixes that will improve your overall HBase experience. HBase does not support running with earlier versions of Hadoop. See the table below for requirements specific to different HBase versions.

Hadoop 2。x更快,包括一些特性,比如短路读取,这将有助于提高您的HBase随机读取配置文件。Hadoop 2。x还包括一些重要的bug修复,可以改善您的整体HBase体验。HBase不支持使用早期版本的Hadoop。请参见下面的表,以了解针对不同HBase版本的需求。

Hadoop 3.x is still in early access releases and has not yet been sufficiently tested by the HBase community for production use cases.

Hadoop 3。x仍在早期访问版本中,并且尚未经过HBase社区对生产用例进行充分的测试。

Use the following legend to interpret this table:

使用以下的图例来解释这张表:

Hadoop version support matrix
  • "S" = supported

    “S”=支持

  • "X" = not supported

    “X”=不支持

  • "NT" = Not tested

    “NT”=不测试

HBase-1.1.x HBase-1.2.x HBase-1.3.x HBase-2.0.x

Hadoop-2.0.x-alpha

Hadoop-2.0.x-alpha

X

X

X

X

X

X

X

X

Hadoop-2.1.0-beta

Hadoop-2.1.0-beta

X

X

X

X

X

X

X

X

Hadoop-2.2.0

Hadoop-2.2.0

NT

NT

X

X

X

X

X

X

Hadoop-2.3.x

Hadoop-2.3.x

NT

NT

X

X

X

X

X

X

Hadoop-2.4.x

Hadoop-2.4.x

S

年代

S

年代

S

年代

X

X

Hadoop-2.5.x

Hadoop-2.5.x

S

年代

S

年代

S

年代

X

X

Hadoop-2.6.0

Hadoop-2.6.0

X

X

X

X

X

X

X

X

Hadoop-2.6.1+

Hadoop-2.6.1 +

NT

NT

S

年代

S

年代

S

年代

Hadoop-2.7.0

Hadoop-2.7.0

X

X

X

X

X

X

X

X

Hadoop-2.7.1+

Hadoop-2.7.1 +

NT

NT

S

年代

S

年代

S

年代

Hadoop-2.8.0

Hadoop-2.8.0

X

X

X

X

X

X

X

X

Hadoop-2.8.1

Hadoop-2.8.1

X

X

X

X

X

X

X

X

Hadoop-3.0.0

Hadoop-3.0.0

NT

NT

NT

NT

NT

NT

NT

NT

Hadoop Pre-2.6.1 and JDK 1.8 Kerberos

When using pre-2.6.1 Hadoop versions and JDK 1.8 in a Kerberos environment, HBase server can fail and abort due to Kerberos keytab relogin error. Late version of JDK 1.7 (1.7.0_80) has the problem too. Refer to HADOOP-10786 for additional details. Consider upgrading to Hadoop 2.6.1+ in this case.

在Kerberos环境中使用pre2.6.1 Hadoop版本和JDK 1.8时,HBase服务器由于Kerberos keytab重新登录错误而失败和中止。JDK 1.7的晚期版本(1.7.0_80)也有问题。请参阅HADOOP-10786,了解更多细节。在本例中考虑升级到Hadoop 2.6.1+。

Hadoop 2.6.x

Hadoop distributions based on the 2.6.x line must have HADOOP-11710 applied if you plan to run HBase on top of an HDFS Encryption Zone. Failure to do so will result in cluster failure and data loss. This patch is present in Apache Hadoop releases 2.6.1+.

基于2.6的Hadoop发行版。如果您打算在HDFS加密区域上运行HBase,那么x线必须使用HADOOP-11710。如果不这样做,将导致集群失败和数据丢失。这个补丁在Apache Hadoop版本2.6.1+中出现。

Hadoop 2.7.x

Hadoop version 2.7.0 is not tested or supported as the Hadoop PMC has explicitly labeled that release as not being stable. (reference the announcement of Apache Hadoop 2.7.0.)

Hadoop version 2.7.0没有测试或支持,因为Hadoop PMC已经明确地将该版本标记为不稳定。(参考Apache Hadoop 2.7.0的公告)

Hadoop 2.8.x

Hadoop version 2.8.0 and 2.8.1 are not tested or supported as the Hadoop PMC has explicitly labeled that releases as not being stable. (reference the announcement of Apache Hadoop 2.8.0 and announcement of Apache Hadoop 2.8.1.)

Hadoop版本2.8.0和2.8.1没有被测试或支持,因为Hadoop PMC已经明确地将其标记为不稳定。(参考Apache Hadoop 2.8.0公告和Apache Hadoop 2.8.1公告)。

Replace the Hadoop Bundled With HBase!

Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under its lib directory. The bundled jar is ONLY for use in standalone mode. In distributed mode, it is critical that the version of Hadoop that is out on your cluster match what is under HBase. Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues. Make sure you replace the jar in HBase across your whole cluster. Hadoop version mismatch issues have various manifestations but often all look like its hung.

因为HBase依赖于Hadoop,它将Hadoop jar的一个实例绑定到它的lib目录下。绑定的jar只用于独立模式。在分布式模式下,关键是在您的集群上的Hadoop版本与HBase下的版本匹配。将在HBase lib目录中找到的hadoop jar替换为您在集群上运行的hadoop jar,以避免版本不匹配问题。确保在整个集群中替换HBase中的jar。Hadoop版本的不匹配问题有各种各样的表现,但通常都看起来像挂着的。

4.1.1. dfs.datanode.max.transfer.threads

以下4.4.1。dfs.datanode.max.transfer.threads

An HDFS DataNode has an upper bound on the number of files that it will serve at any one time. Before doing any loading, make sure you have configured Hadoop’s conf/hdfs-site.xml, setting the dfs.datanode.max.transfer.threads value to at least the following:

一个HDFS DataNode对它将在任何时间服务的文件的数量有一个上限。在进行任何加载之前,请确保已经配置了Hadoop的conf/hdfs-site。xml,设置dfs.datanode.max.transfer。线程的值至少为:

<property>
  <name>dfs.datanode.max.transfer.threads</name>
  <value>4096</value>
</property>

Be sure to restart your HDFS after making the above configuration.

在完成上述配置后,请确保重新启动HDFS。

Not having this configuration in place makes for strange-looking failures. One manifestation is a complaint about missing blocks. For example:

没有这样的配置就会导致看起来很奇怪的失败。一种表现是对缺失的块的抱怨。例如:

10/12/08 20:10:31 INFO hdfs.DFSClient: Could not obtain block
          blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: java.io.IOException: No live nodes
          contain current block. Will get new block locations from namenode and retry...

See also casestudies.max.transfer.threads and note that this property was previously known as dfs.datanode.max.xcievers (e.g. Hadoop HDFS: Deceived by Xciever).

也看到casestudies.max.transfer。线程和注意,此属性以前称为dfs.datanode.max。xcievers(例如Hadoop HDFS:被Xciever欺骗)。

4.2. ZooKeeper Requirements

4.2。动物园管理员的要求

ZooKeeper 3.4.x is required. HBase makes use of the multi functionality that is only available since Zookeeper 3.4.0. The hbase.zookeeper.useMulti configuration property defaults to true. Refer to HBASE-12241 (The crash of regionServer when taking deadserver’s replication queue breaks replication) and HBASE-6775 (Use ZK.multi when available for HBASE-6710 0.92/0.94 compatibility fix) for background. The property is deprecated and useMulti is always enabled in HBase 2.0.

3.4管理员。x是必需的。HBase使用了自Zookeeper 3.4.0以来仅可用的多功能。hbase.zookeeper。useMulti配置属性默认为true。请参考HBASE-12241(在使用deadserver的复制队列中断复制时发生的区域服务器崩溃)和HBASE-6775(使用ZK)。为背景提供HBASE-6710 0.92/0.94兼容性补丁。该属性已弃用,而useMulti总是在HBase 2.0中启用。

5. HBase run modes: Standalone and Distributed

5。HBase运行模式:独立和分布式。

HBase has two run modes: standalone and distributed. Out of the box, HBase runs in standalone mode. Whatever your mode, you will need to configure HBase by editing files in the HBase conf directory. At a minimum, you must edit conf/hbase-env.sh to tell HBase which java to use. In this file you set HBase environment variables such as the heapsize and other options for the JVM, the preferred location for log files, etc. Set JAVA_HOME to point at the root of your java install.

HBase有两种运行模式:独立和分布式。在盒子之外,HBase以独立模式运行。无论您的模式是什么,您都需要通过在HBase conf目录中编辑文件来配置HBase。至少,您必须编辑conf/hbase-env。sh告诉HBase使用哪个java。在这个文件中,您设置了HBase环境变量,比如JVM的heapsize和其他选项,日志文件的首选位置等等。

5.1. Standalone HBase

5.1。独立的HBase

This is the default mode. Standalone mode is what is described in the quickstart section. In standalone mode, HBase does not use HDFS — it uses the local filesystem instead — and it runs all HBase daemons and a local ZooKeeper all up in the same JVM. ZooKeeper binds to a well known port so clients may talk to HBase.

这是默认模式。独立模式是在快速启动部分中描述的。在独立模式下,HBase不使用HDFS——而是使用本地文件系统——并且它在同一个JVM中运行所有HBase守护进程和一个本地ZooKeeper。ZooKeeper绑定到一个著名的端口,这样客户可以和HBase交谈。

5.1.1. Standalone HBase over HDFS

5.1.1。独立的HBase在HDFS

A sometimes useful variation on standalone hbase has all daemons running inside the one JVM but rather than persist to the local filesystem, instead they persist to an HDFS instance.

有时候,独立的hbase上的一些有用的变体有所有在一个JVM中运行的守护进程,而不是持久化到本地文件系统中,而是持久化到一个HDFS实例中。

You might consider this profile when you are intent on a simple deploy profile, the loading is light, but the data must persist across node comings and goings. Writing to HDFS where data is replicated ensures the latter.

当您想要一个简单的部署概要文件时,您可能会考虑这个概要文件,加载是轻的,但是数据必须在节点的出入中保持。将数据写入到HDFS中,确保数据被复制。

To configure this standalone variant, edit your hbase-site.xml setting hbase.rootdir to point at a directory in your HDFS instance but then set hbase.cluster.distributed to false. For example:

要配置这个独立的变体,编辑您的hbase站点。xml设置hbase。在HDFS实例中指向一个目录,然后设置hbase.cluster。分发给假。例如:

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://namenode.example.org:8020/hbase</value>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>false</value>
  </property>
</configuration>

5.2. Distributed

5.2。分布式

Distributed mode can be subdivided into distributed but all daemons run on a single node — a.k.a. pseudo-distributed — and fully-distributed where the daemons are spread across all nodes in the cluster. The pseudo-distributed vs. fully-distributed nomenclature comes from Hadoop.

分布式模式可以被细分为分布式的,但是所有的守护进程都运行在一个节点上,即a.k.a。伪分布式和完全分布式,守护进程分布在集群中的所有节点上。伪分布式和完全分布的命名来自于Hadoop。

Pseudo-distributed mode can run against the local filesystem or it can run against an instance of the Hadoop Distributed File System (HDFS). Fully-distributed mode can ONLY run on HDFS. See the Hadoop documentation for how to set up HDFS. A good walk-through for setting up HDFS on Hadoop 2 can be found at http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide.

伪分布模式可以针对本地文件系统运行,也可以针对Hadoop分布式文件系统(HDFS)的实例运行。全分布式模式只能在HDFS上运行。参见Hadoop文档了解如何设置HDFS。可以在http://www.alexjf.net/blog/distributedsystems/hadoop -yarn- installationguide中找到一个用于在Hadoop 2上设置HDFS的好方法。

5.2.1. Pseudo-distributed

5.2.1。伪分布

Pseudo-Distributed Quickstart

A quickstart has been added to the quickstart chapter. See quickstart-pseudo. Some of the information that was originally in this section has been moved there.

快速启动已经添加到快速启动章节。看到quickstart-pseudo。本节中最初的一些信息已经被移到了那里。

A pseudo-distributed mode is simply a fully-distributed mode run on a single host. Use this HBase configuration for testing and prototyping purposes only. Do not use this configuration for production or for performance evaluation.

伪分布模式是在单个主机上运行的完全分布式模式。仅使用此HBase配置进行测试和原型设计。不要使用此配置用于生产或性能评估。

5.3. Fully-distributed

5.3。全分布

By default, HBase runs in standalone mode. Both standalone mode and pseudo-distributed mode are provided for the purposes of small-scale testing. For a production environment, distributed mode is advised. In distributed mode, multiple instances of HBase daemons run on multiple servers in the cluster.

默认情况下,HBase以独立模式运行。为小型测试的目的,提供了独立模式和伪分布式模式。对于生产环境,建议采用分布式模式。在分布式模式中,HBase守护进程的多个实例在集群中的多个服务器上运行。

Just as in pseudo-distributed mode, a fully distributed configuration requires that you set the hbase.cluster.distributed property to true. Typically, the hbase.rootdir is configured to point to a highly-available HDFS filesystem.

就像在伪分布模式中一样,一个完全分布式的配置要求您设置hbase.cluster。分布式属性为true。通常,hbase。rootdir配置为指向高可用的HDFS文件系统。

In addition, the cluster is configured so that multiple cluster nodes enlist as RegionServers, ZooKeeper QuorumPeers, and backup HMaster servers. These configuration basics are all demonstrated in quickstart-fully-distributed.

此外,还配置了集群,以使多个集群节点作为区域服务器、ZooKeeper quorumpeer和备份的HMaster服务器。这些配置基础都在quickstart-full -distributed中演示。

Distributed RegionServers

Typically, your cluster will contain multiple RegionServers all running on different servers, as well as primary and backup Master and ZooKeeper daemons. The conf/regionservers file on the master server contains a list of hosts whose RegionServers are associated with this cluster. Each host is on a separate line. All hosts listed in this file will have their RegionServer processes started and stopped when the master server starts or stops.

通常,您的集群将包含多个在不同服务器上运行的区域服务器,以及主服务器和备份主机和ZooKeeper守护进程。主服务器上的conf/区域服务器文件包含一个主机列表,其区域服务器与此集群关联。每个主机都在一个单独的行上。在该文件中列出的所有主机将在主服务器启动或停止时启动并停止区域服务器进程。

ZooKeeper and HBase

See the ZooKeeper section for ZooKeeper setup instructions for HBase.

请参见ZooKeeper部分的ZooKeeper设置说明。

Example 6. Example Distributed HBase Cluster

This is a bare-bones conf/hbase-site.xml for a distributed HBase cluster. A cluster that is used for real-world work would contain more custom configuration parameters. Most HBase configuration directives have default values, which are used unless the value is overridden in the hbase-site.xml. See "Configuration Files" for more information.

这是一个裸体的conf/hbase网站。用于分布式HBase集群的xml。用于实际工作的集群将包含更多自定义配置参数。大多数HBase配置指令都有默认值,除非在HBase -site.xml中被覆盖,否则将使用该值。有关更多信息,请参见“配置文件”。

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://namenode.example.org:8020/hbase</value>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>node-a.example.com,node-b.example.com,node-c.example.com</value>
  </property>
</configuration>

This is an example conf/regionservers file, which contains a list of nodes that should run a RegionServer in the cluster. These nodes need HBase installed and they need to use the same contents of the conf/ directory as the Master server

这是一个示例conf/ RegionServer文件,其中包含一个节点列表,该列表应该在集群中运行一个区域服务器。这些节点需要安装HBase,并且它们需要使用conf/目录的相同内容作为主服务器。

node-a.example.com
node-b.example.com
node-c.example.com

This is an example conf/backup-masters file, which contains a list of each node that should run a backup Master instance. The backup Master instances will sit idle unless the main Master becomes unavailable.

这是一个conf/backup-masters文件的示例,其中包含每个节点的列表,该列表应该运行一个备份主实例。除非主主变得不可用,否则备份主实例将处于空闲状态。

node-b.example.com
node-c.example.com
Distributed HBase Quickstart

See quickstart-fully-distributed for a walk-through of a simple three-node cluster configuration with multiple ZooKeeper, backup HMaster, and RegionServer instances.

请参阅quickstart-完全分布式,以完成一个简单的三节点集群配置,其中包含多个ZooKeeper、备份HMaster和区域服务器实例。

Procedure: HDFS Client Configuration
  1. Of note, if you have made HDFS client configuration changes on your Hadoop cluster, such as configuration directives for HDFS clients, as opposed to server-side configurations, you must use one of the following methods to enable HBase to see and use these configuration changes:

    值得注意的是,如果您在Hadoop集群上做了HDFS客户端配置更改,比如HDFS客户机的配置指令,而不是服务器端配置,那么您必须使用以下方法之一来启用HBase查看和使用这些配置更改:

    1. Add a pointer to your HADOOP_CONF_DIR to the HBASE_CLASSPATH environment variable in hbase-env.sh.

      在hbase-env.sh中添加一个指向您的hadoop op_conf_dir到HBASE_CLASSPATH环境变量的指针。

    2. Add a copy of hdfs-site.xml (or hadoop-site.xml) or, better, symlinks, under ${HBASE_HOME}/conf, or

      添加一个hdfs-site拷贝。在${HBASE_HOME}/conf下的xml(或hadoop-site.xml)或更好的符号链接。

    3. if only a small set of HDFS client configurations, add them to hbase-site.xml.

      如果只有一小部分HDFS客户机配置,将它们添加到hbase-site.xml中。

An example of such an HDFS client configuration is dfs.replication. If for example, you want to run with a replication factor of 5, HBase will create files with the default of 3 unless you do the above to make the configuration available to HBase.

这种HDFS客户机配置的一个示例是dfs.replication。例如,如果您想要以5的复制因子运行,HBase将创建默认为3的文件,除非您执行上述操作,以使HBase可用配置。

6. Running and Confirming Your Installation

6。运行和确认您的安装。

Make sure HDFS is running first. Start and stop the Hadoop HDFS daemons by running bin/start-hdfs.sh over in the HADOOP_HOME directory. You can ensure it started properly by testing the put and get of files into the Hadoop filesystem. HBase does not normally use the MapReduce or YARN daemons. These do not need to be started.

确保HDFS首先运行。通过运行bin/ Start - HDFS启动和停止Hadoop HDFS守护进程。在hadoop - home目录中。您可以通过在Hadoop文件系统中测试put和get来确保它正确地启动。HBase通常不使用MapReduce或纱线守护进程。这些不需要启动。

If you are managing your own ZooKeeper, start it and confirm it’s running, else HBase will start up ZooKeeper for you as part of its start process.

如果你正在管理你自己的动物管理员,启动它并确认它正在运行,否则HBase将启动你作为它开始进程的一部分的动物管理员。

Start HBase with the following command:

使用以下命令启动HBase:

bin/start-hbase.sh

Run the above from the HBASE_HOME directory.

运行上面的HBASE_HOME目录。

You should now have a running HBase instance. HBase logs can be found in the logs subdirectory. Check them out especially if HBase had trouble starting.

现在应该有一个运行的HBase实例。可以在日志子目录中找到HBase日志。检查他们,特别是如果HBase有麻烦开始。

HBase also puts up a UI listing vital attributes. By default it’s deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 by default and put up an informational HTTP server at port 16030). If the Master is running on a host named master.example.org on the default port, point your browser at http://master.example.org:16010 to see the web interface.

HBase还设置了一个用于列出重要属性的UI。默认情况下,它被部署在16010端口的主主机上(HBase区域服务器默认监听16020端口,并在16030端口上安装一个信息HTTP服务器)。如果Master在默认端口上运行名为master.example.org的主机,请将浏览器指向http://master.example.org:16010查看web界面。

Once HBase has started, see the shell exercises section for how to create tables, add data, scan your insertions, and finally disable and drop your tables.

一旦HBase启动,请参见shell练习小节,了解如何创建表、添加数据、扫描插入,最后禁用和删除表。

To stop HBase after exiting the HBase shell enter

在退出HBase shell后停止HBase。

$ ./bin/stop-hbase.sh
stopping hbase...............

Shutdown can take a moment to complete. It can take longer if your cluster is comprised of many machines. If you are running a distributed operation, be sure to wait until HBase has shut down completely before stopping the Hadoop daemons.

关闭可能需要一段时间才能完成。如果您的集群由许多机器组成,则需要更长的时间。如果您正在运行一个分布式操作,一定要等到HBase完全关闭之后才停止Hadoop守护进程。

7. Default Configuration

7所示。默认配置

7.1. hbase-site.xml and hbase-default.xml

7.1。hbase-site。xml和hbase-default.xml

Just as in Hadoop where you add site-specific HDFS configuration to the hdfs-site.xml file, for HBase, site specific customizations go into the file conf/hbase-site.xml. For the list of configurable properties, see hbase default configurations below or view the raw hbase-default.xml source file in the HBase source code at src/main/resources.

就像在Hadoop中,将特定于站点的HDFS配置添加到HDFS站点。对于HBase,特定于站点的定制会进入文件conf/ HBase -site.xml。对于可配置属性的列表,请参见下面的hbase默认配置或查看原始hbase-default。xml源文件在src/main/resources的HBase源代码中。

Not all configuration options make it out to hbase-default.xml. Some configurations would only appear in source code; the only way to identify these changes are through code review.

并不是所有的配置选项都显示为hbase-default.xml。一些配置只出现在源代码中;识别这些变化的唯一方法是通过代码复查。

Currently, changes here will require a cluster restart for HBase to notice the change.

当前,这里的更改将要求HBase重新启动集群以注意更改。

7.2. HBase Default Configuration

7.2。HBase默认配置

The documentation below is generated using the default hbase configuration file, hbase-default.xml, as source.

下面的文档是使用默认的hbase配置文件hbase-default生成的。xml源。

hbase.tmp.dir
Description

Temporary directory on the local filesystem. Change this setting to point to a location more permanent than '/tmp', the usual resolve for java.io.tmpdir, as the '/tmp' directory is cleared on machine restart.

本地文件系统上的临时目录。更改此设置以指向比“/tmp”更持久的位置,即java.io的通常解决方案。tmpdir,作为“/tmp”目录在机器重新启动时被清除。

Default

${java.io.tmpdir}/hbase-${user.name}

$ { java.io.tmpdir } / hbase - $ { user.name }

hbase.rootdir
Description

The directory shared by region servers and into which HBase persists. The URL should be 'fully-qualified' to include the filesystem scheme. For example, to specify the HDFS directory '/hbase' where the HDFS instance’s namenode is running at namenode.example.org on port 9000, set this value to: hdfs://namenode.example.org:9000/hbase. By default, we write to whatever ${hbase.tmp.dir} is set too — usually /tmp — so change this configuration or else all data will be lost on machine restart.

该目录由区域服务器共享,并且HBase继续存在。URL应该是“完全限定的”,以包括文件系统方案。例如,要指定HDFS目录“/hbase”,其中HDFS实例的namenode在namenode.example.org上运行,请将该值设置为:HDFS://namenode.example.org:9000/hbase。默认情况下,我们写入任何${hbase.tmp。设置too - usually /tmp—因此更改此配置,否则将在机器重新启动时丢失所有数据。

Default

${hbase.tmp.dir}/hbase

$ { hbase.tmp.dir } / hbase

hbase.cluster.distributed
Description

The mode the cluster will be in. Possible values are false for standalone mode and true for distributed mode. If false, startup will run all HBase and ZooKeeper daemons together in the one JVM.

集群将进入的模式。可能的值对于独立模式来说是错误的,对于分布式模式是正确的。如果false,启动将在一个JVM中运行所有HBase和ZooKeeper守护进程。

Default

false

hbase.zookeeper.quorum
Description

Comma separated list of servers in the ZooKeeper ensemble (This config. should have been named hbase.zookeeper.ensemble). For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper ensemble servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which hbase will start/stop ZooKeeper on as part of cluster start/stop. Client-side, we will take this list of ensemble members and put it together with the hbase.zookeeper.property.clientPort config. and pass it into zookeeper constructor as the connectString parameter.

在ZooKeeper组中,逗号分隔的服务器列表(此配置)。应该被命名为hbase.zookeeper.。例如,host1.mydomain.com,host2.mydomain.com,host3.mydomain.com。默认情况下,这将设置为本地和伪分布式操作的本地主机。对于完全分布的设置,应该将其设置为ZooKeeper集成服务器的完整列表。如果HBASE_MANAGES_ZK设置在hbase-env中。这是hbase将在集群开始/停止时启动/停止ZooKeeper的服务器列表。客户端,我们将把这个集合成员的列表和hbase.zookeeper.property一起放在一起。clientPort配置。并将其作为connectString参数传递给zookeeper构造函数。

Default

localhost

本地主机

zookeeper.recovery.retry.maxsleeptime
Description

Max sleep time before retry zookeeper operations in milliseconds, a max time is needed here so that sleep time won’t grow unboundedly

在重新尝试zookeeper操作几毫秒之前,最大的睡眠时间在这里是需要的,这样睡眠时间就不会无限制地增长。

Default

60000

60000年

hbase.local.dir
Description

Directory on the local filesystem to be used as a local storage.

本地文件系统上的目录,用作本地存储。

Default

${hbase.tmp.dir}/local/

$ { hbase.tmp.dir } /地方/

hbase.master.port
Description

The port the HBase Master should bind to.

HBase主机的端口应该绑定到。

Default

16000

16000年

hbase.master.info.port
Description

The port for the HBase Master web UI. Set to -1 if you do not want a UI instance run.

HBase主web UI的端口。如果您不想运行UI实例,则设置为-1。

Default

16010

16010年

hbase.master.info.bindAddress
Description

The bind address for the HBase Master web UI

HBase主web UI的绑定地址。

Default

0.0.0.0

0.0.0.0

hbase.master.logcleaner.plugins
Description

A comma-separated list of BaseLogCleanerDelegate invoked by the LogsCleaner service. These WAL cleaners are called in order, so put the cleaner that prunes the most files in front. To implement your own BaseLogCleanerDelegate, just put it in HBase’s classpath and add the fully qualified class name here. Always add the above default log cleaners in the list.

LogsCleaner服务调用的一个逗号分隔的BaseLogCleanerDelegate列表。这些WAL -清洁工是按顺序被调用的,所以把最前面的文件清除干净。要实现您自己的BaseLogCleanerDelegate,只需将它放在HBase的类路径中,并在这里添加完全限定的类名。总是在列表中添加上面的默认日志清除器。

Default

org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner

org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner

hbase.master.logcleaner.ttl
Description

How long a WAL remain in the archive ({hbase.rootdir}/oldWALs) directory, after which it will be cleaned by a Master thread. The value is in milliseconds.

在归档({hbase.rootdir}/oldWALs)目录中,一个WAL保持多长时间,之后将由一个主线程进行清理。这个值是以毫秒为单位的。

Default

600000

600000年

hbase.master.procedurewalcleaner.ttl
Description

How long a Procedure WAL will remain in the archive directory, after which it will be cleaned by a Master thread. The value is in milliseconds.

一个过程会在存档目录中保留多长时间,之后将由一个主线程进行清理。这个值是以毫秒为单位的。

Default

604800000

604800000

hbase.master.hfilecleaner.plugins
Description

A comma-separated list of BaseHFileCleanerDelegate invoked by the HFileCleaner service. These HFiles cleaners are called in order, so put the cleaner that prunes the most files in front. To implement your own BaseHFileCleanerDelegate, just put it in HBase’s classpath and add the fully qualified class name here. Always add the above default log cleaners in the list as they will be overwritten in hbase-site.xml.

由HFileCleaner服务调用的以逗号分隔的BaseHFileCleanerDelegate列表。这些HFiles清洗器是按顺序被调用的,所以请将最前面的文件清除干净。要实现您自己的BaseHFileCleanerDelegate,只需将它放在HBase的类路径中,并在这里添加完全限定的类名。总是在列表中添加上面的默认日志清除器,因为它们将被覆盖在hbase-site.xml中。

Default

org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner

org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner

hbase.master.infoserver.redirect
Description

Whether or not the Master listens to the Master web UI port (hbase.master.info.port) and redirects requests to the web UI server shared by the Master and RegionServer. Config. makes sense when Master is serving Regions (not the default).

无论主服务器是否侦听主web UI端口(hbase.master.info.port),并将请求重定向到主服务器和区域服务器共享的web UI服务器。配置。当Master是服务区域(而不是默认区域)时,这是有意义的。

Default

true

真正的

hbase.master.fileSplitTimeout
Description

Splitting a region, how long to wait on the file-splitting step before aborting the attempt. Default: 600000. This setting used to be known as hbase.regionserver.fileSplitTimeout in hbase-1.x. Split is now run master-side hence the rename (If a 'hbase.master.fileSplitTimeout' setting found, will use it to prime the current 'hbase.master.fileSplitTimeout' Configuration.

分割一个区域,在中止尝试之前等待文件拆分步骤需要多长时间。默认值:600000。该设置以前称为hbase.区域性服务器。在hbase fileSplitTimeout - 1. x。Split现在运行主端,因此重命名(如果是“hbase.master”。fileSplitTimeout的设置,将使用它来启动当前的hbase.master。fileSplitTimeout的配置。

Default

600000

600000年

hbase.regionserver.port
Description

The port the HBase RegionServer binds to.

HBase区域服务器绑定到的端口。

Default

16020

16020年

hbase.regionserver.info.port
Description

The port for the HBase RegionServer web UI Set to -1 if you do not want the RegionServer UI to run.

如果您不希望区域服务器UI运行,那么HBase区域服务器web UI的端口设置为-1。

Default

16030

16030年

hbase.regionserver.info.bindAddress
Description

The address for the HBase RegionServer web UI

HBase区域服务器web UI的地址。

Default

0.0.0.0

0.0.0.0

hbase.regionserver.info.port.auto
Description

Whether or not the Master or RegionServer UI should search for a port to bind to. Enables automatic port search if hbase.regionserver.info.port is already in use. Useful for testing, turned off by default.

主服务器或区域服务器UI是否应该搜索一个端口来绑定。支持自动端口搜索,如果hbase.org .info.port已经在使用中。用于测试,默认关闭。

Default

false

hbase.regionserver.handler.count
Description

Count of RPC Listener instances spun up on RegionServers. Same property is used by the Master for count of master handlers. Too many handlers can be counter-productive. Make it a multiple of CPU count. If mostly read-only, handlers count close to cpu count does well. Start with twice the CPU count and tune from there.

在区域服务器上旋转的RPC侦听器实例的计数。主处理程序的主机使用相同的属性。太多的处理程序可能会适得其反。使它成为一个多CPU计数。如果大多数都是只读的,那么处理程序数接近于cpu的数就很好了。从CPU数量的两倍开始。

Default

30

30.

hbase.ipc.server.callqueue.handler.factor
Description

Factor to determine the number of call queues. A value of 0 means a single queue shared between all the handlers. A value of 1 means that each handler has its own queue.

决定调用队列数量的因素。值0表示所有处理程序之间共享一个队列。值1意味着每个处理程序都有自己的队列。

Default

0.1

0.1

hbase.ipc.server.callqueue.read.ratio
Description

Split the call queues into read and write queues. The specified interval (which should be between 0.0 and 1.0) will be multiplied by the number of call queues. A value of 0 indicate to not split the call queues, meaning that both read and write requests will be pushed to the same set of queues. A value lower than 0.5 means that there will be less read queues than write queues. A value of 0.5 means there will be the same number of read and write queues. A value greater than 0.5 means that there will be more read queues than write queues. A value of 1.0 means that all the queues except one are used to dispatch read requests. Example: Given the total number of call queues being 10 a read.ratio of 0 means that: the 10 queues will contain both read/write requests. a read.ratio of 0.3 means that: 3 queues will contain only read requests and 7 queues will contain only write requests. a read.ratio of 0.5 means that: 5 queues will contain only read requests and 5 queues will contain only write requests. a read.ratio of 0.8 means that: 8 queues will contain only read requests and 2 queues will contain only write requests. a read.ratio of 1 means that: 9 queues will contain only read requests and 1 queues will contain only write requests.

将调用队列拆分为读和写队列。指定的间隔(应该在0.0到1.0之间)将被调用队列的数量乘以。值0表示不拆分调用队列,这意味着读和写请求将被推到同一组队列。低于0.5的值意味着要比写队列少读取队列。值0.5意味着会有相同数量的读写队列。大于0.5的值意味着会有更多的读取队列,而不是写队列。值为1.0意味着除了一个队列以外的所有队列都用于分派读取请求。示例:给定读取队列的总数为10。0的比率意味着:10个队列将包含读/写请求。一个阅读。0.3的比率意味着:3个队列只包含读请求,7个队列只包含写请求。一个阅读。比率0.5表示:5个队列只包含读请求,5个队列只包含写请求。一个阅读。0.8表示:8个队列只包含读请求,2个队列只包含写请求。一个阅读。比率1表示:9个队列只包含读请求,1个队列只包含写请求。

Default

0

0

hbase.ipc.server.callqueue.scan.ratio
Description

Given the number of read call queues, calculated from the total number of call queues multiplied by the callqueue.read.ratio, the scan.ratio property will split the read call queues into small-read and long-read queues. A value lower than 0.5 means that there will be less long-read queues than short-read queues. A value of 0.5 means that there will be the same number of short-read and long-read queues. A value greater than 0.5 means that there will be more long-read queues than short-read queues A value of 0 or 1 indicate to use the same set of queues for gets and scans. Example: Given the total number of read call queues being 8 a scan.ratio of 0 or 1 means that: 8 queues will contain both long and short read requests. a scan.ratio of 0.3 means that: 2 queues will contain only long-read requests and 6 queues will contain only short-read requests. a scan.ratio of 0.5 means that: 4 queues will contain only long-read requests and 4 queues will contain only short-read requests. a scan.ratio of 0.8 means that: 6 queues will contain only long-read requests and 2 queues will contain only short-read requests.

给定读调用队列的数量,计算从调用队列的总数乘以callqueue。read。比,扫描。ratio属性将read调用队列拆分为小读和长读队列。低于0.5的值意味着,长读队列比短读队列少。值为0.5意味着有相同数量的短读和长读队列。一个大于0.5的值意味着,将会有比短读队列更多的长读队列,一个0或1的值表示使用相同的队列来获取和扫描。示例:给定读取调用队列的总数为8次扫描。0或1的比率意味着:8个队列将包含长时间和短读请求。扫描。0.3的比率意味着:2个队列只包含长读请求,6个队列只包含短读请求。扫描。比率0.5表示:4个队列只包含长读请求,4个队列只包含短读请求。扫描。0.8的比率意味着:6个队列只包含长读请求,2个队列只包含短读请求。

Default

0

0

hbase.regionserver.msginterval
Description

Interval between messages from the RegionServer to Master in milliseconds.

区域服务器之间的消息间隔以毫秒为单位。

Default

3000

3000年

hbase.regionserver.logroll.period
Description

Period at which we will roll the commit log regardless of how many edits it has.

在此期间,我们将滚动提交日志,而不管它有多少编辑器。

Default

3600000

3600000

hbase.regionserver.logroll.errors.tolerated
Description

The number of consecutive WAL close errors we will allow before triggering a server abort. A setting of 0 will cause the region server to abort if closing the current WAL writer fails during log rolling. Even a small value (2 or 3) will allow a region server to ride over transient HDFS errors.

在触发服务器中止之前,我们将允许连续的WAL - close错误数。如果在日志滚动期间关闭当前的WAL writer失败,那么设置0将导致该区域服务器中止。即使是很小的值(2或3),也会允许一个区域服务器通过短暂的HDFS错误。

Default

2

2

hbase.regionserver.hlog.reader.impl
Description

The WAL file reader implementation.

WAL - file阅读器实现。

Default

org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader

org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader

hbase.regionserver.hlog.writer.impl
Description

The WAL file writer implementation.

WAL - file writer的实现。

Default

org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter

org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter

hbase.regionserver.global.memstore.size
Description

Maximum size of all memstores in a region server before new updates are blocked and flushes are forced. Defaults to 40% of heap (0.4). Updates are blocked and flushes are forced until size of all memstores in a region server hits hbase.regionserver.global.memstore.size.lower.limit. The default value in this configuration has been intentionally left empty in order to honor the old hbase.regionserver.global.memstore.upperLimit property if present.

在新更新被阻塞和刷新之前,区域服务器中所有memstore的最大大小。默认值为堆的40%(0.4)。更新被阻塞和刷新,直到一个区域服务器的所有memstore的大小到达hbase.regionserver. global.memstore.lower .limit。这个配置中的默认值是为了纪念旧的hbase.org .global.memstore而故意留下的。如果存在upperLimit属性。

Default

none

没有一个

hbase.regionserver.global.memstore.size.lower.limit
Description

Maximum size of all memstores in a region server before flushes are forced. Defaults to 95% of hbase.regionserver.global.memstore.size (0.95). A 100% value for this value causes the minimum possible flushing to occur when updates are blocked due to memstore limiting. The default value in this configuration has been intentionally left empty in order to honor the old hbase.regionserver.global.memstore.lowerLimit property if present.

在刷新之前,区域服务器中所有memstore的最大大小都是强制的。默认值为95%的hbase.org .global.memstore。大小(0.95)。此值的100%值会导致当更新由于memstore限制而阻塞时可能发生的最小刷新。这个配置中的默认值是为了纪念旧的hbase.org .global.memstore而故意留下的。如果存在lowerLimit属性。

Default

none

没有一个

hbase.systemtables.compacting.memstore.type
Description

Determines the type of memstore to be used for system tables like META, namespace tables etc. By default NONE is the type and hence we use the default memstore for all the system tables. If we need to use compacting memstore for system tables then set this property to BASIC/EAGER

确定用于系统表的memstore类型,如元、名称空间表等。缺省情况下,NONE是类型,因此我们为所有系统表使用默认的memstore。如果需要对系统表使用压缩memstore,则将此属性设置为BASIC/EAGER。

Default

NONE

没有一个

hbase.regionserver.optionalcacheflushinterval
Description

Maximum amount of time an edit lives in memory before being automatically flushed. Default 1 hour. Set it to 0 to disable automatic flushing.

在自动刷新之前,在内存中编辑生命的最长时间。默认的1小时。设置为0以禁用自动刷新。

Default

3600000

3600000

hbase.regionserver.dns.interface
Description

The name of the Network Interface from which a region server should report its IP address.

一个区域服务器应该报告其IP地址的网络接口的名称。

Default

default

默认的

hbase.regionserver.dns.nameserver
Description

The host name or IP address of the name server (DNS) which a region server should use to determine the host name used by the master for communication and display purposes.

名称服务器(DNS)的主机名或IP地址,区域服务器应该使用它来确定主用于通信和显示目的使用的主机名。

Default

default

默认的

hbase.regionserver.region.split.policy
Description

A split policy determines when a region should be split. The various other split policies that are available currently are BusyRegionSplitPolicy, ConstantSizeRegionSplitPolicy, DisabledRegionSplitPolicy, DelimitedKeyPrefixRegionSplitPolicy, KeyPrefixRegionSplitPolicy, and SteppingSplitPolicy. DisabledRegionSplitPolicy blocks manual region splitting.

分割策略决定一个区域何时应该被分割。当前可用的各种其他分裂策略包括busysplitpolicy、ConstantSizeRegionSplitPolicy、DisabledRegionSplitPolicy、delimitedkeyprefixsplitpolicy、keyprefixsplitpolicy和SteppingSplitPolicy。DisabledRegionSplitPolicy阻止手动区域分割。

Default

org.apache.hadoop.hbase.regionserver.SteppingSplitPolicy

org.apache.hadoop.hbase.regionserver.SteppingSplitPolicy

hbase.regionserver.regionSplitLimit
Description

Limit for the number of regions after which no more region splitting should take place. This is not hard limit for the number of regions but acts as a guideline for the regionserver to stop splitting after a certain limit. Default is set to 1000.

在没有更多区域分裂的区域的数量限制应该发生。这对于区域的数量并不是严格的限制,而是作为区域服务器在一定限度后停止分裂的指南。默认设置为1000。

Default

1000

1000年

zookeeper.session.timeout
Description

ZooKeeper session timeout in milliseconds. It is used in two different ways. First, this value is used in the ZK client that HBase uses to connect to the ensemble. It is also used by HBase when it starts a ZK server and it is passed as the 'maxSessionTimeout'. See http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions. For example, if an HBase region server connects to a ZK ensemble that’s also managed by HBase, then the session timeout will be the one specified by this configuration. But, a region server that connects to an ensemble managed with a different configuration will be subjected that ensemble’s maxSessionTimeout. So, even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and it will take precedence. The current default that ZK ships with is 40 seconds, which is lower than HBase’s.

ZooKeeper会话超时以毫秒为单位。它有两种用途。首先,该值用于HBase用于连接集成的ZK客户端。HBase在启动ZK服务器时也使用它,并作为“maxSessionTimeout”传递。见http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html # ch_zkSessions。例如,如果HBase区域服务器连接到由HBase管理的ZK集成,那么会话超时将由该配置指定。但是,连接到一个由不同配置管理的集成的区域服务器将会受到集成的maxSessionTimeout的影响。因此,尽管HBase可能建议使用90秒,但是集成可以有一个比这个更低的超时值,而且它将优先。ZK船现在的默认值是40秒,低于HBase。

Default

90000

90000年

zookeeper.znode.parent
Description

Root ZNode for HBase in ZooKeeper. All of HBase’s ZooKeeper files that are configured with a relative path will go under this node. By default, all of HBase’s ZooKeeper file paths are configured with a relative path, so they will all go under this directory unless changed.

在ZooKeeper中HBase的根ZNode。所有配置了相对路径的HBase的ZooKeeper文件都将在这个节点下运行。默认情况下,所有HBase的ZooKeeper文件路径都配置了相对路径,所以它们都将在这个目录下运行,除非发生更改。

Default

/hbase

/ hbase

zookeeper.znode.acl.parent
Description

Root ZNode for access control lists.

访问控制列表的根ZNode。

Default

acl

acl

hbase.zookeeper.dns.interface
Description

The name of the Network Interface from which a ZooKeeper server should report its IP address.

一个ZooKeeper服务器应该报告其IP地址的网络接口的名称。

Default

default

默认的

hbase.zookeeper.dns.nameserver
Description

The host name or IP address of the name server (DNS) which a ZooKeeper server should use to determine the host name used by the master for communication and display purposes.

名称服务器(DNS)的主机名或IP地址,由ZooKeeper服务器使用,以确定主用于通信和显示目的使用的主机名。

Default

default

默认的

hbase.zookeeper.peerport
Description

Port used by ZooKeeper peers to talk to each other. See http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperStarted.html#sc_RunningReplicatedZooKeeper for more information.

ZooKeeper的同伴使用的端口,可以互相交谈。查看http://hadoop. apache.org/zookeeper/docs/r3.1.1/zookeeperstar.html #sc_RunningReplicatedZooKeeper获取更多信息。

Default

2888

2888年

hbase.zookeeper.leaderport
Description

Port used by ZooKeeper for leader election. See http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperStarted.html#sc_RunningReplicatedZooKeeper for more information.

用于领导选举的动物管理员使用的港口。查看http://hadoop. apache.org/zookeeper/docs/r3.1.1/zookeeperstar.html #sc_RunningReplicatedZooKeeper获取更多信息。

Default

3888

3888年

hbase.zookeeper.property.initLimit
Description

Property from ZooKeeper’s config zoo.cfg. The number of ticks that the initial synchronization phase can take.

ZooKeeper的配置zoo.cfg的属性。初始同步阶段可以使用的滴答数。

Default

10

10

hbase.zookeeper.property.syncLimit
Description

Property from ZooKeeper’s config zoo.cfg. The number of ticks that can pass between sending a request and getting an acknowledgment.

ZooKeeper的配置zoo.cfg的属性。可以在发送请求和得到确认之间传递的滴答数。

Default

5

5

hbase.zookeeper.property.dataDir
Description

Property from ZooKeeper’s config zoo.cfg. The directory where the snapshot is stored.

ZooKeeper的配置zoo.cfg的属性。存储快照的目录。

Default

${hbase.tmp.dir}/zookeeper

$ { hbase.tmp.dir } /动物园管理员

hbase.zookeeper.property.clientPort
Description

Property from ZooKeeper’s config zoo.cfg. The port at which the clients will connect.

ZooKeeper的配置zoo.cfg的属性。客户端连接的端口。

Default

2181

2181年

hbase.zookeeper.property.maxClientCnxns
Description

Property from ZooKeeper’s config zoo.cfg. Limit on number of concurrent connections (at the socket level) that a single client, identified by IP address, may make to a single member of the ZooKeeper ensemble. Set high to avoid zk connection issues running standalone and pseudo-distributed.

ZooKeeper的配置zoo.cfg的属性。对单个客户端(由IP地址标识)的并发连接数量的限制,可以向ZooKeeper集成的单个成员发送。设置高,以避免zk连接问题运行独立和伪分布。

Default

300

300年

hbase.client.write.buffer
Description

Default size of the BufferedMutator write buffer in bytes. A bigger buffer takes more memory — on both the client and server side since server instantiates the passed write buffer to process it — but a larger buffer size reduces the number of RPCs made. For an estimate of server-side memory-used, evaluate hbase.client.write.buffer * hbase.regionserver.handler.count

BufferedMutator的默认大小以字节为单位编写缓冲区。一个更大的缓冲区会占用更多的内存——因为服务器实例化了已传递的写缓冲区来处理它——但是更大的缓冲区大小减少了RPCs的数量。对于服务器端内存使用的估计,评估hbase.client.write。缓冲* hbase.regionserver.handler.count

Default

2097152

2097152

hbase.client.pause
Description

General client pause value. Used mostly as value to wait before running a retry of a failed get, region lookup, etc. See hbase.client.retries.number for description of how we backoff from this initial pause amount and how this pause works w/ retries.

一般客户暂停价值。在运行一个失败的get、区域查找等重试之前,主要使用的是等待的值。描述我们如何从最初的暂停数量和暂停如何工作w/重试的描述。

Default

100

One hundred.

hbase.client.pause.cqtbe
Description

Whether or not to use a special client pause for CallQueueTooBigException (cqtbe). Set this property to a higher value than hbase.client.pause if you observe frequent CQTBE from the same RegionServer and the call queue there keeps full

是否使用一个特殊的客户端暂停调用CallQueueTooBigException (cqtbe)。将此属性设置为高于hbase.client的值。如果您在同一区域服务器上观察频繁的CQTBE,并且在那里的调用队列保持完整,请暂停。

Default

none

没有一个

hbase.client.retries.number
Description

Maximum retries. Used as maximum for all retryable operations such as the getting of a cell’s value, starting a row update, etc. Retry interval is a rough function based on hbase.client.pause. At first we retry at this interval but then with backoff, we pretty quickly reach retrying every ten seconds. See HConstants#RETRY_BACKOFF for how the backup ramps up. Change this setting and hbase.client.pause to suit your workload.

最大重试。对于所有可重新尝试的操作,如获取单元的值、开始行更新等,使用的最大。重试间隔是基于hbase.client的粗糙函数。一开始我们在这个时间间隔重试,但之后我们会很快的重新尝试每十秒。请参阅hconstant #RETRY_BACKOFF,以了解备份是如何提高的。更改此设置和hbase.client。暂停以适应你的工作量。

Default

15

15

hbase.client.max.total.tasks
Description

The maximum number of concurrent mutation tasks a single HTable instance will send to the cluster.

单个HTable实例将发送到集群的并发突变任务的最大数量。

Default

100

One hundred.

hbase.client.max.perserver.tasks
Description

The maximum number of concurrent mutation tasks a single HTable instance will send to a single region server.

单个HTable实例将发送到单个区域服务器的并发突变任务的最大数量。

Default

2

2

hbase.client.max.perregion.tasks
Description

The maximum number of concurrent mutation tasks the client will maintain to a single Region. That is, if there is already hbase.client.max.perregion.tasks writes in progress for this region, new puts won’t be sent to this region until some writes finishes.

客户机将维护到单个区域的并发突变任务的最大数量。也就是说,如果已经有hbase.client.max.perregion。在这个区域的任务写入过程中,新的put将不会被发送到这个区域,直到一些写完为止。

Default

1

1

hbase.client.perserver.requests.threshold
Description

The max number of concurrent pending requests for one server in all client threads (process level). Exceeding requests will be thrown ServerTooBusyException immediately to prevent user’s threads being occupied and blocked by only one slow region server. If you use a fix number of threads to access HBase in a synchronous way, set this to a suitable value which is related to the number of threads will help you. See https://issues.apache.org/jira/browse/HBASE-16388 for details.

在所有客户端线程(流程级别)中,一个服务器的并发挂起请求的最大数量。超过请求将立即抛出ServerTooBusyException,以防止用户的线程被一个缓慢的区域服务器占用和阻塞。如果您使用固定数量的线程以同步的方式访问HBase,将其设置为与线程数相关的适当值将帮助您。有关详细信息,请参阅https://issues.apache.org/jira/browse/hbase - 16388。

Default

2147483647

2147483647

hbase.client.scanner.caching
Description

Number of rows that we try to fetch when calling next on a scanner if it is not served from (local, client) memory. This configuration works together with hbase.client.scanner.max.result.size to try and use the network efficiently. The default value is Integer.MAX_VALUE by default so that the network will fill the chunk size defined by hbase.client.scanner.max.result.size rather than be limited by a particular number of rows since the size of rows varies table to table. If you know ahead of time that you will not require more than a certain number of rows from a scan, this configuration should be set to that row limit via Scan#setCaching. Higher caching values will enable faster scanners but will eat up more memory and some calls of next may take longer and longer times when the cache is empty. Do not set this value such that the time between invocations is greater than the scanner timeout; i.e. hbase.client.scanner.timeout.period

如果没有从(本地、客户端)内存中提供服务,那么我们在调用next时尝试获取的行数。该配置与hbase.client.scanner.max.result一起工作。大小尝试并有效地使用网络。默认值是整数。默认的MAX_VALUE,这样网络将填充由hbase.client.scanner.max.result定义的块大小。大小,而不是受特定的行数限制,因为行的大小因表而异。如果您提前知道,您将不需要从扫描中获得一定数量的行,那么该配置应该通过scan # setcache设置为该行限制。更高的缓存值将启用更快的扫描器,但会消耗更多的内存,而当缓存为空时,接下来的一些调用可能会花费更长的时间。不要设置此值,因为调用之间的时间大于扫描超时;即hbase.client.scanner.timeout.period

Default

2147483647

2147483647

hbase.client.keyvalue.maxsize
Description

Specifies the combined maximum allowed size of a KeyValue instance. This is to set an upper boundary for a single entry saved in a storage file. Since they cannot be split it helps avoiding that a region cannot be split any further because the data is too large. It seems wise to set this to a fraction of the maximum region size. Setting it to zero or less disables the check.

指定键值实例的最大允许大小。这是为保存在存储文件中的单个条目设置一个上限。因为它们不能被分割,所以避免了一个区域不能被分割,因为数据太大了。将其设置为最大区域大小的一小部分似乎是明智的。将其设置为零或更少的禁用该检查。

Default

10485760

10485760

hbase.server.keyvalue.maxsize
Description

Maximum allowed size of an individual cell, inclusive of value and all key components. A value of 0 or less disables the check. The default value is 10MB. This is a safety setting to protect the server from OOM situations.

允许单个单元的最大允许大小,包括值和所有关键组件。值为0或更小的值将会禁用该检查。默认值是10MB。这是一个安全设置,以保护服务器不受OOM情况的影响。

Default

10485760

10485760

hbase.client.scanner.timeout.period
Description

Client scanner lease period in milliseconds.

客户端扫描仪租期以毫秒为单位。

Default

60000

60000年

hbase.client.localityCheck.threadPoolSize
Default

2

2

hbase.bulkload.retries.number
Description

Maximum retries. This is maximum number of iterations to atomic bulk loads are attempted in the face of splitting operations 0 means never give up.

最大重试。这是对原子批量负载的最大迭代次数的尝试,在分裂操作0意味着永不放弃。

Default

10

10

hbase.master.balancer.maxRitPercent
Description

The max percent of regions in transition when balancing. The default value is 1.0. So there are no balancer throttling. If set this config to 0.01, It means that there are at most 1% regions in transition when balancing. Then the cluster’s availability is at least 99% when balancing.

在平衡时,过渡区域的最大百分比。默认值是1.0。所以没有平衡。如果将此配置设置为0.01,则意味着在平衡时,在转换过程中最多有1%的区域。然后,在平衡时,集群的可用性至少达到99%。

Default

1.0

1.0

hbase.balancer.period
Description

Period at which the region balancer runs in the Master.

区域平衡器在主内运行的周期。

Default

300000

300000年

hbase.normalizer.period
Description

Period at which the region normalizer runs in the Master.

在主程序中,区域正常程序运行的时间。

Default

300000

300000年

hbase.regions.slop
Description

Rebalance if any regionserver has average + (average * slop) regions. The default value of this parameter is 0.001 in StochasticLoadBalancer (the default load balancer), while the default is 0.2 in other load balancers (i.e., SimpleLoadBalancer).

如果任何区域服务器有平均+(平均* slop)区域,则重新平衡。这个参数的默认值是StochasticLoadBalancer(默认负载均衡器)的0.001,而其他负载均衡器的默认值为0.2。SimpleLoadBalancer)。

Default

0.001

0.001

hbase.server.thread.wakefrequency
Description

Time to sleep in between searches for work (in milliseconds). Used as sleep interval by service threads such as log roller.

在搜索工作(以毫秒为单位)之间的时间。以服务线程(如日志滚轮)作为睡眠时间间隔。

Default

10000

10000年

hbase.server.versionfile.writeattempts
Description

How many times to retry attempting to write a version file before just aborting. Each attempt is separated by the hbase.server.thread.wakefrequency milliseconds.

有多少次尝试在放弃之前尝试写一个版本文件。每个尝试都由hbase.server.thread分隔。wakefrequency毫秒。

Default

3

3

hbase.hregion.memstore.flush.size
Description

Memstore will be flushed to disk if size of the memstore exceeds this number of bytes. Value is checked by a thread that runs every hbase.server.thread.wakefrequency.

如果Memstore的大小超过这个字节数,Memstore将被刷新到磁盘。值是由运行每个hbase.server. server.thread.wakefrequency的线程来检查的。

Default

134217728

134217728

hbase.hregion.percolumnfamilyflush.size.lower.bound.min
Description

If FlushLargeStoresPolicy is used and there are multiple column families, then every time that we hit the total memstore limit, we find out all the column families whose memstores exceed a "lower bound" and only flush them while retaining the others in memory. The "lower bound" will be "hbase.hregion.memstore.flush.size / column_family_number" by default unless value of this property is larger than that. If none of the families have their memstore size more than lower bound, all the memstores will be flushed (just as usual).

如果使用FlushLargeStoresPolicy,并且有多个列族,那么每次我们达到内存存储限制时,我们就会发现所有的列家庭,它们的memstore超过了一个“下界”,只在内存中保留其他的时候刷新它们。“下界”将是“hbase.hzone .memstore.flush”。默认情况下,除非这个属性的值大于这个值,否则默认值为“size / column_family_number”。如果没有一个家庭的memstore大小超过下界,那么所有的memstore都会被刷新(就像往常一样)。

Default

16777216

16777216

hbase.hregion.preclose.flush.size
Description

If the memstores in a region are this size or larger when we go to close, run a "pre-flush" to clear out memstores before we put up the region closed flag and take the region offline. On close, a flush is run under the close flag to empty memory. During this time the region is offline and we are not taking on any writes. If the memstore content is large, this flush could take a long time to complete. The preflush is meant to clean out the bulk of the memstore before putting up the close flag and taking the region offline so the flush that runs under the close flag has little to do.

如果一个区域的memstore在我们关闭的时候是这个大小或者更大,那么在我们挂起该区域关闭标志并将该区域关闭之前,运行一个“预刷新”来清空memstores。在关闭的情况下,一个刷新在关闭的标志下运行到空内存。在此期间,该区域处于脱机状态,我们不承担任何写入操作。如果memstore内容很大,那么这个刷新可能需要很长时间才能完成。preflush的意思是在挂起关闭的标志和关闭该区域之前清除内存中的大部分,所以在关闭标志下运行的刷新没有什么作用。

Default

5242880

5242880

hbase.hregion.memstore.block.multiplier
Description

Block updates if memstore has hbase.hregion.memstore.block.multiplier times hbase.hregion.memstore.flush.size bytes. Useful preventing runaway memstore during spikes in update traffic. Without an upper-bound, memstore fills such that when it flushes the resultant flush files take a long time to compact or split, or worse, we OOME.

如果memstore有hbase. hzone .memstore.block,就可以对其进行块更新。hbase.hregion.memstore.flush乘数倍。字节大小。在更新流量的峰值期间防止失控的memstore。如果没有上界,memstore就会填充,这样当它刷新合成的刷新文件时,就需要很长的时间来压缩或拆分,或者更糟,我们OOME。

Default

4

4

hbase.hregion.memstore.mslab.enabled
Description

Enables the MemStore-Local Allocation Buffer, a feature which works to prevent heap fragmentation under heavy write loads. This can reduce the frequency of stop-the-world GC pauses on large heaps.

启用MemStore-Local分配缓冲区,该特性可以防止在重写负载下的堆碎片。这可以减少在大型堆上停止世界GC暂停的频率。

Default

true

真正的

hbase.hregion.max.filesize
Description

Maximum HFile size. If the sum of the sizes of a region’s HFiles has grown to exceed this value, the region is split in two.

最大HFile大小。如果一个区域的HFiles的大小总和已经超过这个值,那么该区域将被一分为二。

Default

10737418240

10737418240

hbase.hregion.majorcompaction
Description

Time between major compactions, expressed in milliseconds. Set to 0 to disable time-based automatic major compactions. User-requested and size-based major compactions will still run. This value is multiplied by hbase.hregion.majorcompaction.jitter to cause compaction to start at a somewhat-random time during a given window of time. The default value is 7 days, expressed in milliseconds. If major compactions are causing disruption in your environment, you can configure them to run at off-peak times for your deployment, or disable time-based major compactions by setting this parameter to 0, and run major compactions in a cron job or by another external mechanism.

主要压实之间的时间,以毫秒表示。设置为0,以禁用基于时间的自动主要压缩。用户请求和基于大小的主要压缩仍然会运行。这个值乘以hbase。hbase。在给定时间窗口内的某个随机时间,抖动导致压缩。默认值为7天,以毫秒表示。如果主要的压缩在您的环境中造成了破坏,您可以配置它们在非高峰时间运行您的部署,或者通过将此参数设置为0来禁用基于时间的主要压缩,并在cron作业或其他外部机制中运行主要的压缩。

Default

604800000

604800000

hbase.hregion.majorcompaction.jitter
Description

A multiplier applied to hbase.hregion.majorcompaction to cause compaction to occur a given amount of time either side of hbase.hregion.majorcompaction. The smaller the number, the closer the compactions will happen to the hbase.hregion.majorcompaction interval.

一个用于hbase.hregion的乘数。主要压实作用,使压实发生在某一给定的时间内,即hbase.h。数字越小,hbase.hregion就会越紧密。majorcompaction区间。

Default

0.50

0.50

hbase.hstore.compactionThreshold
Description

If more than this number of StoreFiles exist in any one Store (one StoreFile is written per flush of MemStore), a compaction is run to rewrite all StoreFiles into a single StoreFile. Larger values delay compaction, but when compaction does occur, it takes longer to complete.

如果在任何一个商店中存在超过此数量的存储文件(每个存储文件都有一个存储文件),那么就会运行一个compaction来将所有的存储文件重写为一个单一的StoreFile。较大的值延迟了压缩,但是当压缩确实发生时,需要更长的时间才能完成。

Default

3

3

hbase.hstore.flusher.count
Description

The number of flush threads. With fewer threads, the MemStore flushes will be queued. With more threads, the flushes will be executed in parallel, increasing the load on HDFS, and potentially causing more compactions.

刷新线程的数量。使用更少的线程,MemStore刷新将被排队。有了更多的线程,这些刷新将并行执行,增加了对HDFS的负载,并可能导致更多的压缩。

Default

2

2

hbase.hstore.blockingStoreFiles
Description

If more than this number of StoreFiles exist in any one Store (one StoreFile is written per flush of MemStore), updates are blocked for this region until a compaction is completed, or until hbase.hstore.blockingWaitTime has been exceeded.

如果在任何一个存储库中存在超过此数量的存储文件(每个存储文件都有一个存储文件),那么该区域的更新将被阻塞,直到完成压缩,或者直到hbase.hstore。blockingWaitTime已经超过了。

Default

16

16

hbase.hstore.blockingWaitTime
Description

The time for which a region will block updates after reaching the StoreFile limit defined by hbase.hstore.blockingStoreFiles. After this time has elapsed, the region will stop blocking updates even if a compaction has not been completed.

在到达由hbase.hstore.blockingStoreFiles定义的存储文件限制后,该区域将阻塞更新的时间。在这段时间过后,即使没有完成压缩,该区域也将停止阻塞更新。

Default

90000

90000年

hbase.hstore.compaction.min
Description

The minimum number of StoreFiles which must be eligible for compaction before compaction can run. The goal of tuning hbase.hstore.compaction.min is to avoid ending up with too many tiny StoreFiles to compact. Setting this value to 2 would cause a minor compaction each time you have two StoreFiles in a Store, and this is probably not appropriate. If you set this value too high, all the other values will need to be adjusted accordingly. For most cases, the default value is appropriate. In previous versions of HBase, the parameter hbase.hstore.compaction.min was named hbase.hstore.compactionThreshold.

在compaction运行之前,必须符合压缩条件的最小存储文件数。优化hbase.hstore.compaction.min的目的是避免使用太多的小存储文件来压缩。将此值设置为2会导致每次在存储中有两个storefile时都会产生一个小的压缩,而这可能是不合适的。如果将此值设置得过高,则需要相应地调整所有其他值。对于大多数情况,默认值是合适的。在HBase的以前版本中,HBase .hstore.compaction.min被命名为hbase.hstore.compactionThreshold。

Default

3

3

hbase.hstore.compaction.max
Description

The maximum number of StoreFiles which will be selected for a single minor compaction, regardless of the number of eligible StoreFiles. Effectively, the value of hbase.hstore.compaction.max controls the length of time it takes a single compaction to complete. Setting it larger means that more StoreFiles are included in a compaction. For most cases, the default value is appropriate.

将为单个小型压缩而选择的存储文件的最大数量,而不考虑合格的存储文件的数量。实际上,hbase.hstore.compaction.max控制了单个压缩完成的时间长度。更大的设置意味着在压缩中包含更多的存储文件。对于大多数情况,默认值是合适的。

Default

10

10

hbase.hstore.compaction.min.size
Description

A StoreFile (or a selection of StoreFiles, when using ExploringCompactionPolicy) smaller than this size will always be eligible for minor compaction. HFiles this size or larger are evaluated by hbase.hstore.compaction.ratio to determine if they are eligible. Because this limit represents the "automatic include" limit for all StoreFiles smaller than this value, this value may need to be reduced in write-heavy environments where many StoreFiles in the 1-2 MB range are being flushed, because every StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the minimum size and require further compaction. If this parameter is lowered, the ratio check is triggered more quickly. This addressed some issues seen in earlier versions of HBase but changing this parameter is no longer necessary in most situations. Default: 128 MB expressed in bytes.

小于这个大小的StoreFile(或使用ExploringCompactionPolicy的存储文件的选择)总是符合小型压缩的条件。这个大小的HFiles由hbase.hstore.compaction.ratio来计算,以确定它们是否符合条件。因为这个限制代表了“自动包括“限制StoreFiles小于这个值,这个值可能需要减少write-heavy环境中1 - 2 MB的许多StoreFiles范围被刷新,因为每个StoreFile将针对压实和由此产生的StoreFiles仍可能受到最小的大小和需要进一步压实。如果这个参数被降低,比率检查被触发的更快。这解决了在早期版本的HBase中所看到的一些问题,但是在大多数情况下更改此参数不再是必要的。默认值:128 MB以字节表示。

Default

134217728

134217728

hbase.hstore.compaction.max.size
Description

A StoreFile (or a selection of StoreFiles, when using ExploringCompactionPolicy) larger than this size will be excluded from compaction. The effect of raising hbase.hstore.compaction.max.size is fewer, larger StoreFiles that do not get compacted often. If you feel that compaction is happening too often without much benefit, you can try raising this value. Default: the value of LONG.MAX_VALUE, expressed in bytes.

大于这个大小的StoreFile(或使用ExploringCompactionPolicy的存储文件的选择)将被排除在compaction之外。增大hbase.hstore.compaction.max.size的效果更小,更大的存储文件不经常被压缩。如果你觉得压实经常发生,没有太多好处,你可以试着提高这个值。默认值:LONG的值。MAX_VALUE,表示字节。

Default

9223372036854775807

9223372036854775807

hbase.hstore.compaction.ratio
Description

For minor compaction, this ratio is used to determine whether a given StoreFile which is larger than hbase.hstore.compaction.min.size is eligible for compaction. Its effect is to limit compaction of large StoreFiles. The value of hbase.hstore.compaction.ratio is expressed as a floating-point decimal. A large ratio, such as 10, will produce a single giant StoreFile. Conversely, a low value, such as .25, will produce behavior similar to the BigTable compaction algorithm, producing four StoreFiles. A moderate value of between 1.0 and 1.4 is recommended. When tuning this value, you are balancing write costs with read costs. Raising the value (to something like 1.4) will have more write costs, because you will compact larger StoreFiles. However, during reads, HBase will need to seek through fewer StoreFiles to accomplish the read. Consider this approach if you cannot take advantage of Bloom filters. Otherwise, you can lower this value to something like 1.0 to reduce the background cost of writes, and use Bloom filters to control the number of StoreFiles touched during reads. For most cases, the default value is appropriate.

对于较小的压缩,此比率用于确定一个给定的存储文件是否大于hbase.hstore.hstore.comaction .min.size具有压缩的条件。它的作用是限制大型存储文件的压缩。hbase.hstore.compaction.ratio被表示为浮点小数。一个大的比率,例如10,将产生一个单一的巨大的存储文件。相反,像.25这样的低值将产生类似于BigTable compaction算法的行为,生成4个StoreFiles。建议在1.0和1.4之间有一个适中的值。在调优这个值时,您需要平衡写成本和读取成本。提高值(大约1.4)会有更多的写成本,因为你会压缩更大的存储文件。但是,在读取过程中,HBase需要通过更少的存储文件来完成读取操作。如果不能利用Bloom过滤器,请考虑这种方法。否则,您可以将这个值降低到类似于1.0这样的东西,以降低写的后台成本,并使用Bloom过滤器来控制在读取过程中被触摸的存储文件的数量。对于大多数情况,默认值是合适的。

Default

1.2F

1.2度

hbase.hstore.compaction.ratio.offpeak
Description

Allows you to set a different (by default, more aggressive) ratio for determining whether larger StoreFiles are included in compactions during off-peak hours. Works in the same way as hbase.hstore.compaction.ratio. Only applies if hbase.offpeak.start.hour and hbase.offpeak.end.hour are also enabled.

允许您设置不同的(默认的,更积极的)比率,以确定在非繁忙时间内是否包含更大的存储文件。与hbase.hstore.c .compaction.ratio相同。仅适用于如果hbase.offpeak.start。小时,hbase.offpeak.end。小时也启用。

Default

5.0F

5.0度

hbase.hstore.time.to.purge.deletes
Description

The amount of time to delay purging of delete markers with future timestamps. If unset, or set to 0, all delete markers, including those with future timestamps, are purged during the next major compaction. Otherwise, a delete marker is kept until the major compaction which occurs after the marker’s timestamp plus the value of this setting, in milliseconds.

在未来时间戳中延迟删除标记的时间。如果未设置或设置为0,所有删除标记,包括那些具有未来时间戳的标记,将在接下来的主要压缩过程中被清除。否则,在标记的时间戳加上该设置的值(以毫秒为单位)后,将保留一个删除标记,直到主压缩。

Default

0

0

hbase.offpeak.start.hour
Description

The start of off-peak hours, expressed as an integer between 0 and 23, inclusive. Set to -1 to disable off-peak.

非高峰时间的开始,表示为0到23之间的整数,包括。设置为-1,以禁用非峰值。

Default

-1

1

hbase.offpeak.end.hour
Description

The end of off-peak hours, expressed as an integer between 0 and 23, inclusive. Set to -1 to disable off-peak.

非高峰时间的结束,表示为0到23之间的整数,包括。设置为-1,以禁用非峰值。

Default

-1

1

hbase.regionserver.thread.compaction.throttle
Description

There are two different thread pools for compactions, one for large compactions and the other for small compactions. This helps to keep compaction of lean tables (such as hbase:meta) fast. If a compaction is larger than this threshold, it goes into the large compaction pool. In most cases, the default value is appropriate. Default: 2 x hbase.hstore.compaction.max x hbase.hregion.memstore.flush.size (which defaults to 128MB). The value field assumes that the value of hbase.hregion.memstore.flush.size is unchanged from the default.

compaction有两个不同的线程池,一个用于大的压缩,另一个用于小的压缩。这有助于保持瘦表(如hbase:meta)的紧凑。如果一个压缩比这个阈值大,它就会进入大的压缩池。在大多数情况下,默认值是适当的。默认值:2 x hbase. hstore.m。hbase。大小(默认为128MB)。值字段假定hbase.harea .memstore.flush的值。大小与默认值保持不变。

Default

2684354560

2684354560

hbase.regionserver.majorcompaction.pagecache.drop
Description

Specifies whether to drop pages read/written into the system page cache by major compactions. Setting it to true helps prevent major compactions from polluting the page cache, which is almost always required, especially for clusters with low/moderate memory to storage ratio.

指定是否通过主要的压缩文件将读/写入到系统页面缓存中。将其设置为true有助于防止主要的压缩行为污染页面缓存,这几乎总是必需的,特别是对于低/中等内存的集群来说。

Default

true

真正的

hbase.regionserver.minorcompaction.pagecache.drop
Description

Specifies whether to drop pages read/written into the system page cache by minor compactions. Setting it to true helps prevent minor compactions from polluting the page cache, which is most beneficial on clusters with low memory to storage ratio or very write heavy clusters. You may want to set it to false under moderate to low write workload when bulk of the reads are on the most recently written data.

指定是否通过较小的压缩来将读/写进系统页面缓存。将其设置为true可以帮助防止轻微的压缩导致页面缓存的污染,这对低内存的集群非常有益,也可以很好地编写集群。当大部分读操作都在最近的书面数据上时,您可能希望将其设置为在中等到低的写工作负载下。

Default

true

真正的

hbase.hstore.compaction.kv.max
Description

The maximum number of KeyValues to read and then write in a batch when flushing or compacting. Set this lower if you have big KeyValues and problems with Out Of Memory Exceptions Set this higher if you have wide, small rows.

当刷新或压缩时,要读取的键值的最大数量,然后在批处理中写入。如果你有大的键值,如果你有很大的内存不足,那么设置这个值,如果你有宽的小的行,就设置这个值。

Default

10

10

hbase.storescanner.parallel.seek.enable
Description

Enables StoreFileScanner parallel-seeking in StoreScanner, a feature which can reduce response latency under special conditions.

在StoreScanner中启用StoreFileScanner并行搜索,这一特性可以在特殊情况下减少响应延迟。

Default

false

hbase.storescanner.parallel.seek.threads
Description

The default thread pool size if parallel-seeking feature enabled.

如果启用了并行查找功能,则默认的线程池大小。

Default

10

10

hfile.block.cache.size
Description

Percentage of maximum heap (-Xmx setting) to allocate to block cache used by a StoreFile. Default of 0.4 means allocate 40%. Set to 0 to disable but it’s not recommended; you need at least enough cache to hold the storefile indices.

最大堆(-Xmx设置)的百分比,用于分配存储文件使用的块缓存。默认的0.4意味着分配40%。设置为0禁用,但不建议使用;您需要至少足够的缓存来保存storefile索引。

Default

0.4

0.4

hfile.block.index.cacheonwrite
Description

This allows to put non-root multi-level index blocks into the block cache at the time the index is being written.

这允许在编写索引时将非根的多级索引块放到块缓存中。

Default

false

hfile.index.block.max.size
Description

When the size of a leaf-level, intermediate-level, or root-level index block in a multi-level block index grows to this size, the block is written out and a new block is started.

当一个多层块索引的叶级、中间层或根级索引块的大小增长到这个大小时,块就被写出来了,一个新的块就开始了。

Default

131072

131072年

hbase.bucketcache.ioengine
Description

Where to store the contents of the bucketcache. One of: offheap, file, files or mmap. If a file or files, set it to file(s):PATH_TO_FILE. mmap means the content will be in an mmaped file. Use mmap:PATH_TO_FILE. See http://hbase.apache.org/book.html#offheap.blockcache for more information.

在哪里存储bucketcache的内容。一个:offheap, file, files或mmap。如果文件或文件,将其设置为file(s):PATH_TO_FILE。mmap意味着内容将在mmaped文件中。用mmap:PATH_TO_FILE。参见http://hbase.apache.org/book.html#offheap.blockcache获取更多信息。

Default

none

没有一个

hbase.bucketcache.size
Description

A float that EITHER represents a percentage of total heap memory size to give to the cache (if < 1.0) OR, it is the total capacity in megabytes of BucketCache. Default: 0.0

一个浮点数,它代表总堆内存大小的百分比给缓存(如果< 1.0),或者,它是总容量的兆字节的BucketCache。默认值:0.0

Default

none

没有一个

hbase.bucketcache.bucket.sizes
Description

A comma-separated list of sizes for buckets for the bucketcache. Can be multiple sizes. List block sizes in order from smallest to largest. The sizes you use will depend on your data access patterns. Must be a multiple of 256 else you will run into 'java.io.IOException: Invalid HFile block magic' when you go to read from cache. If you specify no values here, then you pick up the default bucketsizes set in code (See BucketAllocator#DEFAULT_BUCKET_SIZES).

一个逗号分隔的桶的大小列表,用于桶缓存。可以多个大小。列表块大小从最小到最大。您使用的大小将取决于您的数据访问模式。必须是256的倍数,否则你会遇到“java.io”。当你从缓存中读取时,IOException:无效的HFile块魔法。如果您在这里指定了没有值,那么您可以选择代码中设置的默认大小(参见BucketAllocator# default_bucket_size)。

Default

none

没有一个

hfile.format.version
Description

The HFile format version to use for new files. Version 3 adds support for tags in hfiles (See http://hbase.apache.org/book.html#hbase.tags). Also see the configuration 'hbase.replication.rpc.codec'.

用于新文件的HFile格式版本。版本3在hfiles中添加了对标记的支持(参见http://hbase.apache.org/book.html#hbase.tag)。也可以看到配置'hbase. replices.rpc.codec '。

Default

3

3

hfile.block.bloom.cacheonwrite
Description

Enables cache-on-write for inline blocks of a compound Bloom filter.

启用复合Bloom filter的内联块的缓存。

Default

false

io.storefile.bloom.block.size
Description

The size in bytes of a single block ("chunk") of a compound Bloom filter. This size is approximate, because Bloom blocks can only be inserted at data block boundaries, and the number of keys per data block varies.

一个复合Bloom filter的一个块(“块”)的字节大小。这个大小是近似的,因为Bloom块只能插入到数据块边界,每个数据块的键数也不同。

Default

131072

131072年

hbase.rs.cacheblocksonwrite
Description

Whether an HFile block should be added to the block cache when the block is finished.

当块完成时,是否应该将HFile块添加到块缓存中。

Default

false

hbase.rpc.timeout
Description

This is for the RPC layer to define how long (millisecond) HBase client applications take for a remote call to time out. It uses pings to check connections but will eventually throw a TimeoutException.

这是用于RPC层来定义远程调用的远程调用需要多长(毫秒)的HBase客户机应用程序。它使用pings来检查连接,但最终会抛出一个TimeoutException。

Default

60000

60000年

hbase.client.operation.timeout
Description

Operation timeout is a top-level restriction (millisecond) that makes sure a blocking operation in Table will not be blocked more than this. In each operation, if rpc request fails because of timeout or other reason, it will retry until success or throw RetriesExhaustedException. But if the total time being blocking reach the operation timeout before retries exhausted, it will break early and throw SocketTimeoutException.

操作超时是一个顶级限制(毫秒),确保表中的阻塞操作不会被阻塞。在每个操作中,如果rpc请求由于超时或其他原因而失败,它将重试直到成功或抛出retriesdexception。但是,如果在重试耗尽之前阻塞的总时间到达操作超时,那么它将提前断开并抛出SocketTimeoutException。

Default

1200000

1200000

hbase.cells.scanned.per.heartbeat.check
Description

The number of cells scanned in between heartbeat checks. Heartbeat checks occur during the processing of scans to determine whether or not the server should stop scanning in order to send back a heartbeat message to the client. Heartbeat messages are used to keep the client-server connection alive during long running scans. Small values mean that the heartbeat checks will occur more often and thus will provide a tighter bound on the execution time of the scan. Larger values mean that the heartbeat checks occur less frequently

在心跳检查之间扫描的细胞数量。在扫描过程中发生心跳检查,以确定服务器是否应该停止扫描,以便将心跳消息发送给客户端。Heartbeat消息用于在长时间运行扫描期间保持客户机-服务器连接。小的值意味着心跳检查会更频繁地发生,从而在扫描的执行时间上提供更严格的绑定。更大的值意味着心跳检查频率降低。

Default

10000

10000年

hbase.rpc.shortoperation.timeout
Description

This is another version of "hbase.rpc.timeout". For those RPC operation within cluster, we rely on this configuration to set a short timeout limitation for short operation. For example, short rpc timeout for region server’s trying to report to active master can benefit quicker master failover process.

这是另一个版本的“hbase.rpc.timeout”。对于集群内的RPC操作,我们依赖于此配置来为短操作设置短的超时限制。例如,区域服务器试图向active master报告的短rpc超时可以受益更快的主故障转移过程。

Default

10000

10000年

hbase.ipc.client.tcpnodelay
Description

Set no delay on rpc socket connections. See http://docs.oracle.com/javase/1.5.0/docs/api/java/net/Socket.html#getTcpNoDelay()

不要延迟rpc套接字连接。看到http://docs.oracle.com/javase/1.5.0/docs/api/java/net/Socket.html getTcpNoDelay()

Default

true

真正的

hbase.regionserver.hostname
Description

This config is for experts: don’t set its value unless you really know what you are doing. When set to a non-empty value, this represents the (external facing) hostname for the underlying server. See https://issues.apache.org/jira/browse/HBASE-12954 for details.

这个配置是针对专家的:除非你真的知道你在做什么,否则不要设置它的值。当设置为非空值时,这表示底层服务器的(外部面向)主机名。有关详细信息,请参阅https://issues.apache.org/jira/browse/hbase - 12954。

Default

none

没有一个

hbase.regionserver.hostname.disable.master.reversedns
Description

This config is for experts: don’t set its value unless you really know what you are doing. When set to true, regionserver will use the current node hostname for the servername and HMaster will skip reverse DNS lookup and use the hostname sent by regionserver instead. Note that this config and hbase.regionserver.hostname are mutually exclusive. See https://issues.apache.org/jira/browse/HBASE-18226 for more details.

这个配置是针对专家的:除非你真的知道你在做什么,否则不要设置它的值。当设置为true时,区域服务器将使用当前的节点主机名,而HMaster将跳过反向DNS查找并使用地区服务器发送的主机名。注意,这个配置和hbase.区域性服务器。主机名是互斥的。有关更多细节,请参见https://www.apache.org/jira/browse/hbase -18226。

Default

false

hbase.master.keytab.file
Description

Full path to the kerberos keytab file to use for logging in the configured HMaster server principal.

在配置的HMaster服务器主体中使用kerberos keytab文件的完整路径。

Default

none

没有一个

hbase.master.kerberos.principal
Description

Ex. "hbase/_HOST@EXAMPLE.COM". The kerberos principal name that should be used to run the HMaster process. The principal name should be in the form: user/hostname@DOMAIN. If "_HOST" is used as the hostname portion, it will be replaced with the actual hostname of the running instance.

前女友。“hbase / _HOST@EXAMPLE.COM”。应该用于运行HMaster进程的kerberos主体名称。主名称应该是:user/hostname@DOMAIN。如果“_HOST”用作主机名部分,则将用运行实例的实际主机名替换它。

Default

none

没有一个

hbase.regionserver.keytab.file
Description

Full path to the kerberos keytab file to use for logging in the configured HRegionServer server principal.

完整路径到kerberos keytab文件,用于在已配置的hlocationserver服务器主体中进行日志记录。

Default

none

没有一个

hbase.regionserver.kerberos.principal
Description

Ex. "hbase/_HOST@EXAMPLE.COM". The kerberos principal name that should be used to run the HRegionServer process. The principal name should be in the form: user/hostname@DOMAIN. If "_HOST" is used as the hostname portion, it will be replaced with the actual hostname of the running instance. An entry for this principal must exist in the file specified in hbase.regionserver.keytab.file

前女友。“hbase / _HOST@EXAMPLE.COM”。应该用于运行h区域性服务器进程的kerberos主体名称。主名称应该是:user/hostname@DOMAIN。如果“_HOST”用作主机名部分,则将用运行实例的实际主机名替换它。该主体的一个条目必须存在于hbase.domain server.keytab.file中指定的文件中。

Default

none

没有一个

hadoop.policy.file
Description

The policy configuration file used by RPC servers to make authorization decisions on client requests. Only used when HBase security is enabled.

RPC服务器使用的策略配置文件,用于在客户端请求上进行授权决策。仅在启用HBase安全时使用。

Default

hbase-policy.xml

hbase-policy.xml

hbase.superuser
Description

List of users or groups (comma-separated), who are allowed full privileges, regardless of stored ACLs, across the cluster. Only used when HBase security is enabled.

用户或组的列表(逗号分隔),允许在集群中不考虑存储acl的所有特权。仅在启用HBase安全时使用。

Default

none

没有一个

hbase.auth.key.update.interval
Description

The update interval for master key for authentication tokens in servers in milliseconds. Only used when HBase security is enabled.

用于在服务器中以毫秒为单位的认证标志的主密钥的更新间隔。仅在启用HBase安全时使用。

Default

86400000

86400000

hbase.auth.token.max.lifetime
Description

The maximum lifetime in milliseconds after which an authentication token expires. Only used when HBase security is enabled.

授权令牌过期后的毫秒级的最大生命周期。仅在启用HBase安全时使用。

Default

604800000

604800000

hbase.ipc.client.fallback-to-simple-auth-allowed
Description

When a client is configured to attempt a secure connection, but attempts to connect to an insecure server, that server may instruct the client to switch to SASL SIMPLE (unsecure) authentication. This setting controls whether or not the client will accept this instruction from the server. When false (the default), the client will not allow the fallback to SIMPLE authentication, and will abort the connection.

当一个客户端被配置为尝试一个安全连接,但是试图连接到一个不安全的服务器时,该服务器可能指示客户端切换到SASL简单(不安全)身份验证。该设置控制客户端是否接受来自服务器的指令。当false(缺省值)时,客户端将不允许回退到简单的身份验证,并且将中止连接。

Default

false

hbase.ipc.server.fallback-to-simple-auth-allowed
Description

When a server is configured to require secure connections, it will reject connection attempts from clients using SASL SIMPLE (unsecure) authentication. This setting allows secure servers to accept SASL SIMPLE connections from clients when the client requests. When false (the default), the server will not allow the fallback to SIMPLE authentication, and will reject the connection. WARNING: This setting should ONLY be used as a temporary measure while converting clients over to secure authentication. It MUST BE DISABLED for secure operation.

当服务器配置为需要安全连接时,它将拒绝使用SASL简单(不安全)身份验证的客户端连接尝试。此设置允许安全服务器在客户端请求时从客户端接受SASL简单连接。当false(缺省值)时,服务器将不允许回退到简单的身份验证,并且将拒绝连接。警告:此设置仅在将客户端转换为安全身份验证时作为临时措施使用。必须为安全操作禁用它。

Default

false

hbase.display.keys
Description

When this is set to true the webUI and such will display all start/end keys as part of the table details, region names, etc. When this is set to false, the keys are hidden.

当这个设置为true时,webUI将显示所有的开始/结束键作为表细节的一部分,区域名称等。当这被设置为false时,密钥将被隐藏。

Default

true

真正的

hbase.coprocessor.enabled
Description

Enables or disables coprocessor loading. If 'false' (disabled), any other coprocessor related configuration will be ignored.

启用或禁用协处理器加载。如果“false”(禁用),任何其他协处理器相关配置将被忽略。

Default

true

真正的

hbase.coprocessor.user.enabled
Description

Enables or disables user (aka. table) coprocessor loading. If 'false' (disabled), any table coprocessor attributes in table descriptors will be ignored. If "hbase.coprocessor.enabled" is 'false' this setting has no effect.

启用或禁用用户(aka)。表)协处理器加载。如果“false”(禁用),表描述符中的任何表协处理器属性都将被忽略。如果“hbase.coprocessor。启用“是假的”这个设置没有效果。

Default

true

真正的

hbase.coprocessor.region.classes
Description

A comma-separated list of Coprocessors that are loaded by default on all tables. For any override coprocessor method, these classes will be called in order. After implementing your own Coprocessor, just put it in HBase’s classpath and add the fully qualified class name here. A coprocessor can also be loaded on demand by setting HTableDescriptor.

在所有表上默认加载的由逗号分隔的协处理器列表。对于任何覆盖的协处理器方法,这些类将被依次调用。在实现了自己的协处理器之后,只需将它放在HBase的类路径中,并在这里添加完全限定的类名。也可以通过设置HTableDescriptor来满足需求。

Default

none

没有一个

hbase.coprocessor.master.classes
Description

A comma-separated list of org.apache.hadoop.hbase.coprocessor.MasterObserver coprocessors that are loaded by default on the active HMaster process. For any implemented coprocessor methods, the listed classes will be called in order. After implementing your own MasterObserver, just put it in HBase’s classpath and add the fully qualified class name here.

一个逗号分隔的org.apache.hadoop.hbase.coprocessor。在活动的HMaster进程中默认加载的MasterObserver协处理器。对于任何实现的协处理器方法,将按顺序调用所列出的类。在实现了自己的MasterObserver之后,只需将它放在HBase的类路径中,并在这里添加完全限定的类名。

Default

none

没有一个

hbase.coprocessor.abortonerror
Description

Set to true to cause the hosting server (master or regionserver) to abort if a coprocessor fails to load, fails to initialize, or throws an unexpected Throwable object. Setting this to false will allow the server to continue execution but the system wide state of the coprocessor in question will become inconsistent as it will be properly executing in only a subset of servers, so this is most useful for debugging only.

设置为true,以使托管服务器(主服务器或区域服务器)在协处理器未能加载时终止,无法初始化,或抛出一个意外的可抛出对象。将此设置为false将允许服务器继续执行,但是由于将在服务器的一个子集中正确地执行,因此该协处理器的系统范围将变得不一致,因此这对于调试非常有用。

Default

true

真正的

hbase.rest.port
Description

The port for the HBase REST server.

HBase REST服务器的端口。

Default

8080

8080年

hbase.rest.readonly
Description

Defines the mode the REST server will be started in. Possible values are: false: All HTTP methods are permitted - GET/PUT/POST/DELETE. true: Only the GET method is permitted.

定义REST服务器将启动的模式。可能的值是:false:所有HTTP方法都是允许的- GET/PUT/POST/DELETE。正确:只有GET方法是允许的。

Default

false

hbase.rest.threads.max
Description

The maximum number of threads of the REST server thread pool. Threads in the pool are reused to process REST requests. This controls the maximum number of requests processed concurrently. It may help to control the memory used by the REST server to avoid OOM issues. If the thread pool is full, incoming requests will be queued up and wait for some free threads.

REST服务器线程池的最大线程数。池中的线程被重用以处理REST请求。这将控制并发处理的请求的最大数量。它可以帮助控制REST服务器使用的内存,以避免OOM问题。如果线程池是满的,传入的请求将排队等待一些空闲线程。

Default

100

One hundred.

hbase.rest.threads.min
Description

The minimum number of threads of the REST server thread pool. The thread pool always has at least these number of threads so the REST server is ready to serve incoming requests.

REST服务器线程池的最小线程数。线程池至少有这些线程数,因此REST服务器准备好为传入请求提供服务。

Default

2

2

hbase.rest.support.proxyuser
Description

Enables running the REST server to support proxy-user mode.

支持运行REST服务器以支持代理用户模式。

Default

false

hbase.defaults.for.version.skip
Description

Set to true to skip the 'hbase.defaults.for.version' check. Setting this to true can be useful in contexts other than the other side of a maven generation; i.e. running in an IDE. You’ll want to set this boolean to true to avoid seeing the RuntimeException complaint: "hbase-default.xml file seems to be for and old version of HBase (\${hbase.version}), this version is X.X.X-SNAPSHOT"

设置为true,以跳过“hbase.default .for.version”检查。将其设置为true可以在maven生成的另一端的上下文中有用;即在IDE中运行。您需要将这个布尔值设置为true,以避免看到RuntimeException的抱怨:“hbase-default。xml文件似乎是HBase的旧版本(\${HBase .version}),这个版本是X.X.X-SNAPSHOT。

Default

false

hbase.table.lock.enable
Description

Set to true to enable locking the table in zookeeper for schema change operations. Table locking from master prevents concurrent schema modifications to corrupt table state.

设置为true,以便在zookeeper中锁定表,用于模式更改操作。从主表锁定可以防止并发模式修改到损坏的表状态。

Default

true

真正的

hbase.table.max.rowsize
Description

Maximum size of single row in bytes (default is 1 Gb) for Get’ting or Scan’ning without in-row scan flag set. If row size exceeds this limit RowTooBigException is thrown to client.

如果没有行扫描标记集,那么就可以使用字节(默认为1gb)来获取或扫描的单行最大大小。如果行大小超过这个限制,RowTooBigException就会被抛出到客户端。

Default

1073741824

1073741824

hbase.thrift.minWorkerThreads
Description

The "core size" of the thread pool. New threads are created on every connection until this many threads are created.

线程池的“核心大小”。在创建多个线程之前,将在每个连接上创建新的线程。

Default

16

16

hbase.thrift.maxWorkerThreads
Description

The maximum size of the thread pool. When the pending request queue overflows, new threads are created until their number reaches this number. After that, the server starts dropping connections.

线程池的最大大小。当挂起的请求队列溢出时,将创建新的线程,直到它们的数量达到这个数字。在此之后,服务器开始删除连接。

Default

1000

1000年

hbase.thrift.maxQueuedRequests
Description

The maximum number of pending Thrift connections waiting in the queue. If there are no idle threads in the pool, the server queues requests. Only when the queue overflows, new threads are added, up to hbase.thrift.maxQueuedRequests threads.

在队列中等待的等待的节俭连接的最大数量。如果池中没有空闲线程,则服务器队列请求。只有当队列溢出时,才会添加新的线程,直到hbase.thrift。maxQueuedRequests线程。

Default

1000

1000年

hbase.regionserver.thrift.framed
Description

Use Thrift TFramedTransport on the server side. This is the recommended transport for thrift servers and requires a similar setting on the client side. Changing this to false will select the default transport, vulnerable to DoS when malformed requests are issued due to THRIFT-601.

在服务器端使用节俭的TFramedTransport。这是为节俭服务器推荐的传输方式,并且在客户端需要类似的设置。将此更改为false将选择缺省传输,当由于THRIFT-601发出错误请求时,将容易受到DoS攻击。

Default

false

hbase.regionserver.thrift.framed.max_frame_size_in_mb
Description

Default frame size when using framed transport, in MB

使用框架传输时的默认帧大小,以MB为单位。

Default

2

2

hbase.regionserver.thrift.compact
Description

Use Thrift TCompactProtocol binary serialization protocol.

使用节约TCompactProtocol二进制序列化协议。

Default

false

hbase.rootdir.perms
Description

FS Permissions for the root data subdirectory in a secure (kerberos) setup. When master starts, it creates the rootdir with this permissions or sets the permissions if it does not match.

在安全(kerberos)设置中对根数据子目录的FS权限。当主启动时,它将使用此权限创建rootdir,或设置不匹配的权限。

Default

700

700年

hbase.wal.dir.perms
Description

FS Permissions for the root WAL directory in a secure(kerberos) setup. When master starts, it creates the WAL dir with this permissions or sets the permissions if it does not match.

在一个安全的(kerberos)设置中,对根WAL目录的权限。当master启动时,它会使用该权限创建WAL dir,或者如果它不匹配,则设置权限。

Default

700

700年

hbase.data.umask.enable
Description

Enable, if true, that file permissions should be assigned to the files written by the regionserver

如果是true,则应该将该文件权限分配给区域服务器所写的文件。

Default

false

hbase.data.umask
Description

File permissions that should be used to write data files when hbase.data.umask.enable is true

当hbase.data.umask时,应该用来写入数据文件的文件权限。使是真的

Default

000

000年

hbase.snapshot.enabled
Description

Set to true to allow snapshots to be taken / restored / cloned.

设置为true,以允许进行快照/恢复/克隆。

Default

true

真正的

hbase.snapshot.restore.take.failsafe.snapshot
Description

Set to true to take a snapshot before the restore operation. The snapshot taken will be used in case of failure, to restore the previous state. At the end of the restore operation this snapshot will be deleted

设置为true,在恢复操作之前获取快照。如果出现故障,将使用快照,以恢复以前的状态。在还原操作结束时,此快照将被删除。

Default

true

真正的

hbase.snapshot.restore.failsafe.name
Description

Name of the failsafe snapshot taken by the restore operation. You can use the {snapshot.name}, {table.name} and {restore.timestamp} variables to create a name based on what you are restoring.

恢复操作所采取的故障安全快照的名称。您可以使用{snapshot.name}、{table.name}和{restore。时间戳变量,根据您正在恢复的内容创建一个名称。

Default

hbase-failsafe-{snapshot.name}-{restore.timestamp}

hbase-failsafe - { snapshot.name } { restore.timestamp }

hbase.server.compactchecker.interval.multiplier
Description

The number that determines how often we scan to see if compaction is necessary. Normally, compactions are done after some events (such as memstore flush), but if region didn’t receive a lot of writes for some time, or due to different compaction policies, it may be necessary to check it periodically. The interval between checks is hbase.server.compactchecker.interval.multiplier multiplied by hbase.server.thread.wakefrequency.

这个数字决定了我们多久扫描一次,看看是否需要压缩。通常情况下,在一些事件(比如memstore flush)之后会进行压缩,但是如果区域在一段时间内没有收到大量的写入,或者由于不同的压缩策略,可能需要定期检查它。检查的间隔是hbase.server. compactchecker.interval.乘数乘以hbase.server.thread.wakefrequency。

Default

1000

1000年

hbase.lease.recovery.timeout
Description

How long we wait on dfs lease recovery in total before giving up.

在放弃之前,我们在dfs租约上总共等待了多长时间。

Default

900000

900000年

hbase.lease.recovery.dfs.timeout
Description

How long between dfs recover lease invocations. Should be larger than the sum of the time it takes for the namenode to issue a block recovery command as part of datanode; dfs.heartbeat.interval and the time it takes for the primary datanode, performing block recovery to timeout on a dead datanode; usually dfs.client.socket-timeout. See the end of HBASE-8389 for more.

dfs恢复租约调用之间的时间。应该大于namenode发出块恢复命令作为datanode的一部分所需的时间之和;dfs.heartbeat.interval和主datanode的时间,执行阻塞恢复到死datanode的超时;通常dfs.client.socket-timeout。请参阅HBASE-8389的结尾。

Default

64000

64000年

hbase.column.max.version
Description

New column family descriptors will use this value as the default number of versions to keep.

新的列家族描述符将使用这个值作为保留的默认版本号。

Default

1

1

dfs.client.read.shortcircuit
Description

If set to true, this configuration parameter enables short-circuit local reads.

如果设置为true,则此配置参数启用了短路本地读取。

Default

false

dfs.domain.socket.path
Description

This is a path to a UNIX domain socket that will be used for communication between the DataNode and local HDFS clients, if dfs.client.read.shortcircuit is set to true. If the string "_PORT" is present in this path, it will be replaced by the TCP port of the DataNode. Be careful about permissions for the directory that hosts the shared domain socket; dfsclient will complain if open to other users than the HBase user.

这是一个UNIX域套接字的路径,它将用于DataNode和本地HDFS客户机之间的通信,如果dfs.client.read。短路设置为真。如果在这条路径中存在字符串“_PORT”,那么它将被DataNode的TCP端口所替代。要注意托管共享域套接字的目录的权限;如果对其他用户开放,dfsclient将会向HBase用户投诉。

Default

none

没有一个

hbase.dfs.client.read.shortcircuit.buffer.size
Description

If the DFSClient configuration dfs.client.read.shortcircuit.buffer.size is unset, we will use what is configured here as the short circuit read default direct byte buffer size. DFSClient native default is 1MB; HBase keeps its HDFS files open so number of file blocks * 1MB soon starts to add up and threaten OOME because of a shortage of direct memory. So, we set it down from the default. Make it > the default hbase block size set in the HColumnDescriptor which is usually 64k.

如果DFSClient配置dfs.client.read.短路.buffer。大小是未设置的,我们将使用这里配置的,因为短路读取默认的直接字节缓冲大小。DFSClient本机默认为1MB;HBase保持它的HDFS文件打开,所以文件块的数量* 1MB很快就开始增加并威胁到OOME,因为缺乏直接内存。因此,我们将它从默认设置下。使其>在HColumnDescriptor中默认的hbase块大小设置为64k。

Default

131072

131072年

hbase.regionserver.checksum.verify
Description

If set to true (the default), HBase verifies the checksums for hfile blocks. HBase writes checksums inline with the data when it writes out hfiles. HDFS (as of this writing) writes checksums to a separate file than the data file necessitating extra seeks. Setting this flag saves some on i/o. Checksum verification by HDFS will be internally disabled on hfile streams when this flag is set. If the hbase-checksum verification fails, we will switch back to using HDFS checksums (so do not disable HDFS checksums! And besides this feature applies to hfiles only, not to WALs). If this parameter is set to false, then hbase will not verify any checksums, instead it will depend on checksum verification being done in the HDFS client.

如果设置为true(默认值),HBase将验证hfile块的校验和。HBase在写入hfiles时与数据内联地编写校验和。HDFS(在撰写本文时)将校验和写入一个单独的文件,而不是需要额外查找的数据文件。设置此标志可以节省一些i/o。当设置此标志时,HDFS的校验和将在hfile流中被内部禁用。如果hbase-checksum验证失败,我们将切换回使用HDFS校验和(所以不要禁用HDFS校验和!此外,这个特性只适用于hfiles,而不适用于WALs。如果该参数被设置为false,那么hbase将不验证任何校验和,相反,它将依赖于在HDFS客户机中进行的校验和验证。

Default

true

真正的

hbase.hstore.bytes.per.checksum
Description

Number of bytes in a newly created checksum chunk for HBase-level checksums in hfile blocks.

在hfile块中,新创建的用于hbase级校验和的校验和块中的字节数。

Default

16384

16384年

hbase.hstore.checksum.algorithm
Description

Name of an algorithm that is used to compute checksums. Possible values are NULL, CRC32, CRC32C.

用于计算校验和的算法的名称。可能的值为NULL, CRC32, CRC32C。

Default

CRC32C

CRC32C

hbase.client.scanner.max.result.size
Description

Maximum number of bytes returned when calling a scanner’s next method. Note that when a single row is larger than this limit the row is still returned completely. The default value is 2MB, which is good for 1ge networks. With faster and/or high latency networks this value should be increased.

当调用扫描器的下一个方法时返回的最大字节数。注意,当单个行大于这个限制时,行仍然完全返回。默认值为2MB,这对1ge网络很好。随着更快和/或高延迟网络,这个值应该增加。

Default

2097152

2097152

hbase.server.scanner.max.result.size
Description

Maximum number of bytes returned when calling a scanner’s next method. Note that when a single row is larger than this limit the row is still returned completely. The default value is 100MB. This is a safety setting to protect the server from OOM situations.

当调用扫描器的下一个方法时返回的最大字节数。注意,当单个行大于这个限制时,行仍然完全返回。默认值是100MB。这是一个安全设置,以保护服务器不受OOM情况的影响。

Default

104857600

104857600

hbase.status.published
Description

This setting activates the publication by the master of the status of the region server. When a region server dies and its recovery starts, the master will push this information to the client application, to let them cut the connection immediately instead of waiting for a timeout.

此设置将激活该区域服务器状态的主发布。当一个区域服务器死亡并开始恢复时,主服务器将把这个信息推送到客户端应用程序,让他们立即切断连接,而不是等待超时。

Default

false

hbase.status.publisher.class
Description

Implementation of the status publication with a multicast message.

使用多播消息实现状态发布。

Default

org.apache.hadoop.hbase.master.ClusterStatusPublisher$MulticastPublisher

org.apache.hadoop.hbase.master.ClusterStatusPublisher MulticastPublisher美元

hbase.status.listener.class
Description

Implementation of the status listener with a multicast message.

使用多播消息实现状态监听器。

Default

org.apache.hadoop.hbase.client.ClusterStatusListener$MulticastListener

org.apache.hadoop.hbase.client.ClusterStatusListener MulticastListener美元

hbase.status.multicast.address.ip
Description

Multicast address to use for the status publication by multicast.

多播地址用于状态发布的多点广播。

Default

226.1.1.3

226.1.1.3

hbase.status.multicast.address.port
Description

Multicast port to use for the status publication by multicast.

多播端口用于状态发布的多播。

Default

16100

16100年

hbase.dynamic.jars.dir
Description

The directory from which the custom filter JARs can be loaded dynamically by the region server without the need to restart. However, an already loaded filter/co-processor class would not be un-loaded. See HBASE-1936 for more details. Does not apply to coprocessors.

自定义筛选器jar可以由区域服务器动态加载的目录,而无需重新启动。但是,已经加载的过滤器/协同处理器类不会被卸载。参见HBASE-1936,了解更多细节。不适用于协处理器。

Default

${hbase.rootdir}/lib

$ { hbase.rootdir } / lib

hbase.security.authentication
Description

Controls whether or not secure authentication is enabled for HBase. Possible values are 'simple' (no authentication), and 'kerberos'.

控制是否为HBase启用了安全身份验证。可能的值是“简单的”(没有身份验证)和“kerberos”。

Default

simple

简单的

hbase.rest.filter.classes
Description

Servlet filters for REST service.

Servlet过滤器用于REST服务。

Default

org.apache.hadoop.hbase.rest.filter.GzipFilter

org.apache.hadoop.hbase.rest.filter.GzipFilter

hbase.master.loadbalancer.class
Description

Class used to execute the regions balancing when the period occurs. See the class comment for more on how it works http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.html It replaces the DefaultLoadBalancer as the default (since renamed as the SimpleLoadBalancer).

当周期发生时,用于执行区域平衡的类。请参阅类评论,以了解它如何工作的:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/hbase/master/balancer/stochasticloadbalancer.html以默认的方式替换DefaultLoadBalancer(因为它被重新命名为SimpleLoadBalancer)。

Default

org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer

org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer

hbase.master.loadbalance.bytable
Description

Factor Table name when the balancer runs. Default: false.

当平衡器运行时,元素表名。默认值:false。

Default

false

hbase.master.normalizer.class
Description

Class used to execute the region normalization when the period occurs. See the class comment for more on how it works http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.html

当周期发生时,用于执行区域标准化的类。请参阅类评论,了解更多关于它如何工作的信息。

Default

org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer

org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer

hbase.rest.csrf.enabled
Description

Set to true to enable protection against cross-site request forgery (CSRF)

设置为真,以防止跨站请求伪造(CSRF)

Default

false

hbase.rest-csrf.browser-useragents-regex
Description

A comma-separated list of regular expressions used to match against an HTTP request’s User-Agent header when protection against cross-site request forgery (CSRF) is enabled for REST server by setting hbase.rest.csrf.enabled to true. If the incoming User-Agent matches any of these regular expressions, then the request is considered to be sent by a browser, and therefore CSRF prevention is enforced. If the request’s User-Agent does not match any of these regular expressions, then the request is considered to be sent by something other than a browser, such as scripted automation. In this case, CSRF is not a potential attack vector, so the prevention is not enforced. This helps achieve backwards-compatibility with existing automation that has not been updated to send the CSRF prevention header.

用于与HTTP请求的用户代理头匹配的正则表达式列表,当保护针对跨站请求(CSRF)时,通过设置hbase.rest.csrf为REST服务器启用。启用为true。如果传入的用户代理与这些正则表达式相匹配,则会认为该请求是由浏览器发送的,因此将执行CSRF预防。如果请求的用户代理不匹配这些正则表达式,那么请求将被认为是由浏览器以外的其他东西发送的,比如脚本化的自动化。在这种情况下,CSRF并不是一个潜在的攻击向量,因此不强制执行。这有助于实现向后兼容现有的自动化,而目前的自动化还没有更新来发送CSRF预防报头。

Default

Mozilla.,Opera.

Mozilla,歌剧。

hbase.security.exec.permission.checks
Description

If this setting is enabled and ACL based access control is active (the AccessController coprocessor is installed either as a system coprocessor or on a table as a table coprocessor) then you must grant all relevant users EXEC privilege if they require the ability to execute coprocessor endpoint calls. EXEC privilege, like any other permission, can be granted globally to a user, or to a user on a per table or per namespace basis. For more information on coprocessor endpoints, see the coprocessor section of the HBase online manual. For more information on granting or revoking permissions using the AccessController, see the security section of the HBase online manual.

如果启用了这个设置,并且基于ACL的访问控制是活动的(AccessController协处理器是作为一个系统协处理器或作为表协处理器安装的),那么如果它们需要执行协处理器端点调用的能力,则必须授予所有相关用户EXEC特权。与任何其他权限一样,EXEC特权可以在全局范围内授予用户,也可以在每个表或每个名称空间的基础上授予用户。有关协处理器端点的更多信息,请参阅HBase在线手册的协处理器部分。有关使用AccessController授予或撤销权限的更多信息,请参见HBase在线手册的安全部分。

Default

false

hbase.procedure.regionserver.classes
Description

A comma-separated list of org.apache.hadoop.hbase.procedure.RegionServerProcedureManager procedure managers that are loaded by default on the active HRegionServer process. The lifecycle methods (init/start/stop) will be called by the active HRegionServer process to perform the specific globally barriered procedure. After implementing your own RegionServerProcedureManager, just put it in HBase’s classpath and add the fully qualified class name here.

一个逗号分隔的org.apache.hadoop.hbase.procedure。在活动的h分区服务器进程中默认加载的区域服务器过程管理程序管理器。生命周期方法(init/start/stop)将由活动的h分区服务器进程调用,以执行特定的全局隔离过程。在实现您自己的区域服务器过程管理器之后,只需将它放入HBase的类路径中,并在这里添加完全限定的类名。

Default

none

没有一个

hbase.procedure.master.classes
Description

A comma-separated list of org.apache.hadoop.hbase.procedure.MasterProcedureManager procedure managers that are loaded by default on the active HMaster process. A procedure is identified by its signature and users can use the signature and an instant name to trigger an execution of a globally barriered procedure. After implementing your own MasterProcedureManager, just put it in HBase’s classpath and add the fully qualified class name here.

一个逗号分隔的org.apache.hadoop.hbase.procedure。在活动的HMaster进程中默认加载的主进程管理程序管理器。程序由其签名标识,用户可以使用签名和即时名称来触发一个全局隔离过程的执行。在实现了自己的masterprocessduremanager之后,只需将它放入HBase的类路径中,并在这里添加完全限定的类名。

Default

none

没有一个

hbase.coordinated.state.manager.class
Description

Fully qualified name of class implementing coordinated state manager.

执行协调状态管理器类的全限定名。

Default

org.apache.hadoop.hbase.coordination.ZkCoordinatedStateManager

org.apache.hadoop.hbase.coordination.ZkCoordinatedStateManager

hbase.regionserver.storefile.refresh.period
Description

The period (in milliseconds) for refreshing the store files for the secondary regions. 0 means this feature is disabled. Secondary regions sees new files (from flushes and compactions) from primary once the secondary region refreshes the list of files in the region (there is no notification mechanism). But too frequent refreshes might cause extra Namenode pressure. If the files cannot be refreshed for longer than HFile TTL (hbase.master.hfilecleaner.ttl) the requests are rejected. Configuring HFile TTL to a larger value is also recommended with this setting.

用于刷新次要区域的存储文件的周期(以毫秒为单位)。0表示此功能被禁用。次要区域在次要区域刷新该区域的文件列表(没有通知机制)时,会从主区域看到新的文件(从刷新和压缩)。但是频繁的刷新可能会造成额外的Namenode压力。如果文件不能刷新的时间超过HFile TTL (hbase.master.hfilecleaner.ttl),请求就会被拒绝。将HFile TTL配置为更大的值也建议使用此设置。

Default

0

0

hbase.region.replica.replication.enabled
Description

Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named "region_replica_replication" will be created which will tail the logs and replicate the mutations to region replicas for tables that have region replication > 1. If this is enabled once, disabling this replication also requires disabling the replication peer using shell or Admin java class. Replication to secondary region replicas works over standard inter-cluster replication.

是否启用了对辅助区域副本的异步复制。如果启用了这个功能,将创建一个名为“region_replica_replication”的复制节点,它将跟踪日志,并将这些突变复制到具有区域复制> 1的表的区域副本中。如果启用此功能,禁用此复制还需要使用shell或Admin java类禁用复制对等项。复制到次要区域副本的工作超过标准的集群复制。

Default

false

hbase.http.filter.initializers
Description

A comma separated list of class names. Each class in the list must extend org.apache.hadoop.hbase.http.FilterInitializer. The corresponding Filter will be initialized. Then, the Filter will be applied to all user facing jsp and servlet web pages. The ordering of the list defines the ordering of the filters. The default StaticUserWebFilter add a user principal as defined by the hbase.http.staticuser.user property.

一个逗号分隔的类名列表。列表中的每个类都必须扩展org.apache.hadoop.hbase.http.FilterInitializer。相应的过滤器将被初始化。然后,该过滤器将应用于所有面向jsp和servlet web页面的用户。列表的顺序定义了过滤器的顺序。默认StaticUserWebFilter添加一个由hbase.http.staticuser定义的用户主体。用户属性。

Default

org.apache.hadoop.hbase.http.lib.StaticUserWebFilter

org.apache.hadoop.hbase.http.lib.StaticUserWebFilter

hbase.security.visibility.mutations.checkauths
Description

This property if enabled, will check whether the labels in the visibility expression are associated with the user issuing the mutation

如果启用此属性,将检查可见性表达式中的标签是否与发出该突变的用户关联。

Default

false

hbase.http.max.threads
Description

The maximum number of threads that the HTTP Server will create in its ThreadPool.

HTTP服务器将在其ThreadPool中创建的最大线程数。

Default

16

16

hbase.replication.rpc.codec
Description

The codec that is to be used when replication is enabled so that the tags are also replicated. This is used along with HFileV3 which supports tags in them. If tags are not used or if the hfile version used is HFileV2 then KeyValueCodec can be used as the replication codec. Note that using KeyValueCodecWithTags for replication when there are no tags causes no harm.

在启用复制时要使用的编解码器,以便复制标记。这与HFileV3一起使用,HFileV3支持这些标签。如果没有使用标记,或者使用的hfile版本是HFileV2,那么KeyValueCodec可以作为复制编解码器。注意,在没有标记的情况下,使用KeyValueCodecWithTags进行复制不会造成任何损害。

Default

org.apache.hadoop.hbase.codec.KeyValueCodecWithTags

org.apache.hadoop.hbase.codec.KeyValueCodecWithTags

hbase.replication.source.maxthreads
Description

The maximum number of threads any replication source will use for shipping edits to the sinks in parallel. This also limits the number of chunks each replication batch is broken into. Larger values can improve the replication throughput between the master and slave clusters. The default of 10 will rarely need to be changed.

任何复制源的最大线程数将用于并行地对接收器进行编辑。这也限制了每个复制批被分解成的块的数量。更大的值可以提高主集群和从属集群之间的复制吞吐量。默认的10将很少需要更改。

Default

10

10

hbase.http.staticuser.user
Description

The user name to filter as, on static web filters while rendering content. An example use is the HDFS web UI (user to be used for browsing files).

在呈现内容时,将用户名过滤为静态web过滤器。示例使用的是HDFS web UI(用于浏览文件的用户)。

Default

dr.stack

dr.stack

hbase.regionserver.handler.abort.on.error.percent
Description

The percent of region server RPC threads failed to abort RS. -1 Disable aborting; 0 Abort if even a single handler has died; 0.x Abort only when this percent of handlers have died; 1 Abort only all of the handers have died.

区域服务器RPC线程的百分比未能终止RS. 1禁用中止;如果单个处理程序已经死亡,则终止;0。只有当这个百分比的处理程序已经死亡时,x才会中止;只有所有的手都死了。

Default

0.5

0.5

hbase.mob.file.cache.size
Description

Number of opened file handlers to cache. A larger value will benefit reads by providing more file handlers per mob file cache and would reduce frequent file opening and closing. However, if this is set too high, this could lead to a "too many opened file handlers" The default value is 1000.

打开的文件处理程序的数量。一个更大的值将通过为每个mob文件缓存提供更多的文件处理程序而受益,并减少频繁的文件打开和关闭。但是,如果设置得太高,这可能导致“太多打开的文件处理程序”,默认值是1000。

Default

1000

1000年

hbase.mob.cache.evict.period
Description

The amount of time in seconds before the mob cache evicts cached mob files. The default value is 3600 seconds.

在暴民缓存清除缓存的mob文件之前的几秒钟时间。默认值为3600秒。

Default

3600

3600年

hbase.mob.cache.evict.remain.ratio
Description

The ratio (between 0.0 and 1.0) of files that remains cached after an eviction is triggered when the number of cached mob files exceeds the hbase.mob.file.cache.size. The default value is 0.5f.

当缓存的mob文件的数量超过hbase.mob.file.cache.size时,在被驱逐后缓存的文件的比率(介于0.0和1.0之间)将被触发。默认值是0。5f。

Default

0.5f

0.5度

hbase.master.mob.ttl.cleaner.period
Description

The period that ExpiredMobFileCleanerChore runs. The unit is second. The default value is one day. The MOB file name uses only the date part of the file creation time in it. We use this time for deciding TTL expiry of the files. So the removal of TTL expired files might be delayed. The max delay might be 24 hrs.

结束mobfilecleanerchore运行的期间。单位是秒。默认值是一天。MOB文件名称只使用文件创建时间的日期部分。我们使用这个时间来决定文件的TTL过期。因此,删除TTL过期的文件可能会被延迟。最大延迟可能是24小时。

Default

86400

86400年

hbase.mob.compaction.mergeable.threshold
Description

If the size of a mob file is less than this value, it’s regarded as a small file and needs to be merged in mob compaction. The default value is 1280MB.

如果一个mob文件的大小小于这个值,它就被认为是一个小文件,需要合并到mob compaction中。默认值是1280MB。

Default

1342177280

1342177280

hbase.mob.delfile.max.count
Description

The max number of del files that is allowed in the mob compaction. In the mob compaction, when the number of existing del files is larger than this value, they are merged until number of del files is not larger this value. The default value is 3.

在暴民压缩中允许的del文件的最大数量。在mob compaction中,当现有del文件的数量大于这个值时,它们会被合并,直到del文件的数量不大于这个值。默认值是3。

Default

3

3

hbase.mob.compaction.batch.size
Description

The max number of the mob files that is allowed in a batch of the mob compaction. The mob compaction merges the small mob files to bigger ones. If the number of the small files is very large, it could lead to a "too many opened file handlers" in the merge. And the merge has to be split into batches. This value limits the number of mob files that are selected in a batch of the mob compaction. The default value is 100.

mob文件的最大数量允许在一组暴民压缩。暴徒的密实把小暴徒的文件合并成大的。如果小文件的数量很大,那么在合并中可能会导致“太多打开的文件处理程序”。合并必须分批进行。这个值限制了mob文件中被选中的mob文件的数量。默认值是100。

Default

100

One hundred.

hbase.mob.compaction.chore.period
Description

The period that MobCompactionChore runs. The unit is second. The default value is one week.

MobCompactionChore运行的周期。单位是秒。默认值是一个星期。

Default

604800

604800年

hbase.mob.compactor.class
Description

Implementation of mob compactor, the default one is PartitionedMobCompactor.

实现了mob compactor,默认的是PartitionedMobCompactor。

Default

org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactor

org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactor

hbase.mob.compaction.threads.max
Description

The max number of threads used in MobCompactor.

MobCompactor中使用的最大线程数。

Default

1

1

hbase.snapshot.master.timeout.millis
Description

Timeout for master for the snapshot procedure execution.

用于快照过程执行的主超时。

Default

300000

300000年

hbase.snapshot.region.timeout
Description

Timeout for regionservers to keep threads in snapshot request pool waiting.

区域服务器的超时,以便在快照请求池中保持线程等待。

Default

300000

300000年

hbase.rpc.rows.warning.threshold
Description

Number of rows in a batch operation above which a warning will be logged.

在上面的批处理操作中,将记录一个警告的行数。

Default

5000

5000年

hbase.master.wait.on.service.seconds
Description

Default is 5 minutes. Make it 30 seconds for tests. See HBASE-19794 for some context.

默认是5分钟。做30秒的测试。参见HBASE-19794,了解一些上下文。

Default

30

30.

7.3. hbase-env.sh

7.3。hbase-env.sh

Set HBase environment variables in this file. Examples include options to pass the JVM on start of an HBase daemon such as heap size and garbage collector configs. You can also set configurations for HBase configuration, log directories, niceness, ssh options, where to locate process pid files, etc. Open the file at conf/hbase-env.sh and peruse its content. Each option is fairly well documented. Add your own environment variables here if you want them read by HBase daemons on startup.

在此文件中设置HBase环境变量。示例包括在HBase守护进程(如堆大小和垃圾收集器configs)启动时传递JVM的选项。您还可以为HBase配置、日志目录、niceness、ssh选项、在何处定位进程pid文件等设置配置。并仔细阅读它的内容。每个选项都有相当详细的文档。如果您想让HBase守护进程在启动时读取它们,那么在这里添加您自己的环境变量。

Changes here will require a cluster restart for HBase to notice the change.

这里的更改将要求HBase重新启动集群,以注意更改。

7.4. log4j.properties

7.4。log4j . properties

Edit this file to change rate at which HBase files are rolled and to change the level at which HBase logs messages.

编辑此文件以更改HBase文件的滚动速度,并更改HBase日志消息的级别。

Changes here will require a cluster restart for HBase to notice the change though log levels can be changed for particular daemons via the HBase UI.

这里的更改将要求HBase重新启动集群,以注意更改,但是可以通过HBase UI为特定的守护进程更改日志级别。

7.5. Client configuration and dependencies connecting to an HBase cluster

7.5。连接到HBase集群的客户端配置和依赖关系。

If you are running HBase in standalone mode, you don’t need to configure anything for your client to work provided that they are all on the same machine.

如果在独立模式下运行HBase,则不需要为客户机配置任何东西,前提是它们都在同一台机器上。

Since the HBase Master may move around, clients bootstrap by looking to ZooKeeper for current critical locations. ZooKeeper is where all these values are kept. Thus clients require the location of the ZooKeeper ensemble before they can do anything else. Usually this ensemble location is kept out in the hbase-site.xml and is picked up by the client from the CLASSPATH.

由于HBase主机可以移动,客户端通过查找当前关键位置的ZooKeeper来引导。ZooKeeper是所有这些值保存的地方。因此,客户需要在他们可以做任何其他事情之前,先将ZooKeeper集成在一起。通常这个集成位置在hbase站点中被保留。xml并由来自类路径的客户机接收。

If you are configuring an IDE to run an HBase client, you should include the conf/ directory on your classpath so hbase-site.xml settings can be found (or add src/test/resources to pick up the hbase-site.xml used by tests).

如果您正在配置一个IDE来运行一个HBase客户端,那么您应该在类路径上包含conf/目录,这样HBase -site就可以了。可以找到xml设置(或者添加src/test/resources来获取hbase站点。测试所使用的xml)。

Minimally, an HBase client needs hbase-client module in its dependencies when connecting to a cluster:

在连接到集群时,HBase客户端需要HBase -client模块。

<dependency>
  <groupId>org.apache.hbase</groupId>
  <artifactId>hbase-client</artifactId>
  <version>1.2.4</version>
</dependency>

A basic example hbase-site.xml for client only may look as follows:

一个基本的例子hbase-site。客户端的xml可能如下图所示:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>example1,example2,example3</value>
    <description>The directory shared by region servers.
    </description>
  </property>
</configuration>

7.5.1. Java client configuration

7.5.1。Java客户端配置

The configuration used by a Java client is kept in an HBaseConfiguration instance.

Java客户端使用的配置保存在HBaseConfiguration实例中。

The factory method on HBaseConfiguration, HBaseConfiguration.create();, on invocation, will read in the content of the first hbase-site.xml found on the client’s CLASSPATH, if one is present (Invocation will also factor in any hbase-default.xml found; an hbase-default.xml ships inside the hbase.X.X.X.jar). It is also possible to specify configuration directly without having to read from a hbase-site.xml. For example, to set the ZooKeeper ensemble for the cluster programmatically do as follows:

HBaseConfiguration的工厂方法,hbaseconfigur. create();在调用时,将读取第一个hbase站点的内容。在客户机的类路径上发现的xml(如果有的话)(调用也会导致任何hbase-default)。xml发现;一个hbase-default。xml在hbase.X.X.X.jar中。还可以直接指定配置,而不必从hbase-site.xml读取。例如,要以编程方式设置集群的ZooKeeper集合,如下所示:

Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "localhost");  // Here we are running zookeeper locally

If multiple ZooKeeper instances make up your ZooKeeper ensemble, they may be specified in a comma-separated list (just as in the hbase-site.xml file). This populated Configuration instance can then be passed to an Table, and so on.

如果多个ZooKeeper实例组成了您的ZooKeeper集合,它们可以在逗号分隔的列表中指定(就像在hbase站点中一样)。xml文件)。然后可以将这个填充的配置实例传递给一个表,等等。

8. Example Configurations

8。示例配置

8.1. Basic Distributed HBase Install

8.1。基本的分布式HBase安装

Here is a basic configuration example for a distributed ten node cluster: * The nodes are named example0, example1, etc., through node example9 in this example. * The HBase Master and the HDFS NameNode are running on the node example0. * RegionServers run on nodes example1-example9. * A 3-node ZooKeeper ensemble runs on example1, example2, and example3 on the default ports. * ZooKeeper data is persisted to the directory /export/zookeeper.

下面是分布式10节点集群的一个基本配置示例:*节点被命名为example0, example1,等等,在本例中通过节点example9。* HBase主机和HDFS NameNode在节点example0上运行。*区域服务器运行在节点example1-example9。*一个3节点的ZooKeeper集合在默认端口上运行在example1、example2和example3上。* ZooKeeper数据保存到目录/导出/ ZooKeeper。

Below we show what the main configuration files — hbase-site.xml, regionservers, and hbase-env.sh — found in the HBase conf directory might look like.

下面我们将展示主配置文件- hbase-site。xml、regionservers和hbase-env。在HBase conf目录中找到的sh可能是这样的。

8.1.1. hbase-site.xml

8.1.1。hbase-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>example1,example2,example3</value>
    <description>The directory shared by RegionServers.
    </description>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/export/zookeeper</value>
    <description>Property from ZooKeeper config zoo.cfg.
    The directory where the snapshot is stored.
    </description>
  </property>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://example0:8020/hbase</value>
    <description>The directory shared by RegionServers.
    </description>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>The mode the cluster will be in. Possible values are
      false: standalone and pseudo-distributed setups with managed ZooKeeper
      true: fully-distributed with unmanaged ZooKeeper Quorum (see hbase-env.sh)
    </description>
  </property>
</configuration>

8.1.2. regionservers

8.1.2。regionservers

In this file you list the nodes that will run RegionServers. In our case, these nodes are example1-example9.

在这个文件中,您将列出将运行区域服务器的节点。在我们的例子中,这些节点是example1-example9。

example1
example2
example3
example4
example5
example6
example7
example8
example9

8.1.3. hbase-env.sh

8.1.3。hbase-env.sh

The following lines in the hbase-env.sh file show how to set the JAVA_HOME environment variable (required for HBase) and set the heap to 4 GB (rather than the default value of 1 GB). If you copy and paste this example, be sure to adjust the JAVA_HOME to suit your environment.

hbase-env中的以下几行。sh文件显示了如何设置JAVA_HOME环境变量(HBase所需),并将堆设置为4 GB(而不是默认值为1 GB)。如果您复制并粘贴这个示例,请确保调整JAVA_HOME以适应您的环境。

# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0/

# The maximum amount of heap to use. Default is left to JVM default.
export HBASE_HEAPSIZE=4G

Use rsync to copy the content of the conf directory to all nodes of the cluster.

使用rsync将conf目录的内容复制到集群的所有节点。

9. The Important Configurations

9。重要的配置

Below we list some important configurations. We’ve divided this section into required configuration and worth-a-look recommended configs.

下面列出一些重要的配置。我们将此部分划分为所需的配置和值得推荐的配置。

9.1. Required Configurations

9.1。需要配置

Review the os and hadoop sections.

检查操作系统和hadoop部分。

9.1.1. Big Cluster Configurations

9.1.1。大集群配置

If you have a cluster with a lot of regions, it is possible that a Regionserver checks in briefly after the Master starts while all the remaining RegionServers lag behind. This first server to check in will be assigned all regions which is not optimal. To prevent the above scenario from happening, up the hbase.master.wait.on.regionservers.mintostart property from its default value of 1. See HBASE-6389 Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments for more detail.

如果您的集群中有很多区域,那么在主启动后,区域服务器可能会进行短暂的检查,而其余的区域服务器则会滞后。第一个要检查的服务器将被分配到所有不是最优的区域。为了防止上述情况的发生,请在hbase.master. wait.e.c服务器上执行。mintostart属性的默认值为1。请参阅HBASE-6389修改条件,以确保主服务器在开始区域分配之前等待足够数量的区域服务器。

zookeeper.session.timeout
zookeeper.session.timeout

The default timeout is three minutes (specified in milliseconds). This means that if a server crashes, it will be three minutes before the Master notices the crash and starts recovery. You might need to tune the timeout down to a minute or even less so the Master notices failures sooner. Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this — you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time).

默认超时为3分钟(以毫秒为单位)。这意味着,如果服务器崩溃,将在主人注意到崩溃并开始恢复之前三分钟。您可能需要将超时时间调到一分钟或更短,这样大师就会更快地注意到故障。在更改此值之前,请确保您的JVM垃圾收集配置处于控制之下,否则,长时间的垃圾收集将超出ZooKeeper会话超时,将会占用您的区域服务器。(您可能对此很满意——如果一个区域性服务器长期处于GC状态,那么您可能希望从服务器开始恢复)。

To change this configuration, edit hbase-site.xml, copy the changed file across the cluster and restart.

要更改此配置,请编辑hbase-site。xml,在集群中复制已更改的文件并重新启动。

We set this value high to save our having to field questions up on the mailing lists asking why a RegionServer went down during a massive import. The usual cause is that their JVM is untuned and they are running into long GC pauses. Our thinking is that while users are getting familiar with HBase, we’d save them having to know all of its intricacies. Later when they’ve built some confidence, then they can play with configuration such as this.

我们将这个值设为高,以节省我们在邮件列表上的问题,询问为什么在大量的导入过程中,区域服务器会崩溃。通常的原因是它们的JVM没有调优,它们正在运行长GC暂停。我们的想法是,虽然用户对HBase越来越熟悉,但我们可以让他们知道所有的复杂情况。当他们建立了一些信心之后,他们就可以玩这样的配置了。

Number of ZooKeeper Instances
动物园管理员实例

See zookeeper.

看到动物园管理员。

9.2.2. HDFS Configurations

9.2.2。HDFS配置

dfs.datanode.failed.volumes.tolerated
dfs.datanode.failed.volumes.tolerated

This is the "…​number of volumes that are allowed to fail before a DataNode stops offering service. By default any volume failure will cause a datanode to shutdown" from the hdfs-default.xml description. You might want to set this to about half the amount of your available disks.

这是“……在DataNode停止提供服务之前允许失败的卷的数量”。默认情况下,任何容量失败都会导致datanode关闭“从hdfs-default”。xml描述。您可能希望将其设置为可用磁盘的一半。

hbase.regionserver.handler.count
hbase.regionserver.handler.count

This setting defines the number of threads that are kept open to answer incoming requests to user tables. The rule of thumb is to keep this number low when the payload per request approaches the MB (big puts, scans using a large cache) and high when the payload is small (gets, small puts, ICVs, deletes). The total size of the queries in progress is limited by the setting hbase.ipc.server.max.callqueue.size.

该设置定义了保持打开的线程数,以响应对用户表的传入请求。经验法则是,当每个请求的有效负载接近MB(大容量、扫描使用大缓存)和高负载时(获取、小put、ICVs、删除)时,保持这个数字低。在进程中查询的总大小受设置hbase.ipc.server.max.callqueue.size的限制。

It is safe to set that number to the maximum number of incoming clients if their payload is small, the typical example being a cluster that serves a website since puts aren’t typically buffered and most of the operations are gets.

如果它们的有效负载很小,那么将这个数字设置为最大的传入客户端是安全的,典型的例子是一个服务于一个网站的集群,因为它不是典型的缓冲,而且大多数操作都是得到的。

The reason why it is dangerous to keep this setting high is that the aggregate size of all the puts that are currently happening in a region server may impose too much pressure on its memory, or even trigger an OutOfMemoryError. A RegionServer running on low memory will trigger its JVM’s garbage collector to run more frequently up to a point where GC pauses become noticeable (the reason being that all the memory used to keep all the requests' payloads cannot be trashed, no matter how hard the garbage collector tries). After some time, the overall cluster throughput is affected since every request that hits that RegionServer will take longer, which exacerbates the problem even more.

保持这个设置的高度危险的原因是,当前在区域服务器上发生的所有put的聚合大小可能会对其内存施加太大的压力,甚至引发OutOfMemoryError错误。运行在低内存上的区域服务器将触发JVM的垃圾收集器,以便更频繁地运行GC暂停变得明显的点(原因是,无论垃圾收集器如何努力,所有用于保存所有请求的内存的内存都不能被丢弃)。经过一段时间之后,整个集群的吞吐量都会受到影响,因为每个对该区域服务器的请求都将花费更长的时间,这将使问题更加严重。

You can get a sense of whether you have too little or too many handlers by rpc.logging on an individual RegionServer then tailing its logs (Queued requests consume memory).

通过rpc,您可以了解您是否拥有太多或太多的处理程序。在单个区域服务器上进行日志记录,然后跟踪其日志(队列请求消耗内存)。

9.2.3. Configuration for large memory machines

9.2.3。大型内存机器的配置。

HBase ships with a reasonable, conservative configuration that will work on nearly all machine types that people might want to test with. If you have larger machines — HBase has 8G and larger heap — you might find the following configuration options helpful. TODO.

HBase有一个合理的、保守的配置,它将在几乎所有人们想要测试的机器类型上工作。如果您有更大的机器——HBase有8G和更大的堆——您可能会发现以下配置选项是有帮助的。待办事项。

9.2.4. Compression

9.2.4。压缩

You should consider enabling ColumnFamily compression. There are several options that are near-frictionless and in most all cases boost performance by reducing the size of StoreFiles and thus reducing I/O.

您应该考虑启用ColumnFamily压缩。有几种几乎无摩擦的选项,在大多数情况下,通过减小StoreFiles的大小来提高性能,从而减少I/O。

See compression for more information.

有关更多信息,请参见压缩。

9.2.5. Configuring the size and number of WAL files

9.2.5。配置WAL - files的大小和数量。

HBase uses wal to recover the memstore data that has not been flushed to disk in case of an RS failure. These WAL files should be configured to be slightly smaller than HDFS block (by default a HDFS block is 64Mb and a WAL file is ~60Mb).

HBase使用wal来恢复在RS失败时没有刷新到磁盘的memstore数据。这些WAL - file应该配置为比HDFS块稍微小一点(默认情况下,HDFS块是64Mb,而一个WAL - file是~60Mb)。

HBase also has a limit on the number of WAL files, designed to ensure there’s never too much data that needs to be replayed during recovery. This limit needs to be set according to memstore configuration, so that all the necessary data would fit. It is recommended to allocate enough WAL files to store at least that much data (when all memstores are close to full). For example, with 16Gb RS heap, default memstore settings (0.4), and default WAL file size (~60Mb), 16Gb*0.4/60, the starting point for WAL file count is ~109. However, as all memstores are not expected to be full all the time, less WAL files can be allocated.

HBase对WAL - mail的数量也有限制,这是为了确保在恢复过程中不会有太多的数据需要重放。这个限制需要根据memstore配置来设置,以便所有必要的数据都适合。建议分配足够的WAL - file来存储至少这么多的数据(当所有的内存存储都接近满的时候)。例如,有16Gb的RS堆,默认的memstore设置(0.4),默认的WAL file size (~60Mb), 16Gb*0.4/60, WAL file count的起始点是~109。然而,由于所有的内存存储都不可能一直都是满的,所以可以分配更少的WAL文件。

9.2.6. Managed Splitting

9.2.6。管理的分裂

HBase generally handles splitting of your regions based upon the settings in your hbase-default.xml and hbase-site.xml configuration files. Important settings include hbase.regionserver.region.split.policy, hbase.hregion.max.filesize, hbase.regionserver.regionSplitLimit. A simplistic view of splitting is that when a region grows to hbase.hregion.max.filesize, it is split. For most usage patterns, you should use automatic splitting. See manual region splitting decisions for more information about manual region splitting.

HBase通常根据HBase -default的设置处理区域的分割。xml和hbase-site。xml配置文件。重要的设置包括hbase.regionserver.region.split。政策,hbase.hregion.max。文件大小,hbase.regionserver.regionSplitLimit。分割的一个简单的观点是,当一个区域增长到hbase.hregion.max。文件大小,分裂。对于大多数使用模式,您应该使用自动拆分。有关手动区域拆分的更多信息,请参见手动区域拆分决策。

Instead of allowing HBase to split your regions automatically, you can choose to manage the splitting yourself. This feature was added in HBase 0.90.0. Manually managing splits works if you know your keyspace well, otherwise let HBase figure where to split for you. Manual splitting can mitigate region creation and movement under load. It also makes it so region boundaries are known and invariant (if you disable region splitting). If you use manual splits, it is easier doing staggered, time-based major compactions to spread out your network IO load.

不允许HBase自动分割区域,您可以选择管理您自己。该特性在HBase 0.90.0中添加。如果你知道你的关键空间,手动管理分拆工作,否则让HBase数据为你分配。手动拆分可以减轻区域的创建和负载下的移动。它也使得区域边界是已知的和不变的(如果你禁用区域分割)。如果您使用手动分割,则更容易进行交错的、基于时间的主要压缩来扩展您的网络IO负载。

Disable Automatic Splitting

To disable automatic splitting, you can set region split policy in either cluster configuration or table configuration to be org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy

为了禁用自动分割,您可以在集群配置或表配置中设置区域分割策略。

Automatic Splitting Is Recommended

If you disable automatic splits to diagnose a problem or during a period of fast data growth, it is recommended to re-enable them when your situation becomes more stable. The potential benefits of managing region splits yourself are not undisputed.

如果您禁用自动拆分来诊断问题或在快速数据增长期间,建议在您的情况变得更稳定时重新启用它们。管理地区的潜在好处并不是无可争议的。

Determine the Optimal Number of Pre-Split Regions

The optimal number of pre-split regions depends on your application and environment. A good rule of thumb is to start with 10 pre-split regions per server and watch as data grows over time. It is better to err on the side of too few regions and perform rolling splits later. The optimal number of regions depends upon the largest StoreFile in your region. The size of the largest StoreFile will increase with time if the amount of data grows. The goal is for the largest region to be just large enough that the compaction selection algorithm only compacts it during a timed major compaction. Otherwise, the cluster can be prone to compaction storms with a large number of regions under compaction at the same time. It is important to understand that the data growth causes compaction storms and not the manual split decision.

预分割区域的最佳数量取决于您的应用程序和环境。一个很好的经验法则是,从每个服务器上的10个预分割区域开始,随着时间的推移,随着数据的增长,我们会注意到这一点。最好是在太过少的区域内犯错,之后再进行滚动分割。最优的区域数量取决于区域内最大的存储文件。如果数据量增加,最大的存储文件的大小会随着时间的增加而增加。目标是使最大的区域足够大,使得压实选择算法只在一个定时的主要压缩过程中进行压缩。否则,集群可能会在压缩的同时出现大量区域的压实风暴。重要的是要理解,数据增长会导致压实风暴,而不是手工拆分决策。

If the regions are split into too many large regions, you can increase the major compaction interval by configuring HConstants.MAJOR_COMPACTION_PERIOD. HBase 0.90 introduced org.apache.hadoop.hbase.util.RegionSplitter, which provides a network-IO-safe rolling split of all regions.

如果区域被分割为太多的大区域,您可以通过配置hconstant . major_compaction_period来增加主要的压缩时间间隔。HBase 0.90引入org.apache.hadoop.hbase.util。区域分割器,它提供了所有区域的网络安全滚动分割。

9.2.7. Managed Compactions

9.2.7。件管理

By default, major compactions are scheduled to run once in a 7-day period. Prior to HBase 0.96.x, major compactions were scheduled to happen once per day by default.

默认情况下,主要的compaction计划在7天的时间内运行一次。HBase 0.96之前。在默认情况下,主要的compaction每天都要发生一次。

If you need to control exactly when and how often major compaction runs, you can disable managed major compactions. See the entry for hbase.hregion.majorcompaction in the compaction.parameters table for details.

如果需要精确控制主压缩运行的时间和频率,则可以禁用管理的主要压缩。参见hbase.hregion的条目。majorcompaction压实。参数表细节。

Do Not Disable Major Compactions

Major compactions are absolutely necessary for StoreFile clean-up. Do not disable them altogether. You can run major compactions manually via the HBase shell or via the Admin API.

主要的压缩对于清理仓库是绝对必要的。不要完全禁用它们。您可以通过HBase shell或通过管理API手动运行主要的压缩操作。

For more information about compactions and the compaction file selection process, see compaction

有关compaction和compaction文件选择过程的更多信息,请参见compaction。

9.2.8. Speculative Execution

9.2.8。投机执行

Speculative Execution of MapReduce tasks is on by default, and for HBase clusters it is generally advised to turn off Speculative Execution at a system-level unless you need it for a specific case, where it can be configured per-job. Set the properties mapreduce.map.speculative and mapreduce.reduce.speculative to false.

默认情况下,对于MapReduce任务的推测执行是在默认情况下进行的,对于HBase集群,通常建议在系统级关闭投机性的执行,除非您需要它用于特定的情况,在那里可以配置每个作业。mapreduce.map设置属性。投机和mapreduce.reduce。投机为false。

9.3. Other Configurations

9.3。其他配置

9.3.1. Balancer

设备上装。平衡器

The balancer is a periodic operation which is run on the master to redistribute regions on the cluster. It is configured via hbase.balancer.period and defaults to 300000 (5 minutes).

平衡器是一个周期性的操作,在主服务器上运行,以在集群上重新分配区域。它是通过hbase.balancer配置的。期间和默认为300000(5分钟)。

See master.processes.loadbalancer for more information on the LoadBalancer.

看到master.processes。负载均衡器获取更多关于负载平衡器的信息。

9.3.2. Disabling Blockcache

9.3.2。禁用Blockcache

Do not turn off block cache (You’d do it by setting hfile.block.cache.size to zero). Currently we do not do well if you do this because the RegionServer will spend all its time loading HFile indices over and over again. If your working set is such that block cache does you no good, at least size the block cache such that HFile indices will stay up in the cache (you can get a rough idea on the size you need by surveying RegionServer UIs; you’ll see index block size accounted near the top of the webpage).

不要关闭块缓存(您可以通过设置hfile.block.cache来完成它)。大小为零)。如果您这样做,我们现在做得不好,因为区域服务器将会一次又一次地加载HFile索引。如果您的工作集是这样的,那么块缓存对您没有好处,至少是大小块缓存,这样HFile索引就会停留在缓存中(您可以通过测量区域服务器UIs来了解您需要的大小);你会看到索引块大小在网页顶部附近。

9.3.3. Nagle’s or the small package problem

9.3.3。Nagle或小包装问题。

If a big 40ms or so occasional delay is seen in operations against HBase, try the Nagles' setting. For example, see the user mailing list thread, Inconsistent scan performance with caching set to 1 and the issue cited therein where setting notcpdelay improved scan speeds. You might also see the graphs on the tail of HBASE-7008 Set scanner caching to a better default where our Lars Hofhansl tries various data sizes w/ Nagle’s on and off measuring the effect.

如果在HBase的操作中出现了40毫秒左右的延迟,试试Nagles的设置。例如,查看用户邮件列表线程,不一致的扫描性能和缓存设置为1,以及在其中所引用的设置notcpdelay提高扫描速度的问题。您还可以看到在HBASE-7008设置扫描高速缓存的尾部的图形,使我们的Lars Hofhansl尝试各种数据大小w/ Nagle的on和off测量结果。

9.3.4. Better Mean Time to Recover (MTTR)

9.3.4。更好的平均恢复时间(MTTR)

This section is about configurations that will make servers come back faster after a fail. See the Deveraj Das and Nicolas Liochon blog post Introduction to HBase Mean Time to Recover (MTTR) for a brief introduction.

本节讨论的是在失败后服务器恢复得更快的配置。参见Deveraj Das和Nicolas Liochon的博客文章介绍HBase平均时间恢复(MTTR)进行简要介绍。

The issue HBASE-8354 forces Namenode into loop with lease recovery requests is messy but has a bunch of good discussion toward the end on low timeouts and how to cause faster recovery including citation of fixes added to HDFS. Read the Varun Sharma comments. The below suggested configurations are Varun’s suggestions distilled and tested. Make sure you are running on a late-version HDFS so you have the fixes he refers to and himself adds to HDFS that help HBase MTTR (e.g. HDFS-3703, HDFS-3712, and HDFS-4791 — Hadoop 2 for sure has them and late Hadoop 1 has some). Set the following in the RegionServer.

HBASE-8354将Namenode与租约恢复请求进行循环的问题很麻烦,但是在低超时时间的结束以及如何导致更快的恢复(包括添加到HDFS的修复程序)方面进行了大量的讨论。请阅读Varun Sharma的评论。下面建议的配置是Varun的建议经过蒸馏和测试。确保你运行的是最新版本的HDFS,这样你就有了他所提到的补丁,并且他自己添加到HDFS中来帮助HBase MTTR(例如HDFS-3703, HDFS-3712,和HDFS-4791 - Hadoop 2,这是肯定的,并且后期的Hadoop 1有一些)。在区域服务器中设置以下内容。

<property>
  <name>hbase.lease.recovery.dfs.timeout</name>
  <value>23000</value>
  <description>How much time we allow elapse between calls to recover lease.
  Should be larger than the dfs timeout.</description>
</property>
<property>
  <name>dfs.client.socket-timeout</name>
  <value>10000</value>
  <description>Down the DFS timeout from 60 to 10 seconds.</description>
</property>

And on the NameNode/DataNode side, set the following to enable 'staleness' introduced in HDFS-3703, HDFS-3912.

在NameNode/DataNode方面,设置下面的“staleness”在HDFS-3703中引入,HDFS-3912。

<property>
  <name>dfs.client.socket-timeout</name>
  <value>10000</value>
  <description>Down the DFS timeout from 60 to 10 seconds.</description>
</property>
<property>
  <name>dfs.datanode.socket.write.timeout</name>
  <value>10000</value>
  <description>Down the DFS timeout from 8 * 60 to 10 seconds.</description>
</property>
<property>
  <name>ipc.client.connect.timeout</name>
  <value>3000</value>
  <description>Down from 60 seconds to 3.</description>
</property>
<property>
  <name>ipc.client.connect.max.retries.on.timeouts</name>
  <value>2</value>
  <description>Down from 45 seconds to 3 (2 == 3 retries).</description>
</property>
<property>
  <name>dfs.namenode.avoid.read.stale.datanode</name>
  <value>true</value>
  <description>Enable stale state in hdfs</description>
</property>
<property>
  <name>dfs.namenode.stale.datanode.interval</name>
  <value>20000</value>
  <description>Down from default 30 seconds</description>
</property>
<property>
  <name>dfs.namenode.avoid.write.stale.datanode</name>
  <value>true</value>
  <description>Enable stale state in hdfs</description>
</property>

9.3.5. JMX

9.3.5。JMX

JMX (Java Management Extensions) provides built-in instrumentation that enables you to monitor and manage the Java VM. To enable monitoring and management from remote systems, you need to set system property com.sun.management.jmxremote.port (the port number through which you want to enable JMX RMI connections) when you start the Java VM. See the official documentation for more information. Historically, besides above port mentioned, JMX opens two additional random TCP listening ports, which could lead to port conflict problem. (See HBASE-10289 for details)

JMX (Java管理扩展)提供了内置的工具,使您能够监视和管理Java VM。为了从远程系统启用监视和管理,您需要设置system property .sun.management.jmxremote。在启动Java VM时,端口(希望启用JMX RMI连接的端口号)。有关更多信息,请参见官方文档。历史上,除了上述端口之外,JMX还打开了两个额外的随机TCP监听端口,这可能导致端口冲突问题。(有关详细信息,请参阅hbase - 10289)

As an alternative, You can use the coprocessor-based JMX implementation provided by HBase. To enable it in 0.99 or above, add below property in hbase-site.xml:

作为一种替代方法,您可以使用HBase提供的基于协处理器的JMX实现。要使其在0.99或以上,在hbase-site.xml中添加以下属性:

<property>
  <name>hbase.coprocessor.regionserver.classes</name>
  <value>org.apache.hadoop.hbase.JMXListener</value>
</property>
DO NOT set com.sun.management.jmxremote.port for Java VM at the same time.

Currently it supports Master and RegionServer Java VM. By default, the JMX listens on TCP port 10102, you can further configure the port using below properties:

目前它支持主服务器和区域服务器Java VM。默认情况下,JMX监听TCP端口10102,您可以使用以下属性进一步配置端口:

<property>
  <name>regionserver.rmi.registry.port</name>
  <value>61130</value>
</property>
<property>
  <name>regionserver.rmi.connector.port</name>
  <value>61140</value>
</property>

The registry port can be shared with connector port in most cases, so you only need to configure regionserver.rmi.registry.port. However if you want to use SSL communication, the 2 ports must be configured to different values.

在大多数情况下,注册表端口可以与连接器端口共享,因此您只需要配置区域服务器.rmi.registry.port。但是,如果要使用SSL通信,则必须将两个端口配置为不同的值。

By default the password authentication and SSL communication is disabled. To enable password authentication, you need to update hbase-env.sh like below:

默认情况下,密码身份验证和SSL通信是禁用的。要启用密码身份验证,您需要更新hbase-env。sh像下图:

export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.authenticate=true                  \
                       -Dcom.sun.management.jmxremote.password.file=your_password_file   \
                       -Dcom.sun.management.jmxremote.access.file=your_access_file"

export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE "
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE "

See example password/access file under $JRE_HOME/lib/management.

参见$JRE_HOME/lib/management下的示例密码/访问文件。

To enable SSL communication with password authentication, follow below steps:

要启用SSL通信与密码身份验证,请遵循以下步骤:

#1. generate a key pair, stored in myKeyStore
keytool -genkey -alias jconsole -keystore myKeyStore

#2. export it to file jconsole.cert
keytool -export -alias jconsole -keystore myKeyStore -file jconsole.cert

#3. copy jconsole.cert to jconsole client machine, import it to jconsoleKeyStore
keytool -import -alias jconsole -keystore jconsoleKeyStore -file jconsole.cert

And then update hbase-env.sh like below:

然后更新hbase-env。sh像下图:

export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=true                         \
                       -Djavax.net.ssl.keyStore=/home/tianq/myKeyStore                 \
                       -Djavax.net.ssl.keyStorePassword=your_password_in_step_1       \
                       -Dcom.sun.management.jmxremote.authenticate=true                \
                       -Dcom.sun.management.jmxremote.password.file=your_password file \
                       -Dcom.sun.management.jmxremote.access.file=your_access_file"

export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE "
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE "

Finally start jconsole on the client using the key store:

最后,使用密钥存储库在客户机上启动jconsole:

jconsole -J-Djavax.net.ssl.trustStore=/home/tianq/jconsoleKeyStore
To enable the HBase JMX implementation on Master, you also need to add below property in hbase-site.xml:
<property>
  <name>hbase.coprocessor.master.classes</name>
  <value>org.apache.hadoop.hbase.JMXListener</value>
</property>

The corresponding properties for port configuration are master.rmi.registry.port (by default 10101) and master.rmi.connector.port (by default the same as registry.port)

端口配置的相应属性是master.rmi.registry。端口(默认为10101)和master.rmi.connector。端口(默认情况下与注册端口相同)

10. Dynamic Configuration

10。动态配置

Since HBase 1.0.0, it is possible to change a subset of the configuration without requiring a server restart. In the HBase shell, there are new operators, update_config and update_all_config that will prompt a server or all servers to reload configuration.

由于HBase 1.0.0,可以在不需要服务器重启的情况下更改配置的子集。在HBase shell中,有新的操作符、update_config和update_all_config,它将提示服务器或所有服务器重新加载配置。

Only a subset of all configurations can currently be changed in the running server. Here are those configurations:

在运行的服务器中,目前只有所有配置的一个子集可以被更改。这里是这些配置:

Table 3. Configurations support dynamically change
Key

hbase.ipc.server.fallback-to-simple-auth-allowed

hbase.ipc.server.fallback-to-simple-auth-allowed

hbase.cleaner.scan.dir.concurrent.size

hbase.cleaner.scan.dir.concurrent.size

hbase.regionserver.thread.compaction.large

hbase.regionserver.thread.compaction.large

hbase.regionserver.thread.compaction.small

hbase.regionserver.thread.compaction.small

hbase.regionserver.thread.split

hbase.regionserver.thread.split

hbase.regionserver.throughput.controller

hbase.regionserver.throughput.controller

hbase.regionserver.thread.hfilecleaner.throttle

hbase.regionserver.thread.hfilecleaner.throttle

hbase.regionserver.hfilecleaner.large.queue.size

hbase.regionserver.hfilecleaner.large.queue.size

hbase.regionserver.hfilecleaner.small.queue.size

hbase.regionserver.hfilecleaner.small.queue.size

hbase.regionserver.hfilecleaner.large.thread.count

hbase.regionserver.hfilecleaner.large.thread.count

hbase.regionserver.hfilecleaner.small.thread.count

hbase.regionserver.hfilecleaner.small.thread.count

hbase.regionserver.flush.throughput.controller

hbase.regionserver.flush.throughput.controller

hbase.hstore.compaction.max.size

hbase.hstore.compaction.max.size

hbase.hstore.compaction.max.size.offpeak

hbase.hstore.compaction.max.size.offpeak

hbase.hstore.compaction.min.size

hbase.hstore.compaction.min.size

hbase.hstore.compaction.min

hbase.hstore.compaction.min

hbase.hstore.compaction.max

hbase.hstore.compaction.max

hbase.hstore.compaction.ratio

hbase.hstore.compaction.ratio

hbase.hstore.compaction.ratio.offpeak

hbase.hstore.compaction.ratio.offpeak

hbase.regionserver.thread.compaction.throttle

hbase.regionserver.thread.compaction.throttle

hbase.hregion.majorcompaction

hbase.hregion.majorcompaction

hbase.hregion.majorcompaction.jitter

hbase.hregion.majorcompaction.jitter

hbase.hstore.min.locality.to.skip.major.compact

hbase.hstore.min.locality.to.skip.major.compact

hbase.hstore.compaction.date.tiered.max.storefile.age.millis

hbase.hstore.compaction.date.tiered.max.storefile.age.millis

hbase.hstore.compaction.date.tiered.incoming.window.min

hbase.hstore.compaction.date.tiered.incoming.window.min

hbase.hstore.compaction.date.tiered.window.policy.class

hbase.hstore.compaction.date.tiered.window.policy.class

hbase.hstore.compaction.date.tiered.single.output.for.minor.compaction

hbase.hstore.compaction.date.tiered.single.output.for.minor.compaction

hbase.hstore.compaction.date.tiered.window.factory.class

hbase.hstore.compaction.date.tiered.window.factory.class

hbase.offpeak.start.hour

hbase.offpeak.start.hour

hbase.offpeak.end.hour

hbase.offpeak.end.hour

hbase.oldwals.cleaner.thread.size

hbase.oldwals.cleaner.thread.size

hbase.procedure.worker.keep.alive.time.msec

hbase.procedure.worker.keep.alive.time.msec

hbase.procedure.worker.add.stuck.percentage

hbase.procedure.worker.add.stuck.percentage

hbase.procedure.worker.monitor.interval.msec

hbase.procedure.worker.monitor.interval.msec

hbase.procedure.worker.stuck.threshold.msec

hbase.procedure.worker.stuck.threshold.msec

hbase.regions.slop

hbase.regions.slop

hbase.regions.overallSlop

hbase.regions.overallSlop

hbase.balancer.tablesOnMaster

hbase.balancer.tablesOnMaster

hbase.balancer.tablesOnMaster.systemTablesOnly

hbase.balancer.tablesOnMaster.systemTablesOnly

hbase.util.ip.to.rack.determiner

hbase.util.ip.to.rack.determiner

hbase.ipc.server.max.callqueue.length

hbase.ipc.server.max.callqueue.length

hbase.ipc.server.priority.max.callqueue.length

hbase.ipc.server.priority.max.callqueue.length

hbase.ipc.server.callqueue.type

hbase.ipc.server.callqueue.type

hbase.ipc.server.callqueue.codel.target.delay

hbase.ipc.server.callqueue.codel.target.delay

hbase.ipc.server.callqueue.codel.interval

hbase.ipc.server.callqueue.codel.interval

hbase.ipc.server.callqueue.codel.lifo.threshold

hbase.ipc.server.callqueue.codel.lifo.threshold

hbase.master.balancer.stochastic.maxSteps

hbase.master.balancer.stochastic.maxSteps

hbase.master.balancer.stochastic.stepsPerRegion

hbase.master.balancer.stochastic.stepsPerRegion

hbase.master.balancer.stochastic.maxRunningTime

hbase.master.balancer.stochastic.maxRunningTime

hbase.master.balancer.stochastic.runMaxSteps

hbase.master.balancer.stochastic.runMaxSteps

hbase.master.balancer.stochastic.numRegionLoadsToRemember

hbase.master.balancer.stochastic.numRegionLoadsToRemember

hbase.master.loadbalance.bytable

hbase.master.loadbalance.bytable

hbase.master.balancer.stochastic.minCostNeedBalance

hbase.master.balancer.stochastic.minCostNeedBalance

hbase.master.balancer.stochastic.localityCost

hbase.master.balancer.stochastic.localityCost

hbase.master.balancer.stochastic.rackLocalityCost

hbase.master.balancer.stochastic.rackLocalityCost

hbase.master.balancer.stochastic.readRequestCost

hbase.master.balancer.stochastic.readRequestCost

hbase.master.balancer.stochastic.writeRequestCost

hbase.master.balancer.stochastic.writeRequestCost

hbase.master.balancer.stochastic.memstoreSizeCost

hbase.master.balancer.stochastic.memstoreSizeCost

hbase.master.balancer.stochastic.storefileSizeCost

hbase.master.balancer.stochastic.storefileSizeCost

hbase.master.balancer.stochastic.regionReplicaHostCostKey

hbase.master.balancer.stochastic.regionReplicaHostCostKey

hbase.master.balancer.stochastic.regionReplicaRackCostKey

hbase.master.balancer.stochastic.regionReplicaRackCostKey

hbase.master.balancer.stochastic.regionCountCost

hbase.master.balancer.stochastic.regionCountCost

hbase.master.balancer.stochastic.primaryRegionCountCost

hbase.master.balancer.stochastic.primaryRegionCountCost

hbase.master.balancer.stochastic.moveCost

hbase.master.balancer.stochastic.moveCost

hbase.master.balancer.stochastic.maxMovePercent

hbase.master.balancer.stochastic.maxMovePercent

hbase.master.balancer.stochastic.tableSkewCost

hbase.master.balancer.stochastic.tableSkewCost

Upgrading

升级

You cannot skip major versions when upgrading. If you are upgrading from version 0.90.x to 0.94.x, you must first go from 0.90.x to 0.92.x and then go from 0.92.x to 0.94.x.

升级时不能跳过主要版本。如果升级到0.90版本。0.94 x。x,你必须从0。90开始。0.92 x。x,然后从0。92。x 0.94.x。

It may be possible to skip across versions — for example go from 0.92.2 straight to 0.98.0 just following the 0.96.x upgrade instructions — but these scenarios are untested.

Review Apache HBase Configuration, in particular Hadoop. Familiarize yourself with Support and Testing Expectations.

回顾Apache HBase配置,特别是Hadoop。熟悉支持和测试期望。

11. HBase version number and compatibility

11。HBase版本号和兼容性。

HBase has two versioning schemes, pre-1.0 and post-1.0. Both are detailed below.

HBase有两个版本控制方案,1.0版和1.0版。两者都是详细的下面。

11.1. Post 1.0 versions

11.1。发布1.0版本

Starting with the 1.0.0 release, HBase is working towards Semantic Versioning for its release versioning. In summary:

从1.0.0版本开始,HBase正在努力实现版本控制的语义版本控制。总而言之:

Given a version number MAJOR.MINOR.PATCH, increment the:
  • MAJOR version when you make incompatible API changes,

    当你做出不兼容的API变更时,主要版本,

  • MINOR version when you add functionality in a backwards-compatible manner, and

    次要版本,当您以向后兼容的方式添加功能时,以及。

  • PATCH version when you make backwards-compatible bug fixes.

    当您进行向后兼容的错误修复时,补丁版本。

  • Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

    用于预发布和构建元数据的附加标签可以作为主要的扩展名。补丁格式。

Compatibility Dimensions

In addition to the usual API versioning considerations HBase has other compatibility dimensions that we need to consider.

除了通常的API版本控制之外,HBase还有其他我们需要考虑的兼容性维度。

Client-Server wire protocol compatibility
  • Allows updating client and server out of sync.

    允许更新客户端和服务器不同步。

  • We could only allow upgrading the server first. I.e. the server would be backward compatible to an old client, that way new APIs are OK.

    我们只能允许先升级服务器。也就是说,服务器将向后兼容老客户端,这样新的api就可以了。

  • Example: A user should be able to use an old client to connect to an upgraded cluster.

    示例:用户应该能够使用旧客户端连接到升级的集群。

Server-Server protocol compatibility
  • Servers of different versions can co-exist in the same cluster.

    不同版本的服务器可以在同一个集群中共存。

  • The wire protocol between servers is compatible.

    服务器之间的连线协议是兼容的。

  • Workers for distributed tasks, such as replication and log splitting, can co-exist in the same cluster.

    分布式任务的工作人员,例如复制和日志分割,可以在同一个集群中共存。

  • Dependent protocols (such as using ZK for coordination) will also not be changed.

    依赖协议(例如使用ZK进行协调)也不会改变。

  • Example: A user can perform a rolling upgrade.

    示例:用户可以执行滚动升级。

File format compatibility
  • Support file formats backward and forward compatible

    支持文件格式向后和向前兼容。

  • Example: File, ZK encoding, directory layout is upgraded automatically as part of an HBase upgrade. User can downgrade to the older version and everything will continue to work.

    示例:文件、ZK编码、目录布局自动升级,作为HBase升级的一部分。用户可以降级到旧版本,一切将继续工作。

Client API compatibility
  • Allow changing or removing existing client APIs.

    允许更改或删除现有客户端api。

  • An API needs to be deprecated for a major version before we will change/remove it.

    在我们更改/删除它之前,需要对一个主要版本的API进行弃用。

  • APIs available in a patch version will be available in all later patch versions. However, new APIs may be added which will not be available in earlier patch versions.

    补丁版本中可用的api将在以后的补丁版本中提供。但是,新的api可能会被添加,这在以前的补丁版本中是不可用的。

  • New APIs introduced in a patch version will only be added in a source compatible way [1]: i.e. code that implements public APIs will continue to compile.

    补丁版本中引入的新api只会添加到源兼容的方式[1]:即实现公共api的代码将继续编译。

    • Example: A user using a newly deprecated API does not need to modify application code with HBase API calls until the next major version. *

      示例:使用新弃用API的用户不需要使用HBase API调用修改应用程序代码,直到下一个主要版本。*

Client Binary compatibility
  • Client code written to APIs available in a given patch release can run unchanged (no recompilation needed) against the new jars of later patch versions.

    在一个给定的补丁版本中,为可用的api编写的客户机代码可以在以后的补丁版本的新jar中保持不变(不需要重新编译)。

  • Client code written to APIs available in a given patch release might not run against the old jars from an earlier patch version.

    在给定的补丁版本中提供给api的客户端代码可能不会从早期补丁版本的旧jar中运行。

    • Example: Old compiled client code will work unchanged with the new jars.

      示例:旧的编译后的客户机代码将与新jar保持一致。

  • If a Client implements an HBase Interface, a recompile MAY be required upgrading to a newer minor version (See release notes for warning about incompatible changes). All effort will be made to provide a default implementation so this case should not arise.

    如果客户端实现了HBase接口,则可能需要对更新的小版本进行重新编译(请参阅发行说明以警告不兼容的更改)。所有的努力都将提供一个默认的实现,所以这个案例不应该出现。

Server-Side Limited API compatibility (taken from Hadoop)
  • Internal APIs are marked as Stable, Evolving, or Unstable

    内部api被标记为稳定的、演进的或不稳定的。

  • This implies binary compatibility for coprocessors and plugins (pluggable classes, including replication) as long as these are only using marked interfaces/classes.

    这意味着对协处理器和插件的二进制兼容性(可插入类,包括复制),只要它们只使用标记的接口/类。

  • Example: Old compiled Coprocessor, Filter, or Plugin code will work unchanged with the new jars.

    示例:旧的编译过的协处理器、过滤器或插件代码将与新jar保持一致。

Dependency Compatibility
  • An upgrade of HBase will not require an incompatible upgrade of a dependent project, including the Java runtime.

    HBase的升级不需要对依赖项目进行不兼容的升级,包括Java运行时。

  • Example: An upgrade of Hadoop will not invalidate any of the compatibilities guarantees we made.

    Hadoop的升级不会使我们做出的兼容性保证失效。

Operational Compatibility
  • Metric changes

    指标的变化

  • Behavioral changes of services

    行为变化的服务

  • JMX APIs exposed via the /jmx/ endpoint

    通过/ JMX /端点公开的JMX api。

Summary
  • A patch upgrade is a drop-in replacement. Any change that is not Java binary and source compatible would not be allowed.[2] Downgrading versions within patch releases may not be compatible.

    补丁升级是替代的替代。任何非Java二进制和源兼容的更改都是不允许的。[2]在补丁版本中降级版本可能不兼容。

  • A minor upgrade requires no application/client code modification. Ideally it would be a drop-in replacement but client code, coprocessors, filters, etc might have to be recompiled if new jars are used.

    一个小的升级不需要应用程序/客户端代码修改。理想情况下,如果使用新的jar,它将是一个替代的替代品,但客户代码、协处理器、过滤器等可能需要重新编译。

  • A major upgrade allows the HBase community to make breaking changes.

    一个主要的升级可以让HBase社区做出改变。

Table 4. Compatibility Matrix [3]

Major

主要

Minor

Patch

补丁

Client-Server wire Compatibility

客户机-服务器连接的兼容性

N

N

Y

Y

Y

Y

Server-Server Compatibility

server服务器兼容性

N

N

Y

Y

Y

Y

File Format Compatibility

文件格式兼容性

N [4]

N[4]

Y

Y

Y

Y

Client API Compatibility

客户端API兼容性

N

N

Y

Y

Y

Y

Client Binary Compatibility

客户端二进制兼容性

N

N

N

N

Y

Y

Server-Side Limited API Compatibility

服务器端API兼容性有限

Stable

稳定的

N

N

Y

Y

Y

Y

Evolving

不断发展的

N

N

N

N

Y

Y

Unstable

不稳定

N

N

N

N

N

N

Dependency Compatibility

依赖的兼容性

N

N

Y

Y

Y

Y

Operational Compatibility

操作的兼容性

N

N

N

N

Y

Y

11.1.1. HBase API Surface

11.1.1。HBase API表面

HBase has a lot of API points, but for the compatibility matrix above, we differentiate between Client API, Limited Private API, and Private API. HBase uses Apache Yetus Audience Annotations to guide downstream expectations for stability.

HBase有很多API点,但是对于上面的兼容性矩阵,我们区分了客户机API、有限的私有API和私有API。HBase使用Apache Yetus听众注释来指导下游对稳定性的期望。

  • InterfaceAudience (javadocs): captures the intended audience, possible values include:

    InterfaceAudience (javadocs):捕获目标受众,可能的值包括:

    • Public: safe for end users and external projects

      公众:安全的终端用户和外部项目。

    • LimitedPrivate: used for internals we expect to be pluggable, such as coprocessors

      限制私有:用于我们期望可插入的内部构件,例如协处理器。

    • Private: strictly for use within HBase itself Classes which are defined as IA.Private may be used as parameters or return values for interfaces which are declared IA.LimitedPrivate. Treat the IA.Private object as opaque; do not try to access its methods or fields directly.

      私有:严格地用于HBase本身的类,定义为IA。私有可以用作接口的参数或返回值,这些接口被声明为IA.LimitedPrivate。治疗IA。私有对象作为不透明;不要试图直接访问它的方法或字段。

  • InterfaceStability (javadocs): describes what types of interface changes are permitted. Possible values include:

    InterfaceStability (javadocs):描述允许哪些类型的接口更改。可能的值有:

    • Stable: the interface is fixed and is not expected to change

      稳定:接口是固定的,预计不会改变。

    • Evolving: the interface may change in future minor verisons

      进化:界面可能会在未来的小verisons中发生变化。

    • Unstable: the interface may change at any time

      不稳定:界面随时可能发生变化。

Please keep in mind the following interactions between the InterfaceAudience and InterfaceStability annotations within the HBase project:

请记住在HBase项目中,InterfaceAudience和InterfaceStability注释之间的交互作用:

  • IA.Public classes are inherently stable and adhere to our stability guarantees relating to the type of upgrade (major, minor, or patch).

    IA。公共类本质上是稳定的,并遵循与升级类型(主要的、次要的或补丁)相关的稳定性保证。

  • IA.LimitedPrivate classes should always be annotated with one of the given InterfaceStability values. If they are not, you should presume they are IS.Unstable.

    IA。限定的私有类应该总是用给定的InterfaceStability值之一进行注释。如果它们不是,你应该假定它们是。不稳定的。

  • IA.Private classes should be considered implicitly unstable, with no guarantee of stability between releases.

    IA。私有类应该被认为是隐式不稳定的,不能保证发布之间的稳定性。

HBase Client API

HBase Client API consists of all the classes or methods that are marked with InterfaceAudience.Public interface. All main classes in hbase-client and dependent modules have either InterfaceAudience.Public, InterfaceAudience.LimitedPrivate, or InterfaceAudience.Private marker. Not all classes in other modules (hbase-server, etc) have the marker. If a class is not annotated with one of these, it is assumed to be a InterfaceAudience.Private class.

HBase客户端API由所有的类或方法组成,这些类或方法都被标记为InterfaceAudience。公共接口。hbase-客户端和依赖模块的所有主要类都有InterfaceAudience。公共场所,InterfaceAudience。LimitedPrivate或InterfaceAudience。私人标记。不是其他模块中的所有类(hbase-server等)都有标记。如果一个类没有被注释,那么它被假定为一个InterfaceAudience。私有类。

HBase LimitedPrivate API

LimitedPrivate annotation comes with a set of target consumers for the interfaces. Those consumers are coprocessors, phoenix, replication endpoint implementations or similar. At this point, HBase only guarantees source and binary compatibility for these interfaces between patch versions.

LimitedPrivate注释附带了一组用于接口的目标用户。这些消费者是协处理器、凤凰、复制端点实现或类似的。在这一点上,HBase只保证补丁版本之间的这些接口的源代码和二进制兼容性。

HBase Private API

All classes annotated with InterfaceAudience.Private or all classes that do not have the annotation are for HBase internal use only. The interfaces and method signatures can change at any point in time. If you are relying on a particular interface that is marked Private, you should open a jira to propose changing the interface to be Public or LimitedPrivate, or an interface exposed for this purpose.

所有的类都带有InterfaceAudience的注解。没有注释的私有或所有类只用于HBase内部使用。接口和方法签名可以在任何时间点发生变化。如果您依赖于一个标记为私有的特定接口,那么您应该打开jira,建议将接口更改为公共或限制私有,或为此目的公开接口。

11.2. Pre 1.0 versions

11.2。1.0之前的版本

HBase Pre-1.0 versions are all EOM
For new installations, do not deploy 0.94.y, 0.96.y, or 0.98.y. Deploy our stable version. See EOL 0.96, clean up of EOM releases, and the header of our downloads.

Before the semantic versioning scheme pre-1.0, HBase tracked either Hadoop’s versions (0.2x) or 0.9x versions. If you are into the arcane, checkout our old wiki page on HBase Versioning which tries to connect the HBase version dots. Below sections cover ONLY the releases before 1.0.

在语义版本控制方案1.0之前,HBase跟踪了Hadoop的版本(0.2x)或0.9x版本。如果您进入了这个神秘的领域,请在HBase版本控制上签出我们的旧wiki页面,该页面尝试连接HBase版本的点。下面的小节只介绍1.0之前的版本。

Odd/Even Versioning or "Development" Series Releases

Ahead of big releases, we have been putting up preview versions to start the feedback cycle turning-over earlier. These "Development" Series releases, always odd-numbered, come with no guarantees, not even regards being able to upgrade between two sequential releases (we reserve the right to break compatibility across "Development" Series releases). Needless to say, these releases are not for production deploys. They are a preview of what is coming in the hope that interested parties will take the release for a test drive and flag us early if we there are issues we’ve missed ahead of our rolling a production-worthy release.

在大发布之前,我们已经发布了预览版本,以开始反馈周期的开始。这些“开发”系列发布,总是奇数,没有保证,甚至不认为能够在两个顺序发布之间进行升级(我们保留在“开发”系列发行版中打破兼容性的权利)。不用说,这些发行版不是用于生产部署的。他们是对即将到来的希望的预演,希望有兴趣的各方将会在测试驱动下发布,如果我们在我们的产品发布之前错过了一些问题,我们就会提前通知我们。

Our first "Development" Series was the 0.89 set that came out ahead of HBase 0.90.0. HBase 0.95 is another "Development" Series that portends HBase 0.96.0. 0.99.x is the last series in "developer preview" mode before 1.0. Afterwards, we will be using semantic versioning naming scheme (see above).

我们的第一个“开发”系列是在HBase 0.90.0之前发布的0.89集。HBase 0.95是另一个“开发”系列,它将HBase 0.96.0引入。0.99。x是1.0之前“开发者预览”模式的最后一个系列。之后,我们将使用语义版本命名方案(见上文)。

Binary Compatibility

When we say two HBase versions are compatible, we mean that the versions are wire and binary compatible. Compatible HBase versions means that clients can talk to compatible but differently versioned servers. It means too that you can just swap out the jars of one version and replace them with the jars of another, compatible version and all will just work. Unless otherwise specified, HBase point versions are (mostly) binary compatible. You can safely do rolling upgrades between binary compatible versions; i.e. across point versions: e.g. from 0.94.5 to 0.94.6. See link:[Does compatibility between versions also mean binary compatibility?] discussion on the HBase dev mailing list.

当我们说两个HBase版本是兼容的,我们的意思是版本是线和二进制兼容的。兼容的HBase版本意味着客户机可以与兼容但不同版本的服务器进行对话。这也意味着你可以把一个版本的jar换掉,用另一个版本的jar替换它们,兼容的版本都可以工作。除非另有说明,HBase point版本(大部分)是二进制兼容的。您可以安全地进行二进制兼容版本之间的滚动升级;例如:从0.94.5到0.94.6。参见链接:[在版本之间的兼容性也意味着二进制兼容性吗?]讨论HBase开发邮件列表。

11.3. Rolling Upgrades

11.3。滚动升级

A rolling upgrade is the process by which you update the servers in your cluster a server at a time. You can rolling upgrade across HBase versions if they are binary or wire compatible. See Rolling Upgrade Between Versions that are Binary/Wire Compatible for more on what this means. Coarsely, a rolling upgrade is a graceful stop each server, update the software, and then restart. You do this for each server in the cluster. Usually you upgrade the Master first and then the RegionServers. See Rolling Restart for tools that can help use the rolling upgrade process.

滚动升级是指您一次更新集群中的服务器的过程。如果它们是二进制或有线兼容,您可以在HBase版本上滚动升级。请参阅在二进制/连线版本之间的滚动升级,以了解更多关于这意味着什么。粗略地说,滚动升级是一个优雅的停止服务器,更新软件,然后重新启动。您为集群中的每个服务器都这样做。通常先升级主服务器,然后再升级区域服务器。请参阅滚动重新启动工具,以帮助使用滚动升级过程。

For example, in the below, HBase was symlinked to the actual HBase install. On upgrade, before running a rolling restart over the cluster, we changed the symlink to point at the new HBase software version and then ran

例如,在下面,HBase与实际的HBase安装相关联。在升级过程中,在对集群进行滚动重新启动之前,我们将symlink更改为指向新的HBase软件版本,然后运行。

$ HADOOP_HOME=~/hadoop-2.6.0-CRC-SNAPSHOT ~/hbase/bin/rolling-restart.sh --config ~/conf_hbase

The rolling-restart script will first gracefully stop and restart the master, and then each of the RegionServers in turn. Because the symlink was changed, on restart the server will come up using the new HBase version. Check logs for errors as the rolling upgrade proceeds.

rollingrestart脚本将首先优雅地停止并重新启动主服务器,然后依次为每个区域服务器。因为符号链接发生了改变,重新启动服务器将使用新的HBase版本。在滚动升级的过程中,检查日志中的错误。

Rolling Upgrade Between Versions that are Binary/Wire Compatible

Unless otherwise specified, HBase point versions are binary compatible. You can do a Rolling Upgrades between HBase point versions. For example, you can go to 0.94.6 from 0.94.5 by doing a rolling upgrade across the cluster replacing the 0.94.5 binary with a 0.94.6 binary.

除非另有说明,HBase point版本是二进制兼容的。您可以在HBase点版本之间进行滚动升级。例如,您可以从0.94.5到0.94.6,通过在集群中进行滚动升级,将0.94.5二进制文件替换为0.94.6二进制。

In the minor version-particular sections below, we call out where the versions are wire/protocol compatible and in this case, it is also possible to do a Rolling Upgrades. For example, in Rolling upgrade from 0.98.x to HBase 1.0.0, we state that it is possible to do a rolling upgrade between hbase-0.98.x and hbase-1.0.0.

在下面的特定版本中,我们会指出版本是在哪里兼容的,在这种情况下,也可以进行滚动升级。例如,滚动升级从0.98。x到HBase 1.0.0,我们声明可以在HBase -0.98之间进行滚动升级。x和hbase-1.0.0。

12. Rollback

12。回滚

Sometimes things don’t go as planned when attempting an upgrade. This section explains how to perform a rollback to an earlier HBase release. Note that this should only be needed between Major and some Minor releases. You should always be able to downgrade between HBase Patch releases within the same Minor version. These instructions may require you to take steps before you start the upgrade process, so be sure to read through this section beforehand.

有时候,在尝试升级时,事情并没有按计划进行。本节解释如何执行回滚到早期的HBase版本。请注意,这只需要在主要版本和一些次要版本之间进行。您应该总是能够在同一个小版本的HBase补丁版本之间降级。这些指示可能要求您在启动升级过程之前采取步骤,所以一定要事先阅读这一节。

12.1. Caveats

12.1。警告

Rollback vs Downgrade

This section describes how to perform a rollback on an upgrade between HBase minor and major versions. In this document, rollback refers to the process of taking an upgraded cluster and restoring it to the old version while losing all changes that have occurred since upgrade. By contrast, a cluster downgrade would restore an upgraded cluster to the old version while maintaining any data written since the upgrade. We currently only offer instructions to rollback HBase clusters. Further, rollback only works when these instructions are followed prior to performing the upgrade.

本节描述如何在HBase小版本和主要版本之间执行回滚。在本文档中,rollback指的是在丢失升级后发生的所有更改的情况下,使用升级后的集群并将其恢复到旧版本的过程。相比之下,集群降级将使升级后的集群恢复到旧版本,同时维护升级后编写的任何数据。我们目前只提供回滚HBase集群的指令。而且,在执行升级之前,只有当这些指令被执行之后,回滚才会起作用。

When these instructions talk about rollback vs downgrade of prerequisite cluster services (i.e. HDFS), you should treat leaving the service version the same as a degenerate case of downgrade.

当这些指令涉及到对先决集群服务(即HDFS)的回滚vs降级时,您应该将服务版本与降级的降级事件一样对待。

Replication

Unless you are doing an all-service rollback, the HBase cluster will lose any configured peers for HBase replication. If your cluster is configured for HBase replication, then prior to following these instructions you should document all replication peers. After performing the rollback you should then add each documented peer back to the cluster. For more information on enabling HBase replication, listing peers, and adding a peer see Managing and Configuring Cluster Replication. Note also that data written to the cluster since the upgrade may or may not have already been replicated to any peers. Determining which, if any, peers have seen replication data as well as rolling back the data in those peers is out of the scope of this guide.

除非您正在执行全服务回滚,否则HBase集群将失去任何已配置的HBase复制节点。如果您的集群配置为HBase复制,那么在遵循这些指令之前,您应该记录所有的复制节点。执行回滚后,您应该将每个记录的对等点添加到集群中。有关启用HBase复制、列出对等点和添加对等点查看管理和配置集群复制的更多信息。还要注意,自升级之后写入集群的数据可能已经被复制到其他节点上了。确定哪个节点(如果有的话)已经看到了复制数据,并在这些节点中回滚数据,这超出了本指南的范围。

Data Locality

Unless you are doing an all-service rollback, going through a rollback procedure will likely destroy all locality for Region Servers. You should expect degraded performance until after the cluster has had time to go through compactions to restore data locality. Optionally, you can force a compaction to speed this process up at the cost of generating cluster load.

除非您正在执行全服务回滚,否则执行回滚过程可能会破坏区域服务器的所有位置。在集群有时间通过压缩恢复数据局部性之前,您应该期望降级的性能。可选地,您可以强制一个压实来加速这个过程,以产生集群负载。

Configurable Locations

The instructions below assume default locations for the HBase data directory and the HBase znode. Both of these locations are configurable and you should verify the value used in your cluster before proceeding. In the event that you have a different value, just replace the default with the one found in your configuration * HBase data directory is configured via the key 'hbase.rootdir' and has a default value of '/hbase'. * HBase znode is configured via the key 'zookeeper.znode.parent' and has a default value of '/hbase'.

下面的说明将假设HBase数据目录和HBase znode的默认位置。这两个位置都是可配置的,您应该在继续之前验证集群中使用的值。如果您有不同的值,只需将默认值替换为配置* HBase数据目录中的默认值,该目录将通过key ' HBase配置。rootdir'并具有'/hbase'的默认值。* HBase znode是通过key 'zookeeper.znode配置的。父类的默认值为'/hbase'。

12.2. All service rollback

12.2。所有服务回滚

If you will be performing a rollback of both the HDFS and ZooKeeper services, then HBase’s data will be rolled back in the process.

如果您将执行对HDFS和ZooKeeper服务的回滚,那么HBase的数据将在这个过程中回滚。

Requirements
  • Ability to rollback HDFS and ZooKeeper

    能够回滚HDFS和ZooKeeper。

Before upgrade

No additional steps are needed pre-upgrade. As an extra precautionary measure, you may wish to use distcp to back up the HBase data off of the cluster to be upgraded. To do so, follow the steps in the 'Before upgrade' section of 'Rollback after HDFS downgrade' but copy to another HDFS instance instead of within the same instance.

没有额外的步骤需要预先升级。作为额外的预防措施,您可能希望使用distcp来备份集群中的HBase数据以进行升级。要做到这一点,请按照“在HDFS降级后的回滚”部分的步骤,而不是在同一个实例中复制到另一个HDFS实例。

Performing a rollback
  1. Stop HBase

    停止HBase

  2. Perform a rollback for HDFS and ZooKeeper (HBase should remain stopped)

    对HDFS和ZooKeeper执行回滚(HBase应该停止)

  3. Change the installed version of HBase to the previous version

    将HBase的安装版本更改为上一个版本。

  4. Start HBase

    开始HBase

  5. Verify HBase contents—use the HBase shell to list tables and scan some known values.

    验证HBase内容——使用HBase shell列出表并扫描一些已知值。

12.3. Rollback after HDFS rollback and ZooKeeper downgrade

12.3。在HDFS回滚和ZooKeeper降级之后回滚。

If you will be rolling back HDFS but going through a ZooKeeper downgrade, then HBase will be in an inconsistent state. You must ensure the cluster is not started until you complete this process.

如果您将回滚HDFS,但是通过ZooKeeper降级,那么HBase将处于不一致的状态。在完成此过程之前,您必须确保集群没有启动。

Requirements
  • Ability to rollback HDFS

    回滚HDFS的能力

  • Ability to downgrade ZooKeeper

    能力降级动物园管理员

Before upgrade

No additional steps are needed pre-upgrade. As an extra precautionary measure, you may wish to use distcp to back up the HBase data off of the cluster to be upgraded. To do so, follow the steps in the 'Before upgrade' section of 'Rollback after HDFS downgrade' but copy to another HDFS instance instead of within the same instance.

没有额外的步骤需要预先升级。作为额外的预防措施,您可能希望使用distcp来备份集群中的HBase数据以进行升级。要做到这一点,请按照“在HDFS降级后的回滚”部分的步骤,而不是在同一个实例中复制到另一个HDFS实例。

Performing a rollback
  1. Stop HBase

    停止HBase

  2. Perform a rollback for HDFS and a downgrade for ZooKeeper (HBase should remain stopped)

    对HDFS执行回滚,并降级为ZooKeeper (HBase应该停止)

  3. Change the installed version of HBase to the previous version

    将HBase的安装版本更改为上一个版本。

  4. Clean out ZooKeeper information related to HBase. WARNING: This step will permanently destroy all replication peers. Please see the section on HBase Replication under Caveats for more information.

    清除与HBase相关的动物管理员信息。警告:此步骤将永久销毁所有复制伙伴。有关更多信息,请参阅“警告”中关于HBase复制的部分。

    Clean HBase information out of ZooKeeper
    [hpnewton@gateway_node.example.com ~]$ zookeeper-client -server zookeeper1.example.com:2181,zookeeper2.example.com:2181,zookeeper3.example.com:2181
    Welcome to ZooKeeper!
    JLine support is disabled
    rmr /hbase
    quit
    Quitting...
  5. Start HBase

    开始HBase

  6. Verify HBase contents—use the HBase shell to list tables and scan some known values.

    验证HBase内容——使用HBase shell列出表并扫描一些已知值。

12.4. Rollback after HDFS downgrade

12.4。回滚后HDFS降级

If you will be performing an HDFS downgrade, then you’ll need to follow these instructions regardless of whether ZooKeeper goes through rollback, downgrade, or reinstallation.

如果您将执行HDFS降级,那么您将需要遵循这些指示,而不管ZooKeeper是否通过rollback、降级或重新安装。

Requirements
  • Ability to downgrade HDFS

    降级HDFS的能力

  • Pre-upgrade cluster must be able to run MapReduce jobs

    预升级集群必须能够运行MapReduce作业。

  • HDFS super user access

    HDFS超级用户访问

  • Sufficient space in HDFS for at least two copies of the HBase data directory

    在HDFS中有足够的空间用于至少两个HBase数据目录的副本。

Before upgrade

Before beginning the upgrade process, you must take a complete backup of HBase’s backing data. The following instructions cover backing up the data within the current HDFS instance. Alternatively, you can use the distcp command to copy the data to another HDFS cluster.

在开始升级过程之前,您必须完全备份HBase的备份数据。下面的说明涵盖了在当前的HDFS实例中备份数据。或者,您可以使用distcp命令将数据复制到另一个HDFS集群。

  1. Stop the HBase cluster

    停止HBase集群

  2. Copy the HBase data directory to a backup location using the distcp command as the HDFS super user (shown below on a security enabled cluster)

    将HBase数据目录复制到一个备份位置,使用distcp命令作为HDFS超级用户(在启用安全的集群中显示)

    Using distcp to backup the HBase data directory
    [hpnewton@gateway_node.example.com ~]$ kinit -k -t hdfs.keytab hdfs@EXAMPLE.COM
    [hpnewton@gateway_node.example.com ~]$ hadoop distcp /hbase /hbase-pre-upgrade-backup
  3. Distcp will launch a mapreduce job to handle copying the files in a distributed fashion. Check the output of the distcp command to ensure this job completed successfully.

    Distcp将启动mapreduce作业以处理以分布式方式复制文件。检查distcp命令的输出,以确保完成此工作。

Performing a rollback
  1. Stop HBase

    停止HBase

  2. Perform a downgrade for HDFS and a downgrade/rollback for ZooKeeper (HBase should remain stopped)

    对HDFS进行降级和降级/回滚给ZooKeeper (HBase应该停止)

  3. Change the installed version of HBase to the previous version

    将HBase的安装版本更改为上一个版本。

  4. Restore the HBase data directory from prior to the upgrade as the HDFS super user (shown below on a security enabled cluster). If you backed up your data on another HDFS cluster instead of locally, you will need to use the distcp command to copy it back to the current HDFS cluster.

    在升级之前,将HBase数据目录恢复为HDFS超级用户(在启用安全的集群中显示)。如果您将数据备份到另一个HDFS集群而不是本地,您将需要使用distcp命令将其复制回当前的HDFS集群。

    Restore the HBase data directory
    [hpnewton@gateway_node.example.com ~]$ kinit -k -t hdfs.keytab hdfs@EXAMPLE.COM
    [hpnewton@gateway_node.example.com ~]$ hdfs dfs -mv /hbase /hbase-upgrade-rollback
    [hpnewton@gateway_node.example.com ~]$ hdfs dfs -mv /hbase-pre-upgrade-backup /hbase
  5. Clean out ZooKeeper information related to HBase. WARNING: This step will permanently destroy all replication peers. Please see the section on HBase Replication under Caveats for more information.

    清除与HBase相关的动物管理员信息。警告:此步骤将永久销毁所有复制伙伴。有关更多信息,请参阅“警告”中关于HBase复制的部分。

    Clean HBase information out of ZooKeeper
    [hpnewton@gateway_node.example.com ~]$ zookeeper-client -server zookeeper1.example.com:2181,zookeeper2.example.com:2181,zookeeper3.example.com:2181
    Welcome to ZooKeeper!
    JLine support is disabled
    rmr /hbase
    quit
    Quitting...
  6. Start HBase

    开始HBase

  7. Verify HBase contents–use the HBase shell to list tables and scan some known values.

    验证HBase内容——使用HBase shell列出表并扫描一些已知值。

13. Upgrade Paths

13。升级路径

13.1. Upgrading from 0.98.x to 1.x

13.1。从0.98升级。x 1.倍

In this section we first note the significant changes that come in with 1.0.0+ HBase and then we go over the upgrade process. Be sure to read the significant changes section with care so you avoid surprises.

在本节中,我们首先注意到1.0.0+ HBase带来的重大变化,然后我们讨论升级过程。一定要仔细阅读有意义的修改部分,以免出现意外。

13.1.1. Changes of Note!

13.1.1。变化的注意!

In here we list important changes that are in 1.0.0+ since 0.98.x., changes you should be aware that will go into effect once you upgrade.

在这里,我们列出了自0.98.x以来,1.0.0+的重要变化。,你应该意识到,一旦你升级了,它就会生效。

ZooKeeper 3.4 is required in HBase 1.0.0+

See ZooKeeper Requirements.

看到动物园管理员的需求。

HBase Default Ports Changed

The ports used by HBase changed. They used to be in the 600XX range. In HBase 1.0.0 they have been moved up out of the ephemeral port range and are 160XX instead (Master web UI was 60010 and is now 16010; the RegionServer web UI was 60030 and is now 16030, etc.). If you want to keep the old port locations, copy the port setting configs from hbase-default.xml into hbase-site.xml, change them back to the old values from the HBase 0.98.x era, and ensure you’ve distributed your configurations before you restart.

HBase使用的端口发生了变化。他们曾经在600XX的范围内。在HBase 1.0.0中,它们已经从短暂的端口范围内移出,而改为160XX(主web UI为60010,现在为16010;区域服务器web UI是60030,现在是16030,等等。如果您希望保留旧的端口位置,请从hbase-default复制端口设置配置。xml到hbase-site。xml,将它们从HBase 0.98转换回旧值。x时代,并确保在重新启动之前分配了配置。

HBase Master Port Binding Change

In HBase 1.0.x, the HBase Master binds the RegionServer ports as well as the Master ports. This behavior is changed from HBase versions prior to 1.0. In HBase 1.1 and 2.0 branches, this behavior is reverted to the pre-1.0 behavior of the HBase master not binding the RegionServer ports.

在HBase 1.0。x, HBase主绑定区域服务器端口和主端口。此行为在1.0之前由HBase版本更改。在HBase 1.1和2.0分支中,该行为被恢复到HBase主的1.0前行为,而不是绑定区域服务器端口。

hbase.bucketcache.percentage.in.combinedcache configuration has been REMOVED

You may have made use of this configuration if you are using BucketCache. If NOT using BucketCache, this change does not affect you. Its removal means that your L1 LruBlockCache is now sized using hfile.block.cache.size — i.e. the way you would size the on-heap L1 LruBlockCache if you were NOT doing BucketCache — and the BucketCache size is not whatever the setting for hbase.bucketcache.size is. You may need to adjust configs to get the LruBlockCache and BucketCache sizes set to what they were in 0.98.x and previous. If you did not set this config., its default value was 0.9. If you do nothing, your BucketCache will increase in size by 10%. Your L1 LruBlockCache will become hfile.block.cache.size times your java heap size (hfile.block.cache.size is a float between 0.0 and 1.0). To read more, see HBASE-11520 Simplify offheap cache config by removing the confusing "hbase.bucketcache.percentage.in.combinedcache".

如果您使用的是BucketCache,您可能已经使用了这个配置。如果不使用BucketCache,此更改不会影响您。它的删除意味着您的L1 LruBlockCache现在使用的是hfile.block.cache.size -即。如果您不做BucketCache,那么您将对on-heap L1 LruBlockCache进行大小设置——而BucketCache大小并不适用于hbase.bucketcache。大小是多少。您可能需要调整configs,以使LruBlockCache和BucketCache大小设置为0.98。x和之前。如果您没有设置这个配置。其默认值为0.9。如果你什么都不做,你的背包将会增加10%。您的L1 LruBlockCache将成为hfile.block.cache。大小乘以您的java堆大小(hfile.block.cache)。大小是介于0.0和1.0之间的浮动。若要读取更多信息,请参见HBASE-11520简化offheap缓存配置,删除“hbase. bucketcache.% age.l .in.combinedcache”。

If you have your own customer filters.

See the release notes on the issue HBASE-12068 [Branch-1] Avoid need to always do KeyValueUtil#ensureKeyValue for Filter transformCell; be sure to follow the recommendations therein.

在HBASE-12068 [Branch-1]上看到发行说明,避免总是要对Filter transformCell进行KeyValueUtil#ensureKeyValue;一定要遵循其中的建议。

Mismatch Of hbase.client.scanner.max.result.size Between Client and Server

If either the client or server version is lower than 0.98.11/1.0.0 and the server has a smaller value for hbase.client.scanner.max.result.size than the client, scan requests that reach the server’s hbase.client.scanner.max.result.size are likely to miss data. In particular, 0.98.11 defaults hbase.client.scanner.max.result.size to 2 MB but other versions default to larger values. For this reason, be very careful using 0.98.11 servers with any other client version.

如果客户机或服务器版本低于0.98.11/1.0.0,服务器对hbase.client.scanner.max.result的值更小。大小超过客户端,扫描请求到达服务器的hbase.client.scanner.max.result。大小可能会遗漏数据。特别是,0.98.11缺省值hbase.client.scanner.max.result。大小为2 MB,但其他版本默认为更大的值。因此,要非常小心使用0.98.11服务器和其他客户端版本。

Availability of Date Tiered Compaction.

The Date Tiered Compaction feature available as of 0.98.19 is available in the 1.y release line starting in release 1.3.0. If you have enabled this feature for any tables you must upgrade to version 1.3.0 or later. If you attempt to use an earlier 1.y release, any tables configured to use date tiered compaction will fail to have their regions open.

在1中可用的日期分级压实特性为0.98.19。从版本1.3.0开始的y发布线。如果您已经为任何表启用了这个特性,那么您必须升级到1.3.0或更高版本。如果你尝试使用更早的1。任何配置为使用日期分级压缩的表都将无法打开它们的区域。

13.1.2. Rolling upgrade from 0.98.x to HBase 1.0.0

13.1.2。从0.98滚动升级。x HBase 1.0.0

From 0.96.x to 1.0.0
You cannot do a rolling upgrade from 0.96.x to 1.0.0 without first doing a rolling upgrade to 0.98.x. See comment in HBASE-11164 Document and test rolling updates from 0.98 → 1.0 for the why. Also because HBase 1.0.0 enables HFile v3 by default, HBASE-9801 Change the default HFile version to V3, and support for HFile v3 only arrives in 0.98, this is another reason you cannot rolling upgrade from HBase 0.96.x; if the rolling upgrade stalls, the 0.96.x servers cannot open files written by the servers running the newer HBase 1.0.0 with HFile’s of version 3.

There are no known issues running a rolling upgrade from HBase 0.98.x to HBase 1.0.0.

从HBase 0.98进行滚动升级,目前还没有已知的问题。x HBase 1.0.0。

13.1.3. Scanner Caching has Changed

13.1.3。扫描仪缓存已经改变了

From 0.98.x to 1.x

In hbase-1.x, the default Scan caching 'number of rows' changed. Where in 0.98.x, it defaulted to 100, in later HBase versions, the default became Integer.MAX_VALUE. Not setting a cache size can make for Scans that run for a long time server-side, especially if they are running with stringent filtering. See Revisiting default value for hbase.client.scanner.caching; for further discussion.

在hbase-1。x,默认扫描缓存的行数改变了。在0.98。在后来的HBase版本中,默认为100,默认为Integer.MAX_VALUE。如果不设置缓存大小,则可以对运行了很长时间的服务器端进行扫描,特别是在使用严格的过滤时。请参阅重新访问hbase.client.scanner.缓存的默认值;为进一步讨论。

13.1.4. Upgrading to 1.0 from 0.94

13.1.4。从0.94升级到1.0。

You cannot rolling upgrade from 0.94.x to 1.x.x. You must stop your cluster, install the 1.x.x software, run the migration described at Executing the 0.96 Upgrade (substituting 1.x.x. wherever we make mention of 0.96.x in the section below), and then restart. Be sure to upgrade your ZooKeeper if it is a version less than the required 3.4.x.

不能从0.94滚动升级。x.x 1. x。您必须停止集群,安装1.x。x软件,运行在执行0.96升级时描述的迁移(替换1.x.x。只要提到0。96。x在下面的部分中),然后重新启动。如果您的ZooKeeper的版本小于要求的3.4.x,请确保升级。

13.2. Upgrading from 0.96.x to 0.98.x

13.2。从0.96升级。x 0.98.x

A rolling upgrade from 0.96.x to 0.98.x works. The two versions are not binary compatible.

滚动升级从0.96。0.98 x。x是有效的。这两个版本不是二进制兼容的。

Additional steps are required to take advantage of some of the new features of 0.98.x, including cell visibility labels, cell ACLs, and transparent server side encryption. See Securing Apache HBase for more information. Significant performance improvements include a change to the write ahead log threading model that provides higher transaction throughput under high load, reverse scanners, MapReduce over snapshot files, and striped compaction.

需要额外的步骤来利用0.98的一些新特性。x,包括单元可视性标签、单元格和透明的服务器端加密。有关更多信息,请参见保护Apache HBase。显著的性能改进包括修改前面的日志线程模型,该模型在高负载、反向扫描器、快照文件和条带压实的情况下提供了更高的事务吞吐量。

Clients and servers can run with 0.98.x and 0.96.x versions. However, applications may need to be recompiled due to changes in the Java API.

客户端和服务器可以运行0.98。0.96 x和。x版本。但是,由于Java API的变化,应用程序可能需要重新编译。

13.3. Upgrading from 0.94.x to 0.98.x

13.3。从0.94升级。x 0.98.x

A rolling upgrade from 0.94.x directly to 0.98.x does not work. The upgrade path follows the same procedures as Upgrading from 0.94.x to 0.96.x. Additional steps are required to use some of the new features of 0.98.x. See Upgrading from 0.96.x to 0.98.x for an abbreviated list of these features.

滚动升级从0.94。直接向0.98 x。x不工作。升级路径遵循与从0.94升级的相同程序。x 0.96.x。需要额外的步骤来使用0.98.x的一些新特性。看到从0.96升级。0.98 x。x代表这些特性的缩写列表。

13.4. Upgrading from 0.94.x to 0.96.x

13.4。从0.94升级。x 0.96.x

13.4.1. The "Singularity"

13.4.1。“奇点”

You will have to stop your old 0.94.x cluster completely to upgrade. If you are replicating between clusters, both clusters will have to go down to upgrade. Make sure it is a clean shutdown. The less WAL files around, the faster the upgrade will run (the upgrade will split any log files it finds in the filesystem as part of the upgrade process). All clients must be upgraded to 0.96 too.

你将不得不停止你的旧的0.94。x集群完全升级。如果在集群之间进行复制,那么这两个集群将不得不进行升级。确保它是一个干净的关机。周围的文件越少,升级的速度就越快(升级将会将文件系统中发现的任何日志文件分割为升级过程的一部分)。所有客户必须升级到0.96。

The API has changed. You will need to recompile your code against 0.96 and you may need to adjust applications to go against new APIs (TODO: List of changes).

API已经改变了。您需要将代码重新编译为0.96,您可能需要调整应用程序以适应新的api (TODO:更改列表)。

13.4.2. Executing the 0.96 Upgrade

13.4.2。执行升级到0.96

HDFS and ZooKeeper must be up!
HDFS and ZooKeeper should be up and running during the upgrade process.

HBase 0.96.0 comes with an upgrade script. Run

HBase 0.96.0附带一个升级脚本。运行

$ bin/hbase upgrade

to see its usage. The script has two main modes: -check, and -execute.

看到它的用法。该脚本有两种主要模式:-检查和-执行。

check

The check step is run against a running 0.94 cluster. Run it from a downloaded 0.96.x binary. The check step is looking for the presence of HFile v1 files. These are unsupported in HBase 0.96.0. To have them rewritten as HFile v2 you must run a compaction.

检查步骤运行在运行的0.94集群上。从下载的0.96运行它。x二进制。检查步骤是寻找HFile v1文件的存在。这些在HBase 0.96.0中是不支持的。要将它们重写为HFile v2,您必须运行一个压缩。

The check step prints stats at the end of its run (grep for “Result:” in the log) printing absolute path of the tables it scanned, any HFile v1 files found, the regions containing said files (these regions will need a major compaction), and any corrupted files if found. A corrupt file is unreadable, and so is undefined (neither HFile v1 nor HFile v2).

检查步骤在其运行结束时打印统计数据(grep用于“结果:”在日志中)打印数据表的绝对路径,任何HFile v1文件,包含该文件的区域(这些区域将需要一个主要的压缩),如果发现任何损坏的文件。一个损坏的文件是不可读的,所以没有定义(HFile v1和HFile v2)。

To run the check step, run

运行检查步骤,运行。

$ bin/hbase upgrade -check

Here is sample output:

这是示例输出:

Tables Processed:
hdfs://localhost:41020/myHBase/.META.
hdfs://localhost:41020/myHBase/usertable
hdfs://localhost:41020/myHBase/TestTable
hdfs://localhost:41020/myHBase/t

Count of HFileV1: 2
HFileV1:
hdfs://localhost:41020/myHBase/usertable    /fa02dac1f38d03577bd0f7e666f12812/family/249450144068442524
hdfs://localhost:41020/myHBase/usertable    /ecdd3eaee2d2fcf8184ac025555bb2af/family/249450144068442512

Count of corrupted files: 1
Corrupted Files:
hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812/family/1
Count of Regions with HFileV1: 2
Regions to Major Compact:
hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812
hdfs://localhost:41020/myHBase/usertable/ecdd3eaee2d2fcf8184ac025555bb2af

There are some HFileV1, or corrupt files (files with incorrect major version)

In the above sample output, there are two HFile v1 files in two regions, and one corrupt file. Corrupt files should probably be removed. The regions that have HFile v1s need to be major compacted. To major compact, start up the hbase shell and review how to compact an individual region. After the major compaction is done, rerun the check step and the HFile v1 files should be gone, replaced by HFile v2 instances.

在上面的示例输出中,在两个区域中有两个HFile v1文件,以及一个损坏的文件。应该删除损坏的文件。具有HFile v1s的区域需要进行大压缩。对于主要的契约,启动hbase shell并审查如何压缩单个区域。在完成主要的压缩之后,重新运行检查步骤,HFile v1文件应该被删除,替换为HFile v2实例。

By default, the check step scans the HBase root directory (defined as hbase.rootdir in the configuration). To scan a specific directory only, pass the -dir option.

默认情况下,检查步骤扫描HBase根目录(定义为HBase)。rootdir配置)。要扫描特定的目录,请通过-dir选项。

$ bin/hbase upgrade -check -dir /myHBase/testTable

The above command would detect HFile v1 files in the /myHBase/testTable directory.

上面的命令将在/myHBase/testTable目录中检测HFile v1文件。

Once the check step reports all the HFile v1 files have been rewritten, it is safe to proceed with the upgrade.

一旦检查步骤报告所有HFile v1文件被重写,就可以安全地进行升级了。

execute

After the check step shows the cluster is free of HFile v1, it is safe to proceed with the upgrade. Next is the execute step. You must SHUTDOWN YOUR 0.94.x CLUSTER before you can run the execute step. The execute step will not run if it detects running HBase masters or RegionServers.

在检查步骤显示集群没有HFile v1之后,继续进行升级是安全的。接下来是执行步骤。你必须关闭你的0.94。在运行执行步骤之前,请先进行x集群。如果它检测到运行的HBase主机或区域服务器,则执行步骤将不会运行。

HDFS and ZooKeeper should be up and running during the upgrade process. If zookeeper is managed by HBase, then you can start zookeeper so it is available to the upgrade by running

在升级过程中,HDFS和ZooKeeper应该启动和运行。如果zookeeper由HBase管理,那么您可以启动zookeeper,这样它就可以通过运行升级。

$ ./hbase/bin/hbase-daemon.sh start zookeeper

The execute upgrade step is made of three substeps.

执行升级步骤由三个子步骤组成。

  • Namespaces: HBase 0.96.0 has support for namespaces. The upgrade needs to reorder directories in the filesystem for namespaces to work.

    名称空间:HBase 0.96.0支持名称空间。升级需要对文件系统中的目录进行重新排序,以便使用名称空间工作。

  • ZNodes: All znodes are purged so that new ones can be written in their place using a new protobuf’ed format and a few are migrated in place: e.g. replication and table state znodes

    znode:所有的znode都被清除了,这样新的原型就可以在它们的位置上使用新的原始格式,并且有一些被迁移到适当的地方:例如复制和表状态znode。

  • WAL Log Splitting: If the 0.94.x cluster shutdown was not clean, we’ll split WAL logs as part of migration before we startup on 0.96.0. This WAL splitting runs slower than the native distributed WAL splitting because it is all inside the single upgrade process (so try and get a clean shutdown of the 0.94.0 cluster if you can).

    如果是0。94。x集群关闭不干净,我们将在启动0.96.0之前将WAL - log分割为迁移的一部分。这个WAL - fi的运行速度比本地的分布式WAL要慢,因为它都在单一的升级过程中(所以如果可以的话,试着彻底关闭0.94.0集群)。

To run the execute step, make sure that first you have copied HBase 0.96.0 binaries everywhere under servers and under clients. Make sure the 0.94.0 cluster is down. Then do as follows:

要运行执行步骤,请确保首先在服务器和客户机下复制HBase 0.96.0二进制文件。确保0.94.0集群已经关闭。然后做如下:

$ bin/hbase upgrade -execute

Here is some sample output.

这是一些样本输出。

Starting Namespace upgrade
Created version file at hdfs://localhost:41020/myHBase with version=7
Migrating table testTable to hdfs://localhost:41020/myHBase/.data/default/testTable
.....
Created version file at hdfs://localhost:41020/myHBase with version=8
Successfully completed NameSpace upgrade.
Starting Znode upgrade
.....
Successfully completed Znode upgrade

Starting Log splitting
...
Successfully completed Log splitting

If the output from the execute step looks good, stop the zookeeper instance you started to do the upgrade:

如果执行步骤的输出看起来很好,请停止您开始进行升级的zookeeper实例:

$ ./hbase/bin/hbase-daemon.sh stop zookeeper

Now start up hbase-0.96.0.

现在启动hbase-0.96.0。

13.5. Troubleshooting

13.5。故障排除

Old Client connecting to 0.96 cluster

It will fail with an exception like the below. Upgrade.

它将以如下的异常失败。升级。

17:22:15  Exception in thread "main" java.lang.IllegalArgumentException: Not a host:port pair: PBUF
17:22:15  *
17:22:15   api-compat-8.ent.cloudera.com ��  ���(
17:22:15    at org.apache.hadoop.hbase.util.Addressing.parseHostname(Addressing.java:60)
17:22:15    at org.apache.hadoop.hbase.ServerName.&init>(ServerName.java:101)
17:22:15    at org.apache.hadoop.hbase.ServerName.parseVersionedServerName(ServerName.java:283)
17:22:15    at org.apache.hadoop.hbase.MasterAddressTracker.bytesToServerName(MasterAddressTracker.java:77)
17:22:15    at org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:61)
17:22:15    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:703)
17:22:15    at org.apache.hadoop.hbase.client.HBaseAdmin.&init>(HBaseAdmin.java:126)
17:22:15    at Client_4_3_0.setup(Client_4_3_0.java:716)
17:22:15    at Client_4_3_0.main(Client_4_3_0.java:63)

13.5.1. Upgrading META to use Protocol Buffers (Protobuf)

13.5.1。升级META以使用协议缓冲区(Protobuf)

When you upgrade from versions prior to 0.96, META needs to be converted to use protocol buffers. This is controlled by the configuration option hbase.MetaMigrationConvertingToPB, which is set to true by default. Therefore, by default, no action is required on your part.

当您在0.96之前从版本升级时,需要将META转换为使用协议缓冲区。这是由配置选项hbase控制的。MetaMigrationConvertingToPB,默认设置为true。因此,默认情况下,不需要任何操作。

The migration is a one-time event. However, every time your cluster starts, META is scanned to ensure that it does not need to be converted. If you have a very large number of regions, this scan can take a long time. Starting in 0.98.5, you can set hbase.MetaMigrationConvertingToPB to false in hbase-site.xml, to disable this start-up scan. This should be considered an expert-level setting.

迁移是一次性事件。但是,每次集群启动时,都会扫描元数据以确保不需要转换。如果你有大量的区域,这个扫描可能需要很长时间。从0.98.5开始,您可以设置hbase。在hbase站点中,MetaMigrationConvertingToPB为false。xml,禁用此启动扫描。这应该被视为专家级别的设置。

13.6. Upgrading from 0.92.x to 0.94.x

13.6。从0.92升级。x 0.94.x

We used to think that 0.92 and 0.94 were interface compatible and that you can do a rolling upgrade between these versions but then we figured that HBASE-5357 Use builder pattern in HColumnDescriptor changed method signatures so rather than return void they instead return HColumnDescriptor. This will throw java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HColumnDescriptor.setMaxVersions(I)V so 0.92 and 0.94 are NOT compatible. You cannot do a rolling upgrade between them.

我们曾经认为0.92和0.94是接口兼容的,您可以在这些版本之间进行滚动升级,但是我们发现HBASE-5357在HColumnDescriptor中使用builder模式更改了方法签名,而不是返回void,而是返回HColumnDescriptor。这将把. lang。(I)V . 0.92和0.94是不兼容的。你不能在他们之间进行滚动升级。

13.7. Upgrading from 0.90.x to 0.92.x

13.7。从0.90升级。x 0.92.x

13.7.1. Upgrade Guide

13.7.1。升级指南

You will find that 0.92.0 runs a little differently to 0.90.x releases. Here are a few things to watch out for upgrading from 0.90.x to 0.92.0.

你会发现0。92.0和0。90有点不同。x版本。这里有几点需要注意,从0.90升级。x 0.92.0。

tl:dr

These are the important things to know before upgrading. . Once you upgrade, you can’t go back.

在升级之前,这些都是重要的事情。一旦你升级了,你就不能回去了。

  1. MSLAB is on by default. Watch that heap usage if you have a lot of regions.

    MSLAB默认是on。如果有很多区域,请注意堆使用。

  2. Distributed Log Splitting is on by default. It should make RegionServer failover faster.

    在默认情况下,分布式日志分裂是打开的。它应该使区域性服务器故障转移更快。

  3. There’s a separate tarball for security.

    安全还有一个单独的tarball。

  4. If -XX:MaxDirectMemorySize is set in your hbase-env.sh, it’s going to enable the experimental off-heap cache (You may not want this).

    如果-XX:MaxDirectMemorySize设置在您的hbase-env中。sh,它将启用实验性的堆外缓存(您可能不想要这个)。

You can’t go back!

To move to 0.92.0, all you need to do is shutdown your cluster, replace your HBase 0.90.x with HBase 0.92.0 binaries (be sure you clear out all 0.90.x instances) and restart (You cannot do a rolling restart from 0.90.x to 0.92.x — you must restart). On startup, the .META. table content is rewritten removing the table schema from the info:regioninfo column. Also, any flushes done post first startup will write out data in the new 0.92.0 file format, HBase file format with inline blocks (version 2). This means you cannot go back to 0.90.x once you’ve started HBase 0.92.0 over your HBase data directory.

要移动到0.92.0,需要做的就是关闭集群,替换HBase 0.90。使用HBase 0.92.0二进制文件(请确保您清除了所有0.90)。并重启(您不能从0.90开始滚动重新启动)。0.92 x。x -你必须重启)。在启动时,.META。表内容重写了从info:区域信息列中删除表模式。此外,任何完成后的第一次启动,都将以新的0.92.0文件格式、HBase文件格式和内联块(版本2)来写数据,这意味着您不能回到0.90。一旦您在HBase数据目录上启动了HBase 0.92.0。

MSLAB is ON by default

In 0.92.0, the hbase.hregion.memstore.mslab.enabled flag is set to true (See Long GC pauses). In 0.90.x it was false. When it is enabled, memstores will step allocate memory in MSLAB 2MB chunks even if the memstore has zero or just a few small elements. This is fine usually but if you had lots of regions per RegionServer in a 0.90.x cluster (and MSLAB was off), you may find yourself OOME’ing on upgrade because the thousands of regions * number of column families * 2MB MSLAB (at a minimum) puts your heap over the top. Set hbase.hregion.memstore.mslab.enabled to false or set the MSLAB size down from 2MB by setting hbase.hregion.memstore.mslab.chunksize to something less.

在0.92.0 hbase.hregion.memstore.mslab。enabled标志被设置为true(请参阅长GC暂停)。在0.90。x是假的。当启用它时,memstores将会在MSLAB 2MB内存块中分配内存,即使memstore只有零个或只是几个小的元素。这通常是可以的,但是如果每个区域服务器有很多区域在0.90。x集群(和MSLAB关闭),您可能会发现自己正在进行升级,因为成千上万的区域* * * * * * MSLAB(至少是)将您的堆放置在顶部。设置hbase.hregion.memstore.mslab。通过设置hbase. h区域性.memstore.mslab,可以将MSLAB大小从2MB设置为false或设置。chunksize的东西更少。

Distributed Log Splitting is on by default

Previous, WAL logs on crash were split by the Master alone. In 0.92.0, log splitting is done by the cluster (See HBASE-1364 [performance] Distributed splitting of regionserver commit logs or see the blog post Apache HBase Log Splitting). This should cut down significantly on the amount of time it takes splitting logs and getting regions back online again.

此前,在《撞车》(crash)中,沃斯(WAL - log)被主人单独分开。在0.92.0中,日志拆分是由集群完成的(参见HBase -1364[性能]分布式分区服务器提交日志或查看博客Apache HBase日志拆分)。这应该会大大减少分割日志和恢复区域的时间。

Memory accounting is different now

In 0.92.0, HBase file format with inline blocks (version 2) indices and bloom filters take up residence in the same LRU used caching blocks that come from the filesystem. In 0.90.x, the HFile v1 indices lived outside of the LRU so they took up space even if the index was on a ‘cold’ file, one that wasn’t being actively used. With the indices now in the LRU, you may find you have less space for block caching. Adjust your block cache accordingly. See the Block Cache for more detail. The block size default size has been changed in 0.92.0 from 0.2 (20 percent of heap) to 0.25.

在0.92.0中,带有内联块(版本2)索引和bloom过滤器的HBase文件格式在相同的LRU中占用了来自文件系统的缓存块。在0.90。x, HFile v1指数在LRU之外,所以它们占据了空间,即使索引是在“冷”文件上,也没有被积极使用。使用LRU中的索引,您可能会发现阻塞缓存的空间更小。相应地调整块缓存。更多细节请参见块缓存。块大小的默认大小从0.2(20%的堆)更改为0.92.0。

On the Hadoop version to use

Run 0.92.0 on Hadoop 1.0.x (or CDH3u3). The performance benefits are worth making the move. Otherwise, our Hadoop prescription is as it has been; you need an Hadoop that supports a working sync. See Hadoop.

在Hadoop 1.0上运行0.92.0。x(或CDH3u3)。性能上的好处是值得采取行动的。否则,我们的Hadoop处方就像以前一样;您需要一个支持工作同步的Hadoop。参见Hadoop。

If running on Hadoop 1.0.x (or CDH3u3), enable local read. See Practical Caching presentation for ruminations on the performance benefits ‘going local’ (and for how to enable local reads).

如果在Hadoop 1.0上运行。x(或CDH3u3),允许本地读取。请参阅实用的缓存演示,以了解关于性能好处“本地化”(以及如何启用本地读取)的思考。

HBase 0.92.0 ships with ZooKeeper 3.4.2

If you can, upgrade your ZooKeeper. If you can’t, 3.4.2 clients should work against 3.3.X ensembles (HBase makes use of 3.4.2 API).

如果可以的话,升级你的动物园管理员。如果不能,3.4.2客户端应该针对3.3。X套件(HBase使用3.4.2 API)。

Online alter is off by default

In 0.92.0, we’ve added an experimental online schema alter facility (See hbase.online.schema.update.enable). It’s off by default. Enable it at your own risk. Online alter and splitting tables do not play well together so be sure your cluster quiescent using this feature (for now).

在0.92.0中,我们添加了一个实验性的在线模式更改工具(见hbase.online.schema.update.enable)。这是默认关闭的。你可以自行承担风险。在线修改和拆分表不能很好地组合在一起,所以要确保您的集群休眠使用这个特性(现在)。

WebUI

The web UI has had a few additions made in 0.92.0. It now shows a list of the regions currently transitioning, recent compactions/flushes, and a process list of running processes (usually empty if all is well and requests are being handled promptly). Other additions including requests by region, a debugging servlet dump, etc.

web UI在0.92.0中添加了一些内容。它现在显示了当前正在转换的区域列表,最近的压缩/刷新,以及正在运行的进程的进程列表(如果一切正常,通常是空的,并且请求正在迅速处理)。其他添加包括区域请求、调试servlet转储等。

Security tarball

We now ship with two tarballs; secure and insecure HBase. Documentation on how to setup a secure HBase is on the way.

我们现在用两个tarball;安全的,不安全的HBase。关于如何设置安全HBase的文档正在进行中。

Changes in HBase replication

0.92.0 adds two new features: multi-slave and multi-master replication. The way to enable this is the same as adding a new peer, so in order to have multi-master you would just run add_peer for each cluster that acts as a master to the other slave clusters. Collisions are handled at the timestamp level which may or may not be what you want, this needs to be evaluated on a per use case basis. Replication is still experimental in 0.92 and is disabled by default, run it at your own risk.

0.92.0添加了两个新特性:多奴隶和多主复制。启用这一功能的方法与添加一个新的对等点是一样的,因此,为了拥有多主机,您只需要为每个集群运行add_peer,以充当其他从属集群的主服务器。冲突是在时间戳级别处理的,它可能是您想要的,也可能不是您想要的,这需要在每个用例的基础上进行评估。复制在0.92中仍然是实验性的,在默认情况下是禁用的,在您自己的风险下运行它。

RegionServer now aborts if OOME

If an OOME, we now have the JVM kill -9 the RegionServer process so it goes down fast. Previous, a RegionServer might stick around after incurring an OOME limping along in some wounded state. To disable this facility, and recommend you leave it in place, you’d need to edit the bin/hbase file. Look for the addition of the -XX:OnOutOfMemoryError="kill -9 %p" arguments (See HBASE-4769 - ‘Abort RegionServer Immediately on OOME’).

如果一个OOME,我们现在有JVM kill -9区域服务器进程,所以它会快速下降。以前,在一些受伤的州,一个地区服务器可能会在一瘸一拐的行进中徘徊。要禁用此功能,并建议您将其保留,您需要编辑bin/hbase文件。查找添加的- xx:OnOutOfMemoryError="kill - 9% p"参数(参见HBASE-4769 -“立即在OOME上中止区域服务器”)。

HFile v2 and the “Bigger, Fewer” Tendency

0.92.0 stores data in a new format, HBase file format with inline blocks (version 2). As HBase runs, it will move all your data from HFile v1 to HFile v2 format. This auto-migration will run in the background as flushes and compactions run. HFile v2 allows HBase run with larger regions/files. In fact, we encourage that all HBasers going forward tend toward Facebook axiom #1, run with larger, fewer regions. If you have lots of regions now — more than 100s per host — you should look into setting your region size up after you move to 0.92.0 (In 0.92.0, default size is now 1G, up from 256M), and then running online merge tool (See HBASE-1621 merge tool should work on online cluster, but disabled table).

0.92.0以新的格式存储数据,HBase文件格式与内联块(版本2)。当HBase运行时,它将把所有数据从HFile v1移动到HFile v2格式。这个自动迁移将在后台运行,因为它会运行。HFile v2允许HBase运行较大的区域/文件。事实上,我们鼓励所有的hbaser都倾向于Facebook axiom #1,使用更大、更少的区域。如果你现在有很多地区——超过100年代每个主机,你应该考虑设置区域大小后搬到0.92.0(现在在0.92.0,默认大小是1克,256),然后运行在线合并工具(见hbase - 1621合并工具应该在线集群,但禁用表)。

13.8. Upgrading to HBase 0.90.x from 0.20.x or 0.89.x

13.8。升级到0.90 HBase。从0.20 x。x或0.89.x

This version of 0.90.x HBase can be started on data written by HBase 0.20.x or HBase 0.89.x. There is no need of a migration step. HBase 0.89.x and 0.90.x does write out the name of region directories differently — it names them with a md5 hash of the region name rather than a jenkins hash — so this means that once started, there is no going back to HBase 0.20.x.

这个版本为0.90。x HBase可以从HBase 0.20的数据开始。x或HBase 0.89.x。不需要迁移步骤。HBase 0.89。0.90 x和。x确实以不同的方式写出了区域目录的名称——它用区域名称的md5哈希来命名它们,而不是jenkins哈希——所以这意味着一旦开始,就不会返回到HBase 0.20 x。

Be sure to remove the hbase-default.xml from your conf directory on upgrade. A 0.20.x version of this file will have sub-optimal configurations for 0.90.x HBase. The hbase-default.xml file is now bundled into the HBase jar and read from there. If you would like to review the content of this file, see it in the src tree at src/main/resources/hbase-default.xml or see HBase Default Configuration.

一定要删除hbase-default。从您的conf目录中的xml升级。0.20。该文件的x版本将有0.90的次优化配置。x HBase。hbase-default。xml文件现在被绑定到HBase jar中并从那里读取。如果您想查看该文件的内容,请参见src/main/resources/hbase-default中的src树。xml或查看HBase默认配置。

Finally, if upgrading from 0.20.x, check your .META. schema in the shell. In the past we would recommend that users run with a 16kb MEMSTORE_FLUSHSIZE. Run

最后,如果从0.20升级。x,检查你的.META。模式的壳。在过去,我们建议用户使用16kb的MEMSTORE_FLUSHSIZE运行。运行

hbase> scan '-ROOT-'

in the shell. This will output the current .META. schema. Check MEMSTORE_FLUSHSIZE size. Is it 16kb (16384)? If so, you will need to change this (The 'normal'/default value is 64MB (67108864)). Run the script bin/set_meta_memstore_size.rb. This will make the necessary edit to your .META. schema. Failure to run this change will make for a slow cluster. See HBASE-3499 Users upgrading to 0.90.0 need to have their .META. table updated with the right MEMSTORE_SIZE.

带壳的。这将输出当前的. meta。模式。检查MEMSTORE_FLUSHSIZE大小。这是16 kb(16384)吗?如果是,您将需要更改这个(“正常”/默认值是64MB(67108864))。bin / set_meta_memstore_size.rb运行脚本。这将对您的. meta进行必要的编辑。模式。未能运行此更改将导致集群的缓慢。看到HBASE-3499用户升级到0.90.0需要有他们的. meta。表更新了正确的MEMSTORE_SIZE。

The Apache HBase Shell

Apache HBase壳

The Apache HBase Shell is (J)Ruby's IRB with some HBase particular commands added. Anything you can do in IRB, you should be able to do in the HBase Shell.

Apache HBase Shell是(J)Ruby的IRB,添加了一些HBase特定命令。在IRB中可以做的任何事情,都可以在HBase Shell中进行。

To run the HBase shell, do as follows:

要运行HBase shell,请执行以下操作:

$ ./bin/hbase shell

Type help and then <RETURN> to see a listing of shell commands and options. Browse at least the paragraphs at the end of the help output for the gist of how variables and command arguments are entered into the HBase shell; in particular note how table names, rows, and columns, etc., must be quoted.

类型帮助,然后 <返回> 查看shell命令和选项的列表。至少浏览帮助输出末尾的段落,以了解如何将变量和命令参数输入到HBase shell中;特别要注意,必须引用表名、行和列等等。

See shell exercises for example basic shell operation.

参见shell练习,例如基本的shell操作。

Here is a nicely formatted listing of all shell commands by Rajeshbabu Chintaguntla.

下面是Rajeshbabu Chintaguntla对所有shell命令的良好格式化的列表。

14. Scripting with Ruby

14。使用Ruby脚本

For examples scripting Apache HBase, look in the HBase bin directory. Look at the files that end in *.rb. To run one of these files, do as follows:

例如,使用HBase bin目录来编写Apache HBase脚本。看看在*.rb中结束的文件。要运行其中一个文件,请执行以下操作:

$ ./bin/hbase org.jruby.Main PATH_TO_SCRIPT

15. Running the Shell in Non-Interactive Mode

15。在非交互模式下运行Shell。

A new non-interactive mode has been added to the HBase Shell (HBASE-11658). Non-interactive mode captures the exit status (success or failure) of HBase Shell commands and passes that status back to the command interpreter. If you use the normal interactive mode, the HBase Shell will only ever return its own exit status, which will nearly always be 0 for success.

在HBase Shell中添加了一个新的非交互模式(HBase -11658)。非交互模式捕获HBase Shell命令的退出状态(成功或失败),并将该状态传递回命令解释器。如果使用正常的交互模式,HBase Shell将只返回它自己的退出状态,这几乎总是为0。

To invoke non-interactive mode, pass the -n or --non-interactive option to HBase Shell.

要调用非交互模式,将-n或-非交互式选项传递给HBase Shell。

16. HBase Shell in OS Scripts

16。操作系统脚本中的HBase Shell。

You can use the HBase shell from within operating system script interpreters like the Bash shell which is the default command interpreter for most Linux and UNIX distributions. The following guidelines use Bash syntax, but could be adjusted to work with C-style shells such as csh or tcsh, and could probably be modified to work with the Microsoft Windows script interpreter as well. Submissions are welcome.

您可以在操作系统脚本解释器中使用HBase shell,如Bash shell,它是大多数Linux和UNIX发行版的缺省命令解释器。下面的指导方针使用Bash语法,但是可以调整为使用c风格的shell,例如csh或tcsh,也可以修改为与Microsoft Windows脚本解释器一起工作。欢迎提交。

Spawning HBase Shell commands in this way is slow, so keep that in mind when you are deciding when combining HBase operations with the operating system command line is appropriate.
Example 7. Passing Commands to the HBase Shell

You can pass commands to the HBase Shell in non-interactive mode (see hbase.shell.noninteractive) using the echo command and the | (pipe) operator. Be sure to escape characters in the HBase commands which would otherwise be interpreted by the shell. Some debug-level output has been truncated from the example below.

您可以使用echo命令和|(管道)操作符,以非交互模式将命令传递给HBase Shell(参见hbase.shell.noninteractive)。一定要在HBase命令中转义字符,否则将被shell解释。一些调试级别的输出已经从下面的示例中截断。

$ echo "describe 'test1'" | ./hbase shell -n

Version 0.98.3-hadoop2, rd5e65a9144e315bb0a964e7730871af32f5018d5, Sat May 31 19:56:09 PDT 2014

describe 'test1'

DESCRIPTION                                          ENABLED
 'test1', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NON true
 E', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0',
  VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIO
 NS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS =>
 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false'
 , BLOCKCACHE => 'true'}
1 row(s) in 3.2410 seconds

To suppress all output, echo it to /dev/null:

为了抑制所有输出,将其echo到/dev/null:

$ echo "describe 'test'" | ./hbase shell -n > /dev/null 2>&1
Example 8. Checking the Result of a Scripted Command

Since scripts are not designed to be run interactively, you need a way to check whether your command failed or succeeded. The HBase shell uses the standard convention of returning a value of 0 for successful commands, and some non-zero value for failed commands. Bash stores a command’s return value in a special environment variable called $?. Because that variable is overwritten each time the shell runs any command, you should store the result in a different, script-defined variable.

由于脚本不是用来交互运行的,所以您需要一种方法来检查您的命令是否失败或成功。HBase shell使用标准约定,返回值为0的成功命令,以及一些失败命令的非零值。Bash在一个名为$?的特殊环境变量中存储一个命令的返回值。因为每次shell运行任何命令时,该变量都被覆盖,所以您应该将结果存储在一个不同的、脚本定义的变量中。

This is a naive script that shows one way to store the return value and make a decision based upon it.

这是一个简单的脚本,它显示了一种存储返回值的方法,并基于它做出决策。

#!/bin/bash

echo "describe 'test'" | ./hbase shell -n > /dev/null 2>&1
status=$?
echo "The status was " $status
if ($status == 0); then
    echo "The command succeeded"
else
    echo "The command may have failed."
fi
return $status

16.1. Checking for Success or Failure In Scripts

16.1。检查脚本中的成功或失败。

Getting an exit code of 0 means that the command you scripted definitely succeeded. However, getting a non-zero exit code does not necessarily mean the command failed. The command could have succeeded, but the client lost connectivity, or some other event obscured its success. This is because RPC commands are stateless. The only way to be sure of the status of an operation is to check. For instance, if your script creates a table, but returns a non-zero exit value, you should check whether the table was actually created before trying again to create it.

获取0的退出代码意味着您所编写的命令一定成功。然而,获取非零的退出代码并不一定意味着命令失败。该命令本来可以成功,但是客户端失去了连接,或者其他一些事件掩盖了它的成功。这是因为RPC命令是无状态的。唯一确定操作状态的方法是检查。例如,如果您的脚本创建了一个表,但是返回一个非零的退出值,那么您应该检查表是否在再次尝试创建它之前被创建。

17. Read HBase Shell Commands from a Command File

17所示。从命令文件读取HBase Shell命令。

You can enter HBase Shell commands into a text file, one command per line, and pass that file to the HBase Shell.

您可以将HBase Shell命令输入到一个文本文件中,每一行一个命令,并将该文件传递给HBase Shell。

Example 9. Example Command File
create 'test', 'cf'
list 'test'
put 'test', 'row1', 'cf:a', 'value1'
put 'test', 'row2', 'cf:b', 'value2'
put 'test', 'row3', 'cf:c', 'value3'
put 'test', 'row4', 'cf:d', 'value4'
scan 'test'
get 'test', 'row1'
disable 'test'
enable 'test'
Example 10. Directing HBase Shell to Execute the Commands

Pass the path to the command file as the only argument to the hbase shell command. Each command is executed and its output is shown. If you do not include the exit command in your script, you are returned to the HBase shell prompt. There is no way to programmatically check each individual command for success or failure. Also, though you see the output for each command, the commands themselves are not echoed to the screen so it can be difficult to line up the command with its output.

将路径传递到命令文件,作为hbase shell命令的惟一参数。执行每个命令并显示其输出。如果在脚本中不包含exit命令,则返回到HBase shell提示符。对于成功或失败,没有办法以编程方式检查每个单独的命令。另外,虽然您可以看到每个命令的输出,但是命令本身并没有响应到屏幕上,因此很难将命令与输出连接起来。

$ ./hbase shell ./sample_commands.txt
0 row(s) in 3.4170 seconds

TABLE
test
1 row(s) in 0.0590 seconds

0 row(s) in 0.1540 seconds

0 row(s) in 0.0080 seconds

0 row(s) in 0.0060 seconds

0 row(s) in 0.0060 seconds

ROW                   COLUMN+CELL
 row1                 column=cf:a, timestamp=1407130286968, value=value1
 row2                 column=cf:b, timestamp=1407130286997, value=value2
 row3                 column=cf:c, timestamp=1407130287007, value=value3
 row4                 column=cf:d, timestamp=1407130287015, value=value4
4 row(s) in 0.0420 seconds

COLUMN                CELL
 cf:a                 timestamp=1407130286968, value=value1
1 row(s) in 0.0110 seconds

0 row(s) in 1.5630 seconds

0 row(s) in 0.4360 seconds

18. Passing VM Options to the Shell

18岁。将VM选项传递给Shell。

You can pass VM options to the HBase Shell using the HBASE_SHELL_OPTS environment variable. You can set this in your environment, for instance by editing ~/.bashrc, or set it as part of the command to launch HBase Shell. The following example sets several garbage-collection-related variables, just for the lifetime of the VM running the HBase Shell. The command should be run all on a single line, but is broken by the \ character, for readability.

您可以使用HBASE_SHELL_OPTS环境变量将VM选项传递给HBase Shell。您可以在您的环境中设置这个,例如通过编辑~/。bashrc,或将其设置为启动HBase Shell的命令的一部分。下面的示例设置了几个垃圾收集相关的变量,仅用于运行HBase Shell的VM的生命周期。该命令应该在一行上运行,但是被\字符破坏,为了可读性。

$ HBASE_SHELL_OPTS="-verbose:gc -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps \
  -XX:+PrintGCDetails -Xloggc:$HBASE_HOME/logs/gc-hbase.log" ./bin/hbase shell

19. Shell Tricks

19所示。壳牌的技巧

19.1. Table variables

19.1。表变量

HBase 0.95 adds shell commands that provides jruby-style object-oriented references for tables. Previously all of the shell commands that act upon a table have a procedural style that always took the name of the table as an argument. HBase 0.95 introduces the ability to assign a table to a jruby variable. The table reference can be used to perform data read write operations such as puts, scans, and gets well as admin functionality such as disabling, dropping, describing tables.

HBase 0.95添加了shell命令,它为表提供了jruby风格的面向对象引用。以前,在表上执行的所有shell命令都有一个过程式的样式,它总是以表的名称作为参数。HBase 0.95引入了将表分配给jruby变量的能力。表引用可用于执行数据读写操作,如put、扫描和获得良好的管理功能,如禁用、删除、描述表。

For example, previously you would always specify a table name:

例如,以前您总是指定一个表名:

hbase(main):000:0> create ‘t’, ‘f’
0 row(s) in 1.0970 seconds
hbase(main):001:0> put 't', 'rold', 'f', 'v'
0 row(s) in 0.0080 seconds

hbase(main):002:0> scan 't'
ROW                                COLUMN+CELL
 rold                              column=f:, timestamp=1378473207660, value=v
1 row(s) in 0.0130 seconds

hbase(main):003:0> describe 't'
DESCRIPTION                                                                           ENABLED
 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true
 SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2
 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false
 ', BLOCKCACHE => 'true'}
1 row(s) in 1.4430 seconds

hbase(main):004:0> disable 't'
0 row(s) in 14.8700 seconds

hbase(main):005:0> drop 't'
0 row(s) in 23.1670 seconds

hbase(main):006:0>

Now you can assign the table to a variable and use the results in jruby shell code.

现在可以将表分配给一个变量,并使用jruby shell代码中的结果。

hbase(main):007 > t = create 't', 'f'
0 row(s) in 1.0970 seconds

=> Hbase::Table - t
hbase(main):008 > t.put 'r', 'f', 'v'
0 row(s) in 0.0640 seconds
hbase(main):009 > t.scan
ROW                           COLUMN+CELL
 r                            column=f:, timestamp=1331865816290, value=v
1 row(s) in 0.0110 seconds
hbase(main):010:0> t.describe
DESCRIPTION                                                                           ENABLED
 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true
 SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2
 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false
 ', BLOCKCACHE => 'true'}
1 row(s) in 0.0210 seconds
hbase(main):038:0> t.disable
0 row(s) in 6.2350 seconds
hbase(main):039:0> t.drop
0 row(s) in 0.2340 seconds

If the table has already been created, you can assign a Table to a variable by using the get_table method:

如果已经创建了表,您可以使用get_table方法将一个表分配给一个变量:

hbase(main):011 > create 't','f'
0 row(s) in 1.2500 seconds

=> Hbase::Table - t
hbase(main):012:0> tab = get_table 't'
0 row(s) in 0.0010 seconds

=> Hbase::Table - t
hbase(main):013:0> tab.put ‘r1’ ,’f’, ‘v’
0 row(s) in 0.0100 seconds
hbase(main):014:0> tab.scan
ROW                                COLUMN+CELL
 r1                                column=f:, timestamp=1378473876949, value=v
1 row(s) in 0.0240 seconds
hbase(main):015:0>

The list functionality has also been extended so that it returns a list of table names as strings. You can then use jruby to script table operations based on these names. The list_snapshots command also acts similarly.

列表功能也被扩展,因此它返回一个表名称列表作为字符串。然后,您可以根据这些名称使用jruby编写脚本表操作。list_snapshot命令也类似。

hbase(main):016 > tables = list(‘t.*’)
TABLE
t
1 row(s) in 0.1040 seconds

=> #<#<Class:0x7677ce29>:0x21d377a4>
hbase(main):017:0> tables.map { |t| disable t ; drop  t}
0 row(s) in 2.2510 seconds

=> [nil]
hbase(main):018:0>

19.2. irbrc

19.2。irbrc

Create an .irbrc file for yourself in your home directory. Add customizations. A useful one is command history so commands are save across Shell invocations:

在您的主目录中为自己创建一个.irbrc文件。添加自定义。一个有用的命令是命令历史,所以命令可以通过Shell调用保存:

$ more .irbrc
require 'irb/ext/save-history'
IRB.conf[:SAVE_HISTORY] = 100
IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"

See the ruby documentation of .irbrc to learn about other possible configurations.

请参阅.irbrc的ruby文档了解其他可能的配置。

19.3. LOG data to timestamp

19.3。日志数据的时间戳

To convert the date '08/08/16 20:56:29' from an hbase log into a timestamp, do:

要将日期“08/08/16 20:56:29”从hbase日志转换为时间戳,请执行:

hbase(main):021:0> import java.text.SimpleDateFormat
hbase(main):022:0> import java.text.ParsePosition
hbase(main):023:0> SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("08/08/16 20:56:29", ParsePosition.new(0)).getTime() => 1218920189000

To go the other direction:

走向另一个方向:

hbase(main):021:0> import java.util.Date
hbase(main):022:0> Date.new(1218920189000).toString() => "Sat Aug 16 20:56:29 UTC 2008"

To output in a format that is exactly like that of the HBase log format will take a little messing with SimpleDateFormat.

要以与HBase日志格式完全相同的格式输出,将会使用SimpleDateFormat来进行一些干扰。

19.4. Query Shell Configuration

19.4。查询外壳配置

hbase(main):001:0> @shell.hbase.configuration.get("hbase.rpc.timeout")
=> "60000"

To set a config in the shell:

在shell中设置一个配置:

hbase(main):005:0> @shell.hbase.configuration.setInt("hbase.rpc.timeout", 61010)
hbase(main):006:0> @shell.hbase.configuration.get("hbase.rpc.timeout")
=> "61010"

19.5. Pre-splitting tables with the HBase Shell

19.5。与HBase Shell的预分解表。

You can use a variety of options to pre-split tables when creating them via the HBase Shell create command.

在通过HBase Shell创建命令时,可以使用各种选项来预分解表。

The simplest approach is to specify an array of split points when creating the table. Note that when specifying string literals as split points, these will create split points based on the underlying byte representation of the string. So when specifying a split point of '10', we are actually specifying the byte split point '\x31\30'.

最简单的方法是在创建表时指定分割点的数组。注意,当将字符串文本指定为分割点时,它们将基于字符串的底层字节表示创建分叉点。因此,当指定“10”的分叉点时,我们实际上指定了字节分割点“\x31\30”。

The split points will define n+1 regions where n is the number of split points. The lowest region will contain all keys from the lowest possible key up to but not including the first split point key. The next region will contain keys from the first split point up to, but not including the next split point key. This will continue for all split points up to the last. The last region will be defined from the last split point up to the maximum possible key.

分割点将定义n+1个区域,其中n为分裂点的个数。最低的区域将包含所有键,从最低的可能的关键到但不包括第一个分裂点的关键。下一个区域将包含从第一个分裂点到,但不包括下一个拆分点键的键。这将会持续到最后。最后一个区域将从最后一个分割点定义到最大可能的密钥。

hbase>create 't1','f',SPLITS => ['10','20',30']

In the above example, the table 't1' will be created with column family 'f', pre-split to four regions. Note the first region will contain all keys from '\x00' up to '\x30' (as '\x31' is the ASCII code for '1').

在上面的例子中,表“t1”将用列族“f”创建,并预先划分为四个区域。注意,第一个区域将包含“\x00”到“\x30”的所有键(“\x31”是“1”的ASCII码)。

You can pass the split points in a file using following variation. In this example, the splits are read from a file corresponding to the local path on the local filesystem. Each line in the file specifies a split point key.

您可以使用以下变体在文件中传递分割点。在本例中,将从与本地文件系统上的本地路径对应的文件中读取分割。文件中的每一行都指定了一个拆分点键。

hbase>create 't14','f',SPLITS_FILE=>'splits.txt'

The other options are to automatically compute splits based on a desired number of regions and a splitting algorithm. HBase supplies algorithms for splitting the key range based on uniform splits or based on hexadecimal keys, but you can provide your own splitting algorithm to subdivide the key range.

其他选项是根据所需的区域数量和分割算法自动计算分割。HBase提供了基于均匀分割或基于十六进制键来分割密钥范围的算法,但是您可以提供自己的分裂算法来细分密钥范围。

# create table with four regions based on random bytes keys
hbase>create 't2','f1', { NUMREGIONS => 4 , SPLITALGO => 'UniformSplit' }

# create table with five regions based on hex keys
hbase>create 't3','f1', { NUMREGIONS => 5, SPLITALGO => 'HexStringSplit' }

As the HBase Shell is effectively a Ruby environment, you can use simple Ruby scripts to compute splits algorithmically.

由于HBase Shell实际上是一个Ruby环境,所以您可以使用简单的Ruby脚本来计算分割算法。

# generate splits for long (Ruby fixnum) key range from start to end key
hbase(main):070:0> def gen_splits(start_key,end_key,num_regions)
hbase(main):071:1>   results=[]
hbase(main):072:1>   range=end_key-start_key
hbase(main):073:1>   incr=(range/num_regions).floor
hbase(main):074:1>   for i in 1 .. num_regions-1
hbase(main):075:2>     results.push([i*incr+start_key].pack("N"))
hbase(main):076:2>   end
hbase(main):077:1>   return results
hbase(main):078:1> end
hbase(main):079:0>
hbase(main):080:0> splits=gen_splits(1,2000000,10)
=> ["\000\003\r@", "\000\006\032\177", "\000\t'\276", "\000\f4\375", "\000\017B<", "\000\022O{", "\000\025\\\272", "\000\030i\371", "\000\ew8"]
hbase(main):081:0> create 'test_splits','f',SPLITS=>splits
0 row(s) in 0.2670 seconds

=> Hbase::Table - test_splits

Note that the HBase Shell command truncate effectively drops and recreates the table with default options which will discard any pre-splitting. If you need to truncate a pre-split table, you must drop and recreate the table explicitly to re-specify custom split options.

注意,HBase Shell命令截断了有效的删除,并使用默认选项重新创建表,该选项将丢弃任何预分解。如果您需要截断一个预分割表,您必须删除并重新创建表,以重新指定自定义的分割选项。

19.6. Debug

19.6。调试

19.6.1. Shell debug switch

19.6.1。壳牌调试开关

You can set a debug switch in the shell to see more output — e.g. more of the stack trace on exception — when you run a command:

您可以在shell中设置一个调试开关以查看更多的输出。当您运行一个命令时,更多的堆栈跟踪是异常的:

hbase> debug <RETURN>

19.6.2. DEBUG log level

19.6.2。调试日志级别

To enable DEBUG level logging in the shell, launch it with the -d option.

要在shell中启用调试级别的日志记录,可以使用-d选项启动它。

$ ./bin/hbase shell -d

19.7. Commands

19.7。命令

19.7.1. count

19.7.1。数

Count command returns the number of rows in a table. It’s quite fast when configured with the right CACHE

Count命令返回表中的行数。配置正确的缓存时速度非常快。

hbase> count '<tablename>', CACHE => 1000

The above count fetches 1000 rows at a time. Set CACHE lower if your rows are big. Default is to fetch one row at a time.

以上计数一次取1000行。如果行很大,则设置缓存更低。默认是一次取一行。

Data Model

数据模型

In HBase, data is stored in tables, which have rows and columns. This is a terminology overlap with relational databases (RDBMSs), but this is not a helpful analogy. Instead, it can be helpful to think of an HBase table as a multi-dimensional map.

在HBase中,数据存储在表中,表中有行和列。这是一个与关系数据库(RDBMSs)重叠的术语,但这不是一个有用的类比。相反,可以将HBase表看作多维映射。

HBase Data Model Terminology
Table

An HBase table consists of multiple rows.

一个HBase表由多个行组成。

Row

A row in HBase consists of a row key and one or more columns with values associated with them. Rows are sorted alphabetically by the row key as they are stored. For this reason, the design of the row key is very important. The goal is to store data in such a way that related rows are near each other. A common row key pattern is a website domain. If your row keys are domains, you should probably store them in reverse (org.apache.www, org.apache.mail, org.apache.jira). This way, all of the Apache domains are near each other in the table, rather than being spread out based on the first letter of the subdomain.

HBase中的一行包含一行键和一个或多个带有与之关联的值的列。行按字母顺序排序,按行键存储。因此,行键的设计非常重要。其目标是存储数据,以使相关的行彼此相邻。公共行键模式是一个网站域。如果您的行键是域,则应该将它们存储在反向(org.apache)中。www,表示。邮件,org.apache.jira)。这样,所有的Apache域都在表中彼此相邻,而不是基于子域的第一个字母展开。

Column

A column in HBase consists of a column family and a column qualifier, which are delimited by a : (colon) character.

HBase中的一个列由一个列族和一个列限定符组成,它由一个:(冒号)字符分隔。

Column Family

Column families physically colocate a set of columns and their values, often for performance reasons. Each column family has a set of storage properties, such as whether its values should be cached in memory, how its data is compressed or its row keys are encoded, and others. Each row in a table has the same column families, though a given row might not store anything in a given column family.

列家庭在物理上对一组列和它们的值进行物理colocate,通常是出于性能方面的原因。每个列家族都有一组存储属性,比如它的值是否应该缓存在内存中,它的数据是如何被压缩的,或者它的行键是编码的,等等。表中的每一行都有相同的列家族,尽管给定的行可能不会存储在给定列家族中的任何东西。

Column Qualifier

A column qualifier is added to a column family to provide the index for a given piece of data. Given a column family content, a column qualifier might be content:html, and another might be content:pdf. Though column families are fixed at table creation, column qualifiers are mutable and may differ greatly between rows.

将列限定符添加到列家族中,以提供给定数据块的索引。给定一个列的家庭内容,一个列限定符可能是内容:html,另一个可能是内容:pdf。虽然列族在表创建时是固定的,但是列限定符是可变的,在行之间可能有很大的不同。

Cell

A cell is a combination of row, column family, and column qualifier, and contains a value and a timestamp, which represents the value’s version.

单元格是行、列族和列限定符的组合,包含一个值和一个时间戳,它表示值的版本。

Timestamp

A timestamp is written alongside each value, and is the identifier for a given version of a value. By default, the timestamp represents the time on the RegionServer when the data was written, but you can specify a different timestamp value when you put data into the cell.

时间戳是在每个值旁边写的,是一个给定版本的值的标识符。默认情况下,时间戳表示数据写入时区域服务器上的时间,但您可以在将数据放入单元格时指定不同的时间戳值。

20. Conceptual View

20.概念视图

You can read a very understandable explanation of the HBase data model in the blog post Understanding HBase and BigTable by Jim R. Wilson. Another good explanation is available in the PDF Introduction to Basic Schema Design by Amandeep Khurana.

您可以通过Jim R. Wilson的博客文章了解HBase和BigTable中HBase数据模型的一个非常容易理解的解释。另一个很好的解释是,Amandeep Khurana的基本模式设计的PDF介绍。

It may help to read different perspectives to get a solid understanding of HBase schema design. The linked articles cover the same ground as the information in this section.

它可能有助于阅读不同的透视图以获得对HBase模式设计的坚实理解。链接的文章与本节中的信息相同。

The following example is a slightly modified form of the one on page 2 of the BigTable paper. There is a table called webtable that contains two rows (com.cnn.www and com.example.www) and three column families named contents, anchor, and people. In this example, for the first row (com.cnn.www), anchor contains two columns (anchor:cssnsi.com, anchor:my.look.ca) and contents contains one column (contents:html). This example contains 5 versions of the row with the row key com.cnn.www, and one version of the row with the row key com.example.www. The contents:html column qualifier contains the entire HTML of a given website. Qualifiers of the anchor column family each contain the external site which links to the site represented by the row, along with the text it used in the anchor of its link. The people column family represents people associated with the site.

下面的示例是BigTable文件第2页上的一个稍微修改过的表单。有一个名为webtable的表,它包含两行(com.cn .www和com.example.www)和三个列的名为contents、anchor和people的列。在本例中,对于第一行(com.cn .www),锚包含两列(anchor:cssnsi.com, anchor:my.look.ca),内容包含一个列(内容:html)。这个示例包含5个版本的行和行键com.cn .www,以及一行与行键com.example.www的一个版本。内容:html列限定符包含给定网站的整个html。锚列族的限定符每个包含外部站点,该站点链接到由行表示的站点,以及它在链接的锚中使用的文本。people栏目组代表与网站相关的人。

Column Names

By convention, a column name is made of its column family prefix and a qualifier. For example, the column contents:html is made up of the column family contents and the html qualifier. The colon character (:) delimits the column family from the column family qualifier.

按惯例,列名称由其列族前缀和限定符组成。例如,列内容:html是由列家族内容和html限定符组成的。冒号(:)将列族从列族限定符中分离出来。

Table 5. Table webtable
Row Key Time Stamp ColumnFamily contents ColumnFamily anchor ColumnFamily people

"com.cnn.www"

“com.cnn.www”

t9

t9

anchor:cnnsi.com = "CNN"

CNN主播:cnnsi.com = " "

"com.cnn.www"

“com.cnn.www”

t8

t8

anchor:my.look.ca = "CNN.com"

主播:my.look。ca = " CNN.com "

"com.cnn.www"

“com.cnn.www”

t6

t6

contents:html = "<html>…​"

内容:html = " < html >…”

"com.cnn.www"

“com.cnn.www”

t5

t5

contents:html = "<html>…​"

内容:html = " < html >…”

"com.cnn.www"

“com.cnn.www”

t3

t3

contents:html = "<html>…​"

内容:html = " < html >…”

"com.example.www"

“com.example.www”

t5

t5

contents:html = "<html>…​"

内容:html = " < html >…”

people:author = "John Doe"

人:作者=“John Doe”

Cells in this table that appear to be empty do not take space, or in fact exist, in HBase. This is what makes HBase "sparse." A tabular view is not the only possible way to look at data in HBase, or even the most accurate. The following represents the same information as a multi-dimensional map. This is only a mock-up for illustrative purposes and may not be strictly accurate.

这个表中显示为空的单元格不占用空间,或者实际上存在于HBase中。这就是为什么HBase是“稀疏的”。表格视图并不是查看HBase中数据的唯一方法,甚至是最准确的数据。下面是与多维映射相同的信息。这只是为说明目的而做的模型,而且可能不是严格的精确。

{
  "com.cnn.www": {
    contents: {
      t6: contents:html: "<html>..."
      t5: contents:html: "<html>..."
      t3: contents:html: "<html>..."
    }
    anchor: {
      t9: anchor:cnnsi.com = "CNN"
      t8: anchor:my.look.ca = "CNN.com"
    }
    people: {}
  }
  "com.example.www": {
    contents: {
      t5: contents:html: "<html>..."
    }
    anchor: {}
    people: {
      t5: people:author: "John Doe"
    }
  }
}

21. Physical View

21。物理视图

Although at a conceptual level tables may be viewed as a sparse set of rows, they are physically stored by column family. A new column qualifier (column_family:column_qualifier) can be added to an existing column family at any time.

尽管在概念级别的表中可以看作是一组稀疏的行,但它们实际上是由列族存储的。一个新的列限定符(column_family:column_qualifier)可以在任何时候添加到现有的列家族中。

Table 6. ColumnFamily anchor
Row Key Time Stamp Column Family anchor

"com.cnn.www"

“com.cnn.www”

t9

t9

anchor:cnnsi.com = "CNN"

CNN主播:cnnsi.com = " "

"com.cnn.www"

“com.cnn.www”

t8

t8

anchor:my.look.ca = "CNN.com"

主播:my.look。ca = " CNN.com "

Table 7. ColumnFamily contents
Row Key Time Stamp ColumnFamily contents:

"com.cnn.www"

“com.cnn.www”

t6

t6

contents:html = "<html>…​"

内容:html = " < html >…”

"com.cnn.www"

“com.cnn.www”

t5

t5

contents:html = "<html>…​"

内容:html = " < html >…”

"com.cnn.www"

“com.cnn.www”

t3

t3

contents:html = "<html>…​"

内容:html = " < html >…”

The empty cells shown in the conceptual view are not stored at all. Thus a request for the value of the contents:html column at time stamp t8 would return no value. Similarly, a request for an anchor:my.look.ca value at time stamp t9 would return no value. However, if no timestamp is supplied, the most recent value for a particular column would be returned. Given multiple versions, the most recent is also the first one found, since timestamps are stored in descending order. Thus a request for the values of all columns in the row com.cnn.www if no timestamp is specified would be: the value of contents:html from timestamp t6, the value of anchor:cnnsi.com from timestamp t9, the value of anchor:my.look.ca from timestamp t8.

在概念视图中显示的空单元根本没有存储。因此,对内容的值的请求:在时间戳t8上的html列将返回无值。类似地,对锚的请求:my.look。时间戳t9的ca值不会返回任何值。但是,如果没有提供时间戳,则返回特定列的最新值。给定多个版本,最近的版本也是第一个发现的,因为时间戳是按降序存储的。因此,如果没有指定时间戳,则请求行com.cn .cn .www中的所有列的值:内容的值:时间戳t6的html值,锚的值:时间戳t9的cnnsi.com,锚的值:my.look。从时间戳t8 ca。

For more information about the internals of how Apache HBase stores data, see regions.arch.

有关Apache HBase存储数据的内部情况的更多信息,请参见区域。arch。

22. Namespace

22。名称空间

A namespace is a logical grouping of tables analogous to a database in relation database systems. This abstraction lays the groundwork for upcoming multi-tenancy related features:

名称空间是表的逻辑分组,类似于数据库系统中的数据库。这种抽象为即将到来的多租户相关特性奠定了基础:

  • Quota Management (HBASE-8410) - Restrict the amount of resources (i.e. regions, tables) a namespace can consume.

    配额管理(HBASE-8410)限制了名称空间可以使用的资源数量(即区域、表)。

  • Namespace Security Administration (HBASE-9206) - Provide another level of security administration for tenants.

    名称空间安全管理(HBASE-9206)——为租户提供另一个级别的安全管理。

  • Region server groups (HBASE-6721) - A namespace/table can be pinned onto a subset of RegionServers thus guaranteeing a coarse level of isolation.

    区域服务器组(HBASE-6721) -一个名称空间/表可以被固定在区域服务器的子集上,从而保证了一个粗糙的隔离级别。

22.1. Namespace management

22.1。命名空间管理

A namespace can be created, removed or altered. Namespace membership is determined during table creation by specifying a fully-qualified table name of the form:

可以创建、删除或修改名称空间。在表创建期间,通过指定表单的完全限定表名来确定名称空间成员:

<table namespace>:<table qualifier>
Example 11. Examples
#Create a namespace
create_namespace 'my_ns'
#create my_table in my_ns namespace
create 'my_ns:my_table', 'fam'
#drop namespace
drop_namespace 'my_ns'
#alter namespace
alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}

22.2. Predefined namespaces

22.2。预定义的名称空间

There are two predefined special namespaces:

有两个预定义的特殊名称空间:

  • hbase - system namespace, used to contain HBase internal tables

    hbase -系统名称空间,用于包含hbase内部表。

  • default - tables with no explicit specified namespace will automatically fall into this namespace

    没有显式指定名称空间的默认表将自动归入这个名称空间。

Example 12. Examples
#namespace=foo and table qualifier=bar
create 'foo:bar', 'fam'

#namespace=default and table qualifier=bar
create 'bar', 'fam'

23. Table

23。表

Tables are declared up front at schema definition time.

表在模式定义时间内被声明。

24. Row

24。行

Row keys are uninterpreted bytes. Rows are lexicographically sorted with the lowest order appearing first in a table. The empty byte array is used to denote both the start and end of a tables' namespace.

行键是未解释的字节。行是按字母顺序排序的,顺序是表中出现的最低顺序。空字节数组用于表示表名称空间的开始和结束。

25. Column Family

25。列族

Columns in Apache HBase are grouped into column families. All column members of a column family have the same prefix. For example, the columns courses:history and courses:math are both members of the courses column family. The colon character (:) delimits the column family from the column family qualifier. The column family prefix must be composed of printable characters. The qualifying tail, the column family qualifier, can be made of any arbitrary bytes. Column families must be declared up front at schema definition time whereas columns do not need to be defined at schema time but can be conjured on the fly while the table is up and running.

Apache HBase中的列被分组为列族。列家族的所有列成员都具有相同的前缀。例如,列课程:历史和课程:数学是课程列家庭的成员。冒号(:)将列族从列族限定符中分离出来。列族前缀必须由可打印字符组成。符合条件的尾部,列家庭限定符,可以由任意的字节组成。列族必须在模式定义时间内声明,而列不需要在模式时间内定义,但是可以在表启动和运行时动态地转换。

Physically, all column family members are stored together on the filesystem. Because tunings and storage specifications are done at the column family level, it is advised that all column family members have the same general access pattern and size characteristics.

物理上,所有列家族成员都存储在文件系统中。由于在列家族级别上完成了调优和存储规范,因此建议所有列家庭成员都具有相同的通用访问模式和大小特性。

26. Cells

26岁。细胞

A {row, column, version} tuple exactly specifies a cell in HBase. Cell content is uninterpreted bytes

{行,列,版本}元组精确地指定了HBase中的单元格。单元格内容是未解释的字节。

27. Data Model Operations

27。数据模型操作

The four primary data model operations are Get, Put, Scan, and Delete. Operations are applied via Table instances.

四个主要的数据模型操作是Get、Put、Scan和Delete。操作通过表实例应用。

27.1. Get

27.1。得到

Get returns attributes for a specified row. Gets are executed via Table.get

获取指定行的返回属性。get通过表执行。

27.2. Put

27.2。把

Put either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Puts are executed via Table.put (non-writeBuffer) or Table.batch (non-writeBuffer)

将新行添加到表中(如果键是新的),或者可以更新现有的行(如果键已经存在)。put是通过表执行的。把(non-writeBuffer)或表。批处理(non-writeBuffer)

27.3. Scans

27.3。扫描

Scan allow iteration over multiple rows for specified attributes.

扫描允许对指定属性的多行进行迭代。

The following is an example of a Scan on a Table instance. Assume that a table is populated with rows with keys "row1", "row2", "row3", and then another set of rows with the keys "abc1", "abc2", and "abc3". The following example shows how to set a Scan instance to return the rows beginning with "row".

下面是对表实例进行扫描的示例。假设一个表中包含有键“row1”、“row2”、“row3”,然后是另一组带有“abc1”、“abc2”和“abc3”的行。下面的示例演示如何设置扫描实例以返回以“row”开头的行。

public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...

Table table = ...      // instantiate a Table instance

Scan scan = new Scan();
scan.addColumn(CF, ATTR);
scan.setRowPrefixFilter(Bytes.toBytes("row"));
ResultScanner rs = table.getScanner(scan);
try {
  for (Result r = rs.next(); r != null; r = rs.next()) {
    // process result...
  }
} finally {
  rs.close();  // always close the ResultScanner!
}

Note that generally the easiest way to specify a specific stop point for a scan is by using the InclusiveStopFilter class.

请注意,通常为扫描指定特定的停止点的最简单方法是使用InclusiveStopFilter类。

27.4. Delete

27.4。删除

Delete removes a row from a table. Deletes are executed via Table.delete.

从表中删除一行。删除是通过表执行的。

HBase does not modify data in place, and so deletes are handled by creating new markers called tombstones. These tombstones, along with the dead values, are cleaned up on major compactions.

HBase不修改数据,因此删除是通过创建称为tombstone的新标记来处理的。这些墓碑,连同死去的价值观,都被清理干净了。

See version.delete for more information on deleting versions of columns, and see compaction for more information on compactions.

请参阅版本。删除更多关于删除列版本的信息,并查看compaction以获得关于compaction的更多信息。

28. Versions

28。版本

A {row, column, version} tuple exactly specifies a cell in HBase. It’s possible to have an unbounded number of cells where the row and column are the same but the cell address differs only in its version dimension.

{行,列,版本}元组精确地指定了HBase中的单元格。在行和列相同的情况下,可能有一个无限数量的单元格,但单元地址仅在其版本维度上有所不同。

While rows and column keys are expressed as bytes, the version is specified using a long integer. Typically this long contains time instances such as those returned by java.util.Date.getTime() or System.currentTimeMillis(), that is: the difference, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC.

虽然行和列键表示为字节,但使用长整数指定版本。通常,这段时间包含一些时间实例,例如java.util.Date.getTime()或System.currentTimeMillis()所返回的时间实例,即:在当前时间和午夜、1970年1月1日和1970年1月1日之间,以毫秒计算的差异。

The HBase version dimension is stored in decreasing order, so that when reading from a store file, the most recent values are found first.

HBase版本维度存储在递减顺序中,因此当从存储文件读取时,首先会发现最近的值。

There is a lot of confusion over the semantics of cell versions, in HBase. In particular:

在HBase中,对单元格的语义有很多混淆。特别是:

  • If multiple writes to a cell have the same version, only the last written is fetchable.

    如果多个写入到一个单元格具有相同的版本,则只有最后一个写的是fetchable。

  • It is OK to write cells in a non-increasing version order.

    在不增加的版本中编写单元格是可以的。

Below we describe how the version dimension in HBase currently works. See HBASE-2406 for discussion of HBase versions. Bending time in HBase makes for a good read on the version, or time, dimension in HBase. It has more detail on versioning than is provided here. As of this writing, the limitation Overwriting values at existing timestamps mentioned in the article no longer holds in HBase. This section is basically a synopsis of this article by Bruno Dumon.

下面我们将介绍HBase中的版本维度是如何工作的。有关HBase版本的讨论,请参见HBase -2406。在HBase中的弯曲时间可以在HBase中对版本或时间维度进行良好的读取。它比这里提供的版本更详细。在撰写本文时,限制在文章中提到的现有时间戳中覆盖的值在HBase中不再有效。这一节基本上是布鲁诺·杜蒙的这篇文章的梗概。

28.1. Specifying the Number of Versions to Store

28.1。指定要存储的版本的数量。

The maximum number of versions to store for a given column is part of the column schema and is specified at table creation, or via an alter command, via HColumnDescriptor.DEFAULT_VERSIONS. Prior to HBase 0.96, the default number of versions kept was 3, but in 0.96 and newer has been changed to 1.

为给定列存储的最大版本数是列模式的一部分,并且是在表创建中指定的,或者通过hcolumndescriptor.default_version通过alter命令指定。在HBase 0.96之前,保留的版本的默认数量是3,但是在0.96和更新的版本中被更改为1。

Example 13. Modify the Maximum Number of Versions for a Column Family

This example uses HBase Shell to keep a maximum of 5 versions of all columns in column family f1. You could also use HColumnDescriptor.

这个示例使用HBase Shell在列家族f1中保留最多5个版本的所有列。也可以使用HColumnDescriptor。

hbase> alter ‘t1′, NAME => ‘f1′, VERSIONS => 5
Example 14. Modify the Minimum Number of Versions for a Column Family

You can also specify the minimum number of versions to store per column family. By default, this is set to 0, which means the feature is disabled. The following example sets the minimum number of versions on all columns in column family f1 to 2, via HBase Shell. You could also use HColumnDescriptor.

您还可以指定每个列家族存储的版本的最小数量。默认情况下,这个值设置为0,这意味着该特性是禁用的。下面的示例通过HBase Shell将列族f1到2的所有列的最小版本数设置为2。也可以使用HColumnDescriptor。

hbase> alter ‘t1′, NAME => ‘f1′, MIN_VERSIONS => 2

Starting with HBase 0.98.2, you can specify a global default for the maximum number of versions kept for all newly-created columns, by setting hbase.column.max.version in hbase-site.xml. See hbase.column.max.version.

从HBase 0.98.2开始,通过设置HBase . columnmax,您可以为所有新建列保留的最大版本数指定一个全局缺省值。在hbase-site.xml版本。看到hbase.column.max.version。

28.2. Versions and HBase Operations

28.2。版本和HBase操作

In this section we look at the behavior of the version dimension for each of the core HBase operations.

在本节中,我们将查看每个核心HBase操作的版本维度的行为。

28.2.1. Get/Scan

28.2.1。Get /扫描

Gets are implemented on top of Scans. The below discussion of Get applies equally to Scans.

获取是在扫描之上实现的。下面的讨论同样适用于扫描。

By default, i.e. if you specify no explicit version, when doing a get, the cell whose version has the largest value is returned (which may or may not be the latest one written, see later). The default behavior can be modified in the following ways:

默认情况下,即如果您指定没有显式的版本,当执行get时,其版本具有最大的值的单元格会返回(这可能是最新的一个,也可能不是最近的一个)。默认行为可以通过以下方式进行修改:

  • to return more than one version, see Get.setMaxVersions()

    要返回多个版本,请参阅Get.setMaxVersions()

  • to return versions other than the latest, see Get.setTimeRange()

    要返回最新的版本,请参阅Get.setTimeRange()

    To retrieve the latest version that is less than or equal to a given value, thus giving the 'latest' state of the record at a certain point in time, just use a range from 0 to the desired version and set the max versions to 1.

    要检索小于或等于给定值的最新版本,因此在某个时间点上给出记录的“最新”状态,只需使用从0到所需版本的范围,并将最大版本设置为1。

28.2.2. Default Get Example

28.2.2。默认有例子

The following Get will only retrieve the current version of the row

下面的Get将只检索该行的当前版本。

public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
Get get = new Get(Bytes.toBytes("row1"));
Result r = table.get(get);
byte[] b = r.getValue(CF, ATTR);  // returns current version of value

28.2.3. Versioned Get Example

28.2.3。版本化得到的例子

The following Get will return the last 3 versions of the row.

下面的Get将返回该行的最后3个版本。

public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
Get get = new Get(Bytes.toBytes("row1"));
get.setMaxVersions(3);  // will return last 3 versions of row
Result r = table.get(get);
byte[] b = r.getValue(CF, ATTR);  // returns current version of value
List<KeyValue> kv = r.getColumn(CF, ATTR);  // returns all versions of this column

28.2.4. Put

28.2.4。把

Doing a put always creates a new version of a cell, at a certain timestamp. By default the system uses the server’s currentTimeMillis, but you can specify the version (= the long integer) yourself, on a per-column level. This means you could assign a time in the past or the future, or use the long value for non-time purposes.

在某个时间戳中,执行put总是会创建一个新版本的单元格。默认情况下,系统使用服务器的currentTimeMillis,但是您可以在每列的级别上指定您自己的版本(=长整数)。这意味着您可以在过去或将来分配一个时间,或者将长值用于非时间目的。

To overwrite an existing value, do a put at exactly the same row, column, and version as that of the cell you want to overwrite.

要覆盖现有的值,请执行与要覆盖的单元格相同的行、列和版本。

Implicit Version Example
隐式版本的例子

The following Put will be implicitly versioned by HBase with the current time.

下面的Put将由HBase在当前时间内隐式地版本化。

public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
Put put = new Put(Bytes.toBytes(row));
put.add(CF, ATTR, Bytes.toBytes( data));
table.put(put);
Explicit Version Example
明确的版本的例子

The following Put has the version timestamp explicitly set.

下面的Put是显式设置的版本时间戳。

public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
Put put = new Put( Bytes.toBytes(row));
long explicitTimeInMs = 555;  // just an example
put.add(CF, ATTR, explicitTimeInMs, Bytes.toBytes(data));
table.put(put);

Caution: the version timestamp is used internally by HBase for things like time-to-live calculations. It’s usually best to avoid setting this timestamp yourself. Prefer using a separate timestamp attribute of the row, or have the timestamp as a part of the row key, or both.

警告:版本时间戳是由HBase在内部使用的,用于计算实时计算。通常最好不要自己设置这个时间戳。更喜欢使用行的单独的时间戳属性,或者将时间戳作为行键的一部分,或者两者都使用。

28.2.5. Delete

28.2.5。删除

There are three different types of internal delete markers. See Lars Hofhansl’s blog for discussion of his attempt adding another, Scanning in HBase: Prefix Delete Marker.

有三种不同类型的内部删除标记。请参阅Lars Hofhansl的博客,讨论他的尝试添加另一个,在HBase中扫描:前缀删除标记。

  • Delete: for a specific version of a column.

    删除:针对某一列的特定版本。

  • Delete column: for all versions of a column.

    删除列:对于所有版本的列。

  • Delete family: for all columns of a particular ColumnFamily

    删除家庭:适用于特定列的所有列。

When deleting an entire row, HBase will internally create a tombstone for each ColumnFamily (i.e., not each individual column).

当删除整个行时,HBase将在内部为每个ColumnFamily创建一个tombstone(即:,不是每一栏。

Deletes work by creating tombstone markers. For example, let’s suppose we want to delete a row. For this you can specify a version, or else by default the currentTimeMillis is used. What this means is delete all cells where the version is less than or equal to this version. HBase never modifies data in place, so for example a delete will not immediately delete (or mark as deleted) the entries in the storage file that correspond to the delete condition. Rather, a so-called tombstone is written, which will mask the deleted values. When HBase does a major compaction, the tombstones are processed to actually remove the dead values, together with the tombstones themselves. If the version you specified when deleting a row is larger than the version of any value in the row, then you can consider the complete row to be deleted.

通过创建墓碑标记来删除工作。例如,假设我们想删除一行。为此,您可以指定一个版本,或者默认使用currentTimeMillis。这意味着删除所有版本小于或等于这个版本的单元格。HBase永远不会修改数据,例如,删除不会立即删除(或标记为已删除)存储文件中对应于删除条件的条目。相反,所谓的墓碑是写出来的,它会掩盖被删除的值。当HBase做一个主要的压缩时,tombstone会被处理,以实际移除死值,连同墓碑本身。如果在删除行时指定的版本大于行中任何值的版本,则可以考虑删除完整行。

For an informative discussion on how deletes and versioning interact, see the thread Put w/timestamp → Deleteall → Put w/ timestamp fails up on the user mailing list.

的信息讨论如何删除和版本进行交互,查看线程把w /时间戳→Deleteall→把w /时间戳失败用户邮件列表。

Also see keyvalue for more information on the internal KeyValue format.

还可以看到关于内部keyvalue格式的更多信息的keyvalue。

Delete markers are purged during the next major compaction of the store, unless the KEEP_DELETED_CELLS option is set in the column family (See Keeping Deleted Cells). To keep the deletes for a configurable amount of time, you can set the delete TTL via the hbase.hstore.time.to.purge.deletes property in hbase-site.xml. If hbase.hstore.time.to.purge.deletes is not set, or set to 0, all delete markers, including those with timestamps in the future, are purged during the next major compaction. Otherwise, a delete marker with a timestamp in the future is kept until the major compaction which occurs after the time represented by the marker’s timestamp plus the value of hbase.hstore.time.to.purge.deletes, in milliseconds.

删除标记在该存储的下一个主要压缩过程中被清除,除非在列家族中设置KEEP_DELETED_CELLS选项(参见保留删除的单元格)。为了保持删除的时间,你可以通过hbase.hstore.h .time.to.purge.delete来设置删除TTL。如果hbase.hstore.time.to.purge.deletes没有设置或设置为0,所有删除标记,包括将来的时间戳,都将在接下来的主要压缩过程中被清除。否则,在未来的时间戳中会保留一个带有时间戳的删除标记,直到标记的时间戳和hbase. hstore.c . hstore.o .purge.delete以毫秒为单位表示的时间戳后出现的主压缩。

This behavior represents a fix for an unexpected change that was introduced in HBase 0.94, and was fixed in HBASE-10118. The change has been backported to HBase 0.94 and newer branches.

28.3. Current Limitations

28.3。当前的限制

28.3.1. Deletes mask Puts

28.3.1。删除面具了

Deletes mask puts, even puts that happened after the delete was entered. See HBASE-2256. Remember that a delete writes a tombstone, which only disappears after then next major compaction has run. Suppose you do a delete of everything ⇐ T. After this you do a new put with a timestamp ⇐ T. This put, even if it happened after the delete, will be masked by the delete tombstone. Performing the put will not fail, but when you do a get you will notice the put did have no effect. It will start working again after the major compaction has run. These issues should not be a problem if you use always-increasing versions for new puts to a row. But they can occur even if you do not care about time: just do delete and put immediately after each other, and there is some chance they happen within the same millisecond.

删除掩码设置,甚至在输入删除后发生。看到hbase - 2256。记住,一个删除写了一个墓碑,它只会在接下来的主要压实运行之后消失。假设你做删除一切⇐t后你做一个新的时间戳⇐t .这把,即使它发生在删除后,将删除蒙面的墓碑。执行put不会失败,但是当你做一个get时,你会注意到put没有效果。它将在主压缩运行后重新开始工作。如果您使用总是递增的新版本,这些问题不应该成为问题。但是,即使你不关心时间,它们也会发生:只需要立即删除和放置,并且在相同的毫秒内就有可能发生。

28.3.2. Major compactions change query results

28.3.2。主要的压缩会改变查询结果。

…​create three cell versions at t1, t2 and t3, with a maximum-versions setting of 2. So when getting all versions, only the values at t2 and t3 will be returned. But if you delete the version at t2 or t3, the one at t1 will appear again. Obviously, once a major compaction has run, such behavior will not be the case anymore…​ (See Garbage Collection in Bending time in HBase.)

在t1、t2和t3中创建三个单元版本,最大版本设置为2。所以当得到所有的版本时,只有t2和t3的值会被返回。但是如果你在t2或t3中删除这个版本,t1的那个将会再次出现。很明显,一旦一个主要的压缩运行,这样的行为就不再是这样了…(在HBase的弯曲时间里看到垃圾收集)。

29. Sort Order

29。排序顺序

All data model operations HBase return data in sorted order. First by row, then by ColumnFamily, followed by column qualifier, and finally timestamp (sorted in reverse, so newest records are returned first).

所有数据模型操作HBase返回数据的排序顺序。首先是行,然后是ColumnFamily,然后是列限定符,最后是时间戳(以反向排序,所以最新的记录是先返回的)。

30. Column Metadata

30.列元数据

There is no store of column metadata outside of the internal KeyValue instances for a ColumnFamily. Thus, while HBase can support not only a wide number of columns per row, but a heterogeneous set of columns between rows as well, it is your responsibility to keep track of the column names.

在ColumnFamily的内部KeyValue实例之外,不存在列元数据存储。因此,虽然HBase不仅可以支持每行中大量的列,而且还可以支持行之间的异构列,但您有责任跟踪列名。

The only way to get a complete set of columns that exist for a ColumnFamily is to process all the rows. For more information about how HBase stores data internally, see keyvalue.

获得一个列的完整的列的唯一方法是处理所有的行。有关HBase在内部如何存储数据的更多信息,请参见keyvalue。

31. Joins

31日。连接

Whether HBase supports joins is a common question on the dist-list, and there is a simple answer: it doesn’t, at not least in the way that RDBMS' support them (e.g., with equi-joins or outer-joins in SQL). As has been illustrated in this chapter, the read data model operations in HBase are Get and Scan.

HBase是否支持连接在列表中是一个常见的问题,并且有一个简单的答案:它没有,至少在RDBMS支持它们的方式上(例如,在SQL中使用equijoin或outer连接)。如本章所述,HBase中的读取数据模型操作是Get和Scan。

However, that doesn’t mean that equivalent join functionality can’t be supported in your application, but you have to do it yourself. The two primary strategies are either denormalizing the data upon writing to HBase, or to have lookup tables and do the join between HBase tables in your application or MapReduce code (and as RDBMS' demonstrate, there are several strategies for this depending on the size of the tables, e.g., nested loops vs. hash-joins). So which is the best approach? It depends on what you are trying to do, and as such there isn’t a single answer that works for every use case.

但是,这并不意味着在您的应用程序中不能支持等价的连接功能,但是您必须自己完成它。两个主要策略是denormalizing数据写入HBase,或查找表和HBase表之间的连接应用程序或MapReduce代码(如RDBMS的演示,有几种策略取决于表的大小,例如,嵌套循环比散列连接)。那么哪种方法才是最好的呢?这取决于你想要做什么,因此,对于每个用例都没有一个有效的答案。

32. ACID

32。酸

See ACID Semantics. Lars Hofhansl has also written a note on ACID in HBase.

看到酸语义。Lars Hofhansl也在HBase上写了关于酸的注释。

HBase and Schema Design

HBase和模式设计

A good introduction on the strength and weaknesses modelling on the various non-rdbms datastores is to be found in Ian Varley’s Master thesis, No Relation: The Mixed Blessings of Non-Relational Databases. It is a little dated now but a good background read if you have a moment on how HBase schema modeling differs from how it is done in an RDBMS. Also, read keyvalue for how HBase stores data internally, and the section on schema.casestudies.

在Ian Varley的硕士论文中,我们可以很好地介绍各种非关系型数据库的优点和缺点,没有关系:非关系数据库的混合的好处。它现在有点过时了,但是如果您有一个关于HBase模式建模与在RDBMS中是如何不同的时间,那么它将是一个很好的背景。另外,请阅读keyvalue,以了解HBase如何在内部存储数据,以及schema.casestudies的部分。

The documentation on the Cloud Bigtable website, Designing Your Schema, is pertinent and nicely done and lessons learned there equally apply here in HBase land; just divide any quoted values by ~10 to get what works for HBase: e.g. where it says individual values can be ~10MBs in size, HBase can do similar — perhaps best to go smaller if you can — and where it says a maximum of 100 column families in Cloud Bigtable, think ~10 when modeling on HBase.

云Bigtable网站上的文档,设计您的模式,是恰当的,很好地完成了,并且在HBase中也同样适用于这里的经验教训;只是任何引用值除以~ 10为HBase工作:例如,它说可以~ 10个人值大小mbs,HBase最好做类似的——也许可以更小的如果你可以和它说最多100列家庭云Bigtable,认为~ 10 HBase建模时。

See also Robert Yokota’s HBase Application Archetypes (an update on work done by other HBasers), for a helpful categorization of use cases that do well on top of the HBase model.

请参阅Robert Yokota的HBase应用程序原型(其他HBasers所做的工作更新),以帮助对在HBase模型之上做得更好的用例进行分类。

33. Schema Creation

33。创建模式

HBase schemas can be created or updated using the The Apache HBase Shell or by using Admin in the Java API.

可以使用Apache HBase Shell或在Java API中使用Admin来创建或更新HBase模式。

Tables must be disabled when making ColumnFamily modifications, for example:

在制作ColumnFamily的修改时必须禁用表,例如:

Configuration config = HBaseConfiguration.create();
Admin admin = new Admin(conf);
TableName table = TableName.valueOf("myTable");

admin.disableTable(table);

HColumnDescriptor cf1 = ...;
admin.addColumn(table, cf1);      // adding new ColumnFamily
HColumnDescriptor cf2 = ...;
admin.modifyColumn(table, cf2);    // modifying existing ColumnFamily

admin.enableTable(table);

See client dependencies for more information about configuring client connections.

有关配置客户端连接的更多信息,请参见客户机依赖关系。

online schema changes are supported in the 0.92.x codebase, but the 0.90.x codebase requires the table to be disabled.

33.1. Schema Updates

33.1。模式更新

When changes are made to either Tables or ColumnFamilies (e.g. region size, block size), these changes take effect the next time there is a major compaction and the StoreFiles get re-written.

当对表或columnfamily(例如区域大小、块大小)进行更改时,这些更改将在下一次发生重大压缩时生效,并重新编写存储文件。

See store for more information on StoreFiles.

有关存储文件的更多信息,请参见存储。

34. Table Schema Rules Of Thumb

34。表模式的经验法则。

There are many different data sets, with different access patterns and service-level expectations. Therefore, these rules of thumb are only an overview. Read the rest of this chapter to get more details after you have gone through this list.

有许多不同的数据集,有不同的访问模式和服务级别的期望。因此,这些经验法则只是一个概述。阅读本章的其余部分,以获得更多的细节。

  • Aim to have regions sized between 10 and 50 GB.

    目标区域大小在10到50 GB之间。

  • Aim to have cells no larger than 10 MB, or 50 MB if you use mob. Otherwise, consider storing your cell data in HDFS and store a pointer to the data in HBase.

    如果您使用mob,目标是没有大于10 MB的单元,或者50 MB。否则,考虑将您的单元数据存储在HDFS中,并在HBase中存储一个指向数据的指针。

  • A typical schema has between 1 and 3 column families per table. HBase tables should not be designed to mimic RDBMS tables.

    一个典型的模式在每个表中有1到3个列。HBase表不应该被设计成模拟RDBMS表。

  • Around 50-100 regions is a good number for a table with 1 or 2 column families. Remember that a region is a contiguous segment of a column family.

    大约50-100个区域对于有1或2个列族的表来说是一个很好的数字。记住,一个区域是一个列族的连续部分。

  • Keep your column family names as short as possible. The column family names are stored for every value (ignoring prefix encoding). They should not be self-documenting and descriptive like in a typical RDBMS.

    尽量缩短你的列姓。列家族名存储为每个值(忽略前缀编码)。它们不应该像典型的RDBMS那样自我记录和描述。

  • If you are storing time-based machine data or logging information, and the row key is based on device ID or service ID plus time, you can end up with a pattern where older data regions never have additional writes beyond a certain age. In this type of situation, you end up with a small number of active regions and a large number of older regions which have no new writes. For these situations, you can tolerate a larger number of regions because your resource consumption is driven by the active regions only.

    如果您正在存储基于时间的机器数据或日志信息,而行键是基于设备ID或服务ID加上时间的,那么您就可以使用一种模式,在这个模式中,较老的数据区域在一定的年龄之外不会有额外的写入。在这种情况下,你会得到少数活跃的区域和大量没有新写的旧区域。对于这些情况,您可以容忍更多的区域,因为您的资源消耗仅由活动区域驱动。

  • If only one column family is busy with writes, only that column family accomulates memory. Be aware of write patterns when allocating resources.

    如果只有一个列家庭忙于编写,那么只有这个列家庭容纳内存。在分配资源时要注意编写模式。

RegionServer Sizing Rules of Thumb

区域服务器的大小规则。

Lars Hofhansl wrote a great blog post about RegionServer memory sizing. The upshot is that you probably need more memory than you think you need. He goes into the impact of region size, memstore size, HDFS replication factor, and other things to check.

Lars Hofhansl写了一篇关于区域服务器内存大小的博文。结果是,你可能需要比你认为你需要的更多的记忆。他深入研究了区域大小、memstore大小、HDFS复制因子以及其他需要检查的东西。

Personally I would place the maximum disk space per machine that can be served exclusively with HBase around 6T, unless you have a very read-heavy workload. In that case the Java heap should be 32GB (20G regions, 128M memstores, the rest defaults).

我个人认为,除非你有非常繁重的工作,否则每台机器的最大磁盘空间只能在6T左右。在这种情况下,Java堆应该是32GB (20G区域,128M memstores,其余的缺省值)。

— Lars Hofhansl
http://hadoop-hbase.blogspot.com/2013/01/hbase-region-server-memory-sizing.html

35. On the number of column families

35。关于列族的数量。

HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low. Currently, flushing and compactions are done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed even though the amount of data they carry is small. When many column families exist the flushing and compaction interaction can make for a bunch of needless i/o (To be addressed by changing flushing and compaction to work on a per column family basis). For more information on compactions, see Compaction.

HBase目前在两三个列的家庭中都不太好,所以在您的模式中保持列族的数量很低。目前,冲洗和压实是在每个区域的基础上进行的,因此,如果一个列族携带大量的数据带来了冲洗,即使他们携带的数据量很小,相邻的家庭也会被刷新。当许多列家族存在时,刷新和压实交互可以使一堆不必要的i/o(通过更改刷新和压缩以在每个列的家庭基础上工作)。有关Compaction的更多信息,请参见Compaction。

Try to make do with one column family if you can in your schemas. Only introduce a second and third column family in the case where data access is usually column scoped; i.e. you query one column family or the other but usually not both at the one time.

如果您可以在模式中使用一个列族,那么就尝试使用它。在数据访问通常为列范围的情况下,只引入第二和第三列家族;也就是说,你查询一个列的家庭或另一个,但通常不是两个都在同一时间。

35.1. Cardinality of ColumnFamilies

35.1。基数的ColumnFamilies

Where multiple ColumnFamilies exist in a single table, be aware of the cardinality (i.e., number of rows). If ColumnFamilyA has 1 million rows and ColumnFamilyB has 1 billion rows, ColumnFamilyA’s data will likely be spread across many, many regions (and RegionServers). This makes mass scans for ColumnFamilyA less efficient.

在单个表中存在多个ColumnFamilies时,请注意基数(即:的行数)。如果ColumnFamilyA有100万行,而ColumnFamilyB有10亿行,那么ColumnFamilyA的数据很可能会分布在许多区域(和区域服务器)。这使得对ColumnFamilyA的大规模扫描效率降低。

36. Rowkey Design

36。Rowkey设计

36.1. Hotspotting

36.1。热点

Rows in HBase are sorted lexicographically by row key. This design optimizes for scans, allowing you to store related rows, or rows that will be read together, near each other. However, poorly designed row keys are a common source of hotspotting. Hotspotting occurs when a large amount of client traffic is directed at one node, or only a few nodes, of a cluster. This traffic may represent reads, writes, or other operations. The traffic overwhelms the single machine responsible for hosting that region, causing performance degradation and potentially leading to region unavailability. This can also have adverse effects on other regions hosted by the same region server as that host is unable to service the requested load. It is important to design data access patterns such that the cluster is fully and evenly utilized.

HBase中的行由行键按字母顺序排序。该设计优化了扫描,允许您存储相关的行,或将一起阅读的行,彼此相邻。然而,设计糟糕的行键是热点的常见来源。当大量的客户端流量定向到一个节点(或仅几个节点)时,就会出现“热点识别”。此流量可以表示读、写或其他操作。流量超过了负责托管该区域的单一机器,导致性能下降,并可能导致区域不可用。这也会对同一区域服务器承载的其他区域产生不利影响,因为主机无法服务请求的负载。设计数据访问模式是很重要的,这样集群就可以得到充分和均匀的利用。

To prevent hotspotting on writes, design your row keys such that rows that truly do need to be in the same region are, but in the bigger picture, data is being written to multiple regions across the cluster, rather than one at a time. Some common techniques for avoiding hotspotting are described below, along with some of their advantages and drawbacks.

为了防止在编写时出现hotspotting,请设计行键,以便在同一区域中真正需要的行是相同的,但是在更大的情况下,数据将被写到集群中的多个区域,而不是一次一个。下面将介绍一些避免热识别的常用技术,以及它们的一些优点和缺点。

Salting

Salting in this sense has nothing to do with cryptography, but refers to adding random data to the start of a row key. In this case, salting refers to adding a randomly-assigned prefix to the row key to cause it to sort differently than it otherwise would. The number of possible prefixes correspond to the number of regions you want to spread the data across. Salting can be helpful if you have a few "hot" row key patterns which come up over and over amongst other more evenly-distributed rows. Consider the following example, which shows that salting can spread write load across multiple RegionServers, and illustrates some of the negative implications for reads.

在这个意义上,Salting与加密无关,而是指将随机数据添加到行键的开头。在这种情况下,salting指的是在行键中添加一个随机分配的前缀,以使它以不同于其他方式的方式排序。可能的前缀的数量与您希望将数据传播的区域数量相对应。如果您有几个“热”行键模式,这些模式在其他更均匀分布的行中反复出现,那么Salting可能会有帮助。请考虑下面的示例,它显示了salting可以跨多个区域服务器传播写负载,并演示了一些对读取的负面影响。

Example 15. Salting Example

Suppose you have the following list of row keys, and your table is split such that there is one region for each letter of the alphabet. Prefix 'a' is one region, prefix 'b' is another. In this table, all rows starting with 'f' are in the same region. This example focuses on rows with keys like the following:

假设您有下面的行键列表,并且您的表被拆分,这样每个字母表的每个字母都有一个区域。前缀“a”是一个区域,前缀“b”是另一个区域。在该表中,以“f”开头的所有行位于同一区域。这个例子关注的是带键的行:

foo0001
foo0002
foo0003
foo0004

Now, imagine that you would like to spread these across four different regions. You decide to use four different salts: a, b, c, and d. In this scenario, each of these letter prefixes will be on a different region. After applying the salts, you have the following rowkeys instead. Since you can now write to four separate regions, you theoretically have four times the throughput when writing that you would have if all the writes were going to the same region.

现在,想象一下,你想要把它们分散到四个不同的区域。您决定使用四种不同的盐:a、b、c和d。在这种情况下,每个字母前缀将位于不同的区域。在应用了这些盐之后,您将得到以下的rowkeys。既然你现在可以写四个不同的区域,理论上你写的时候你会有4倍的吞吐量,如果所有的写都是在同一个区域。

a-foo0003
b-foo0001
c-foo0004
d-foo0002

Then, if you add another row, it will randomly be assigned one of the four possible salt values and end up near one of the existing rows.

然后,如果您添加另一行,它将随机分配四种可能的盐值之一,并在现有的行中结束。

a-foo0003
b-foo0001
c-foo0003
c-foo0004
d-foo0002

Since this assignment will be random, you will need to do more work if you want to retrieve the rows in lexicographic order. In this way, salting attempts to increase throughput on writes, but has a cost during reads.

由于这个任务是随机的,如果你想要在字典顺序中检索行,你需要做更多的工作。通过这种方式,salting试图增加写操作的吞吐量,但是在读取过程中有成本。

Hashing

Instead of a random assignment, you could use a one-way hash that would cause a given row to always be "salted" with the same prefix, in a way that would spread the load across the RegionServers, but allow for predictability during reads. Using a deterministic hash allows the client to reconstruct the complete rowkey and use a Get operation to retrieve that row as normal.

您可以使用单向散列,而不是随机分配,这样可以使给定的行始终以相同的前缀“加盐”,这样可以在区域服务器上分散负载,但是在读取期间允许可预测性。使用确定性哈希允许客户端重构完整的rowkey,并使用Get操作来恢复正常的行。

Example 16. Hashing Example
Given the same situation in the salting example above, you could instead apply a one-way hash that would cause the row with key foo0003 to always, and predictably, receive the a prefix. Then, to retrieve that row, you would already know the key. You could also optimize things so that certain pairs of keys were always in the same region, for instance.
Reversing the Key

A third common trick for preventing hotspotting is to reverse a fixed-width or numeric row key so that the part that changes the most often (the least significant digit) is first. This effectively randomizes row keys, but sacrifices row ordering properties.

防止热斑的第三个常见的技巧是,反转固定宽度或数字行键,使最常发生变化的部分(最不重要的数字)是第一个。这有效地随机化了行键,但牺牲了行排序属性。

See https://communities.intel.com/community/itpeernetwork/datastack/blog/2013/11/10/discussion-on-designing-hbase-tables, and article on Salted Tables from the Phoenix project, and the discussion in the comments of HBASE-11682 for more information about avoiding hotspotting.

请参阅https://community.intel.com/community/itpeernetwork/datastack/blog/2013/11/10/discussion -on-design -hbase- Tables,并在“凤凰”项目中对盐表进行讨论,并在HBASE-11682的评论中进行讨论,以获得更多关于避免热点的信息。

36.2. Monotonically Increasing Row Keys/Timeseries Data

36.2。单调递增的行键/Timeseries数据。

In the HBase chapter of Tom White’s book Hadoop: The Definitive Guide (O’Reilly) there is a an optimization note on watching out for a phenomenon where an import process walks in lock-step with all clients in concert pounding one of the table’s regions (and thus, a single node), then moving onto the next region, etc. With monotonically increasing row-keys (i.e., using a timestamp), this will happen. See this comic by IKai Lan on why monotonically increasing row keys are problematic in BigTable-like datastores: monotonically increasing values are bad. The pile-up on a single region brought on by monotonically increasing keys can be mitigated by randomizing the input records to not be in sorted order, but in general it’s best to avoid using a timestamp or a sequence (e.g. 1, 2, 3) as the row-key.

汤姆的HBase章白的书Hadoop:明确的指南(O ' reilly)有一个优化注意看了一个现象,一个导入过程走在同步与所有客户共同冲击表的地区之一(因此,单个节点),然后移动到下一个区域,与单调递增的行键(即等等。。(使用时间戳),这将发生。看一看IKai Lan的漫画,为什么单调递增的行键在bigtable类的数据存储中是有问题的:单调递增的值是不好的。通过对输入记录进行随机化,可以减轻单个区域上单调递增的键所带来的堆积,但一般来说,最好避免使用时间戳或序列(例如,1、2、3)作为行键。

If you do need to upload time series data into HBase, you should study OpenTSDB as a successful example. It has a page describing the schema it uses in HBase. The key format in OpenTSDB is effectively [metric_type][event_timestamp], which would appear at first glance to contradict the previous advice about not using a timestamp as the key. However, the difference is that the timestamp is not in the lead position of the key, and the design assumption is that there are dozens or hundreds (or more) of different metric types. Thus, even with a continual stream of input data with a mix of metric types, the Puts are distributed across various points of regions in the table.

如果您确实需要将时间序列数据上传到HBase中,那么您应该将OpenTSDB作为一个成功的例子。它有一个描述它在HBase中使用的模式的页面。OpenTSDB中的关键格式是有效的[metric_type][event_timestamp],它会在第一眼看上去与之前的建议相矛盾,即不使用时间戳作为键。但是,不同之处在于时间戳不在键的主要位置,而设计假设是有几十个或数百个(或更多)不同的度量类型。因此,即使输入数据的连续流与度量类型混合,也会分布在表中各个区域的位置上。

See schema.casestudies for some rowkey design examples.

看到模式。对一些rowkey设计示例的案例研究。

36.3. Try to minimize row and column sizes

36.3。尽量减少行和列的大小。

In HBase, values are always freighted with their coordinates; as a cell value passes through the system, it’ll be accompanied by its row, column name, and timestamp - always. If your rows and column names are large, especially compared to the size of the cell value, then you may run up against some interesting scenarios. One such is the case described by Marc Limotte at the tail of HBASE-3551 (recommended!). Therein, the indices that are kept on HBase storefiles (StoreFile (HFile)) to facilitate random access may end up occupying large chunks of the HBase allotted RAM because the cell value coordinates are large. Mark in the above cited comment suggests upping the block size so entries in the store file index happen at a larger interval or modify the table schema so it makes for smaller rows and column names. Compression will also make for larger indices. See the thread a question storefileIndexSize up on the user mailing list.

在HBase中,值总是与它们的坐标相匹配;当一个单元格值通过该系统时,它将伴随它的行、列名称和时间戳——始终。如果您的行和列名很大,特别是与单元格值的大小相比,那么您可能会遇到一些有趣的情况。其中一个例子是Marc Limotte在HBASE-3551(推荐!)的尾巴上描述的。其中,保存在HBase storefiles (StoreFile (HFile))上的索引可以方便随机访问,最终可能占用大量的HBase分配RAM,因为单元值坐标很大。在上面引用的注释中,Mark建议增加块大小,这样,存储文件索引中的条目就会发生在更大的间隔中,或者修改表模式,这样它就可以为较小的行和列名进行修改。压缩也将为更大的指标。在用户邮件列表上查看一个问题storefileIndexSize。

Most of the time small inefficiencies don’t matter all that much. Unfortunately, this is a case where they do. Whatever patterns are selected for ColumnFamilies, attributes, and rowkeys they could be repeated several billion times in your data.

大多数时候,小的低效率并不重要。不幸的是,这是他们的一个例子。无论对ColumnFamilies、属性和rowkeys选择何种模式,它们都可以在您的数据中重复几十亿次。

See keyvalue for more information on HBase stores data internally to see why this is important.

有关HBase存储数据的更多信息,请参见keyvalue,以了解其重要性。

36.3.1. Column Families

36.3.1。列的家庭

Try to keep the ColumnFamily names as small as possible, preferably one character (e.g. "d" for data/default).

尽量使列族名尽可能小,最好是一个字符(例如:数据/默认的“d”)。

See KeyValue for more information on HBase stores data internally to see why this is important.

有关HBase存储数据的更多信息,请参见KeyValue,以了解其重要性。

36.3.2. Attributes

36.3.2。属性

Although verbose attribute names (e.g., "myVeryImportantAttribute") are easier to read, prefer shorter attribute names (e.g., "via") to store in HBase.

尽管详细的属性名称(例如,“myVeryImportantAttribute”)更容易阅读,但更喜欢较短的属性名称(例如,“via”)来存储在HBase中。

See keyvalue for more information on HBase stores data internally to see why this is important.

有关HBase存储数据的更多信息,请参见keyvalue,以了解其重要性。

36.3.3. Rowkey Length

36.3.3。Rowkey长度

Keep them as short as is reasonable such that they can still be useful for required data access (e.g. Get vs. Scan). A short key that is useless for data access is not better than a longer key with better get/scan properties. Expect tradeoffs when designing rowkeys.

让它们尽可能短,这样它们仍然可以用于需要的数据访问(例如,Get和Scan)。一个对数据访问无用的短键并不比一个更长的键更好的获取/扫描属性好。在设计行键时,需要权衡。

36.3.4. Byte Patterns

36.3.4。字节模式

A long is 8 bytes. You can store an unsigned number up to 18,446,744,073,709,551,615 in those eight bytes. If you stored this number as a String — presuming a byte per character — you need nearly 3x the bytes.

一个长是8个字节。在这8个字节中,您可以将未签名的数字存储为18,446,744,073,709,551,615。如果将这个数字存储为字符串,假设每个字符为一个字节,那么您需要的字节数接近3倍。

Not convinced? Below is some sample code that you can run on your own.

不相信吗?下面是一些您可以自己运行的示例代码。

// long
//
long l = 1234567890L;
byte[] lb = Bytes.toBytes(l);
System.out.println("long bytes length: " + lb.length);   // returns 8

String s = String.valueOf(l);
byte[] sb = Bytes.toBytes(s);
System.out.println("long as string length: " + sb.length);    // returns 10

// hash
//
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] digest = md.digest(Bytes.toBytes(s));
System.out.println("md5 digest bytes length: " + digest.length);    // returns 16

String sDigest = new String(digest);
byte[] sbDigest = Bytes.toBytes(sDigest);
System.out.println("md5 digest as string length: " + sbDigest.length);    // returns 26

Unfortunately, using a binary representation of a type will make your data harder to read outside of your code. For example, this is what you will see in the shell when you increment a value:

不幸的是,使用一种类型的二进制表示将使您的数据更难在代码之外读取。例如,当您增加一个值时,您将在shell中看到:

hbase(main):001:0> incr 't', 'r', 'f:q', 1
COUNTER VALUE = 1

hbase(main):002:0> get 't', 'r'
COLUMN                                        CELL
 f:q                                          timestamp=1369163040570, value=\x00\x00\x00\x00\x00\x00\x00\x01
1 row(s) in 0.0310 seconds

The shell makes a best effort to print a string, and it this case it decided to just print the hex. The same will happen to your row keys inside the region names. It can be okay if you know what’s being stored, but it might also be unreadable if arbitrary data can be put in the same cells. This is the main trade-off.

shell为打印字符串做出了最大的努力,并且它决定只打印十六进制。同样的情况也会发生在区域名称的行键中。如果您知道存储了什么,那么它可能是可以的,但是如果可以将任意数据放入相同的单元中,那么它也可能是不可读的。这是主要的权衡。

36.4. Reverse Timestamps

36.4。反向时间戳

Reverse Scan API

HBASE-4811 implements an API to scan a table or a range within a table in reverse, reducing the need to optimize your schema for forward or reverse scanning. This feature is available in HBase 0.98 and later. See Scan.setReversed() for more information.

HBASE-4811实现了一个API,可以在一个表中反向扫描一个表或一个范围,从而减少对前向或反向扫描优化模式的需求。该特性在HBase 0.98和以后可用。要了解更多信息,请参阅scan.setre()。

A common problem in database processing is quickly finding the most recent version of a value. A technique using reverse timestamps as a part of the key can help greatly with a special case of this problem. Also found in the HBase chapter of Tom White’s book Hadoop: The Definitive Guide (O’Reilly), the technique involves appending (Long.MAX_VALUE - timestamp) to the end of any key, e.g. [key][reverse_timestamp].

数据库处理中的一个常见问题是快速找到最新版本的值。使用反向时间戳作为键的一部分的技术可以极大地帮助解决这个问题的特殊情况。在汤姆·怀特的书《Hadoop:权威指南》(O 'Reilly)的HBase章节中也有发现,该技术包括附加(Long)。MAX_VALUE - timestamp)到任何键的末尾,例如[key][reverse_timestamp]。

The most recent value for [key] in a table can be found by performing a Scan for [key] and obtaining the first record. Since HBase keys are in sorted order, this key sorts before any older row-keys for [key] and thus is first.

通过对[key]进行扫描并获得第一个记录,可以找到表中[key]的最近值。由于HBase键是按排序顺序排列的,所以这一键在任何老的行键之前(键)之前排序,因此是第一个键。

This technique would be used instead of using Number of Versions where the intent is to hold onto all versions "forever" (or a very long time) and at the same time quickly obtain access to any other version by using the same Scan technique.

这种技术将会被使用,而不是使用许多版本,其中的目的是“永远”(或很长时间)保存所有版本,同时通过使用相同的扫描技术快速获得对任何其他版本的访问。

36.5. Rowkeys and ColumnFamilies

36.5。Rowkeys和ColumnFamilies

Rowkeys are scoped to ColumnFamilies. Thus, the same rowkey could exist in each ColumnFamily that exists in a table without collision.

Rowkeys被限定在ColumnFamilies中。因此,相同的行键可以存在于一个没有冲突的表中。

36.6. Immutability of Rowkeys

36.6。不变性的Rowkeys

Rowkeys cannot be changed. The only way they can be "changed" in a table is if the row is deleted and then re-inserted. This is a fairly common question on the HBase dist-list so it pays to get the rowkeys right the first time (and/or before you’ve inserted a lot of data).

Rowkeys不能改变。它们在表中“更改”的惟一方法是,如果行被删除,然后重新插入。这在HBase列表中是一个相当常见的问题,因此在第一次(以及/或插入大量数据之前)获得rowkeys是值得的。

36.7. Relationship Between RowKeys and Region Splits

36.7。RowKeys与区域分割的关系。

If you pre-split your table, it is critical to understand how your rowkey will be distributed across the region boundaries. As an example of why this is important, consider the example of using displayable hex characters as the lead position of the key (e.g., "0000000000000000" to "ffffffffffffffff"). Running those key ranges through Bytes.split (which is the split strategy used when creating regions in Admin.createTable(byte[] startKey, byte[] endKey, numRegions) for 10 regions will generate the following splits…​

如果您预先分割了您的表,那么理解您的rowkey将如何分布到整个区域边界是非常重要的。作为一个重要的例子,考虑使用可显示的十六进制字符作为键的主要位置的例子(例如,“0000000000000000”到“ffffffffffffffffffff”)。通过字节来运行这些键。split(在管理中创建区域时使用的拆分策略)。10个区域的createTable(byte[] startKey, byte[] endKey, numRegions)将产生以下的分割…

48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48                                // 0
54 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10                 // 6
61 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -68                 // =
68 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -126  // D
75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 72                                // K
82 18 18 18 18 18 18 18 18 18 18 18 18 18 18 14                                // R
88 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -44                 // X
95 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -102                // _
102 102 102 102 102 102 102 102 102 102 102 102 102 102 102 102                // f

(note: the lead byte is listed to the right as a comment.) Given that the first split is a '0' and the last split is an 'f', everything is great, right? Not so fast.

(注意:标题字节被列在右侧作为注释。)考虑到第一个分割是一个“0”而最后一个分割是一个“f”,一切都很好,对吧?没有那么快。

The problem is that all the data is going to pile up in the first 2 regions and the last region thus creating a "lumpy" (and possibly "hot") region problem. To understand why, refer to an ASCII Table. '0' is byte 48, and 'f' is byte 102, but there is a huge gap in byte values (bytes 58 to 96) that will never appear in this keyspace because the only values are [0-9] and [a-f]. Thus, the middle regions will never be used. To make pre-splitting work with this example keyspace, a custom definition of splits (i.e., and not relying on the built-in split method) is required.

问题是,所有的数据都将堆积在前两个区域和最后一个区域,从而造成一个“块状”(可能是“热”)区域问题。要理解原因,请参考ASCII表。“0”是字节48,而“f”是字节102,但是字节值(字节58到96)之间有一个巨大的空白,因为唯一的值是[0-9]和[a-f],所以它永远不会出现在这个密钥空间中。因此,中间区域将永远不会被使用。要使用这个示例keyspace进行预分解工作,可以定义分割(也就是)。,并且不依赖内置的分割方法)是必需的。

Lesson #1: Pre-splitting tables is generally a best practice, but you need to pre-split them in such a way that all the regions are accessible in the keyspace. While this example demonstrated the problem with a hex-key keyspace, the same problem can happen with any keyspace. Know your data.

第1课:预分解表通常是最佳实践,但您需要以这样一种方式预分解它们,使所有区域都可以在keyspace中访问。这个例子演示了一个hex密钥空间的问题,同样的问题也可能发生在任何密钥空间中。知道你的数据。

Lesson #2: While generally not advisable, using hex-keys (and more generally, displayable data) can still work with pre-split tables as long as all the created regions are accessible in the keyspace.

第2课:虽然通常不可取,但是使用hex键(更一般地说,可显示数据)仍然可以使用预分割表,只要在keyspace中可以访问所有创建的区域。

To conclude this example, the following is an example of how appropriate splits can be pre-created for hex-keys:.

为了完成这个示例,下面是一个示例,说明如何为hex键预先创建适当的分割:。

public static boolean createTable(Admin admin, HTableDescriptor table, byte[][] splits)
throws IOException {
  try {
    admin.createTable( table, splits );
    return true;
  } catch (TableExistsException e) {
    logger.info("table " + table.getNameAsString() + " already exists");
    // the table already exists...
    return false;
  }
}

public static byte[][] getHexSplits(String startKey, String endKey, int numRegions) {
  byte[][] splits = new byte[numRegions-1][];
  BigInteger lowestKey = new BigInteger(startKey, 16);
  BigInteger highestKey = new BigInteger(endKey, 16);
  BigInteger range = highestKey.subtract(lowestKey);
  BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions));
  lowestKey = lowestKey.add(regionIncrement);
  for(int i=0; i < numRegions-1;i++) {
    BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
    byte[] b = String.format("%016x", key).getBytes();
    splits[i] = b;
  }
  return splits;
}

37. Number of Versions

37岁。数量的版本

37.1. Maximum Number of Versions

37.1。最大数量的版本

The maximum number of row versions to store is configured per column family via HColumnDescriptor. The default for max versions is 1. This is an important parameter because as described in Data Model section HBase does not overwrite row values, but rather stores different values per row by time (and qualifier). Excess versions are removed during major compactions. The number of max versions may need to be increased or decreased depending on application needs.

每个列家族通过HColumnDescriptor配置存储的行版本的最大数量。max版本的默认值是1。这是一个重要的参数,因为正如数据模型部分HBase所描述的那样,它不会覆盖行值,而是按时间(和限定符)存储不同的值。在主要的压缩过程中删除多余的版本。根据应用程序的需要,可能需要增加或减少max版本的数量。

It is not recommended setting the number of max versions to an exceedingly high level (e.g., hundreds or more) unless those old values are very dear to you because this will greatly increase StoreFile size.

不建议将max版本的数量设置为非常高的级别(例如,数百或更多),除非这些旧值对您非常重要,因为这会大大增加StoreFile的大小。

37.2. Minimum Number of Versions

37.2。最小数量的版本

Like maximum number of row versions, the minimum number of row versions to keep is configured per column family via HColumnDescriptor. The default for min versions is 0, which means the feature is disabled. The minimum number of row versions parameter is used together with the time-to-live parameter and can be combined with the number of row versions parameter to allow configurations such as "keep the last T minutes worth of data, at most N versions, but keep at least M versions around" (where M is the value for minimum number of row versions, M<N). This parameter should only be set when time-to-live is enabled for a column family and must be less than the number of row versions.

与最大行版本数一样,通过HColumnDescriptor将每个列家庭配置的行版本的最小数量。最小版本的默认值是0,这意味着该特性是禁用的。最小数量的行版本参数是与生存时间参数一起使用,可以结合行版本参数允许的数量配置如“保持最后T分钟的数据,最多N版本,但至少保持M版本”(M的值为最小数量的行版本,M < N)。此参数只在对列族启用时才设置,并且必须小于行版本的数量。

38. Supported Datatypes

38。支持的数据类型

HBase supports a "bytes-in/bytes-out" interface via Put and Result, so anything that can be converted to an array of bytes can be stored as a value. Input could be strings, numbers, complex objects, or even images as long as they can rendered as bytes.

HBase通过Put和Result支持“bytes-in/ byts -out”接口,因此可以将任何可以转换成字节数组的内容存储为一个值。输入可以是字符串、数字、复杂对象,甚至是图像,只要它们能以字节的形式呈现。

There are practical limits to the size of values (e.g., storing 10-50MB objects in HBase would probably be too much to ask); search the mailing list for conversations on this topic. All rows in HBase conform to the Data Model, and that includes versioning. Take that into consideration when making your design, as well as block size for the ColumnFamily.

对于值的大小有实际的限制(例如,在HBase中存储10-50MB的对象可能会要求太多);在邮件列表中搜索关于这个主题的对话。HBase中的所有行都符合数据模型,包括版本控制。在设计时要考虑到这一点,也要考虑到ColumnFamily的块大小。

38.1. Counters

38.1。计数器

One supported datatype that deserves special mention are "counters" (i.e., the ability to do atomic increments of numbers). See Increment in Table.

一个值得特别提及的支持数据类型是“计数器”。,也就是对数字进行原子增量的能力。看到增量表。

Synchronization on counters are done on the RegionServer, not in the client.

计数器上的同步是在区域服务器上完成的,而不是在客户机上。

39. Joins

39岁。连接

If you have multiple tables, don’t forget to factor in the potential for Joins into the schema design.

如果您有多个表,请不要忘记考虑加入模式设计的可能性。

40. Time To Live (TTL)

40。生存时间(TTL)

ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies to all versions of a row - even the current one. The TTL time encoded in the HBase for the row is specified in UTC.

ColumnFamilies可以在秒内设置TTL长度,HBase在到达过期时间后自动删除行。这适用于行的所有版本——甚至是当前版本。在UTC中指定了在HBase中编码的TTL时间。

Store files which contains only expired rows are deleted on minor compaction. Setting hbase.store.delete.expired.storefile to false disables this feature. Setting minimum number of versions to other than 0 also disables this.

只在较小的压缩过程中删除包含过期行的存储文件。设置hbase.store.delete.expired.storefile对该特性的错误禁用。将最小数量的版本设置为0也可以禁用此功能。

See HColumnDescriptor for more information.

有关更多信息,请参见HColumnDescriptor。

Recent versions of HBase also support setting time to live on a per cell basis. See HBASE-10560 for more information. Cell TTLs are submitted as an attribute on mutation requests (Appends, Increments, Puts, etc.) using Mutation#setTTL. If the TTL attribute is set, it will be applied to all cells updated on the server by the operation. There are two notable differences between cell TTL handling and ColumnFamily TTLs:

HBase的最新版本也支持在每个单元的基础上设置时间。更多信息见HBASE-10560。使用突变#setTTL,将单元TTLs作为一个属性提交给突变请求(附加、增量、放置等)。如果设置了TTL属性,它将被应用到服务器上通过操作更新的所有单元格。TTL处理与ColumnFamily TTLs有两个显著的区别:

  • Cell TTLs are expressed in units of milliseconds instead of seconds.

    单元TTLs以毫秒为单位表示,而不是以秒为单位。

  • A cell TTLs cannot extend the effective lifetime of a cell beyond a ColumnFamily level TTL setting.

    一个细胞TTLs不能将一个细胞的有效寿命延长到一个ColumnFamily level TTL设置之外。

41. Keeping Deleted Cells

41岁。保持删除细胞

By default, delete markers extend back to the beginning of time. Therefore, Get or Scan operations will not see a deleted cell (row or column), even when the Get or Scan operation indicates a time range before the delete marker was placed.

默认情况下,删除标记可以追溯到时间的开始。因此,获取或扫描操作将不会看到被删除的单元格(行或列),即使在获取或扫描操作指示删除标记放置之前的时间范围。

ColumnFamilies can optionally keep deleted cells. In this case, deleted cells can still be retrieved, as long as these operations specify a time range that ends before the timestamp of any delete that would affect the cells. This allows for point-in-time queries even in the presence of deletes.

ColumnFamilies可以选择保留删除的单元格。在这种情况下,仍然可以检索被删除的单元格,只要这些操作指定一个时间范围,在任何将影响单元格的删除时间戳之前结束。即使在删除的情况下,这也允许进行时间点查询。

Deleted cells are still subject to TTL and there will never be more than "maximum number of versions" deleted cells. A new "raw" scan options returns all deleted rows and the delete markers.

已删除的单元仍受TTL的约束,并且永远不会有超过“最大版本”被删除的单元格。一个新的“原始”扫描选项返回所有被删除的行和删除标记。

Example 17. Change the Value of KEEP_DELETED_CELLS Using HBase Shell
hbase> hbase> alter ‘t1′, NAME => ‘f1′, KEEP_DELETED_CELLS => true
Example 18. Change the Value of KEEP_DELETED_CELLS Using the API
...
HColumnDescriptor.setKeepDeletedCells(true);
...

Let us illustrate the basic effect of setting the KEEP_DELETED_CELLS attribute on a table.

让我们说明在表中设置keep_deleted_cell属性的基本效果。

First, without:

首先,没有:

create 'test', {NAME=>'e', VERSIONS=>2147483647}
put 'test', 'r1', 'e:c1', 'value', 10
put 'test', 'r1', 'e:c1', 'value', 12
put 'test', 'r1', 'e:c1', 'value', 14
delete 'test', 'r1', 'e:c1',  11

hbase(main):017:0> scan 'test', {RAW=>true, VERSIONS=>1000}
ROW                                              COLUMN+CELL
 r1                                              column=e:c1, timestamp=14, value=value
 r1                                              column=e:c1, timestamp=12, value=value
 r1                                              column=e:c1, timestamp=11, type=DeleteColumn
 r1                                              column=e:c1, timestamp=10, value=value
1 row(s) in 0.0120 seconds

hbase(main):018:0> flush 'test'
0 row(s) in 0.0350 seconds

hbase(main):019:0> scan 'test', {RAW=>true, VERSIONS=>1000}
ROW                                              COLUMN+CELL
 r1                                              column=e:c1, timestamp=14, value=value
 r1                                              column=e:c1, timestamp=12, value=value
 r1                                              column=e:c1, timestamp=11, type=DeleteColumn
1 row(s) in 0.0120 seconds

hbase(main):020:0> major_compact 'test'
0 row(s) in 0.0260 seconds

hbase(main):021:0> scan 'test', {RAW=>true, VERSIONS=>1000}
ROW                                              COLUMN+CELL
 r1                                              column=e:c1, timestamp=14, value=value
 r1                                              column=e:c1, timestamp=12, value=value
1 row(s) in 0.0120 seconds

Notice how delete cells are let go.

注意删除单元格是如何被释放的。

Now let’s run the same test only with KEEP_DELETED_CELLS set on the table (you can do table or per-column-family):

现在,让我们只在表上设置KEEP_DELETED_CELLS(您可以做表或每个列的家庭)来运行同一个测试:

hbase(main):005:0> create 'test', {NAME=>'e', VERSIONS=>2147483647, KEEP_DELETED_CELLS => true}
0 row(s) in 0.2160 seconds

=> Hbase::Table - test
hbase(main):006:0> put 'test', 'r1', 'e:c1', 'value', 10
0 row(s) in 0.1070 seconds

hbase(main):007:0> put 'test', 'r1', 'e:c1', 'value', 12
0 row(s) in 0.0140 seconds

hbase(main):008:0> put 'test', 'r1', 'e:c1', 'value', 14
0 row(s) in 0.0160 seconds

hbase(main):009:0> delete 'test', 'r1', 'e:c1',  11
0 row(s) in 0.0290 seconds

hbase(main):010:0> scan 'test', {RAW=>true, VERSIONS=>1000}
ROW                                                                                          COLUMN+CELL
 r1                                                                                          column=e:c1, timestamp=14, value=value
 r1                                                                                          column=e:c1, timestamp=12, value=value
 r1                                                                                          column=e:c1, timestamp=11, type=DeleteColumn
 r1                                                                                          column=e:c1, timestamp=10, value=value
1 row(s) in 0.0550 seconds

hbase(main):011:0> flush 'test'
0 row(s) in 0.2780 seconds

hbase(main):012:0> scan 'test', {RAW=>true, VERSIONS=>1000}
ROW                                                                                          COLUMN+CELL
 r1                                                                                          column=e:c1, timestamp=14, value=value
 r1                                                                                          column=e:c1, timestamp=12, value=value
 r1                                                                                          column=e:c1, timestamp=11, type=DeleteColumn
 r1                                                                                          column=e:c1, timestamp=10, value=value
1 row(s) in 0.0620 seconds

hbase(main):013:0> major_compact 'test'
0 row(s) in 0.0530 seconds

hbase(main):014:0> scan 'test', {RAW=>true, VERSIONS=>1000}
ROW                                                                                          COLUMN+CELL
 r1                                                                                          column=e:c1, timestamp=14, value=value
 r1                                                                                          column=e:c1, timestamp=12, value=value
 r1                                                                                          column=e:c1, timestamp=11, type=DeleteColumn
 r1                                                                                          column=e:c1, timestamp=10, value=value
1 row(s) in 0.0650 seconds

KEEP_DELETED_CELLS is to avoid removing Cells from HBase when the only reason to remove them is the delete marker. So with KEEP_DELETED_CELLS enabled deleted cells would get removed if either you write more versions than the configured max, or you have a TTL and Cells are in excess of the configured timeout, etc.

KEEP_DELETED_CELLS是为了避免移除HBase中的单元格,因为删除它们的唯一原因是删除标记。因此,如果您编写的版本比配置的max多,或者您有一个TTL和单元格超过了配置的超时,那么使用KEEP_DELETED_CELLS将会被删除。

42. Secondary Indexes and Alternate Query Paths

42。二级索引和备选查询路径。

This section could also be titled "what if my table rowkey looks like this but I also want to query my table like that." A common example on the dist-list is where a row-key is of the format "user-timestamp" but there are reporting requirements on activity across users for certain time ranges. Thus, selecting by user is easy because it is in the lead position of the key, but time is not.

这个部分也可以命名为“如果我的表rowkey看起来像这样,但是我也想这样查询我的表。”在列表列表中,一个常见的例子是“用户时间戳”格式的行键,但是在特定的时间范围内,对用户的活动有报告要求。因此,用户选择很容易,因为它处于关键位置,但时间不是。

There is no single answer on the best way to handle this because it depends on…​

解决这个问题的最好方法没有单一的答案,因为这取决于……

  • Number of users

    用户数量

  • Data size and data arrival rate

    数据大小和数据到达率。

  • Flexibility of reporting requirements (e.g., completely ad-hoc date selection vs. pre-configured ranges)

    报告需求的灵活性(例如,完全特别的日期选择和预先配置的范围)

  • Desired execution speed of query (e.g., 90 seconds may be reasonable to some for an ad-hoc report, whereas it may be too long for others)

    期望的执行速度(例如,90秒可能对某些特定的报告来说是合理的,而对于其他人来说可能太长了)

and solutions are also influenced by the size of the cluster and how much processing power you have to throw at the solution. Common techniques are in sub-sections below. This is a comprehensive, but not exhaustive, list of approaches.

解决方案也会受到集群规模的影响,以及需要处理多少处理能力。常见的技术在下面的小节中。这是一个全面的,但不是详尽的方法列表。

It should not be a surprise that secondary indexes require additional cluster space and processing. This is precisely what happens in an RDBMS because the act of creating an alternate index requires both space and processing cycles to update. RDBMS products are more advanced in this regard to handle alternative index management out of the box. However, HBase scales better at larger data volumes, so this is a feature trade-off.

次要索引需要额外的集群空间和处理,这并不奇怪。这正是在RDBMS中发生的情况,因为创建替代索引的行为需要空间和处理周期来更新。在这方面,RDBMS产品在处理替代索引管理方面更为先进。但是,HBase在更大的数据量上更有效,因此这是一个特性权衡。

Pay attention to Apache HBase Performance Tuning when implementing any of these approaches.

在实现任何这些方法时,请注意Apache HBase性能调优。

Additionally, see the David Butler response in this dist-list thread HBase, mail # user - Stargate+hbase

另外,在这个列表列表线程HBase、mail # user - Stargate+ HBase中,可以看到David Butler的响应。

42.1. Filter Query

42.1。过滤查询

Depending on the case, it may be appropriate to use Client Request Filters. In this case, no secondary index is created. However, don’t try a full-scan on a large table like this from an application (i.e., single-threaded client).

根据情况不同,使用客户机请求筛选器可能是合适的。在这种情况下,没有创建第二个索引。但是,不要在一个应用程序(也就是)中对一个大型表进行全面扫描。单线程客户)。

42.2. Periodic-Update Secondary Index

42.2。Periodic-Update二级索引

A secondary index could be created in another table which is periodically updated via a MapReduce job. The job could be executed intra-day, but depending on load-strategy it could still potentially be out of sync with the main data table.

可以在另一个表中创建第二个索引,该表通过MapReduce作业定期更新。该作业可以在一天内执行,但取决于负载策略,它仍然可能与主数据表不同步。

See mapreduce.example.readwrite for more information.

看到mapreduce.example。读写的更多信息。

42.3. Dual-Write Secondary Index

42.3。Dual-Write二级索引

Another strategy is to build the secondary index while publishing data to the cluster (e.g., write to data table, write to index table). If this is approach is taken after a data table already exists, then bootstrapping will be needed for the secondary index with a MapReduce job (see secondary.indexes.periodic).

另一个策略是在将数据发布到集群时构建次要索引(例如,写入数据表,写入到索引表)。如果这是在数据表已经存在之后采取的方法,那么使用MapReduce作业的辅助索引将需要bootstrapping(请参阅secondary.index .定期)。

42.4. Summary Tables

42.4。汇总表

Where time-ranges are very wide (e.g., year-long report) and where the data is voluminous, summary tables are a common approach. These would be generated with MapReduce jobs into another table.

在时间范围很广(例如,一年的报告)和数据大量的地方,汇总表是一种常见的方法。这些将通过MapReduce作业生成另一个表。

See mapreduce.example.summary for more information.

看到mapreduce.example。摘要以获取更多信息。

42.5. Coprocessor Secondary Index

42.5。协处理器的二级指标

Coprocessors act like RDBMS triggers. These were added in 0.92. For more information, see coprocessors

协处理器的作用类似于RDBMS触发器。这些加在0。92里。有关更多信息,请参见协处理器。

43. Constraints

43。约束

HBase currently supports 'constraints' in traditional (SQL) database parlance. The advised usage for Constraints is in enforcing business rules for attributes in the table (e.g. make sure values are in the range 1-10). Constraints could also be used to enforce referential integrity, but this is strongly discouraged as it will dramatically decrease the write throughput of the tables where integrity checking is enabled. Extensive documentation on using Constraints can be found at Constraint since version 0.94.

HBase目前支持传统(SQL)数据库用语的“约束”。约束的建议用法是为表中的属性执行业务规则(例如,确保值在1-10的范围内)。约束还可以用于强制引用完整性,但是这是非常不理想的,因为它将极大地减少启用完整性检查的表的写吞吐量。关于使用约束的大量文档可以在版本0.94的约束下找到。

44. Schema Design Case Studies

44岁。模式设计案例研究

The following will describe some typical data ingestion use-cases with HBase, and how the rowkey design and construction can be approached. Note: this is just an illustration of potential approaches, not an exhaustive list. Know your data, and know your processing requirements.

下面将描述一些典型的基于HBase的数据输入用例,以及如何处理rowkey设计和构建。注意:这只是一个潜在的方法的说明,而不是一个详尽的列表。了解您的数据,了解您的处理需求。

It is highly recommended that you read the rest of the HBase and Schema Design first, before reading these case studies.

在阅读这些案例研究之前,强烈建议您先阅读HBase和模式设计的其余部分。

The following case studies are described:

以下是个案研究:

  • Log Data / Timeseries Data

    日志数据/ Timeseries数据。

  • Log Data / Timeseries on Steroids

    使用类固醇的日志数据/ Timeseries。

  • Customer/Order

    客户/订单

  • Tall/Wide/Middle Schema Design

    高/宽/中间模式设计

  • List Data

    列表数据

44.1. Case Study - Log Data and Timeseries Data

44.1。案例研究-日志数据和Timeseries数据。

Assume that the following data elements are being collected.

假设正在收集以下数据元素。

  • Hostname

    主机名

  • Timestamp

    时间戳

  • Log event

    日志事件

  • Value/message

    价值/消息

We can store them in an HBase table called LOG_DATA, but what will the rowkey be? From these attributes the rowkey will be some combination of hostname, timestamp, and log-event - but what specifically?

我们可以将它们存储在一个名为LOG_DATA的HBase表中,但是rowkey是什么呢?从这些属性中,rowkey将是主机名、时间戳和日志事件的组合——但是具体是什么呢?

44.1.1. Timestamp In The Rowkey Lead Position

44.1.1。在Rowkey Lead位置上的时间戳。

The rowkey [timestamp][hostname][log-event] suffers from the monotonically increasing rowkey problem described in Monotonically Increasing Row Keys/Timeseries Data.

rowkey [timestamp][hostname][logevent]在单调递增的行键/Timeseries数据中所描述的单调递增的rowkey问题受到了影响。

There is another pattern frequently mentioned in the dist-lists about "bucketing" timestamps, by performing a mod operation on the timestamp. If time-oriented scans are important, this could be a useful approach. Attention must be paid to the number of buckets, because this will require the same number of scans to return results.

在列表中经常提到的另一种模式是在时间戳上执行一个mod操作。如果时间导向扫描很重要,这可能是一个有用的方法。必须注意bucket的数量,因为这需要相同数量的扫描才能返回结果。

long bucket = timestamp % numBuckets;

to construct:

构造:

[bucket][timestamp][hostname][log-event]

As stated above, to select data for a particular timerange, a Scan will need to be performed for each bucket. 100 buckets, for example, will provide a wide distribution in the keyspace but it will require 100 Scans to obtain data for a single timestamp, so there are trade-offs.

如上所述,要为特定的时间范围选择数据,需要对每个bucket执行扫描。例如,100个bucket将在keyspace中提供一个广泛的分布,但是需要100次扫描才能获得单个时间戳的数据,因此需要进行权衡。

44.1.2. Host In The Rowkey Lead Position

44.1.2。主机在Rowkey领先位置。

The rowkey [hostname][log-event][timestamp] is a candidate if there is a large-ish number of hosts to spread the writes and reads across the keyspace. This approach would be useful if scanning by hostname was a priority.

如果有大量的主机来传播写操作并读取整个密钥空间,那么rowkey [hostname][日志事件][时间戳]是一个候选对象。如果以主机名进行扫描是优先级,那么这种方法将非常有用。

44.1.3. Timestamp, or Reverse Timestamp?

44.1.3。时间戳,或反向时间戳?

If the most important access path is to pull most recent events, then storing the timestamps as reverse-timestamps (e.g., timestamp = Long.MAX_VALUE – timestamp) will create the property of being able to do a Scan on [hostname][log-event] to obtain the most recently captured events.

如果最重要的访问路径是拖拽最近的事件,那么将时间戳存储为反向时间戳(例如,timestamp = Long)。MAX_VALUE - timestamp)将创建能够对[主机名][log-event]进行扫描以获取最近捕获的事件的属性。

Neither approach is wrong, it just depends on what is most appropriate for the situation.

这两种方法都不是错误的,它只取决于最适合的情况。

Reverse Scan API

HBASE-4811 implements an API to scan a table or a range within a table in reverse, reducing the need to optimize your schema for forward or reverse scanning. This feature is available in HBase 0.98 and later. See Scan.setReversed() for more information.

HBASE-4811实现了一个API,可以在一个表中反向扫描一个表或一个范围,从而减少对前向或反向扫描优化模式的需求。该特性在HBase 0.98和以后可用。要了解更多信息,请参阅scan.setre()。

44.1.4. Variable Length or Fixed Length Rowkeys?

44.1.4。可变长度或固定长度的行键?

It is critical to remember that rowkeys are stamped on every column in HBase. If the hostname is a and the event type is e1 then the resulting rowkey would be quite small. However, what if the ingested hostname is myserver1.mycompany.com and the event type is com.package1.subpackage2.subsubpackage3.ImportantService?

重要的是要记住,在HBase的每一列上都要加盖rowkeys。如果主机名是a,事件类型是e1,那么生成的rowkey将非常小。但是,如果接收的主机名是myserver1.mycompany.com,事件类型是com.package1.subpackage2. subpackage3. importantservice ?

It might make sense to use some substitution in the rowkey. There are at least two approaches: hashed and numeric. In the Hostname In The Rowkey Lead Position example, it might look like this:

在rowkey中使用一些替换可能是有意义的。至少有两种方法:散列和数值。在Rowkey Lead Position示例中的主机名中,它可能是这样的:

Composite Rowkey With Hashes:

复合Rowkey散列:

  • [MD5 hash of hostname] = 16 bytes

    [MD5散列的主机名]= 16字节。

  • [MD5 hash of event-type] = 16 bytes

    (事件类型的MD5哈希)= 16字节。

  • [timestamp] = 8 bytes

    (时间戳)= 8个字节

Composite Rowkey With Numeric Substitution:

数值替换的复合行键:

For this approach another lookup table would be needed in addition to LOG_DATA, called LOG_TYPES. The rowkey of LOG_TYPES would be:

对于这种方法,除了LOG_DATA(称为LOG_TYPES)之外,还需要另一个查找表。LOG_TYPES的行键为:

  • [type] (e.g., byte indicating hostname vs. event-type)

    [类型](例如,字节指示主机名与事件类型)

  • [bytes] variable length bytes for raw hostname or event-type.

    [字节]用于原始主机名或事件类型的可变长度字节。

A column for this rowkey could be a long with an assigned number, which could be obtained by using an HBase counter

这个行键的列可以是一个指定的编号,可以通过使用HBase计数器来获得。

So the resulting composite rowkey would be:

因此,合成的复合rowkey将是:

  • [substituted long for hostname] = 8 bytes

    [取代长为主机名]= 8字节。

  • [substituted long for event type] = 8 bytes

    [用long表示事件类型]= 8字节。

  • [timestamp] = 8 bytes

    (时间戳)= 8个字节

In either the Hash or Numeric substitution approach, the raw values for hostname and event-type can be stored as columns.

在散列或数值替代方法中,主机名和事件类型的原始值可以存储为列。

44.2. Case Study - Log Data and Timeseries Data on Steroids

44.2。案例研究-日志数据和关于类固醇的Timeseries数据。

This effectively is the OpenTSDB approach. What OpenTSDB does is re-write data and pack rows into columns for certain time-periods. For a detailed explanation, see: http://opentsdb.net/schema.html, and Lessons Learned from OpenTSDB from HBaseCon2012.

这实际上就是OpenTSDB方法。OpenTSDB所做的是在特定的时间周期内将数据重写成列。有关详细说明,请参见:http://opentsdb.net/schema.html,以及从HBaseCon2012获得的OpenTSDB的经验。

But this is how the general concept works: data is ingested, for example, in this manner…​

但这是一般概念的工作方式:数据被摄入,例如,以这种方式……

[hostname][log-event][timestamp1]
[hostname][log-event][timestamp2]
[hostname][log-event][timestamp3]

with separate rowkeys for each detailed event, but is re-written like this…​

对于每个详细的事件,使用单独的行键,但是重新编写如下…

[hostname][log-event][timerange]

and each of the above events are converted into columns stored with a time-offset relative to the beginning timerange (e.g., every 5 minutes). This is obviously a very advanced processing technique, but HBase makes this possible.

上面的每一个事件都被转换成存储有时间偏移量的列(例如,每5分钟)。这显然是一种非常高级的处理技术,但是HBase使这成为可能。

44.3. Case Study - Customer/Order

44.3。案例研究——客户/订单

Assume that HBase is used to store customer and order information. There are two core record-types being ingested: a Customer record type, and Order record type.

假设HBase用于存储客户和订单信息。有两种核心记录类型被摄入:客户记录类型和订单记录类型。

The Customer record type would include all the things that you’d typically expect:

客户记录类型将包括您通常期望的所有内容:

  • Customer number

    客户编号

  • Customer name

    客户名称

  • Address (e.g., city, state, zip)

    地址(如城市、州、邮编)

  • Phone numbers, etc.

    电话号码等。

The Order record type would include things like:

订单记录类型包括以下内容:

  • Customer number

    客户编号

  • Order number

    订单号

  • Sales date

    销售日期

  • A series of nested objects for shipping locations and line-items (see Order Object Design for details)

    用于配送位置和行项目的一系列嵌套对象(详见Order对象设计)

Assuming that the combination of customer number and sales order uniquely identify an order, these two attributes will compose the rowkey, and specifically a composite key such as:

假设客户编号和销售订单的组合惟一地标识一个订单,这两个属性将组成rowkey,特别是组合键,例如:

[customer number][order number]

for an ORDER table. However, there are more design decisions to make: are the raw values the best choices for rowkeys?

订单表。但是,还有更多的设计决策要做:原始值是行键的最佳选择吗?

The same design questions in the Log Data use-case confront us here. What is the keyspace of the customer number, and what is the format (e.g., numeric? alphanumeric?) As it is advantageous to use fixed-length keys in HBase, as well as keys that can support a reasonable spread in the keyspace, similar options appear:

在日志数据用例中,同样的设计问题将我们摆在这里。客户编号的关键空间是什么,格式是什么(例如,数字?)字母数字?)由于在HBase中使用固定长度的键,以及可以支持在keyspace中合理扩展的键,类似的选项出现了:

Composite Rowkey With Hashes:

复合Rowkey散列:

  • [MD5 of customer number] = 16 bytes

    (客户编号的MD5) = 16字节。

  • [MD5 of order number] = 16 bytes

    [MD5的订单号]= 16字节。

Composite Numeric/Hash Combo Rowkey:

复合数字/散列组合Rowkey:

  • [substituted long for customer number] = 8 bytes

    [取代顾客编号]= 8字节。

  • [MD5 of order number] = 16 bytes

    [MD5的订单号]= 16字节。

44.3.1. Single Table? Multiple Tables?

44.3.1。单表吗?多个表吗?

A traditional design approach would have separate tables for CUSTOMER and SALES. Another option is to pack multiple record types into a single table (e.g., CUSTOMER++).

传统的设计方法将为客户和销售提供单独的表。另一种方法是将多个记录类型打包到单个表中(例如,CUSTOMER++)。

Customer Record Type Rowkey:

客户记录类型Rowkey:

  • [customer-id]

    (客户id)

  • [type] = type indicating `1' for customer record type

    [type] =类型指示' 1'为客户记录类型。

Order Record Type Rowkey:

Rowkey顺序记录类型:

  • [customer-id]

    (客户id)

  • [type] = type indicating `2' for order record type

    [type] =类型指示' 2'的订单记录类型。

  • [order]

    (订单)

The advantage of this particular CUSTOMER++ approach is that organizes many different record-types by customer-id (e.g., a single scan could get you everything about that customer). The disadvantage is that it’s not as easy to scan for a particular record-type.

这种特殊的客户++方法的优点是通过客户id组织了许多不同的记录类型(例如,一次扫描可以让您了解客户的一切)。缺点是不容易扫描到特定的记录类型。

44.3.2. Order Object Design

44.3.2。订单对象设计

Now we need to address how to model the Order object. Assume that the class structure is as follows:

现在我们需要讨论如何建模Order对象。假设类结构如下:

Order

(an Order can have multiple ShippingLocations

(一个订单可以有多个shippinglocation。

LineItem

(a ShippingLocation can have multiple LineItems

(ShippingLocation可以有多个LineItems。

there are multiple options on storing this data.

存储这些数据有多个选项。

Completely Normalized
完全归一化

With this approach, there would be separate tables for ORDER, SHIPPING_LOCATION, and LINE_ITEM.

使用这种方法,将会有单独的订单、SHIPPING_LOCATION和LINE_ITEM表。

The ORDER table’s rowkey was described above: schema.casestudies.custorder

上面描述了ORDER表的rowkey: schema.casestudies.custorder。

The SHIPPING_LOCATION’s composite rowkey would be something like this:

SHIPPING_LOCATION的复合行键是这样的:

  • [order-rowkey]

    (order-rowkey)

  • [shipping location number] (e.g., 1st location, 2nd, etc.)

    [船舶位置编号](例如,第1位,第2号,等等)

The LINE_ITEM table’s composite rowkey would be something like this:

LINE_ITEM表的复合行键是这样的:

  • [order-rowkey]

    (order-rowkey)

  • [shipping location number] (e.g., 1st location, 2nd, etc.)

    [船舶位置编号](例如,第1位,第2号,等等)

  • [line item number] (e.g., 1st lineitem, 2nd, etc.)

    [行项目编号](如第1行、第2条等)

Such a normalized model is likely to be the approach with an RDBMS, but that’s not your only option with HBase. The cons of such an approach is that to retrieve information about any Order, you will need:

这种规范化模型很可能是使用RDBMS的方法,但这不是HBase的唯一选择。这种方法的缺点是检索关于任何订单的信息,您将需要:

  • Get on the ORDER table for the Order

    上订单的订单。

  • Scan on the SHIPPING_LOCATION table for that order to get the ShippingLocation instances

    为了获得ShippingLocation实例,请在SHIPPING_LOCATION表上进行扫描。

  • Scan on the LINE_ITEM for each ShippingLocation

    扫描每个ShippingLocation的LINE_ITEM。

granted, this is what an RDBMS would do under the covers anyway, but since there are no joins in HBase you’re just more aware of this fact.

当然,这是RDBMS在覆盖范围内所做的事情,但是由于HBase中没有连接,您只是更了解这个事实。

Single Table With Record Types
单台与记录类型。

With this approach, there would exist a single table ORDER that would contain

使用这种方法,将会有一个包含的单个表顺序。

The Order rowkey was described above: schema.casestudies.custorder

上面描述了Order rowkey: schema.casestudies.custorder。

  • [order-rowkey]

    (order-rowkey)

  • [ORDER record type]

    (订单记录类型)

The ShippingLocation composite rowkey would be something like this:

ShippingLocation组合rowkey是这样的:

  • [order-rowkey]

    (order-rowkey)

  • [SHIPPING record type]

    (航运记录类型)

  • [shipping location number] (e.g., 1st location, 2nd, etc.)

    [船舶位置编号](例如,第1位,第2号,等等)

The LineItem composite rowkey would be something like this:

LineItem组合rowkey是这样的:

  • [order-rowkey]

    (order-rowkey)

  • [LINE record type]

    (线记录类型)

  • [shipping location number] (e.g., 1st location, 2nd, etc.)

    [船舶位置编号](例如,第1位,第2号,等等)

  • [line item number] (e.g., 1st lineitem, 2nd, etc.)

    [行项目编号](如第1行、第2条等)

Denormalized
规范化的

A variant of the Single Table With Record Types approach is to denormalize and flatten some of the object hierarchy, such as collapsing the ShippingLocation attributes onto each LineItem instance.

使用记录类型方法的单个表的一个变体是对一些对象层次结构进行非规范化和扁平化,例如将ShippingLocation属性折叠到每个LineItem实例上。

The LineItem composite rowkey would be something like this:

LineItem组合rowkey是这样的:

  • [order-rowkey]

    (order-rowkey)

  • [LINE record type]

    (线记录类型)

  • [line item number] (e.g., 1st lineitem, 2nd, etc., care must be taken that there are unique across the entire order)

    [line项目编号](例如,第1条lineitem,第2条,等等,必须注意在整个订单中有唯一的)

and the LineItem columns would be something like this:

LineItem列是这样的

  • itemNumber

    itemNumber)

  • quantity

    数量

  • price

    价格

  • shipToLine1 (denormalized from ShippingLocation)

    从ShippingLocation shipToLine1(规范化)

  • shipToLine2 (denormalized from ShippingLocation)

    从ShippingLocation shipToLine2(规范化)

  • shipToCity (denormalized from ShippingLocation)

    从ShippingLocation shipToCity(规范化)

  • shipToState (denormalized from ShippingLocation)

    从ShippingLocation shipToState(规范化)

  • shipToZip (denormalized from ShippingLocation)

    从ShippingLocation shipToZip(规范化)

The pros of this approach include a less complex object hierarchy, but one of the cons is that updating gets more complicated in case any of this information changes.

这种方法的优点包括一个不那么复杂的对象层次结构,但是其中一个缺点是,如果这些信息发生变化,更新变得更加复杂。

Object BLOB
对象的团

With this approach, the entire Order object graph is treated, in one way or another, as a BLOB. For example, the ORDER table’s rowkey was described above: schema.casestudies.custorder, and a single column called "order" would contain an object that could be deserialized that contained a container Order, ShippingLocations, and LineItems.

通过这种方法,整个Order对象图以一种或另一种方式被视为BLOB。例如,上面描述了ORDER表的rowkey: schema.casestudies。custorder和一个名为“order”的单一列将包含一个可以被反序列化的对象,该对象包含一个容器顺序、shippinglocation和LineItems。

There are many options here: JSON, XML, Java Serialization, Avro, Hadoop Writables, etc. All of them are variants of the same approach: encode the object graph to a byte-array. Care should be taken with this approach to ensure backward compatibility in case the object model changes such that older persisted structures can still be read back out of HBase.

这里有许多选项:JSON、XML、Java序列化、Avro、Hadoop Writables等,它们都是相同方法的变体:将对象图编码为字节数组。应该注意这种方法,以确保在对象模型更改时向后兼容,这样旧的持久化结构仍然可以从HBase中读取。

Pros are being able to manage complex object graphs with minimal I/O (e.g., a single HBase Get per Order in this example), but the cons include the aforementioned warning about backward compatibility of serialization, language dependencies of serialization (e.g., Java Serialization only works with Java clients), the fact that you have to deserialize the entire object to get any piece of information inside the BLOB, and the difficulty in getting frameworks like Hive to work with custom objects like this.

优点是能够以最小的I / O管理复杂对象图(例如,单个HBase得到每个订单在这个例子中),但缺点包括序列化的上述警告向后兼容性,语言依赖性的序列化(例如,Java序列化仅适用于Java客户端),这个事实你必须反序列化整个对象得到任何信息在BLOB,和困难这样的框架蜂巢处理自定义对象。

44.4. Case Study - "Tall/Wide/Middle" Schema Design Smackdown

44.4。案例研究——“高/宽/中”模式设计的Smackdown。

This section will describe additional schema design questions that appear on the dist-list, specifically about tall and wide tables. These are general guidelines and not laws - each application must consider its own needs.

本节将描述在列表列表中出现的其他模式设计问题,特别是关于高和宽的表。这些是通用的指导方针,而不是法律——每个应用程序都必须考虑自己的需求。

44.4.1. Rows vs. Versions

44.4.1。行和版本

A common question is whether one should prefer rows or HBase’s built-in-versioning. The context is typically where there are "a lot" of versions of a row to be retained (e.g., where it is significantly above the HBase default of 1 max versions). The rows-approach would require storing a timestamp in some portion of the rowkey so that they would not overwrite with each successive update.

一个常见的问题是,是否应该选择行或HBase的构建版本。上下文通常是要保留的行的“很多”版本(例如,它明显高于HBase默认的1 max版本)。行方法需要在行键的某个部分存储时间戳,这样它们就不会在每次连续更新时覆盖它们。

Preference: Rows (generally speaking).

偏好:行(一般来说)。

44.4.2. Rows vs. Columns

44.4.2。行与列

Another common question is whether one should prefer rows or columns. The context is typically in extreme cases of wide tables, such as having 1 row with 1 million attributes, or 1 million rows with 1 columns apiece.

另一个常见的问题是,是否应该选择行或列。上下文通常是在一些宽表的极端情况下,例如有1行具有100万属性,或1百万行,每个列有1列。

Preference: Rows (generally speaking). To be clear, this guideline is in the context is in extremely wide cases, not in the standard use-case where one needs to store a few dozen or hundred columns. But there is also a middle path between these two options, and that is "Rows as Columns."

偏好:行(一般来说)。要说明的是,这个指导方针是在非常广泛的情况下,而不是在标准的用例中,其中一个需要存储几十个或数百个列。但是在这两个选项之间也有一条中间路径,那就是“行作为列”。

44.4.3. Rows as Columns

44.4.3。行,列

The middle path between Rows vs. Columns is packing data that would be a separate row into columns, for certain rows. OpenTSDB is the best example of this case where a single row represents a defined time-range, and then discrete events are treated as columns. This approach is often more complex, and may require the additional complexity of re-writing your data, but has the advantage of being I/O efficient. For an overview of this approach, see schema.casestudies.log-steroids.

行与列之间的中间路径是打包数据,这些数据将单独列成列,用于某些行。OpenTSDB是这种情况下最好的例子,其中一行表示一个已定义的时间范围,然后将离散事件作为列处理。这种方法通常比较复杂,可能需要重新编写数据的额外复杂性,但是具有I/O效率的优势。有关此方法的概述,请参见schema.casestudies.log-类固醇。

44.5. Case Study - List Data

44.5。案例研究-列表数据。

The following is an exchange from the user dist-list regarding a fairly common question: how to handle per-user list data in Apache HBase.

下面是关于一个相当常见的问题的用户列表的交换:如何处理Apache HBase中的每个用户列表数据。

  • QUESTION *

    问题*

We’re looking at how to store a large amount of (per-user) list data in HBase, and we were trying to figure out what kind of access pattern made the most sense. One option is store the majority of the data in a key, so we could have something like:

我们正在研究如何在HBase中存储大量的(每个用户)列表数据,并且我们正在尝试找出最合理的访问模式。一种选择是将大部分数据存储在一个密钥中,这样我们就可以拥有如下内容:

<FixedWidthUserName><FixedWidthValueId1>:"" (no value)
<FixedWidthUserName><FixedWidthValueId2>:"" (no value)
<FixedWidthUserName><FixedWidthValueId3>:"" (no value)

The other option we had was to do this entirely using:

我们的另一个选择是完全使用:

<FixedWidthUserName><FixedWidthPageNum0>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...
<FixedWidthUserName><FixedWidthPageNum1>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...

where each row would contain multiple values. So in one case reading the first thirty values would be:

其中每一行都包含多个值。因此,在一个案例中,阅读前30个值是:

scan { STARTROW => 'FixedWidthUsername' LIMIT => 30}

And in the second case it would be

第二种情况是。

get 'FixedWidthUserName\x00\x00\x00\x00'

The general usage pattern would be to read only the first 30 values of these lists, with infrequent access reading deeper into the lists. Some users would have ⇐ 30 total values in these lists, and some users would have millions (i.e. power-law distribution)

一般的使用模式是只读取这些列表的前30个值,而不经常访问更深入到列表中。一些用户会⇐30总值在这些列表,和一些用户数百万(即幂律分布)

The single-value format seems like it would take up more space on HBase, but would offer some improved retrieval / pagination flexibility. Would there be any significant performance advantages to be able to paginate via gets vs paginating with scans?

单值格式似乎需要在HBase上占用更多空间,但可以提供一些改进的检索/分页灵活性。有什么显著的性能优势可以通过获取和扫描的页面进行分页吗?

My initial understanding was that doing a scan should be faster if our paging size is unknown (and caching is set appropriately), but that gets should be faster if we’ll always need the same page size. I’ve ended up hearing different people tell me opposite things about performance. I assume the page sizes would be relatively consistent, so for most use cases we could guarantee that we only wanted one page of data in the fixed-page-length case. I would also assume that we would have infrequent updates, but may have inserts into the middle of these lists (meaning we’d need to update all subsequent rows).

我最初的理解是,如果我们的分页大小未知(并适当地设置了缓存),那么做一个扫描应该更快,但是如果我们总是需要相同的页面大小,那就应该更快。我最终听到不同的人告诉我关于绩效的相反的事情。我假设页面大小是相对一致的,所以对于大多数用例,我们可以保证我们只希望在固定页长度的情况下只需要一页数据。我还假设我们会有不频繁的更新,但是可能会插入到这些列表的中间(这意味着我们需要更新所有后续的行)。

Thanks for help / suggestions / follow-up questions.

谢谢你的帮助/建议/后续问题。

  • ANSWER *

    答案*

If I understand you correctly, you’re ultimately trying to store triples in the form "user, valueid, value", right? E.g., something like:

如果我正确地理解了您,您最终将尝试在表单“user, valueid, value”中存储三元组,对吗?例如,类似:

"user123, firstname, Paul",
"user234, lastname, Smith"

(But the usernames are fixed width, and the valueids are fixed width).

(但是用户名是固定宽度的,而valueids是固定宽度的)。

And, your access pattern is along the lines of: "for user X, list the next 30 values, starting with valueid Y". Is that right? And these values should be returned sorted by valueid?

并且,您的访问模式是沿着:“对于用户X,列出接下来的30个值,从valueid Y开始”。是这样吗?这些值应该按照valueid的顺序返回吗?

The tl;dr version is that you should probably go with one row per user+value, and not build a complicated intra-row pagination scheme on your own unless you’re really sure it is needed.

tl;dr版本是,您应该使用每个用户的一行+值,而不是自己构建一个复杂的内部行分页方案,除非您真的确定它是必需的。

Your two options mirror a common question people have when designing HBase schemas: should I go "tall" or "wide"? Your first schema is "tall": each row represents one value for one user, and so there are many rows in the table for each user; the row key is user + valueid, and there would be (presumably) a single column qualifier that means "the value". This is great if you want to scan over rows in sorted order by row key (thus my question above, about whether these ids are sorted correctly). You can start a scan at any user+valueid, read the next 30, and be done. What you’re giving up is the ability to have transactional guarantees around all the rows for one user, but it doesn’t sound like you need that. Doing it this way is generally recommended (see here https://hbase.apache.org/book.html#schema.smackdown).

你的两种选择反映了人们在设计HBase模式时遇到的一个常见问题:我应该“高”还是“宽”?您的第一个模式是“tall”:每一行表示一个用户的一个值,因此每个用户的表中有许多行;行键是user + valueid,并且有(大概)一个表示“值”的列限定符。如果您想按行键(因此我上面的问题,关于这些id是否正确排序)进行扫描,这是很好的。您可以在任何用户+valueid上开始扫描,阅读下一个30,然后完成。你放弃的是在所有行中为一个用户提供事务保证的能力,但是听起来不像你需要的那样。这样做通常是推荐的(参见这里的https://hbase.apache.org/book.html#schema.smackdown)。

Your second option is "wide": you store a bunch of values in one row, using different qualifiers (where the qualifier is the valueid). The simple way to do that would be to just store ALL values for one user in a single row. I’m guessing you jumped to the "paginated" version because you’re assuming that storing millions of columns in a single row would be bad for performance, which may or may not be true; as long as you’re not trying to do too much in a single request, or do things like scanning over and returning all of the cells in the row, it shouldn’t be fundamentally worse. The client has methods that allow you to get specific slices of columns.

第二个选项是“宽”:在一行中存储一串值,使用不同的限定符(修饰符是valueid)。简单的方法是将一个用户的所有值存储在一行中。我猜你跳到了“分页”的版本,因为你假设在一行中存储数百万列将不利于性能,这可能是也可能不是真的;只要你不想在一个单一的请求中做太多的事情,或者做一些像扫描和返回一行中的所有单元的事情,它就不会变得更糟。客户端有允许您获得特定的列的方法。

Note that neither case fundamentally uses more disk space than the other; you’re just "shifting" part of the identifying information for a value either to the left (into the row key, in option one) or to the right (into the column qualifiers in option 2). Under the covers, every key/value still stores the whole row key, and column family name. (If this is a bit confusing, take an hour and watch Lars George’s excellent video about understanding HBase schema design: http://www.youtube.com/watch?v=_HLoH_PgrLk).

注意,这两种情况都没有从根本上使用更多的磁盘空间;您只是将标识信息的一部分“转移”到左边(在行键中,在选项1中)或右边(在选项2中的列限定符中)。在覆盖下,每个键/值仍然存储整个行键和列姓。(如果这有点让人困惑的话,那就花一个小时,看看Lars George关于理解HBase模式设计的优秀视频:http://www.youtube.com/watch? v_hloh_pgrlk)。

A manually paginated version has lots more complexities, as you note, like having to keep track of how many things are in each page, re-shuffling if new values are inserted, etc. That seems significantly more complex. It might have some slight speed advantages (or disadvantages!) at extremely high throughput, and the only way to really know that would be to try it out. If you don’t have time to build it both ways and compare, my advice would be to start with the simplest option (one row per user+value). Start simple and iterate! :)

手动分页的版本有很多复杂的地方,如您所注意到的,如要跟踪每一页中有多少东西,如果插入新值,重新洗牌,等等。这似乎要复杂得多。它可能有一些轻微的速度优势(或缺点!)在极高的吞吐量,而且唯一的方法,真正知道那将是尝试它。如果您没有时间构建这两种方法并进行比较,那么我的建议将从最简单的选项(每个用户+值的一行)开始。开始简单重复!:)

45. Operational and Performance Configuration Options

45岁。操作和性能配置选项。

45.1. Tune HBase Server RPC Handling

45.1。调优HBase服务器RPC处理。

  • Set hbase.regionserver.handler.count (in hbase-site.xml) to cores x spindles for concurrency.

    设置hbase.regionserver.handler。计数(在hbase-site.xml中)到核心x轴的并发性。

  • Optionally, split the call queues into separate read and write queues for differentiated service. The parameter hbase.ipc.server.callqueue.handler.factor specifies the number of call queues:

    可选地,将调用队列分割为单独的读和写队列,以区别服务。参数hbase.ipc.server.callqueue.handler。因子指定调用队列的数量:

    • 0 means a single shared queue

      0表示单个共享队列。

    • 1 means one queue for each handler.

      1表示每个处理程序的一个队列。

    • A value between 0 and 1 allocates the number of queues proportionally to the number of handlers. For instance, a value of .5 shares one queue between each two handlers.

      在0到1之间的一个值将队列的数量按比例分配给处理程序的数量。例如,.5的值在每个处理程序之间共享一个队列。

  • Use hbase.ipc.server.callqueue.read.ratio (hbase.ipc.server.callqueue.read.share in 0.98) to split the call queues into read and write queues:

    使用hbase.ipc.server.callqueue.read。(hbase.ipc.server.callqueue.read比例。将调用队列拆分为读和写队列:

    • 0.5 means there will be the same number of read and write queues

      0.5表示将会有相同数量的读和写队列。

    • < 0.5 for more read than write

      < 0.5 for more read than write。

    • > 0.5 for more write than read

      比读更多的写>。5。

  • Set hbase.ipc.server.callqueue.scan.ratio (HBase 1.0+) to split read call queues into small-read and long-read queues:

    设置hbase.ipc.server.callqueue.scan。比率(HBase 1.0+)将读调用队列分成小读和长读队列:

    • 0.5 means that there will be the same number of short-read and long-read queues

      0.5表示将会有相同数量的短读和长读队列。

    • < 0.5 for more short-read

      < 0.5用于更短的阅读。

    • > 0.5 for more long-read

      > 0.5用于更长的阅读。

45.2. Disable Nagle for RPC

45.2。禁用对RPC纳格尔

Disable Nagle’s algorithm. Delayed ACKs can add up to ~200ms to RPC round trip time. Set the following parameters:

纳格尔禁用的算法。延迟的ack可以加到~200ms到RPC往返时间。设置以下参数:

  • In Hadoop’s core-site.xml:

    在Hadoop的core-site.xml:

    • ipc.server.tcpnodelay = true

      ipc.server。tcpnodelay = true

    • ipc.client.tcpnodelay = true

      ipc.client。tcpnodelay = true

  • In HBase’s hbase-site.xml:

    在HBase的hbase-site.xml:

    • hbase.ipc.client.tcpnodelay = true

      hbase.ipc.client。tcpnodelay = true

    • hbase.ipc.server.tcpnodelay = true

      hbase.ipc.server。tcpnodelay = true

45.3. Limit Server Failure Impact

45.3。限制服务器故障影响

Detect regionserver failure as fast as reasonable. Set the following parameters:

尽可能快地检测区域服务器故障。设置以下参数:

  • In hbase-site.xml, set zookeeper.session.timeout to 30 seconds or less to bound failure detection (20-30 seconds is a good start).

    在hbase-site。xml,zookeeper.session设置。超时到30秒或更少的绑定故障检测(20-30秒是一个好的开始)。

  • Detect and avoid unhealthy or failed HDFS DataNodes: in hdfs-site.xml and hbase-site.xml, set the following parameters:

    检测和避免不健康的或失败的HDFS DataNodes:在HDFS站点。xml和hbase-site。xml,设置以下参数:

    • dfs.namenode.avoid.read.stale.datanode = true

      dfs.namenode.avoid.read.stale.datanode = true

    • dfs.namenode.avoid.write.stale.datanode = true

      dfs.namenode.avoid.write.stale.datanode = true

45.4. Optimize on the Server Side for Low Latency

45.4。优化服务器端的低延迟。

  • Skip the network for local blocks. In hbase-site.xml, set the following parameters:

    跳过本地块的网络。在hbase-site。xml,设置以下参数:

    • dfs.client.read.shortcircuit = true

      dfs.client.read。短路= true

    • dfs.client.read.shortcircuit.buffer.size = 131072 (Important to avoid OOME)

      dfs.client.read.shortcircuit.buffer。大小= 131072(避免OOME重要)

  • Ensure data locality. In hbase-site.xml, set hbase.hstore.min.locality.to.skip.major.compact = 0.7 (Meaning that 0.7 <= n <= 1)

    确保数据本地化。在hbase-site。xml,设置hbase.hstore.min. to.skip. compact = 0.7(意思是0.7 <= n <= 1)

  • Make sure DataNodes have enough handlers for block transfers. In hdfs-site.xml, set the following parameters:

    确保DataNodes有足够的处理块传输的处理程序。在hdfs-site。xml,设置以下参数:

    • dfs.datanode.max.xcievers >= 8192

      dfs.datanode.max。xcievers > = 8192

    • dfs.datanode.handler.count = number of spindles

      dfs.datanode.handler。锭数=锭数。

45.5. JVM Tuning

45.5。JVM调优

45.5.1. Tune JVM GC for low collection latencies

45.5.1。为低收集延迟调优JVM GC。

  • Use the CMS collector: -XX:+UseConcMarkSweepGC

    使用CMS收集器:-XX:+UseConcMarkSweepGC。

  • Keep eden space as small as possible to minimize average collection time. Example:

    保持eden空间尽可能小,以最小化平均收集时间。例子:

    -XX:CMSInitiatingOccupancyFraction=70
  • Optimize for low collection latency rather than throughput: -Xmn512m

    优化低收集延迟而不是吞吐量:-Xmn512m。

  • Collect eden in parallel: -XX:+UseParNewGC

    并行收集eden: -XX:+UseParNewGC。

  • Avoid collection under pressure: -XX:+UseCMSInitiatingOccupancyOnly

    避免在压力下收集:-XX:+UseCMSInitiatingOccupancyOnly。

  • Limit per request scanner result sizing so everything fits into survivor space but doesn’t tenure. In hbase-site.xml, set hbase.client.scanner.max.result.size to 1/8th of eden space (with -Xmn512m this is ~51MB )

    限制每个请求扫描结果的大小,所以所有的东西都适合于幸存者空间,但是没有使用期限。在hbase-site。xml,hbase.client.scanner.max.result设置。大小到eden空间的1/8(使用-Xmn512m这是~51MB)

  • Set max.result.size x handler.count less than survivor space

    设置max.result。尺寸x处理器。小于幸存者空间。

45.5.2. OS-Level Tuning

45.5.2。操作系统调优

  • Turn transparent huge pages (THP) off:

    打开透明的大页面(THP):

    echo never > /sys/kernel/mm/transparent_hugepage/enabled
    echo never > /sys/kernel/mm/transparent_hugepage/defrag
  • Set vm.swappiness = 0

    设置虚拟机。swappiness = 0

  • Set vm.min_free_kbytes to at least 1GB (8GB on larger memory systems)

    设置虚拟机。min_free_kbytes至少为1GB(大内存系统的8GB)

  • Disable NUMA zone reclaim with vm.zone_reclaim_mode = 0

    使用vm禁用NUMA区回收。zone_reclaim_mode = 0

46. Special Cases

46岁。特殊情况

46.1. For applications where failing quickly is better than waiting

46.1。对于快速失败的应用程序,要比等待更好。

  • In hbase-site.xml on the client side, set the following parameters:

    在hbase-site。xml在客户端,设置以下参数:

    • Set hbase.client.pause = 1000

      设置hbase.client。暂停= 1000

    • Set hbase.client.retries.number = 3

      设置hbase.client.retries。数量= 3

    • If you want to ride over splits and region moves, increase hbase.client.retries.number substantially (>= 20)

      如果你想跨越分裂和区域移动,增加hbase.client. retry。数量大幅(> = 20)

    • Set the RecoverableZookeeper retry count: zookeeper.recovery.retry = 1 (no retry)

      设置可恢复动物管理员重试计数:zookeeper.recovery。重试= 1(不重试)

  • In hbase-site.xml on the server side, set the Zookeeper session timeout for detecting server failures: zookeeper.session.timeout ⇐ 30 seconds (20-30 is good).

    在hbase-site。服务器端上的xml设置了检测服务器故障的Zookeeper会话超时:zookeeper.session。超时⇐30秒(20 - 30是好的)。

46.2. For applications that can tolerate slightly out of date information

46.2。适用于那些可以稍微超出日期信息的应用程序。

HBase timeline consistency (HBASE-10070) With read replicas enabled, read-only copies of regions (replicas) are distributed over the cluster. One RegionServer services the default or primary replica, which is the only replica that can service writes. Other RegionServers serve the secondary replicas, follow the primary RegionServer, and only see committed updates. The secondary replicas are read-only, but can serve reads immediately while the primary is failing over, cutting read availability blips from seconds to milliseconds. Phoenix supports timeline consistency as of 4.4.0 Tips:

HBase时间轴一致性(HBase -10070)具有读取的副本,而区域(副本)的只读副本分布在集群上。一个区域服务器服务默认或主副本,这是唯一可以服务的副本。其他区域服务器服务于次要副本,跟随主区域服务器,只看到提交的更新。二级副本是只读的,但是在主服务器失败时可以立即执行读取操作,从秒到毫秒将读取可用性blips。Phoenix支持时间轴一致性为4.4.0的提示:

  • Deploy HBase 1.0.0 or later.

    部署HBase 1.0.0或更高版本。

  • Enable timeline consistent replicas on the server side.

    在服务器端启用时间轴一致的副本。

  • Use one of the following methods to set timeline consistency:

    使用以下方法设置时间轴一致性:

    • Use ALTER SESSION SET CONSISTENCY = 'TIMELINE’

      使用ALTER SESSION SET一致性= 'TIMELINE '

    • Set the connection property Consistency to timeline in the JDBC connect string

      将连接属性的一致性设置为JDBC连接字符串中的时间线。

46.3. More Information

46.3。更多的信息

See the Performance section perf.schema for more information about operational and performance schema design options, such as Bloom Filters, Table-configured regionsizes, compression, and blocksizes.

查看性能部分perf。关于操作和性能模式设计选项的更多信息的模式,如Bloom filter、表配置的区域大小、压缩和块大小。

HBase and MapReduce

HBase和MapReduce

Apache MapReduce is a software framework used to analyze large amounts of data. It is provided by Apache Hadoop. MapReduce itself is out of the scope of this document. A good place to get started with MapReduce is https://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html. MapReduce version 2 (MR2)is now part of YARN.

Apache MapReduce是一个用于分析大量数据的软件框架。它由Apache Hadoop提供。MapReduce本身超出了这个文档的范围。从MapReduce开始的一个好地方是https://hadoop.apache.org/docs/r2.6.0/hadoop- MapReduce -client/hadoop- MapReduce -客户- core/mapreducetutories.html。MapReduce版本2 (MR2)现在是纱线的一部分。

This chapter discusses specific configuration steps you need to take to use MapReduce on data within HBase. In addition, it discusses other interactions and issues between HBase and MapReduce jobs. Finally, it discusses Cascading, an alternative API for MapReduce.

本章讨论了在HBase中使用MapReduce数据时需要采取的具体配置步骤。此外,还讨论了HBase与MapReduce作业之间的其他交互和问题。最后,讨论了MapReduce的一个替代API级联。

mapred and mapreduce

There are two mapreduce packages in HBase as in MapReduce itself: org.apache.hadoop.hbase.mapred and org.apache.hadoop.hbase.mapreduce. The former does old-style API and the latter the new mode. The latter has more facility though you can usually find an equivalent in the older package. Pick the package that goes with your MapReduce deploy. When in doubt or starting over, pick org.apache.hadoop.hbase.mapreduce. In the notes below, we refer to o.a.h.h.mapreduce but replace with o.a.h.h.mapred if that is what you are using.

在HBase中有两个mapreduce包,如mapreduce本身:org.apache.hadoop.hbase。mapred org.apache.hadoop.hbase.mapreduce。前者采用旧式API,后者采用新模式。后者有更多的功能,尽管您通常可以在旧的包中找到等价的。选择与MapReduce部署相关的包。当有疑问或重新开始时,选择org.apache.hadoop.hbase.mapreduce。在下面的笔记中,我们提到了o.a.h.h.。mapreduce但是替换o。h。h。如果你使用的是mapred。

47. HBase, MapReduce, and the CLASSPATH

47岁。HBase、MapReduce和类路径。

By default, MapReduce jobs deployed to a MapReduce cluster do not have access to either the HBase configuration under $HBASE_CONF_DIR or the HBase classes.

默认情况下,部署到MapReduce集群的MapReduce作业不能访问HBASE_CONF_DIR或HBase类下的HBase配置。

To give the MapReduce jobs the access they need, you could add hbase-site.xml_to _$HADOOP_HOME/conf and add HBase jars to the $HADOOP_HOME/lib directory. You would then need to copy these changes across your cluster. Or you could edit $HADOOP_HOME/conf/hadoop-env.sh and add hbase dependencies to the HADOOP_CLASSPATH variable. Neither of these approaches is recommended because it will pollute your Hadoop install with HBase references. It also requires you restart the Hadoop cluster before Hadoop can use the HBase data.

为了给MapReduce工作提供所需的访问权限,您可以添加hbase-site。xml_to _$HADOOP_HOME/conf,并将HBase jar添加到$HADOOP_HOME/lib目录。然后,您需要在集群中复制这些更改。或者你可以编辑$HADOOP_HOME/conf/hadoop-env。将hbase依赖项添加到hadoop - classpath变量中。这两种方法都不推荐使用,因为它会使用HBase引用污染您的Hadoop安装。它还要求您在Hadoop使用HBase数据之前重新启动Hadoop集群。

The recommended approach is to let HBase add its dependency jars and use HADOOP_CLASSPATH or -libjars.

推荐的方法是让HBase添加它的依赖项jar,并使用hadoop - classpath或-libjars。

Since HBase 0.90.x, HBase adds its dependency JARs to the job configuration itself. The dependencies only need to be available on the local CLASSPATH and from here they’ll be picked up and bundled into the fat job jar deployed to the MapReduce cluster. A basic trick just passes the full hbase classpath — all hbase and dependent jars as well as configurations — to the mapreduce job runner letting hbase utility pick out from the full-on classpath what it needs adding them to the MapReduce job configuration (See the source at TableMapReduceUtil#addDependencyJars(org.apache.hadoop.mapreduce.Job) for how this is done).

自从HBase 0.90。HBase将其依赖项jar添加到作业配置本身。依赖项只需要在本地类路径上可用,从这里它们将被打包到部署到MapReduce集群的fat job jar中。基本技巧只是通过完整的hbase类路径有依赖关系的jar——hbase和mapreduce工作以及配置——跑步让hbase效用挑选从全面类路径中需要将它们添加到mapreduce任务配置(见源代码在TableMapReduceUtil # addDependencyJars(org.apache.hadoop.mapreduce.Job)这是如何实现的)。

The following example runs the bundled HBase RowCounter MapReduce job against a table named usertable. It sets into HADOOP_CLASSPATH the jars hbase needs to run in an MapReduce context (including configuration files such as hbase-site.xml). Be sure to use the correct version of the HBase JAR for your system; replace the VERSION string in the below command line w/ the version of your local hbase install. The backticks (` symbols) cause the shell to execute the sub-commands, setting the output of hbase classpath into HADOOP_CLASSPATH. This example assumes you use a BASH-compatible shell.

下面的示例针对一个名为usertable的表运行了绑定的HBase RowCounter MapReduce作业。它设置到hadoop - classpath中,jar hbase需要在MapReduce上下文中运行(包括配置文件,如hbase-site.xml)。确保您的系统使用了正确的HBase JAR版本;将版本字符串替换为以下命令行w/本地hbase安装版本。backticks(符号)导致shell执行子命令,将hbase类路径的输出设置为hadoop - classpath。本例假设您使用的是bash兼容的shell。

$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` \
  ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/lib/hbase-mapreduce-VERSION.jar \
  org.apache.hadoop.hbase.mapreduce.RowCounter usertable

The above command will launch a row counting mapreduce job against the hbase cluster that is pointed to by your local configuration on a cluster that the hadoop configs are pointing to.

上面的命令将在hadoop configs指向的集群上的本地配置中,启动一个针对hbase集群的行计数mapreduce作业。

The main for the hbase-mapreduce.jar is a Driver that lists a few basic mapreduce tasks that ship with hbase. For example, presuming your install is hbase 2.0.0-SNAPSHOT:

主要用于hbase- apreduce。jar是一个驱动程序,列出了一些基本的mapreduce任务。例如,假设您的安装是hbase 2.0.0快照:

$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` \
  ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/lib/hbase-mapreduce-2.0.0-SNAPSHOT.jar
An example program must be given as the first argument.
Valid program names are:
  CellCounter: Count cells in HBase table.
  WALPlayer: Replay WAL files.
  completebulkload: Complete a bulk data load.
  copytable: Export a table from local cluster to peer cluster.
  export: Write table data to HDFS.
  exportsnapshot: Export the specific snapshot to a given FileSystem.
  import: Import data written by Export.
  importtsv: Import data in TSV format.
  rowcounter: Count rows in HBase table.
  verifyrep: Compare the data from tables in two different clusters. WARNING: It doesn't work for incrementColumnValues'd cells since the timestamp is changed after being appended to the log.

You can use the above listed shortnames for mapreduce jobs as in the below re-run of the row counter job (again, presuming your install is hbase 2.0.0-SNAPSHOT):

您可以使用上面列出的短名称来进行mapreduce作业,就像下面重新运行的行计数器作业一样(同样,假设您的安装是hbase 2.0.0快照):

$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` \
  ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/lib/hbase-mapreduce-2.0.0-SNAPSHOT.jar \
  rowcounter usertable

You might find the more selective hbase mapredcp tool output of interest; it lists the minimum set of jars needed to run a basic mapreduce job against an hbase install. It does not include configuration. You’ll probably need to add these if you want your MapReduce job to find the target cluster. You’ll probably have to also add pointers to extra jars once you start to do anything of substance. Just specify the extras by passing the system propery -Dtmpjars when you run hbase mapredcp.

您可能会发现更有选择的hbase mapredcp工具输出感兴趣;它列出了在hbase安装基础上运行基本mapreduce作业所需的最小jar集。它不包括配置。如果希望MapReduce任务找到目标集群,您可能需要添加这些内容。当你开始做任何实质性的事情时,你可能还需要添加指向额外jar的指针。当您运行hbase mapredcp时,只需通过传递系统propery -Dtmpjars来指定额外的功能。

For jobs that do not package their dependencies or call TableMapReduceUtil#addDependencyJars, the following command structure is necessary:

对于不打包其依赖项或调用TableMapReduceUtil#addDependencyJars的作业,需要以下命令结构:

$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase mapredcp`:${HBASE_HOME}/conf hadoop jar MyApp.jar MyJobMainClass -libjars $(${HBASE_HOME}/bin/hbase mapredcp | tr ':' ',') ...

The example may not work if you are running HBase from its build directory rather than an installed location. You may see an error like the following:

如果您是从构建目录而不是安装位置运行HBase,那么这个示例可能不会起作用。您可能会看到如下错误:

java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper

If this occurs, try modifying the command as follows, so that it uses the HBase JARs from the target/ directory within the build environment.

如果发生这种情况,请尝试修改以下命令,以便在构建环境中使用目标/目录中的HBase jar。

$ HADOOP_CLASSPATH=${HBASE_BUILD_HOME}/hbase-mapreduce/target/hbase-mapreduce-VERSION-SNAPSHOT.jar:`${HBASE_BUILD_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_BUILD_HOME}/hbase-mapreduce/target/hbase-mapreduce-VERSION-SNAPSHOT.jar rowcounter usertable
Notice to MapReduce users of HBase between 0.96.1 and 0.98.4

Some MapReduce jobs that use HBase fail to launch. The symptom is an exception similar to the following:

一些使用HBase的MapReduce作业无法启动。该症状与以下情况类似:

Exception in thread "main" java.lang.IllegalAccessError: class
    com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass
    com.google.protobuf.LiteralByteString
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at
    org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
    at
    org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
    at
    org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
    at
    org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
    at
    org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
    at
    org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
...

This is caused by an optimization introduced in HBASE-9867 that inadvertently introduced a classloader dependency.

这是由HBASE-9867引入的优化导致的,它无意中引入了类加载器的依赖关系。

This affects both jobs using the -libjars option and "fat jar," those which package their runtime dependencies in a nested lib folder.

这将使用-libjar选项和“fat jar”来影响两个作业,它们将运行时依赖包打包在一个嵌套的lib文件夹中。

In order to satisfy the new classloader requirements, hbase-protocol.jar must be included in Hadoop’s classpath. See HBase, MapReduce, and the CLASSPATH for current recommendations for resolving classpath errors. The following is included for historical purposes.

为了满足新的类加载器的要求,hbase-protocol。jar必须包含在Hadoop的类路径中。有关解决类路径错误的当前建议,请参见HBase、MapReduce和类路径。以下内容为历史目的。

This can be resolved system-wide by including a reference to the hbase-protocol.jar in Hadoop’s lib directory, via a symlink or by copying the jar into the new location.

这可以通过包括对hbase协议的引用来解决整个系统。jar在Hadoop的lib目录中,通过一个符号链接或将jar复制到新的位置。

This can also be achieved on a per-job launch basis by including it in the HADOOP_CLASSPATH environment variable at job submission time. When launching jobs that package their dependencies, all three of the following job launching commands satisfy this requirement:

这也可以在每个工作的发布基础上实现,包括在作业提交时间的hadoop - classpath环境变量中。当启动包其依赖项的作业时,以下三个工作启动命令都满足以下要求:

$ HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass
$ HADOOP_CLASSPATH=$(hbase mapredcp):/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass
$ HADOOP_CLASSPATH=$(hbase classpath) hadoop jar MyJob.jar MyJobMainClass

For jars that do not package their dependencies, the following command structure is necessary:

对于不打包其依赖项的jar,需要以下命令结构:

$ HADOOP_CLASSPATH=$(hbase mapredcp):/etc/hbase/conf hadoop jar MyApp.jar MyJobMainClass -libjars $(hbase mapredcp | tr ':' ',') ...

See also HBASE-10304 for further discussion of this issue.

请参阅HBASE-10304,进一步讨论这个问题。

48. MapReduce Scan Caching

48。MapReduce扫描缓存

TableMapReduceUtil now restores the option to set scanner caching (the number of rows which are cached before returning the result to the client) on the Scan object that is passed in. This functionality was lost due to a bug in HBase 0.95 (HBASE-11558), which is fixed for HBase 0.98.5 and 0.96.3. The priority order for choosing the scanner caching is as follows:

TableMapReduceUtil现在重新存储了在传入的扫描对象上设置扫描缓存(在将结果返回给客户机之前缓存的行数)的选项。该功能由于HBase 0.95 (HBase -11558)中的bug而丢失,该缺陷在HBase 0.98.5和0.96.3中固定。选择扫描仪缓存的优先顺序如下:

  1. Caching settings which are set on the scan object.

    在扫描对象上设置的缓存设置。

  2. Caching settings which are specified via the configuration option hbase.client.scanner.caching, which can either be set manually in hbase-site.xml or via the helper method TableMapReduceUtil.setScannerCaching().

    通过配置选项hbase.client.scanner指定的缓存设置。缓存,可以在hbase站点中手动设置。xml或通过助手方法tablemapreduceutil.setscannercache()。

  3. The default value HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING, which is set to 100.

    默认值HConstants。default_hbase_client_scanner_cache设置为100。

Optimizing the caching settings is a balance between the time the client waits for a result and the number of sets of results the client needs to receive. If the caching setting is too large, the client could end up waiting for a long time or the request could even time out. If the setting is too small, the scan needs to return results in several pieces. If you think of the scan as a shovel, a bigger cache setting is analogous to a bigger shovel, and a smaller cache setting is equivalent to more shoveling in order to fill the bucket.

优化缓存设置是客户端等待结果的时间和客户端需要接收的结果集的数量之间的平衡。如果缓存设置太大,客户端可能会等待很长时间,或者请求甚至超时。如果设置太小,扫描需要返回几个片段的结果。如果你把扫描看作是一个铲,一个更大的缓存设置类似于一个更大的铲子,一个较小的缓存设置相当于更多的铲雪来填满桶。

The list of priorities mentioned above allows you to set a reasonable default, and override it for specific operations.

上面提到的优先级列表允许您设置合理的默认值,并为特定操作覆盖它。

See the API documentation for Scan for more details.

有关详细信息,请参阅API文档。

49. Bundled HBase MapReduce Jobs

49。捆绑HBase MapReduce工作

The HBase JAR also serves as a Driver for some bundled MapReduce jobs. To learn about the bundled MapReduce jobs, run the following command.

HBase JAR还充当了一些绑定MapReduce作业的驱动程序。要了解绑定的MapReduce作业,请运行以下命令。

$ ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-mapreduce-VERSION.jar
An example program must be given as the first argument.
Valid program names are:
  copytable: Export a table from local cluster to peer cluster
  completebulkload: Complete a bulk data load.
  export: Write table data to HDFS.
  import: Import data written by Export.
  importtsv: Import data in TSV format.
  rowcounter: Count rows in HBase table

Each of the valid program names are bundled MapReduce jobs. To run one of the jobs, model your command after the following example.

每个有效的程序名称都被绑定在MapReduce作业上。要运行其中一个作业,请在下面的示例中为您的命令建模。

$ ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-mapreduce-VERSION.jar rowcounter myTable

50. HBase as a MapReduce Job Data Source and Data Sink

50。HBase作为MapReduce作业数据源和数据接收器。

HBase can be used as a data source, TableInputFormat, and data sink, TableOutputFormat or MultiTableOutputFormat, for MapReduce jobs. Writing MapReduce jobs that read or write HBase, it is advisable to subclass TableMapper and/or TableReducer. See the do-nothing pass-through classes IdentityTableMapper and IdentityTableReducer for basic usage. For a more involved example, see RowCounter or review the org.apache.hadoop.hbase.mapreduce.TestTableMapReduce unit test.

HBase可作为数据源、TableInputFormat和数据接收器、TableOutputFormat或MultiTableOutputFormat,用于MapReduce作业。写MapReduce任务读或写HBase,建议子类化TableMapper和/或TableReducer。请参阅“不做”的传递类标识符和标识符类的基本用法。对于更复杂的示例,请参见RowCounter或查看org.apache.hadoop.hbase.mapreduce。TestTableMapReduce单元测试。

If you run MapReduce jobs that use HBase as source or sink, need to specify source and sink table and column names in your configuration.

如果使用HBase作为源或接收器运行MapReduce作业,则需要在配置中指定源和sink表和列名称。

When you read from HBase, the TableInputFormat requests the list of regions from HBase and makes a map, which is either a map-per-region or mapreduce.job.maps map, whichever is smaller. If your job only has two maps, raise mapreduce.job.maps to a number greater than the number of regions. Maps will run on the adjacent TaskTracker/NodeManager if you are running a TaskTracer/NodeManager and RegionServer per node. When writing to HBase, it may make sense to avoid the Reduce step and write back into HBase from within your map. This approach works when your job does not need the sort and collation that MapReduce does on the map-emitted data. On insert, HBase 'sorts' so there is no point double-sorting (and shuffling data around your MapReduce cluster) unless you need to. If you do not need the Reduce, your map might emit counts of records processed for reporting at the end of the job, or set the number of Reduces to zero and use TableOutputFormat. If running the Reduce step makes sense in your case, you should typically use multiple reducers so that load is spread across the HBase cluster.

当您从HBase中读取时,TableInputFormat请求HBase中的区域列表,并生成一个map,它要么是map-per-region,要么是mapreduce.job。地图地图,以较小的为准。如果你的工作只有两张地图,那就增加mapreduce。映射到大于区域数目的数字。如果您正在运行一个TaskTracer/NodeManager和每个节点的区域服务器,那么地图将在邻近的任务跟踪器/NodeManager上运行。当写入HBase时,避免减少步骤并从映射中返回到HBase可能是有意义的。当您的作业不需要MapReduce在地图上释放的数据的排序和排序时,这种方法就会起作用。在插入时,HBase“排序”,因此,除非您需要,否则没有必要对您的MapReduce集群进行重复排序(和调整数据)。如果您不需要Reduce,那么您的映射可能会在作业结束时发出处理报告的记录计数,或者将Reduce的数量设置为0,并使用TableOutputFormat。如果在您的情况下运行Reduce步骤是有意义的,那么您应该使用多个减速器,这样负载就会分散到HBase集群中。

A new HBase partitioner, the HRegionPartitioner, can run as many reducers the number of existing regions. The HRegionPartitioner is suitable when your table is large and your upload will not greatly alter the number of existing regions upon completion. Otherwise use the default partitioner.

一个新的HBase分区者,h分区的参与者,可以在现有区域的数量上运行。当您的表很大,并且您的上载不会在完成时大大改变现有区域的数量时,hpartiator是合适的。否则使用默认的分区。

51. Writing HFiles Directly During Bulk Import

51。在批量导入期间直接编写HFiles。

If you are importing into a new table, you can bypass the HBase API and write your content directly to the filesystem, formatted into HBase data files (HFiles). Your import will run faster, perhaps an order of magnitude faster. For more on how this mechanism works, see Bulk Loading.

如果您正在导入一个新表,您可以绕过HBase API并将内容直接写到文件系统中,格式化为HBase数据文件(HFiles)。您的导入将会运行得更快,速度可能会更快。有关这个机制如何工作的更多信息,请参阅批量加载。

52. RowCounter Example

52岁。rowcount例子

The included RowCounter MapReduce job uses TableInputFormat and does a count of all rows in the specified table. To run it, use the following command:

包含的RowCounter MapReduce作业使用TableInputFormat,并对指定表中的所有行进行计数。要运行它,请使用以下命令:

$ ./bin/hadoop jar hbase-X.X.X.jar

This will invoke the HBase MapReduce Driver class. Select rowcounter from the choice of jobs offered. This will print rowcounter usage advice to standard output. Specify the tablename, column to count, and output directory. If you have classpath errors, see HBase, MapReduce, and the CLASSPATH.

这将调用HBase MapReduce驱动程序类。从提供的工作选择中选择rowcounter。这将打印rowcounter使用建议到标准输出。指定tablename、列计数和输出目录。如果您有类路径错误,请参见HBase、MapReduce和类路径。

53. Map-Task Splitting

53岁。地图任务分解

53.1. The Default HBase MapReduce Splitter

53.1。默认的HBase MapReduce Splitter。

When TableInputFormat is used to source an HBase table in a MapReduce job, its splitter will make a map task for each region of the table. Thus, if there are 100 regions in the table, there will be 100 map-tasks for the job - regardless of how many column families are selected in the Scan.

当使用TableInputFormat在MapReduce作业中为HBase表提供源时,它的splitter将为表的每个区域生成一个映射任务。因此,如果表中有100个区域,那么无论在扫描中选择多少个列族,都将有100个映射任务。

53.2. Custom Splitters

53.2。自定义分割

For those interested in implementing custom splitters, see the method getSplits in TableInputFormatBase. That is where the logic for map-task assignment resides.

对于那些有兴趣实现自定义拆分的人,请参见TableInputFormatBase中的方法get。这就是映射任务分配的逻辑所在。

54. HBase MapReduce Examples

54。HBase MapReduce的例子

54.1. HBase MapReduce Read Example

54.1。HBase MapReduce读例子

The following is an example of using HBase as a MapReduce source in read-only manner. Specifically, there is a Mapper instance but no Reducer, and nothing is being emitted from the Mapper. The job would be defined as follows…​

下面是以只读方式使用HBase作为MapReduce源的示例。具体来说,有一个Mapper实例,但没有减速器,并且没有任何东西从映射器中发出。这项工作的定义如下……

Configuration config = HBaseConfiguration.create();
Job job = new Job(config, "ExampleRead");
job.setJarByClass(MyReadJob.class);     // class that contains mapper

Scan scan = new Scan();
scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false);  // don't set to true for MR jobs
// set other scan attrs
...

TableMapReduceUtil.initTableMapperJob(
  tableName,        // input HBase table name
  scan,             // Scan instance to control CF and attribute selection
  MyMapper.class,   // mapper
  null,             // mapper output key
  null,             // mapper output value
  job);
job.setOutputFormatClass(NullOutputFormat.class);   // because we aren't emitting anything from mapper

boolean b = job.waitForCompletion(true);
if (!b) {
  throw new IOException("error with job!");
}

…​and the mapper instance would extend TableMapper…​

并且mapper实例将扩展TableMapper…

public static class MyMapper extends TableMapper<Text, Text> {

  public void map(ImmutableBytesWritable row, Result value, Context context) throws InterruptedException, IOException {
    // process data for the row from the Result instance.
   }
}

54.2. HBase MapReduce Read/Write Example

54.2。HBase MapReduce读/写的例子

The following is an example of using HBase both as a source and as a sink with MapReduce. This example will simply copy data from one table to another.

下面是一个使用HBase作为源和使用MapReduce的接收器的示例。本例将简单地将数据从一个表复制到另一个表。

Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleReadWrite");
job.setJarByClass(MyReadWriteJob.class);    // class that contains mapper

Scan scan = new Scan();
scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false);  // don't set to true for MR jobs
// set other scan attrs

TableMapReduceUtil.initTableMapperJob(
  sourceTable,      // input table
  scan,             // Scan instance to control CF and attribute selection
  MyMapper.class,   // mapper class
  null,             // mapper output key
  null,             // mapper output value
  job);
TableMapReduceUtil.initTableReducerJob(
  targetTable,      // output table
  null,             // reducer class
  job);
job.setNumReduceTasks(0);

boolean b = job.waitForCompletion(true);
if (!b) {
    throw new IOException("error with job!");
}

An explanation is required of what TableMapReduceUtil is doing, especially with the reducer. TableOutputFormat is being used as the outputFormat class, and several parameters are being set on the config (e.g., TableOutputFormat.OUTPUT_TABLE), as well as setting the reducer output key to ImmutableBytesWritable and reducer value to Writable. These could be set by the programmer on the job and conf, but TableMapReduceUtil tries to make things easier.

需要解释的是,TableMapReduceUtil在做什么,特别是在减速器上。TableOutputFormat被用作outputFormat类,并且在配置上设置了几个参数(例如,TableOutputFormat. output_table),以及将减缩输出键设置为ImmutableBytesWritable和reducer值可写。这些可以由程序员在作业和conf上设置,但是TableMapReduceUtil试图让事情变得更简单。

The following is the example mapper, which will create a Put and matching the input Result and emit it. Note: this is what the CopyTable utility does.

下面的示例是mapper,它将创建一个Put并匹配输入结果并发出它。注意:这是CopyTable实用程序所做的。

public static class MyMapper extends TableMapper<ImmutableBytesWritable, Put>  {

  public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
    // this example is just copying the data from the source table...
      context.write(row, resultToPut(row,value));
    }

    private static Put resultToPut(ImmutableBytesWritable key, Result result) throws IOException {
      Put put = new Put(key.get());
      for (KeyValue kv : result.raw()) {
        put.add(kv);
      }
      return put;
    }
}

There isn’t actually a reducer step, so TableOutputFormat takes care of sending the Put to the target table.

实际上并没有减少的步骤,所以TableOutputFormat负责将Put发送到目标表。

This is just an example, developers could choose not to use TableOutputFormat and connect to the target table themselves.

这只是一个示例,开发人员可以选择不使用TableOutputFormat并连接到目标表本身。

54.3. HBase MapReduce Read/Write Example With Multi-Table Output

54.3。HBase MapReduce读/写例,具有多表输出。

TODO: example for MultiTableOutputFormat.

待办事项:MultiTableOutputFormat的示例。

54.4. HBase MapReduce Summary to HBase Example

54.4。HBase MapReduce概要到HBase示例。

The following example uses HBase as a MapReduce source and sink with a summarization step. This example will count the number of distinct instances of a value in a table and write those summarized counts in another table.

下面的示例使用HBase作为MapReduce源,并使用一个总结步骤。此示例将计算表中某个值的不同实例数,并将这些汇总计数写入另一个表中。

Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleSummary");
job.setJarByClass(MySummaryJob.class);     // class that contains mapper and reducer

Scan scan = new Scan();
scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false);  // don't set to true for MR jobs
// set other scan attrs

TableMapReduceUtil.initTableMapperJob(
  sourceTable,        // input table
  scan,               // Scan instance to control CF and attribute selection
  MyMapper.class,     // mapper class
  Text.class,         // mapper output key
  IntWritable.class,  // mapper output value
  job);
TableMapReduceUtil.initTableReducerJob(
  targetTable,        // output table
  MyTableReducer.class,    // reducer class
  job);
job.setNumReduceTasks(1);   // at least one, adjust as required

boolean b = job.waitForCompletion(true);
if (!b) {
  throw new IOException("error with job!");
}

In this example mapper a column with a String-value is chosen as the value to summarize upon. This value is used as the key to emit from the mapper, and an IntWritable represents an instance counter.

在这个示例中,选择一个带有字符串值的列作为总结的值。这个值用作从映射器发出的键,而IntWritable则表示实例计数器。

public static class MyMapper extends TableMapper<Text, IntWritable>  {
  public static final byte[] CF = "cf".getBytes();
  public static final byte[] ATTR1 = "attr1".getBytes();

  private final IntWritable ONE = new IntWritable(1);
  private Text text = new Text();

  public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
    String val = new String(value.getValue(CF, ATTR1));
    text.set(val);     // we can only emit Writables...
    context.write(text, ONE);
  }
}

In the reducer, the "ones" are counted (just like any other MR example that does this), and then emits a Put.

在还原器中,“ones”被计数(就像其他例子一样),然后发出一个Put。

public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable>  {
  public static final byte[] CF = "cf".getBytes();
  public static final byte[] COUNT = "count".getBytes();

  public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
    int i = 0;
    for (IntWritable val : values) {
      i += val.get();
    }
    Put put = new Put(Bytes.toBytes(key.toString()));
    put.add(CF, COUNT, Bytes.toBytes(i));

    context.write(null, put);
  }
}

54.5. HBase MapReduce Summary to File Example

54.5。HBase MapReduce摘要以文件为例。

This very similar to the summary example above, with exception that this is using HBase as a MapReduce source but HDFS as the sink. The differences are in the job setup and in the reducer. The mapper remains the same.

这与上面的概要示例非常相似,但它使用HBase作为MapReduce源,而HDFS作为接收器。不同之处在于工作的设置和减少。映射器保持不变。

Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleSummaryToFile");
job.setJarByClass(MySummaryFileJob.class);     // class that contains mapper and reducer

Scan scan = new Scan();
scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false);  // don't set to true for MR jobs
// set other scan attrs

TableMapReduceUtil.initTableMapperJob(
  sourceTable,        // input table
  scan,               // Scan instance to control CF and attribute selection
  MyMapper.class,     // mapper class
  Text.class,         // mapper output key
  IntWritable.class,  // mapper output value
  job);
job.setReducerClass(MyReducer.class);    // reducer class
job.setNumReduceTasks(1);    // at least one, adjust as required
FileOutputFormat.setOutputPath(job, new Path("/tmp/mr/mySummaryFile"));  // adjust directories as required

boolean b = job.waitForCompletion(true);
if (!b) {
  throw new IOException("error with job!");
}

As stated above, the previous Mapper can run unchanged with this example. As for the Reducer, it is a "generic" Reducer instead of extending TableMapper and emitting Puts.

如上所述,前面的Mapper可以与此示例保持不变。至于减速机,它是一种“通用”的减速机,而不是扩展的制表机和发射装置。

public static class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable>  {

  public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
    int i = 0;
    for (IntWritable val : values) {
      i += val.get();
    }
    context.write(key, new IntWritable(i));
  }
}

54.6. HBase MapReduce Summary to HBase Without Reducer

54.6。HBase MapReduce概要到HBase而不减少。

It is also possible to perform summaries without a reducer - if you use HBase as the reducer.

如果你使用HBase作为减速器,也可以不使用减速器来执行摘要。

An HBase target table would need to exist for the job summary. The Table method incrementColumnValue would be used to atomically increment values. From a performance perspective, it might make sense to keep a Map of values with their values to be incremented for each map-task, and make one update per key at during the cleanup method of the mapper. However, your mileage may vary depending on the number of rows to be processed and unique keys.

工作总结需要有一个HBase目标表。表方法incrementColumnValue将用于原子递增值。从性能的角度来看,将值的映射与它们的值保持一致,以对每个Map -task进行递增,并在mapper的清理方法中对每个键进行一次更新,这可能是有意义的。但是,根据要处理的行数和唯一键的不同,您的里程可能会有所不同。

In the end, the summary results are in HBase.

最后,总结结果在HBase中。

54.7. HBase MapReduce Summary to RDBMS

54.7。HBase MapReduce摘要到RDBMS。

Sometimes it is more appropriate to generate summaries to an RDBMS. For these cases, it is possible to generate summaries directly to an RDBMS via a custom reducer. The setup method can connect to an RDBMS (the connection information can be passed via custom parameters in the context) and the cleanup method can close the connection.

有时,为RDBMS生成摘要更合适。对于这些情况,可以通过自定义减少器直接将概要文件生成到RDBMS。setup方法可以连接到RDBMS(连接信息可以通过上下文中的自定义参数传递),并且清理方法可以关闭连接。

It is critical to understand that number of reducers for the job affects the summarization implementation, and you’ll have to design this into your reducer. Specifically, whether it is designed to run as a singleton (one reducer) or multiple reducers. Neither is right or wrong, it depends on your use-case. Recognize that the more reducers that are assigned to the job, the more simultaneous connections to the RDBMS will be created - this will scale, but only to a point.

重要的是,要了解工作中的减速器的数量会影响到总结的实现,并且您将不得不将其设计成您的减速器。具体地说,它是否被设计为单例(一个减速器)或多个减速器。两者都不是对或错,这取决于你的用例。要认识到分配给该作业的减缩器越多,就会创建与RDBMS越同步的连接——这将扩展,但只会达到一个点。

public static class MyRdbmsReducer extends Reducer<Text, IntWritable, Text, IntWritable>  {

  private Connection c = null;

  public void setup(Context context) {
    // create DB connection...
  }

  public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
    // do summarization
    // in this example the keys are Text, but this is just an example
  }

  public void cleanup(Context context) {
    // close db connection
  }

}

In the end, the summary results are written to your RDBMS table/s.

最后,将汇总结果写到RDBMS表/s中。

55. Accessing Other HBase Tables in a MapReduce Job

55。在MapReduce作业中访问其他HBase表。

Although the framework currently allows one HBase table as input to a MapReduce job, other HBase tables can be accessed as lookup tables, etc., in a MapReduce job via creating an Table instance in the setup method of the Mapper.

尽管框架目前允许一个HBase表作为MapReduce作业的输入,其他的HBase表可以作为查找表来访问,等等,在MapReduce作业中,通过在Mapper的setup方法中创建一个表实例。

public class MyMapper extends TableMapper<Text, LongWritable> {
  private Table myOtherTable;

  public void setup(Context context) {
    // In here create a Connection to the cluster and save it or use the Connection
    // from the existing table
    myOtherTable = connection.getTable("myOtherTable");
  }

  public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
    // process Result...
    // use 'myOtherTable' for lookups
  }

56. Speculative Execution

56。投机执行

It is generally advisable to turn off speculative execution for MapReduce jobs that use HBase as a source. This can either be done on a per-Job basis through properties, or on the entire cluster. Especially for longer running jobs, speculative execution will create duplicate map-tasks which will double-write your data to HBase; this is probably not what you want.

通常建议关闭使用HBase作为源的MapReduce作业的投机性执行。这可以通过属性或整个集群来实现。特别是对于长时间运行的作业,推测执行将创建重复的映射任务,将您的数据写入HBase;这可能不是你想要的。

See spec.ex for more information.

有关更多信息,请参见spec.ex。

57. Cascading

57。级联

Cascading is an alternative API for MapReduce, which actually uses MapReduce, but allows you to write your MapReduce code in a simplified way.

级联是MapReduce的一个替代API,它实际上使用MapReduce,但是允许您以简化的方式编写MapReduce代码。

The following example shows a Cascading Flow which "sinks" data into an HBase cluster. The same hBaseTap API could be used to "source" data as well.

下面的示例展示了一个级联流,它将数据“汇聚”到一个HBase集群中。同样的hBaseTap API也可以用于“源”数据。

// read data from the default filesystem
// emits two fields: "offset" and "line"
Tap source = new Hfs( new TextLine(), inputFileLhs );

// store data in an HBase cluster
// accepts fields "num", "lower", and "upper"
// will automatically scope incoming fields to their proper familyname, "left" or "right"
Fields keyFields = new Fields( "num" );
String[] familyNames = {"left", "right"};
Fields[] valueFields = new Fields[] {new Fields( "lower" ), new Fields( "upper" ) };
Tap hBaseTap = new HBaseTap( "multitable", new HBaseScheme( keyFields, familyNames, valueFields ), SinkMode.REPLACE );

// a simple pipe assembly to parse the input into fields
// a real app would likely chain multiple Pipes together for more complex processing
Pipe parsePipe = new Each( "insert", new Fields( "line" ), new RegexSplitter( new Fields( "num", "lower", "upper" ), " " ) );

// "plan" a cluster executable Flow
// this connects the source Tap and hBaseTap (the sink Tap) to the parsePipe
Flow parseFlow = new FlowConnector( properties ).connect( source, hBaseTap, parsePipe );

// start the flow, and block until complete
parseFlow.complete();

// open an iterator on the HBase table we stuffed data into
TupleEntryIterator iterator = parseFlow.openSink();

while(iterator.hasNext())
  {
  // print out each tuple from HBase
  System.out.println( "iterator.next() = " + iterator.next() );
  }

iterator.close();

Securing Apache HBase

确保Apache HBase

Reporting Security Bugs
To protect existing HBase installations from exploitation, please do not use JIRA to report security-related bugs. Instead, send your report to the mailing list private@apache.org, which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report.

HBase adheres to the Apache Software Foundation’s policy on reported vulnerabilities, available at http://apache.org/security/.

HBase遵循Apache Software Foundation关于报告漏洞的策略,可在http://apache.org/security/上找到。

If you wish to send an encrypted report, you can use the GPG details provided for the general ASF security list. This will likely increase the response time to your report.

如果您希望发送一个加密的报告,您可以使用GPG的详细信息来提供一般的ASF安全列表。这可能会增加您报告的响应时间。

HBase provides mechanisms to secure various components and aspects of HBase and how it relates to the rest of the Hadoop infrastructure, as well as clients and resources outside Hadoop.

HBase提供了各种机制来确保HBase的各种组件和方面,以及它如何与Hadoop基础设施的其余部分、以及Hadoop外的客户端和资源相关。

58. Using Secure HTTP (HTTPS) for the Web UI

58岁。为Web UI使用安全的HTTP (HTTPS)。

A default HBase install uses insecure HTTP connections for Web UIs for the master and region servers. To enable secure HTTP (HTTPS) connections instead, set hbase.ssl.enabled to true in hbase-site.xml. This does not change the port used by the Web UI. To change the port for the web UI for a given HBase component, configure that port’s setting in hbase-site.xml. These settings are:

默认的HBase安装使用不安全的HTTP连接来为主服务器和区域服务器提供Web ui。要启用安全的HTTP (HTTPS)连接,设置hbase.ssl。在hbase-site.xml中启用。这不会改变Web UI所使用的端口。要为一个给定的HBase组件更改web UI的端口,可以在HBase -site.xml中配置该端口的设置。这些设置包括:

  • hbase.master.info.port

    hbase.master.info.port

  • hbase.regionserver.info.port

    hbase.regionserver.info.port

If you enable HTTPS, clients should avoid using the non-secure HTTP connection.

If you enable secure HTTP, clients should connect to HBase using the https:// URL. Clients using the http:// URL will receive an HTTP response of 200, but will not receive any data. The following exception is logged:

如果您启用了安全的HTTP,客户端应该使用https:// URL连接到HBase。使用http:// URL的客户机将收到200个HTTP响应,但不会接收任何数据。日志记录如下异常:

javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?

This is because the same port is used for HTTP and HTTPS.

这是因为同一个端口用于HTTP和HTTPS。

HBase uses Jetty for the Web UI. Without modifying Jetty itself, it does not seem possible to configure Jetty to redirect one port to another on the same host. See Nick Dimiduk’s contribution on this Stack Overflow thread for more information. If you know how to fix this without opening a second port for HTTPS, patches are appreciated.

HBase使用Jetty作为Web UI。如果不修改Jetty本身,就不可能配置Jetty将一个端口重定向到同一主机上的另一个端口。有关更多信息,请参见Nick Dimiduk对这个堆栈溢出线程的贡献。如果您知道如何解决这个问题,而不为HTTPS打开第二个端口,那么补丁就会被欣赏。

59. Using SPNEGO for Kerberos authentication with Web UIs

59。使用SPNEGO对Web ui进行Kerberos身份验证。

Kerberos-authentication to HBase Web UIs can be enabled via configuring SPNEGO with the hbase.security.authentication.ui property in hbase-site.xml. Enabling this authentication requires that HBase is also configured to use Kerberos authentication for RPCs (e.g hbase.security.authentication = kerberos).

通过配置SPNEGO和HBase .security.authentication,可以启用对HBase Web ui的kerberos身份验证。在hbase-site.xml ui属性。启用此身份验证要求HBase也被配置为使用Kerberos身份验证(e)。g hbase.security。kerberos身份验证=)。

<property>
  <name>hbase.security.authentication.ui</name>
  <value>kerberos</value>
  <description>Controls what kind of authentication should be used for the HBase web UIs.</description>
</property>
<property>
  <name>hbase.security.authentication</name>
  <value>kerberos</value>
  <description>The Kerberos keytab file to use for SPNEGO authentication by the web server.</description>
</property>

A number of properties exist to configure SPNEGO authentication for the web server:

有许多属性用于为web服务器配置SPNEGO身份验证:

<property>
  <name>hbase.security.authentication.spnego.kerberos.principal</name>
  <value>HTTP/_HOST@EXAMPLE.COM</value>
  <description>Required for SPNEGO, the Kerberos principal to use for SPNEGO authentication by the
  web server. The _HOST keyword will be automatically substituted with the node's
  hostname.</description>
</property>
<property>
  <name>hbase.security.authentication.spnego.kerberos.keytab</name>
  <value>/etc/security/keytabs/spnego.service.keytab</value>
  <description>Required for SPNEGO, the Kerberos keytab file to use for SPNEGO authentication by the
  web server.</description>
</property>
<property>
  <name>hbase.security.authentication.spnego.kerberos.name.rules</name>
  <value></value>
  <description>Optional, Hadoop-style `auth_to_local` rules which will be parsed and used in the
  handling of Kerberos principals</description>
</property>
<property>
  <name>hbase.security.authentication.signature.secret.file</name>
  <value></value>
  <description>Optional, a file whose contents will be used as a secret to sign the HTTP cookies
  as a part of the SPNEGO authentication handshake. If this is not provided, Java's `Random` library
  will be used for the secret.</description>
</property>

60. Secure Client Access to Apache HBase

60。安全客户端访问Apache HBase。

Newer releases of Apache HBase (>= 0.92) support optional SASL authentication of clients. See also Matteo Bertozzi’s article on Understanding User Authentication and Authorization in Apache HBase.

Apache HBase的更新版本(>= 0.92)支持客户端可选的SASL认证。请参阅Matteo Bertozzi的文章,了解Apache HBase中的用户身份验证和授权。

This describes how to set up Apache HBase and clients for connection to secure HBase resources.

这描述了如何设置Apache HBase和客户端来连接到安全的HBase资源。

60.1. Prerequisites

60.1。先决条件

Hadoop Authentication Configuration

To run HBase RPC with strong authentication, you must set hbase.security.authentication to kerberos. In this case, you must also set hadoop.security.authentication to kerberos in core-site.xml. Otherwise, you would be using strong authentication for HBase but not for the underlying HDFS, which would cancel out any benefit.

要使用强身份验证运行HBase RPC,必须设置HBase .security。kerberos身份验证。在这种情况下,您还必须设置hadoop.security。在core-site.xml中对kerberos进行身份验证。否则,您将为HBase使用强大的身份验证,而不是针对底层的HDFS,这将抵消任何好处。

Kerberos KDC

You need to have a working Kerberos KDC.

您需要一个工作的Kerberos KDC。

60.2. Server-side Configuration for Secure Operation

60.2。用于安全操作的服务器端配置。

First, refer to security.prerequisites and ensure that your underlying HDFS configuration is secure.

首先,指的是安全。先决条件和确保底层的HDFS配置是安全的。

Add the following to the hbase-site.xml file on every server machine in the cluster:

将以下内容添加到hbase站点。集群中每个服务器机器上的xml文件:

<property>
  <name>hbase.security.authentication</name>
  <value>kerberos</value>
</property>
<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
  <value>org.apache.hadoop.hbase.security.token.TokenProvider</value>
</property>

A full shutdown and restart of HBase service is required when deploying these configuration changes.

在部署这些配置更改时,需要完全关闭和重新启动HBase服务。

60.3. Client-side Configuration for Secure Operation

60.3。安全操作的客户端配置。

First, refer to Prerequisites and ensure that your underlying HDFS configuration is secure.

首先,请参考先决条件,确保底层的HDFS配置是安全的。

Add the following to the hbase-site.xml file on every client:

将以下内容添加到hbase站点。每个客户端的xml文件:

<property>
  <name>hbase.security.authentication</name>
  <value>kerberos</value>
</property>

The client environment must be logged in to Kerberos from KDC or keytab via the kinit command before communication with the HBase cluster will be possible.

在与HBase集群通信之前,必须通过kinit命令将客户机环境从KDC或keytab登录到Kerberos。

Be advised that if the hbase.security.authentication in the client- and server-side site files do not match, the client will not be able to communicate with the cluster.

请注意,如果hbase.security。在客户端和服务器端站点文件中的身份验证不匹配,客户端将无法与集群通信。

Once HBase is configured for secure RPC it is possible to optionally configure encrypted communication. To do so, add the following to the hbase-site.xml file on every client:

一旦HBase配置为安全RPC,就有可能配置加密通信。为此,将以下内容添加到hbase站点。每个客户端的xml文件:

<property>
  <name>hbase.rpc.protection</name>
  <value>privacy</value>
</property>

This configuration property can also be set on a per-connection basis. Set it in the Configuration supplied to Table:

这个配置属性也可以在每个连接的基础上设置。将其设置在提供给表的配置中:

Configuration conf = HBaseConfiguration.create();
Connection connection = ConnectionFactory.createConnection(conf);
conf.set("hbase.rpc.protection", "privacy");
try (Connection connection = ConnectionFactory.createConnection(conf);
     Table table = connection.getTable(TableName.valueOf(tablename))) {
  .... do your stuff
}

Expect a ~10% performance penalty for encrypted communication.

对于加密通信,期望有10% ~10%的性能损失。

60.4. Client-side Configuration for Secure Operation - Thrift Gateway

60.4。安全操作-节约网关的客户端配置。

Add the following to the hbase-site.xml file for every Thrift gateway:

将以下内容添加到hbase站点。每个节俭网关的xml文件:

<property>
  <name>hbase.thrift.keytab.file</name>
  <value>/etc/hbase/conf/hbase.keytab</value>
</property>
<property>
  <name>hbase.thrift.kerberos.principal</name>
  <value>$USER/_HOST@HADOOP.LOCALDOMAIN</value>
  <!-- TODO: This may need to be HTTP/_HOST@<REALM> and _HOST may not work. You may have to put the concrete full hostname. -->
</property>
<!-- Add these if you need to configure a different DNS interface from the default -->
<property>
  <name>hbase.thrift.dns.interface</name>
  <value>default</value>
</property>
<property>
  <name>hbase.thrift.dns.nameserver</name>
  <value>default</value>
</property>

Substitute the appropriate credential and keytab for $USER and $KEYTAB respectively.

分别以$USER和$ keytab替代适当的凭据和keytab。

In order to use the Thrift API principal to interact with HBase, it is also necessary to add the hbase.thrift.kerberos.principal to the acl table. For example, to give the Thrift API principal, thrift_server, administrative access, a command such as this one will suffice:

为了使用节俭API主体与HBase交互,还需要添加HBase . Thrift .kerberos。用于acl表的主体。例如,为节约API主体,thrift_server,管理访问,这样的命令就足够了:

grant 'thrift_server', 'RWCA'

For more information about ACLs, please see the Access Control Labels (ACLs) section

有关acl的更多信息,请参见访问控制标签(ACLs)部分。

The Thrift gateway will authenticate with HBase using the supplied credential. No authentication will be performed by the Thrift gateway itself. All client access via the Thrift gateway will use the Thrift gateway’s credential and have its privilege.

节约网关将使用提供的凭据对HBase进行身份验证。没有认证将由节约网关本身执行。通过节俭网关的所有客户端访问将使用节俭网关的凭据,并拥有它的特权。

60.5. Configure the Thrift Gateway to Authenticate on Behalf of the Client

60.5。配置节俭网关,以代表客户端进行身份验证。

Client-side Configuration for Secure Operation - Thrift Gateway describes how to authenticate a Thrift client to HBase using a fixed user. As an alternative, you can configure the Thrift gateway to authenticate to HBase on the client’s behalf, and to access HBase using a proxy user. This was implemented in HBASE-11349 for Thrift 1, and HBASE-11474 for Thrift 2.

安全操作-节俭网关的客户端配置描述了如何使用一个固定用户对一个节俭客户进行身份验证。作为另一种选择,您可以配置节约网关,以客户的名义向HBase进行身份验证,并使用代理用户访问HBase。这是在HBASE-11349中实现的节约1,HBASE-11474用于节约2。

Limitations with Thrift Framed Transport

If you use framed transport, you cannot yet take advantage of this feature, because SASL does not work with Thrift framed transport at this time.

如果您使用框架传输,您还不能利用这个特性,因为在这个时候SASL不使用节俭框架传输。

To enable it, do the following.

要启用它,请执行以下操作。

  1. Be sure Thrift is running in secure mode, by following the procedure described in Client-side Configuration for Secure Operation - Thrift Gateway.

    确保节俭是在安全模式下运行,遵循客户端配置中描述的安全操作—节约网关。

  2. Be sure that HBase is configured to allow proxy users, as described in REST Gateway Impersonation Configuration.

    请确保HBase配置为允许代理用户,如REST网关模拟配置所述。

  3. In hbase-site.xml for each cluster node running a Thrift gateway, set the property hbase.thrift.security.qop to one of the following three values:

    在hbase-site。每个集群节点运行一个节约网关的xml,设置属性hbase.thrift.security。qop的以下三个值之一:

    • privacy - authentication, integrity, and confidentiality checking.

      隐私——认证、完整性和机密性检查。

    • integrity - authentication and integrity checking

      完整性-认证和完整性检查。

    • authentication - authentication checking only

      身份验证——只检查身份验证。

  4. Restart the Thrift gateway processes for the changes to take effect. If a node is running Thrift, the output of the jps command will list a ThriftServer process. To stop Thrift on a node, run the command bin/hbase-daemon.sh stop thrift. To start Thrift on a node, run the command bin/hbase-daemon.sh start thrift.

    重新启动节俭网关流程,以使更改生效。如果一个节点正在运行节约,jps命令的输出将列出一个ThriftServer进程。要在节点上停止节约,运行命令bin/hbase-daemon。sh停止节俭。要在节点上开始节约,运行命令bin/hbase-daemon。sh开始节俭。

60.6. Configure the Thrift Gateway to Use the doAs Feature

60.6。配置节俭网关以使用doAs特性。

Configure the Thrift Gateway to Authenticate on Behalf of the Client describes how to configure the Thrift gateway to authenticate to HBase on the client’s behalf, and to access HBase using a proxy user. The limitation of this approach is that after the client is initialized with a particular set of credentials, it cannot change these credentials during the session. The doAs feature provides a flexible way to impersonate multiple principals using the same client. This feature was implemented in HBASE-12640 for Thrift 1, but is currently not available for Thrift 2.

配置“节约网关”以代表客户端进行身份验证,描述如何配置“节约网关”,以客户端身份验证到HBase,并使用代理用户访问HBase。这种方法的局限性是,在客户端使用特定的凭证集初始化之后,它不能在会话期间更改这些凭据。doAs特性提供了一种灵活的方式,可以使用同一个客户机模拟多个主体。这个特性是在HBASE-12640中实现的,用于节约1,但是目前还不能用于节约2。

To enable the doAs feature, add the following to the hbase-site.xml file for every Thrift gateway:

要启用doAs功能,请将以下内容添加到hbase站点。每个节俭网关的xml文件:

<property>
  <name>hbase.regionserver.thrift.http</name>
  <value>true</value>
</property>
<property>
  <name>hbase.thrift.support.proxyuser</name>
  <value>true/value>
</property>

To allow proxy users when using doAs impersonation, add the following to the hbase-site.xml file for every HBase node:

为了在使用doAs模拟时允许代理用户,请将以下内容添加到hbase站点。每个HBase节点的xml文件:

<property>
  <name>hadoop.security.authorization</name>
  <value>true</value>
</property>
<property>
  <name>hadoop.proxyuser.$USER.groups</name>
  <value>$GROUPS</value>
</property>
<property>
  <name>hadoop.proxyuser.$USER.hosts</name>
  <value>$GROUPS</value>
</property>

Take a look at the demo client to get an overall idea of how to use this feature in your client.

看一下演示客户端,了解如何在客户端使用这个特性。

60.7. Client-side Configuration for Secure Operation - REST Gateway

60.7。安全操作- REST网关的客户端配置。

Add the following to the hbase-site.xml file for every REST gateway:

将以下内容添加到hbase站点。每个REST网关的xml文件:

<property>
  <name>hbase.rest.keytab.file</name>
  <value>$KEYTAB</value>
</property>
<property>
  <name>hbase.rest.kerberos.principal</name>
  <value>$USER/_HOST@HADOOP.LOCALDOMAIN</value>
</property>

Substitute the appropriate credential and keytab for $USER and $KEYTAB respectively.

分别以$USER和$ keytab替代适当的凭据和keytab。

The REST gateway will authenticate with HBase using the supplied credential.

REST网关将使用提供的凭据对HBase进行身份验证。

In order to use the REST API principal to interact with HBase, it is also necessary to add the hbase.rest.kerberos.principal to the acl table. For example, to give the REST API principal, rest_server, administrative access, a command such as this one will suffice:

为了使用REST API主体与HBase交互,还需要添加HBase .rest.kerberos。用于acl表的主体。例如,要给REST API principal、rest_server、管理访问,这样的命令就足够了:

grant 'rest_server', 'RWCA'

For more information about ACLs, please see the Access Control Labels (ACLs) section

有关acl的更多信息,请参见访问控制标签(ACLs)部分。

HBase REST gateway supports SPNEGO HTTP authentication for client access to the gateway. To enable REST gateway Kerberos authentication for client access, add the following to the hbase-site.xml file for every REST gateway.

HBase REST网关支持为客户端访问网关的SPNEGO HTTP身份验证。要为客户端访问启用REST网关Kerberos身份验证,请将以下内容添加到hbase站点。每个REST网关的xml文件。

<property>
  <name>hbase.rest.support.proxyuser</name>
  <value>true</value>
</property>
<property>
  <name>hbase.rest.authentication.type</name>
  <value>kerberos</value>
</property>
<property>
  <name>hbase.rest.authentication.kerberos.principal</name>
  <value>HTTP/_HOST@HADOOP.LOCALDOMAIN</value>
</property>
<property>
  <name>hbase.rest.authentication.kerberos.keytab</name>
  <value>$KEYTAB</value>
</property>
<!-- Add these if you need to configure a different DNS interface from the default -->
<property>
  <name>hbase.rest.dns.interface</name>
  <value>default</value>
</property>
<property>
  <name>hbase.rest.dns.nameserver</name>
  <value>default</value>
</property>

Substitute the keytab for HTTP for $KEYTAB.

将HTTP的keytab替换为$ keytab。

HBase REST gateway supports different 'hbase.rest.authentication.type': simple, kerberos. You can also implement a custom authentication by implementing Hadoop AuthenticationHandler, then specify the full class name as 'hbase.rest.authentication.type' value. For more information, refer to SPNEGO HTTP authentication.

HBase REST网关支持不同的“HBase .rest.authentication”。类型:简单,kerberos。您还可以通过实现Hadoop AuthenticationHandler来实现自定义身份验证,然后将完整的类名指定为“hbase.rest.authentication”。类型的值。有关更多信息,请参阅SPNEGO HTTP身份验证。

60.8. REST Gateway Impersonation Configuration

60.8。其他网关模拟配置

By default, the REST gateway doesn’t support impersonation. It accesses the HBase on behalf of clients as the user configured as in the previous section. To the HBase server, all requests are from the REST gateway user. The actual users are unknown. You can turn on the impersonation support. With impersonation, the REST gateway user is a proxy user. The HBase server knows the actual/real user of each request. So it can apply proper authorizations.

默认情况下,REST网关不支持模拟。它代表客户端访问HBase,就像在前一节中配置的用户一样。对于HBase服务器,所有请求都来自REST网关用户。实际的用户是未知的。您可以打开模拟支持。通过模拟,REST gateway用户是一个代理用户。HBase服务器知道每个请求的实际/实际用户。因此,它可以应用适当的授权。

To turn on REST gateway impersonation, we need to configure HBase servers (masters and region servers) to allow proxy users; configure REST gateway to enable impersonation.

要打开REST网关模拟,我们需要配置HBase服务器(主机和区域服务器)来允许代理用户;配置REST网关以启用模拟。

To allow proxy users, add the following to the hbase-site.xml file for every HBase server:

要允许代理用户,请将以下内容添加到hbase站点。每个HBase服务器的xml文件:

<property>
  <name>hadoop.security.authorization</name>
  <value>true</value>
</property>
<property>
  <name>hadoop.proxyuser.$USER.groups</name>
  <value>$GROUPS</value>
</property>
<property>
  <name>hadoop.proxyuser.$USER.hosts</name>
  <value>$GROUPS</value>
</property>

Substitute the REST gateway proxy user for $USER, and the allowed group list for $GROUPS.

将REST网关代理用户替换为$ user,并将允许的组列表替换为$GROUPS。

To enable REST gateway impersonation, add the following to the hbase-site.xml file for every REST gateway.

要启用REST网关模拟,请将以下内容添加到hbase站点。每个REST网关的xml文件。

<property>
  <name>hbase.rest.authentication.type</name>
  <value>kerberos</value>
</property>
<property>
  <name>hbase.rest.authentication.kerberos.principal</name>
  <value>HTTP/_HOST@HADOOP.LOCALDOMAIN</value>
</property>
<property>
  <name>hbase.rest.authentication.kerberos.keytab</name>
  <value>$KEYTAB</value>
</property>

Substitute the keytab for HTTP for $KEYTAB.

将HTTP的keytab替换为$ keytab。

61. Simple User Access to Apache HBase

61年。简单的用户访问Apache HBase。

Newer releases of Apache HBase (>= 0.92) support optional SASL authentication of clients. See also Matteo Bertozzi’s article on Understanding User Authentication and Authorization in Apache HBase.

Apache HBase的更新版本(>= 0.92)支持客户端可选的SASL认证。请参阅Matteo Bertozzi的文章,了解Apache HBase中的用户身份验证和授权。

This describes how to set up Apache HBase and clients for simple user access to HBase resources.

这描述了如何设置Apache HBase和客户端,以便简单的用户访问HBase资源。

61.1. Simple versus Secure Access

61.1。简单和安全的访问

The following section shows how to set up simple user access. Simple user access is not a secure method of operating HBase. This method is used to prevent users from making mistakes. It can be used to mimic the Access Control using on a development system without having to set up Kerberos.

下一节将介绍如何设置简单的用户访问。简单的用户访问不是操作HBase的安全方法。该方法用于防止用户出错。它可以用来模拟在开发系统上使用的访问控制,而不必设置Kerberos。

This method is not used to prevent malicious or hacking attempts. To make HBase secure against these types of attacks, you must configure HBase for secure operation. Refer to the section Secure Client Access to Apache HBase and complete all of the steps described there.

此方法不用于防止恶意或黑客攻击。要使HBase安全防范这些类型的攻击,您必须为安全操作配置HBase。请参见安全客户端访问Apache HBase的部分,并完成所描述的所有步骤。

61.2. Prerequisites

61.2。先决条件

None

没有一个

61.3. Server-side Configuration for Simple User Access Operation

61.3。用于简单用户访问操作的服务器端配置。

Add the following to the hbase-site.xml file on every server machine in the cluster:

将以下内容添加到hbase站点。集群中每个服务器机器上的xml文件:

<property>
  <name>hbase.security.authentication</name>
  <value>simple</value>
</property>
<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>
<property>
  <name>hbase.coprocessor.master.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
  <name>hbase.coprocessor.region.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
  <name>hbase.coprocessor.regionserver.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>

For 0.94, add the following to the hbase-site.xml file on every server machine in the cluster:

在0.94中,将以下内容添加到hbase站点。集群中每个服务器机器上的xml文件:

<property>
  <name>hbase.rpc.engine</name>
  <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
</property>
<property>
  <name>hbase.coprocessor.master.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
  <name>hbase.coprocessor.region.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>

A full shutdown and restart of HBase service is required when deploying these configuration changes.

在部署这些配置更改时,需要完全关闭和重新启动HBase服务。

61.4. Client-side Configuration for Simple User Access Operation

61.4。简单用户访问操作的客户端配置。

Add the following to the hbase-site.xml file on every client:

将以下内容添加到hbase站点。每个客户端的xml文件:

<property>
  <name>hbase.security.authentication</name>
  <value>simple</value>
</property>

For 0.94, add the following to the hbase-site.xml file on every server machine in the cluster:

在0.94中,将以下内容添加到hbase站点。集群中每个服务器机器上的xml文件:

<property>
  <name>hbase.rpc.engine</name>
  <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
</property>

Be advised that if the hbase.security.authentication in the client- and server-side site files do not match, the client will not be able to communicate with the cluster.

请注意,如果hbase.security。在客户端和服务器端站点文件中的身份验证不匹配,客户端将无法与集群通信。

61.4.1. Client-side Configuration for Simple User Access Operation - Thrift Gateway

61.4.1。简单用户访问操作-节俭网关的客户端配置。

The Thrift gateway user will need access. For example, to give the Thrift API user, thrift_server, administrative access, a command such as this one will suffice:

节俭网关用户将需要访问。例如,为节约API用户,thrift_server,管理访问,这样的命令就足够了:

grant 'thrift_server', 'RWCA'

For more information about ACLs, please see the Access Control Labels (ACLs) section

有关acl的更多信息,请参见访问控制标签(ACLs)部分。

The Thrift gateway will authenticate with HBase using the supplied credential. No authentication will be performed by the Thrift gateway itself. All client access via the Thrift gateway will use the Thrift gateway’s credential and have its privilege.

节约网关将使用提供的凭据对HBase进行身份验证。没有认证将由节约网关本身执行。通过节俭网关的所有客户端访问将使用节俭网关的凭据,并拥有它的特权。

61.4.2. Client-side Configuration for Simple User Access Operation - REST Gateway

61.4.2。简单用户访问操作- REST网关的客户端配置。

The REST gateway will authenticate with HBase using the supplied credential. No authentication will be performed by the REST gateway itself. All client access via the REST gateway will use the REST gateway’s credential and have its privilege.

REST网关将使用提供的凭据对HBase进行身份验证。其他网关本身不会执行任何身份验证。所有通过REST网关的客户端访问都将使用REST gateway的凭据,并具有它的特权。

The REST gateway user will need access. For example, to give the REST API user, rest_server, administrative access, a command such as this one will suffice:

其他网关用户将需要访问。例如,为了给REST API用户、rest_server、管理访问,这样的命令就足够了:

grant 'rest_server', 'RWCA'

For more information about ACLs, please see the Access Control Labels (ACLs) section

有关acl的更多信息,请参见访问控制标签(ACLs)部分。

It should be possible for clients to authenticate with the HBase cluster through the REST gateway in a pass-through manner via SPNEGO HTTP authentication. This is future work.

客户机可以通过SPNEGO HTTP身份验证通过其他网关对HBase集群进行身份验证。这是未来的工作。

62. Securing Access to HDFS and ZooKeeper

62年。确保对HDFS和ZooKeeper的访问。

Secure HBase requires secure ZooKeeper and HDFS so that users cannot access and/or modify the metadata and data from under HBase. HBase uses HDFS (or configured file system) to keep its data files as well as write ahead logs (WALs) and other data. HBase uses ZooKeeper to store some metadata for operations (master address, table locks, recovery state, etc).

安全的HBase需要安全的ZooKeeper和HDFS,这样用户就不能访问和/或修改HBase下的元数据和数据。HBase使用HDFS(或配置的文件系统)来保存其数据文件,并写入前面的日志(WALs)和其他数据。HBase使用ZooKeeper来存储一些操作元数据(主地址、表锁、恢复状态等)。

62.1. Securing ZooKeeper Data

62.1。保护动物园管理员数据

ZooKeeper has a pluggable authentication mechanism to enable access from clients using different methods. ZooKeeper even allows authenticated and un-authenticated clients at the same time. The access to znodes can be restricted by providing Access Control Lists (ACLs) per znode. An ACL contains two components, the authentication method and the principal. ACLs are NOT enforced hierarchically. See ZooKeeper Programmers Guide for details.

ZooKeeper有一个可插入的身份验证机制,可以使用不同的方法来访问客户端。ZooKeeper甚至允许同时进行身份验证和未经身份验证的客户端。通过提供每个znode的访问控制列表(ACLs),可以限制对znode的访问。ACL包含两个组件:身份验证方法和主体。acl不是分层执行的。详细信息请参见ZooKeeper程序员指南。

HBase daemons authenticate to ZooKeeper via SASL and kerberos (See SASL Authentication with ZooKeeper). HBase sets up the znode ACLs so that only the HBase user and the configured hbase superuser (hbase.superuser) can access and modify the data. In cases where ZooKeeper is used for service discovery or sharing state with the client, the znodes created by HBase will also allow anyone (regardless of authentication) to read these znodes (clusterId, master address, meta location, etc), but only the HBase user can modify them.

HBase守护进程通过SASL和kerberos对ZooKeeper进行身份验证(见SASL认证与ZooKeeper)。HBase设置了znode acl,这样只有HBase用户和配置的HBase超级用户(HBase .superuser)才能访问和修改数据。在将ZooKeeper用于服务发现或与客户共享状态的情况下,HBase创建的znode还允许任何人(无论身份验证)读取这些znode (clusterId、master address、meta location等),但只有HBase用户可以修改它们。

62.2. Securing File System (HDFS) Data

62.2。保护文件系统(HDFS)数据。

All of the data under management is kept under the root directory in the file system (hbase.rootdir). Access to the data and WAL files in the filesystem should be restricted so that users cannot bypass the HBase layer, and peek at the underlying data files from the file system. HBase assumes the filesystem used (HDFS or other) enforces permissions hierarchically. If sufficient protection from the file system (both authorization and authentication) is not provided, HBase level authorization control (ACLs, visibility labels, etc) is meaningless since the user can always access the data from the file system.

管理中的所有数据都保存在文件系统的根目录下(hbase.rootdir)。应该限制对文件系统中的数据和WAL文件的访问,这样用户就不能绕过HBase层,并从文件系统中查看底层数据文件。HBase假定所使用的文件系统(HDFS或其他)以分层的方式执行权限。如果没有提供来自文件系统(授权和身份验证)的充分保护,那么HBase级别的授权控制(ACLs、可见性标签等)是没有意义的,因为用户总是可以从文件系统访问数据。

HBase enforces the posix-like permissions 700 (rwx------) to its root directory. It means that only the HBase user can read or write the files in FS. The default setting can be changed by configuring hbase.rootdir.perms in hbase-site.xml. A restart of the active master is needed so that it changes the used permissions. For versions before 1.2.0, you can check whether HBASE-13780 is committed, and if not, you can manually set the permissions for the root directory if needed. Using HDFS, the command would be:

HBase将类似posix的权限700 (rwx-----)强制到它的根目录。这意味着只有HBase用户可以读取或写入FS中的文件。可以通过配置hbase.rootdir来更改默认设置。在hbase-site.xml烫发。需要重新启动active master,以便它更改已使用的权限。对于1.2.0之前的版本,您可以检查HBASE-13780是否已提交,如果没有,您可以在需要时手动设置根目录的权限。使用HDFS时,命令如下:

sudo -u hdfs hadoop fs -chmod 700 /hbase

You should change /hbase if you are using a different hbase.rootdir.

如果使用不同的hbase.rootdir,您应该更改/hbase。

In secure mode, SecureBulkLoadEndpoint should be configured and used for properly handing of users files created from MR jobs to the HBase daemons and HBase user. The staging directory in the distributed file system used for bulk load (hbase.bulkload.staging.dir, defaults to /tmp/hbase-staging) should have (mode 711, or rwx—​x—​x) so that users can access the staging directory created under that parent directory, but cannot do any other operation. See Secure Bulk Load for how to configure SecureBulkLoadEndPoint.

在安全模式下,应该配置SecureBulkLoadEndpoint,并用于正确地将由jobs创建的用户文件交给HBase守护进程和HBase用户。用于批量加载的分布式文件系统中的临时目录(hbase.bulkload.staging)。dir(默认为/tmp/hbase-staging)应该有(模式711,或rwx - x - x),这样用户就可以访问在父目录下创建的临时目录,但是不能执行其他操作。有关如何配置SecureBulkLoadEndPoint的安全批量加载。

63. Securing Access To Your Data

63年。确保访问您的数据。

After you have configured secure authentication between HBase client and server processes and gateways, you need to consider the security of your data itself. HBase provides several strategies for securing your data:

在HBase客户端和服务器进程和网关之间配置了安全认证之后,您需要考虑数据本身的安全性。HBase提供了几种保护数据的策略:

  • Role-based Access Control (RBAC) controls which users or groups can read and write to a given HBase resource or execute a coprocessor endpoint, using the familiar paradigm of roles.

    基于角色的访问控制(RBAC)控件,用户或组可以使用熟悉的角色范例来读取和写入给定的HBase资源或执行协处理器端点。

  • Visibility Labels which allow you to label cells and control access to labelled cells, to further restrict who can read or write to certain subsets of your data. Visibility labels are stored as tags. See hbase.tags for more information.

    可见标签允许您标记单元格并控制对标记单元的访问,从而进一步限制谁可以读取或写入您的数据的某些子集。可见标签存储为标签。看到hbase。标签的更多信息。

  • Transparent encryption of data at rest on the underlying filesystem, both in HFiles and in the WAL. This protects your data at rest from an attacker who has access to the underlying filesystem, without the need to change the implementation of the client. It can also protect against data leakage from improperly disposed disks, which can be important for legal and regulatory compliance.

    透明的加密数据在底层文件系统上,包括在HFiles和在WAL中。这将保护您的数据免受攻击者的休息,因为攻击者可以访问底层文件系统,而不需要更改客户端的实现。它还可以防止不正确地配置磁盘的数据泄漏,这对于法律和法规遵从性很重要。

Server-side configuration, administration, and implementation details of each of these features are discussed below, along with any performance trade-offs. An example security configuration is given at the end, to show these features all used together, as they might be in a real-world scenario.

下面将讨论每个特性的服务器端配置、管理和实现细节,以及任何性能权衡。最后给出了一个示例安全性配置,以显示这些特性都是一起使用的,因为它们可能存在于实际场景中。

All aspects of security in HBase are in active development and evolving rapidly. Any strategy you employ for security of your data should be thoroughly tested. In addition, some of these features are still in the experimental stage of development. To take advantage of many of these features, you must be running HBase 0.98+ and using the HFile v3 file format.
Protecting Sensitive Files

Several procedures in this section require you to copy files between cluster nodes. When copying keys, configuration files, or other files containing sensitive strings, use a secure method, such as ssh, to avoid leaking sensitive data.

本节中的几个步骤要求您在集群节点之间复制文件。在复制密钥、配置文件或包含敏感字符串的其他文件时,使用安全方法,例如ssh,以避免泄漏敏感数据。

Procedure: Basic Server-Side Configuration
  1. Enable HFile v3, by setting hfile.format.version to 3 in hbase-site.xml. This is the default for HBase 1.0 and newer.

    启用HFile v3,通过设置HFile .format。在hbase-site.xml中版本为3。这是HBase 1.0和更新版本的默认值。

    <property>
      <name>hfile.format.version</name>
      <value>3</value>
    </property>
  2. Enable SASL and Kerberos authentication for RPC and ZooKeeper, as described in security.prerequisites and SASL Authentication with ZooKeeper.

    如安全性所述,启用RPC和ZooKeeper的SASL和Kerberos身份验证。与ZooKeeper的先决条件和SASL认证。

63.1. Tags

63.1。标签

Tags are a feature of HFile v3. A tag is a piece of metadata which is part of a cell, separate from the key, value, and version. Tags are an implementation detail which provides a foundation for other security-related features such as cell-level ACLs and visibility labels. Tags are stored in the HFiles themselves. It is possible that in the future, tags will be used to implement other HBase features. You don’t need to know a lot about tags in order to use the security features they enable.

标签是HFile v3的一个特性。标记是元数据的一部分,它是单元格的一部分,与键、值和版本分开。标签是一种实现细节,它为其他与安全相关的特性提供了基础,如:细胞级acl和可见性标签。标记存储在HFiles中。在将来可能会使用标记来实现其他HBase特性。为了使用它们启用的安全特性,您不需要知道很多关于标记的信息。

63.1.1. Implementation Details

63.1.1。实现细节

Every cell can have zero or more tags. Every tag has a type and the actual tag byte array.

每个单元格都可以有零个或多个标记。每个标记都有一个类型和实际的标签字节数组。

Just as row keys, column families, qualifiers and values can be encoded (see data.block.encoding.types), tags can also be encoded as well. You can enable or disable tag encoding at the level of the column family, and it is enabled by default. Use the HColumnDescriptor#setCompressionTags(boolean compressTags) method to manage encoding settings on a column family. You also need to enable the DataBlockEncoder for the column family, for encoding of tags to take effect.

正如行键、列族、限定符和值可以被编码(请参阅data.block.encoding.types),标签也可以被编码。您可以在列家族级别启用或禁用标记编码,默认情况下启用它。使用HColumnDescriptor#setCompressionTags(boolean compressTags)方法来管理列家族的编码设置。您还需要为列家族启用DataBlockEncoder,以便对标记进行编码以生效。

You can enable compression of each tag in the WAL, if WAL compression is also enabled, by setting the value of hbase.regionserver.wal.tags.enablecompression to true in hbase-site.xml. Tag compression uses dictionary encoding.

您可以通过设置hbase. local server.wal.tags的值,在WAL中启用对每个标记的压缩,如果还启用了WAL压缩。在hbase-site.xml中实现了对true的支持。标记压缩使用字典编码。

Tag compression is not supported when using WAL encryption.

使用WAL加密时,不支持标记压缩。

63.2. Access Control Labels (ACLs)

63.2。访问控制标签(acl)

63.2.1. How It Works

63.2.1。它是如何工作的

ACLs in HBase are based upon a user’s membership in or exclusion from groups, and a given group’s permissions to access a given resource. ACLs are implemented as a coprocessor called AccessController.

HBase中的acl基于用户的成员资格,或从组中排除,以及给定组访问给定资源的权限。acl被实现为一个称为AccessController的协处理器。

HBase does not maintain a private group mapping, but relies on a Hadoop group mapper, which maps between entities in a directory such as LDAP or Active Directory, and HBase users. Any supported Hadoop group mapper will work. Users are then granted specific permissions (Read, Write, Execute, Create, Admin) against resources (global, namespaces, tables, cells, or endpoints).

HBase不维护一个私有组映射,而是依赖于一个Hadoop group mapper,它映射一个目录中的实体(如LDAP或Active directory)和HBase用户之间的映射。任何支持的Hadoop group mapper都可以工作。然后,用户被授予特定权限(读、写、执行、创建、管理)与资源(全局、名称空间、表、单元或端点)。

With Kerberos and Access Control enabled, client access to HBase is authenticated and user data is private unless access has been explicitly granted.

HBase has a simpler security model than relational databases, especially in terms of client operations. No distinction is made between an insert (new record) and update (of existing record), for example, as both collapse down into a Put.

HBase具有比关系数据库更简单的安全模型,特别是在客户端操作方面。例如,在插入(新记录)和更新(现有记录)之间没有区别,例如,两者都崩溃到一个Put中。

Understanding Access Levels
了解访问级别

HBase access levels are granted independently of each other and allow for different types of operations at a given scope.

HBase访问级别是相互独立的,并允许在给定的范围内进行不同类型的操作。

  • Read (R) - can read data at the given scope

    读取(R) -可以在给定的范围内读取数据。

  • Write (W) - can write data at the given scope

    写入(W) -可以在给定的范围内写入数据。

  • Execute (X) - can execute coprocessor endpoints at the given scope

    执行(X) -可以在给定的范围执行协处理器端点。

  • Create (C) - can create tables or drop tables (even those they did not create) at the given scope

    创建(C) -可以在给定的范围内创建表或删除表(甚至是它们没有创建的表)。

  • Admin (A) - can perform cluster operations such as balancing the cluster or assigning regions at the given scope

    Admin (A) -可以执行集群操作,例如在给定的范围内平衡集群或分配区域。

The possible scopes are:

可能的范围是:

  • Superuser - superusers can perform any operation available in HBase, to any resource. The user who runs HBase on your cluster is a superuser, as are any principals assigned to the configuration property hbase.superuser in hbase-site.xml on the HMaster.

    超级用户-超级用户可以在HBase中执行任何操作,任何资源。在集群上运行HBase的用户是一个超级用户,和分配给配置属性HBase的任何主体一样。在hbase-site超级用户。HMaster xml。

  • Global - permissions granted at global scope allow the admin to operate on all tables of the cluster.

    全局范围内授予的全局权限允许管理员在集群的所有表上运行。

  • Namespace - permissions granted at namespace scope apply to all tables within a given namespace.

    命名空间范围内授予的名称空间——适用于给定名称空间中的所有表。

  • Table - permissions granted at table scope apply to data or metadata within a given table.

    表范围内授予的权限适用于给定表中的数据或元数据。

  • ColumnFamily - permissions granted at ColumnFamily scope apply to cells within that ColumnFamily.

    ColumnFamily (ColumnFamily)的权限适用于该列家庭中的单元格。

  • Cell - permissions granted at cell scope apply to that exact cell coordinate (key, value, timestamp). This allows for policy evolution along with data.

    在单元格范围内授予的单元格权限适用于精确的单元坐标(键、值、时间戳)。这允许策略随数据一起演进。

    To change an ACL on a specific cell, write an updated cell with new ACL to the precise coordinates of the original.

    若要更改特定单元格上的ACL,请使用新ACL将更新后的单元格写入原始的精确坐标。

    If you have a multi-versioned schema and want to update ACLs on all visible versions, you need to write new cells for all visible versions. The application has complete control over policy evolution.

    如果您有一个多版本的模式,并且想要更新所有可见版本的acl,那么您需要为所有可见版本编写新的单元格。应用程序完全控制策略的演变。

    The exception to the above rule is append and increment processing. Appends and increments can carry an ACL in the operation. If one is included in the operation, then it will be applied to the result of the append or increment. Otherwise, the ACL of the existing cell you are appending to or incrementing is preserved.

    上述规则的例外是附加和递增处理。附加和增量可以在操作中携带ACL。如果其中一个包含在操作中,那么它将被应用到附加或增量的结果中。否则,您所添加的现有单元格的ACL将被保留。

The combination of access levels and scopes creates a matrix of possible access levels that can be granted to a user. In a production environment, it is useful to think of access levels in terms of what is needed to do a specific job. The following list describes appropriate access levels for some common types of HBase users. It is important not to grant more access than is required for a given user to perform their required tasks.

访问级别和作用域的组合创建了一个可授予用户的可能访问级别的矩阵。在生产环境中,考虑到需要做特定工作所需的访问级别是很有用的。下面的列表描述了一些普通类型的HBase用户的适当访问级别。重要的是,不授予比给定用户执行所需任务所需的更多访问权限。

  • Superusers - In a production system, only the HBase user should have superuser access. In a development environment, an administrator may need superuser access in order to quickly control and manage the cluster. However, this type of administrator should usually be a Global Admin rather than a superuser.

    超级用户——在生产系统中,只有HBase用户应该拥有超级用户访问权限。在开发环境中,管理员可能需要超级用户访问,以便快速控制和管理集群。但是,这种类型的管理员通常应该是一个全局管理员,而不是一个超级用户。

  • Global Admins - A global admin can perform tasks and access every table in HBase. In a typical production environment, an admin should not have Read or Write permissions to data within tables.

    全局管理员——全局管理员可以执行任务并访问HBase中的每个表。在典型的生产环境中,管理员不应该对表中的数据进行读或写权限。

  • A global admin with Admin permissions can perform cluster-wide operations on the cluster, such as balancing, assigning or unassigning regions, or calling an explicit major compaction. This is an operations role.

    具有管理权限的全局管理可以在集群上执行集群范围的操作,比如平衡、分配或取消区域,或者调用显式的主压缩。这是一个操作角色。

  • A global admin with Create permissions can create or drop any table within HBase. This is more of a DBA-type role.

    具有创建权限的全局管理可以在HBase中创建或删除任何表。这更像是dba类型的角色。

    In a production environment, it is likely that different users will have only one of Admin and Create permissions.

    在生产环境中,不同的用户可能只有一个管理员并创建权限。

    In the current implementation, a Global Admin with Admin permission can grant himself Read and Write permissions on a table and gain access to that table’s data. For this reason, only grant Global Admin permissions to trusted user who actually need them.

    在当前实现中,具有Admin权限的全局管理可以允许自己在表上读取和写入权限,并访问该表的数据。出于这个原因,只授予真正需要的可信用户的全局管理权限。

    Also be aware that a Global Admin with Create permission can perform a Put operation on the ACL table, simulating a grant or revoke and circumventing the authorization check for Global Admin permissions.

    还要注意,具有创建权限的全局管理可以在ACL表上执行Put操作,模拟grant或revoke,并规避全局管理权限的授权检查。

    Due to these issues, be cautious with granting Global Admin privileges.

    由于这些问题,要谨慎地授予全局管理员权限。

  • Namespace Admins - a namespace admin with Create permissions can create or drop tables within that namespace, and take and restore snapshots. A namespace admin with Admin permissions can perform operations such as splits or major compactions on tables within that namespace.

    名称空间管理器—具有创建权限的名称空间管理器可以在该名称空间中创建或删除表,并获取和恢复快照。具有管理权限的名称空间管理可以在该名称空间内的表上执行拆分或大型压缩等操作。

  • Table Admins - A table admin can perform administrative operations only on that table. A table admin with Create permissions can create snapshots from that table or restore that table from a snapshot. A table admin with Admin permissions can perform operations such as splits or major compactions on that table.

    表管理员—一个表管理员只能在该表上执行管理操作。具有创建权限的表管理可以从该表创建快照或从快照恢复该表。具有管理权限的表管理员可以执行操作,例如在该表上的拆分或主要的压缩。

  • Users - Users can read or write data, or both. Users can also execute coprocessor endpoints, if given Executable permissions.

    用户——用户可以读取或写入数据,或者两者都可以。如果给定可执行权限,用户还可以执行协处理器端点。

Table 8. Real-World Example of Access Levels
Job Title Scope Permissions Description

Senior Administrator

高级管理员

Global

全球

Access, Create

访问、创建

Manages the cluster and gives access to Junior Administrators.

管理集群并提供对初级管理员的访问。

Junior Administrator

初级管理员

Global

全球

Create

创建

Creates tables and gives access to Table Administrators.

创建表并提供对表管理员的访问。

Table Administrator

表管理员

Table

Access

访问

Maintains a table from an operations point of view.

从操作的角度维护一个表。

Data Analyst

数据分析师

Table

Read

Creates reports from HBase data.

创建来自HBase数据的报告。

Web Application

Web应用程序

Table

Read, Write

读、写

Puts data into HBase and uses HBase data to perform operations.

将数据放入HBase中,并使用HBase数据执行操作。

ACL Matrix

For more details on how ACLs map to specific HBase operations and tasks, see appendix acl matrix.

有关ACLs如何映射特定的HBase操作和任务的详细信息,请参阅附录acl矩阵。

Implementation Details
实现细节

Cell-level ACLs are implemented using tags (see Tags). In order to use cell-level ACLs, you must be using HFile v3 and HBase 0.98 or newer.

单元级acl使用标记实现(参见标签)。为了使用单元级别的acl,必须使用HFile v3和HBase 0.98或更新。

  1. Files created by HBase are owned by the operating system user running the HBase process. To interact with HBase files, you should use the API or bulk load facility.

    由HBase创建的文件由运行HBase进程的操作系统用户拥有。要与HBase文件交互,您应该使用API或批量加载工具。

  2. HBase does not model "roles" internally in HBase. Instead, group names can be granted permissions. This allows external modeling of roles via group membership. Groups are created and manipulated externally to HBase, via the Hadoop group mapping service.

    HBase并不在HBase内部建立“角色”。相反,可以授予组名权限。这允许通过组成员关系对角色进行外部建模。通过Hadoop组映射服务,将组从外部创建和操作到HBase。

Server-Side Configuration
服务器端配置
  1. As a prerequisite, perform the steps in Procedure: Basic Server-Side Configuration.

    作为先决条件,执行过程中的步骤:基本的服务器端配置。

  2. Install and configure the AccessController coprocessor, by setting the following properties in hbase-site.xml. These properties take a list of classes.

    通过在hbase-site.xml中设置以下属性,安装并配置AccessController协处理器。这些属性包含类的列表。

    If you use the AccessController along with the VisibilityController, the AccessController must come first in the list, because with both components active, the VisibilityController will delegate access control on its system tables to the AccessController. For an example of using both together, see Security Configuration Example.
    <property>
      <name>hbase.security.authorization</name>
      <value>true</value>
    </property>
    <property>
      <name>hbase.coprocessor.region.classes</name>
      <value>org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.token.TokenProvider</value>
    </property>
    <property>
      <name>hbase.coprocessor.master.classes</name>
      <value>org.apache.hadoop.hbase.security.access.AccessController</value>
    </property>
    <property>
      <name>hbase.coprocessor.regionserver.classes</name>
      <value>org.apache.hadoop.hbase.security.access.AccessController</value>
    </property>
    <property>
      <name>hbase.security.exec.permission.checks</name>
      <value>true</value>
    </property>

    Optionally, you can enable transport security, by setting hbase.rpc.protection to privacy. This requires HBase 0.98.4 or newer.

    可选地,您可以通过设置hbase.rpc来启用传输安全性。保护隐私。这需要HBase 0.98.4或更新。

  3. Set up the Hadoop group mapper in the Hadoop namenode’s core-site.xml. This is a Hadoop file, not an HBase file. Customize it to your site’s needs. Following is an example.

    在Hadoop namenode的core-site.xml中设置Hadoop group mapper。这是一个Hadoop文件,不是HBase文件。根据站点的需要定制它。下面是一个例子。

    <property>
      <name>hadoop.security.group.mapping</name>
      <value>org.apache.hadoop.security.LdapGroupsMapping</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.url</name>
      <value>ldap://server</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.bind.user</name>
      <value>Administrator@example-ad.local</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.bind.password</name>
      <value>****</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.base</name>
      <value>dc=example-ad,dc=local</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
      <value>(&amp;(objectClass=user)(sAMAccountName={0}))</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
      <value>(objectClass=group)</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
      <value>member</value>
    </property>
    
    <property>
      <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>
      <value>cn</value>
    </property>
  4. Optionally, enable the early-out evaluation strategy. Prior to HBase 0.98.0, if a user was not granted access to a column family, or at least a column qualifier, an AccessDeniedException would be thrown. HBase 0.98.0 removed this exception in order to allow cell-level exceptional grants. To restore the old behavior in HBase 0.98.0-0.98.6, set hbase.security.access.early_out to true in hbase-site.xml. In HBase 0.98.6, the default has been returned to true.

    可选地,启用早期的评估策略。在HBase 0.98.0之前,如果用户没有获得对列家族的访问权,或者至少是一个列限定符,那么将抛出一个AccessDeniedException。HBase 0.98.0删除了这个异常,以便允许单元级别的异常授予。要恢复HBase 0.98.0-0.98.6的旧行为,设置hbase.security.access。在hbase-site.xml中,early_out为true。在HBase 0.98.6中,默认值已返回true。

  5. Distribute your configuration and restart your cluster for changes to take effect.

    分配您的配置并重新启动集群以使更改生效。

  6. To test your configuration, log into HBase Shell as a given user and use the whoami command to report the groups your user is part of. In this example, the user is reported as being a member of the services group.

    要测试您的配置,请登录到HBase Shell作为一个给定的用户,并使用whoami命令来报告您的用户所属的组。在本例中,用户被报告为服务组的成员。

    hbase> whoami
    service (auth:KERBEROS)
        groups: services
Administration
政府

Administration tasks can be performed from HBase Shell or via an API.

可以从HBase Shell或通过API执行管理任务。

API Examples

Many of the API examples below are taken from source files hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java and hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/SecureTestUtil.java.

下面的许多API示例都是取自源文件hbase-server/src/test/java/ org/apache/hadoop/hbase/security/access/testaccesscontroller。java和java /org/apache/hadoop/hbase/security/access/SecureTestUtil.java hbase-server / src /测试/。

Neither the examples, nor the source files they are taken from, are part of the public HBase API, and are provided for illustration only. Refer to the official API for usage instructions.

这些示例和它们所使用的源文件都不是公共HBase API的一部分,仅供参考。请参考使用说明的官方API。

  1. User and Group Administration

    用户和组管理

    Users and groups are maintained external to HBase, in your directory.

    在您的目录中,用户和组被维护在HBase外部。

  2. Granting Access To A Namespace, Table, Column Family, or Cell

    允许访问名称空间、表、列族或单元格。

    There are a few different types of syntax for grant statements. The first, and most familiar, is as follows, with the table and column family being optional:

    grant语句有几种不同的语法。第一个,也是最熟悉的,如下,表和列家庭是可选的:

    grant 'user', 'RWXCA', 'TABLE', 'CF', 'CQ'

    Groups and users are granted access in the same way, but groups are prefixed with an @ symbol. In the same way, tables and namespaces are specified in the same way, but namespaces are prefixed with an @ symbol.

    组和用户以相同的方式被授予访问权限,但是组以@符号作为前缀。同样,表和名称空间也是以同样的方式指定的,但是名称空间是用@符号来前缀的。

    It is also possible to grant multiple permissions against the same resource in a single statement, as in this example. The first sub-clause maps users to ACLs and the second sub-clause specifies the resource.

    还可以在单个语句中对同一资源授予多个权限,如本例中所示。第一个子句将用户映射到acl,第二个子句指定资源。

    HBase Shell support for granting and revoking access at the cell level is for testing and verification support, and should not be employed for production use because it won’t apply the permissions to cells that don’t exist yet. The correct way to apply cell level permissions is to do so in the application code when storing the values.
    ACL Granularity and Evaluation Order

    ACLs are evaluated from least granular to most granular, and when an ACL is reached that grants permission, evaluation stops. This means that cell ACLs do not override ACLs at less granularity.

    ACLs从最小粒度到最细粒度,当到达ACL时,授予权限,评估停止。这意味着单元ACLs不会以更低的粒度覆盖acl。

    Example 19. HBase Shell
    • Global:

      全球:

      hbase> grant '@admins', 'RWXCA'
    • Namespace:

      名称空间:

      hbase> grant 'service', 'RWXCA', '@test-NS'
    • Table:

      表:

      hbase> grant 'service', 'RWXCA', 'user'
    • Column Family:

      列族:

      hbase> grant '@developers', 'RW', 'user', 'i'
    • Column Qualifier:

      列限定符:

      hbase> grant 'service, 'RW', 'user', 'i', 'foo'
    • Cell:

      细胞:

      The syntax for granting cell ACLs uses the following syntax:

      授予单元acl的语法使用以下语法:

      grant <table>, \
        { '<user-or-group>' => \
          '<permissions>', ... }, \
        { <scanner-specification> }
    • <user-or-group> is the user or group name, prefixed with @ in the case of a group.

      是用户或组名称,在组中以@前缀。

    • <permissions> is a string containing any or all of "RWXCA", though only R and W are meaningful at cell scope.

      <权限> 是一个包含任何或全部“RWXCA”的字符串,尽管只有R和W在单元范围内是有意义的。

    • <scanner-specification> is the scanner specification syntax and conventions used by the 'scan' shell command. For some examples of scanner specifications, issue the following HBase Shell command.

      是“扫描”shell命令使用的扫描器规范语法和约定。对于一些扫描器规范的例子,发出下面的HBase Shell命令。

      hbase> help "scan"

      If you need to enable cell acl,the hfile.format.version option in hbase-site.xml should be greater than or equal to 3,and the hbase.security.access.early_out option should be set to false.This example grants read access to the 'testuser' user and read/write access to the 'developers' group, on cells in the 'pii' column which match the filter.

      如果需要启用cell acl,请使用hfile.format。在hbase-site版本选项。xml应该大于或等于3,而hbase。security.access。early_out选项应该设置为false。这个示例授予对“testuser”用户的读访问权限,并在“pii”列中与筛选器匹配的单元格上读/写访问“开发人员”组。

      hbase> grant 'user', \
        { '@developers' => 'RW', 'testuser' => 'R' }, \
        { COLUMNS => 'pii', FILTER => "(PrefixFilter ('test'))" }

      The shell will run a scanner with the given criteria, rewrite the found cells with new ACLs, and store them back to their exact coordinates.

      shell将运行一个带有给定条件的扫描器,用新的acl重写已发现的单元,并将它们存储回精确的坐标。

    Example 20. API

    The following example shows how to grant access at the table level.

    下面的示例演示如何在表级别授予访问权限。

    public static void grantOnTable(final HBaseTestingUtility util, final String user,
        final TableName table, final byte[] family, final byte[] qualifier,
        final Permission.Action... actions) throws Exception {
      SecureTestUtil.updateACLs(util, new Callable<Void>() {
        @Override
        public Void call() throws Exception {
          try (Connection connection = ConnectionFactory.createConnection(util.getConfiguration());
               Table acl = connection.getTable(AccessControlLists.ACL_TABLE_NAME)) {
            BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
            AccessControlService.BlockingInterface protocol =
              AccessControlService.newBlockingStub(service);
            AccessControlUtil.grant(null, protocol, user, table, family, qualifier, false, actions);
          }
          return null;
        }
      });
    }

    To grant permissions at the cell level, you can use the Mutation.setACL method:

    要在单元级别授予权限,您可以使用该突变。setACL方法:

    Mutation.setACL(String user, Permission perms)
    Mutation.setACL(Map<String, Permission> perms)

    Specifically, this example provides read permission to a user called user1 on any cells contained in a particular Put operation:

    具体地说,这个示例向一个名为user1的用户提供了对特定Put操作中所包含的任何单元的read权限:

    put.setACL(user1, new Permission(Permission.Action.READ))
  3. Revoking Access Control From a Namespace, Table, Column Family, or Cell

    从名称空间、表、列族或单元格撤消访问控制。

    The revoke command and API are twins of the grant command and API, and the syntax is exactly the same. The only exception is that you cannot revoke permissions at the cell level. You can only revoke access that has previously been granted, and a revoke statement is not the same thing as explicit denial to a resource.

    revoke命令和API是grant命令和API的双胞胎,语法是完全相同的。惟一的例外是您不能在单元级别撤销权限。您只能撤销以前授予的访问权限,撤销声明与对资源的显式拒绝是不一样的。

    HBase Shell support for granting and revoking access is for testing and verification support, and should not be employed for production use because it won’t apply the permissions to cells that don’t exist yet. The correct way to apply cell-level permissions is to do so in the application code when storing the values.
    Example 21. Revoking Access To a Table
    public static void revokeFromTable(final HBaseTestingUtility util, final String user,
        final TableName table, final byte[] family, final byte[] qualifier,
        final Permission.Action... actions) throws Exception {
      SecureTestUtil.updateACLs(util, new Callable<Void>() {
        @Override
        public Void call() throws Exception {
          Configuration conf = HBaseConfiguration.create();
          Connection connection = ConnectionFactory.createConnection(conf);
          Table acl = connection.getTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME);
          try {
            BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
            AccessControlService.BlockingInterface protocol =
                AccessControlService.newBlockingStub(service);
            ProtobufUtil.revoke(protocol, user, table, family, qualifier, actions);
          } finally {
            acl.close();
          }
          return null;
        }
      });
    }
  4. Showing a User’s Effective Permissions

    显示用户的有效权限。

    Example 22. HBase Shell
    hbase> user_permission 'user'
    
    hbase> user_permission '.*'
    
    hbase> user_permission JAVA_REGEX
Example 23. API
public static void verifyAllowed(User user, AccessTestAction action, int count) throws Exception {
  try {
    Object obj = user.runAs(action);
    if (obj != null && obj instanceof List&lt;?&gt;) {
      List&lt;?&gt; results = (List&lt;?&gt;) obj;
      if (results != null && results.isEmpty()) {
        fail("Empty non null results from action for user '" ` user.getShortName() ` "'");
      }
      assertEquals(count, results.size());
    }
  } catch (AccessDeniedException ade) {
    fail("Expected action to pass for user '" ` user.getShortName() ` "' but was denied");
  }
}

63.3. Visibility Labels

63.3。可见性标签

Visibility labels control can be used to only permit users or principals associated with a given label to read or access cells with that label. For instance, you might label a cell top-secret, and only grant access to that label to the managers group. Visibility labels are implemented using Tags, which are a feature of HFile v3, and allow you to store metadata on a per-cell basis. A label is a string, and labels can be combined into expressions by using logical operators (&, |, or !), and using parentheses for grouping. HBase does not do any kind of validation of expressions beyond basic well-formedness. Visibility labels have no meaning on their own, and may be used to denote sensitivity level, privilege level, or any other arbitrary semantic meaning.

可见标签控件只能用于允许与给定标签关联的用户或主体使用该标签读取或访问单元格。例如,您可能将一个单元格的最高机密标记为,并且只授予管理组访问该标签的权限。可见性标签是使用标签实现的,它是HFile v3的一个特性,允许您在每个单元的基础上存储元数据。标签是一个字符串,标签可以通过使用逻辑运算符(&,|,或者!)组合成表达式,并使用圆括号来分组。HBase除了基本的良好格式之外,没有任何形式的表达式验证。可见性标签本身没有意义,可用来表示敏感级别、特权级别或任何其他任意语义含义。

If a user’s labels do not match a cell’s label or expression, the user is denied access to the cell.

如果用户的标签与单元格的标签或表达式不匹配,则用户无法访问该单元格。

In HBase 0.98.6 and newer, UTF-8 encoding is supported for visibility labels and expressions. When creating labels using the addLabels(conf, labels) method provided by the org.apache.hadoop.hbase.security.visibility.VisibilityClient class and passing labels in Authorizations via Scan or Get, labels can contain UTF-8 characters, as well as the logical operators normally used in visibility labels, with normal Java notations, without needing any escaping method. However, when you pass a CellVisibility expression via a Mutation, you must enclose the expression with the CellVisibility.quote() method if you use UTF-8 characters or logical operators. See TestExpressionParser and the source file hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestScan.java.

在HBase 0.98.6和更新的UTF-8编码支持可见标签和表达式。在创建标签时,使用org.apache.hadoop.hbase. security.security.security.visibility提供的addlabel (conf, label)方法。VisibilityClient类和通过扫描或Get获得授权的标签,标签可以包含UTF-8字符,以及在可见标签中通常使用的逻辑运算符,使用普通的Java符号,不需要任何转义方法。但是,当您通过一个突变传递一个CellVisibility表达式时,您必须用CellVisibility.quote()方法将表达式括起来,如果您使用UTF-8字符或逻辑运算符。参见TestExpressionParser和源文件hbase-client/src/test/java/ org/apache/hadoop/hbase/client/testscan.java。

A user adds visibility expressions to a cell during a Put operation. In the default configuration, the user does not need to have access to a label in order to label cells with it. This behavior is controlled by the configuration option hbase.security.visibility.mutations.checkauths. If you set this option to true, the labels the user is modifying as part of the mutation must be associated with the user, or the mutation will fail. Whether a user is authorized to read a labelled cell is determined during a Get or Scan, and results which the user is not allowed to read are filtered out. This incurs the same I/O penalty as if the results were returned, but reduces load on the network.

在Put操作中,用户在单元格中添加可见性表达式。在默认配置中,用户不需要访问一个标签,就可以用它来标记单元格。此行为由配置选项hbase.security.visibility.mutations.checkauths控制。如果您将此选项设置为true,那么用户正在修改的标签必须与用户关联,否则该突变将失败。在获取或扫描过程中,用户是否被授权读取被标记的单元格,而不允许用户读取的结果被过滤掉。这将导致相同的I/O惩罚,就像返回结果一样,但是减少了网络上的负载。

Visibility labels can also be specified during Delete operations. For details about visibility labels and Deletes, see HBASE-10885.

可见标签也可以在删除操作中指定。有关可见标签和删除的详细信息,请参见HBASE-10885。

The user’s effective label set is built in the RPC context when a request is first received by the RegionServer. The way that users are associated with labels is pluggable. The default plugin passes through labels specified in Authorizations added to the Get or Scan and checks those against the calling user’s authenticated labels list. When the client passes labels for which the user is not authenticated, the default plugin drops them. You can pass a subset of user authenticated labels via the Get#setAuthorizations(Authorizations(String,…​)) and Scan#setAuthorizations(Authorizations(String,…​)); methods.

当区域服务器首次接收到请求时,用户的有效标签集构建在RPC上下文中。用户与标签关联的方式是可插入的。默认插件通过添加到Get或扫描的授权中指定的标签,并检查那些与调用用户的经过身份验证的标签列表相对应的标签。当客户端通过未验证用户的标签时,默认插件会将其删除。您可以通过Get# setauthorize(授权(String,…))和扫描# setauthorize(授权(String,…))传递用户身份验证标签的子集。方法。

Groups can be granted visibility labels the same way as users. Groups are prefixed with an @ symbol. When checking visibility labels of a user, the server will include the visibility labels of the groups of which the user is a member, together with the user’s own labels. When the visibility labels are retrieved using API VisibilityClient#getAuths or Shell command get_auths for a user, we will return labels added specifically for that user alone, not the group level labels.

组可以像用户一样被授予可见标签。组以@符号作为前缀。在检查用户的可见性标签时,服务器将包括用户所属的组的可见性标签,以及用户自己的标签。当使用API VisibilityClient#getAuths或Shell命令get_auths对一个用户检索可见性标签时,我们会返回专门为该用户添加的标签,而不是组级标签。

Visibility label access checking is performed by the VisibilityController coprocessor. You can use interface VisibilityLabelService to provide a custom implementation and/or control the way that visibility labels are stored with cells. See the source file hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithCustomVisLabService.java for one example.

可见标签访问检查由VisibilityController协处理器执行。您可以使用接口VisibilityLabelService来提供自定义的实现和/或控制可见标签存储在单元格中的方式。查看源文件hbase-server/src/test/java/ org/apache/hadoop/hbase/security/visibility/testvisibility/testvisibilitylabelswithcustomvislabservice。java的一个例子。

Visibility labels can be used in conjunction with ACLs.

可见标签可以与acl一起使用。

The labels have to be explicitly defined before they can be used in visibility labels. See below for an example of how this can be done.
There is currently no way to determine which labels have been applied to a cell. See HBASE-12470 for details.
Visibility labels are not currently applied for superusers.
Table 9. Examples of Visibility Expressions
Expression Interpretation
fulltime

Allow access to users associated with the fulltime label.

允许访问与fulltime标签关联的用户。

!public

Allow access to users not associated with the public label.

允许访问与公共标签无关的用户。

( secret | topsecret ) & !probationary

Allow access to users associated with either the secret or topsecret label and not associated with the probationary label.

允许访问与秘密或topsecret标签相关的用户,而不与试用标签关联。

63.3.1. Server-Side Configuration

63.3.1。服务器端配置

  1. As a prerequisite, perform the steps in Procedure: Basic Server-Side Configuration.

    作为先决条件,执行过程中的步骤:基本的服务器端配置。

  2. Install and configure the VisibilityController coprocessor by setting the following properties in hbase-site.xml. These properties take a list of class names.

    通过在hbase-site.xml中设置以下属性来安装和配置VisibilityController协处理器。这些属性包含类名的列表。

    <property>
      <name>hbase.security.authorization</name>
      <value>true</value>
    </property>
    <property>
      <name>hbase.coprocessor.region.classes</name>
      <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
    </property>
    <property>
      <name>hbase.coprocessor.master.classes</name>
      <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
    </property>
    If you use the AccessController and VisibilityController coprocessors together, the AccessController must come first in the list, because with both components active, the VisibilityController will delegate access control on its system tables to the AccessController.
  3. Adjust Configuration

    调整配置

    By default, users can label cells with any label, including labels they are not associated with, which means that a user can Put data that he cannot read. For example, a user could label a cell with the (hypothetical) 'topsecret' label even if the user is not associated with that label. If you only want users to be able to label cells with labels they are associated with, set hbase.security.visibility.mutations.checkauths to true. In that case, the mutation will fail if it makes use of labels the user is not associated with.

    默认情况下,用户可以用任何标签标记单元格,包括与之无关的标签,这意味着用户可以将他无法读取的数据放入其中。例如,用户可以用(假设的)标记一个单元格“topsecret”标签,即使用户与该标签没有关联。如果你只希望用户能够给带有标签的细胞贴上标签,那么就设置hbase.security.visibility.突变。checkauths为true。在这种情况下,如果使用了与用户无关的标签,那么该突变就会失败。

  4. Distribute your configuration and restart your cluster for changes to take effect.

    分配您的配置并重新启动集群以使更改生效。

63.3.2. Administration

63.3.2。政府

Administration tasks can be performed using the HBase Shell or the Java API. For defining the list of visibility labels and associating labels with users, the HBase Shell is probably simpler.

可以使用HBase Shell或Java API执行管理任务。对于定义可见性标签的列表并将标签与用户关联起来,HBase Shell可能更简单。

API Examples

Many of the Java API examples in this section are taken from the source file hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java. Refer to that file or the API documentation for more context.

本节中的许多Java API示例都是从源文件hbase-server/src/test/ Java / org/apache/hadoop/hbase/security/visibility/testvisibility/testvisibilitylabels.java中获取的。请参阅该文件或API文档以获得更多上下文。

Neither these examples, nor the source file they were taken from, are part of the public HBase API, and are provided for illustration only. Refer to the official API for usage instructions.

这些示例,以及它们所使用的源文件都不是公共HBase API的一部分,仅供参考。请参考使用说明的官方API。

  1. Define the List of Visibility Labels

    定义可见标签的列表。

    Example 24. HBase Shell
    hbase> add_labels [ 'admin', 'service', 'developer', 'test' ]
    Example 25. Java API
    public static void addLabels() throws Exception {
      PrivilegedExceptionAction<VisibilityLabelsResponse> action = new PrivilegedExceptionAction<VisibilityLabelsResponse>() {
        public VisibilityLabelsResponse run() throws Exception {
          String[] labels = { SECRET, TOPSECRET, CONFIDENTIAL, PUBLIC, PRIVATE, COPYRIGHT, ACCENT,
              UNICODE_VIS_TAG, UC1, UC2 };
          try {
            VisibilityClient.addLabels(conf, labels);
          } catch (Throwable t) {
            throw new IOException(t);
          }
          return null;
        }
      };
      SUPERUSER.runAs(action);
    }
  2. Associate Labels with Users

    将标签与用户

    Example 26. HBase Shell
    hbase> set_auths 'service', [ 'service' ]
    hbase> set_auths 'testuser', [ 'test' ]
    hbase> set_auths 'qa', [ 'test', 'developer' ]
    hbase> set_auths '@qagroup', [ 'test' ]
    Example 27. Java API
    public void testSetAndGetUserAuths() throws Throwable {
      final String user = "user1";
      PrivilegedExceptionAction<Void> action = new PrivilegedExceptionAction<Void>() {
        public Void run() throws Exception {
          String[] auths = { SECRET, CONFIDENTIAL };
          try {
            VisibilityClient.setAuths(conf, auths, user);
          } catch (Throwable e) {
          }
          return null;
        }
        ...
  3. Clear Labels From Users

    从用户明确的标签

    Example 28. HBase Shell
    hbase> clear_auths 'service', [ 'service' ]
    hbase> clear_auths 'testuser', [ 'test' ]
    hbase> clear_auths 'qa', [ 'test', 'developer' ]
    hbase> clear_auths '@qagroup', [ 'test', 'developer' ]
    Example 29. Java API
    ...
    auths = new String[] { SECRET, PUBLIC, CONFIDENTIAL };
    VisibilityLabelsResponse response = null;
    try {
      response = VisibilityClient.clearAuths(conf, auths, user);
    } catch (Throwable e) {
      fail("Should not have failed");
      ...
    }
  4. Apply a Label or Expression to a Cell

    将标签或表达式应用于单元格。

    The label is only applied when data is written. The label is associated with a given version of the cell.

    只有在写入数据时才应用该标签。标签与一个给定版本的单元格关联。

    Example 30. HBase Shell
    hbase> set_visibility 'user', 'admin|service|developer', { COLUMNS => 'i' }
    hbase> set_visibility 'user', 'admin|service', { COLUMNS => 'pii' }
    hbase> set_visibility 'user', 'test', { COLUMNS => [ 'i', 'pii' ], FILTER => "(PrefixFilter ('test'))" }
    HBase Shell support for applying labels or permissions to cells is for testing and verification support, and should not be employed for production use because it won’t apply the labels to cells that don’t exist yet. The correct way to apply cell level labels is to do so in the application code when storing the values.
    Example 31. Java API
    static Table createTableAndWriteDataWithLabels(TableName tableName, String... labelExps)
        throws Exception {
      Configuration conf = HBaseConfiguration.create();
      Connection connection = ConnectionFactory.createConnection(conf);
      Table table = NULL;
      try {
        table = TEST_UTIL.createTable(tableName, fam);
        int i = 1;
        List<Put> puts = new ArrayList<Put>();
        for (String labelExp : labelExps) {
          Put put = new Put(Bytes.toBytes("row" + i));
          put.add(fam, qual, HConstants.LATEST_TIMESTAMP, value);
          put.setCellVisibility(new CellVisibility(labelExp));
          puts.add(put);
          i++;
        }
        table.put(puts);
      } finally {
        if (table != null) {
          table.flushCommits();
        }
      }

63.3.3. Reading Cells with Labels

63.3.3。阅读细胞与标签

When you issue a Scan or Get, HBase uses your default set of authorizations to filter out cells that you do not have access to. A superuser can set the default set of authorizations for a given user by using the set_auths HBase Shell command or the VisibilityClient.setAuths() method.

当您发出扫描或获取时,HBase使用默认的授权集来过滤您无法访问的单元格。超级用户可以通过使用set_auth HBase Shell命令或VisibilityClient.setAuths()方法为给定用户设置默认的授权集。

You can specify a different authorization during the Scan or Get, by passing the AUTHORIZATIONS option in HBase Shell, or the Scan.setAuthorizations() method if you use the API. This authorization will be combined with your default set as an additional filter. It will further filter your results, rather than giving you additional authorization.

您可以在扫描或获取过程中指定不同的授权,通过在HBase Shell中传递授权选项,或者在使用API时使用Scan.setAuthorizations()方法。此授权将与您的默认设置合并为一个额外的过滤器。它将进一步过滤您的结果,而不是给您额外的授权。

Example 32. HBase Shell
hbase> get_auths 'myUser'
hbase> scan 'table1', AUTHORIZATIONS => ['private']
Example 33. Java API
...
public Void run() throws Exception {
  String[] auths1 = { SECRET, CONFIDENTIAL };
  GetAuthsResponse authsResponse = null;
  try {
    VisibilityClient.setAuths(conf, auths1, user);
    try {
      authsResponse = VisibilityClient.getAuths(conf, user);
    } catch (Throwable e) {
      fail("Should not have failed");
    }
  } catch (Throwable e) {
  }
  List<String> authsList = new ArrayList<String>();
  for (ByteString authBS : authsResponse.getAuthList()) {
    authsList.add(Bytes.toString(authBS.toByteArray()));
  }
  assertEquals(2, authsList.size());
  assertTrue(authsList.contains(SECRET));
  assertTrue(authsList.contains(CONFIDENTIAL));
  return null;
}
...

63.3.4. Implementing Your Own Visibility Label Algorithm

63.3.4。实现自己的可见性标签算法。

Interpreting the labels authenticated for a given get/scan request is a pluggable algorithm.

对给定的get/scan请求进行身份验证的标签是可插入的算法。

You can specify a custom plugin or plugins by using the property hbase.regionserver.scan.visibility.label.generator.class. The output for the first ScanLabelGenerator will be the input for the next one, until the end of the list.

您可以通过使用属性hbase. domain server.scan.visibility.label.generator.class来指定自定义插件或插件。第一个ScanLabelGenerator的输出将是下一个的输入,直到列表的末尾。

The default implementation, which was implemented in HBASE-12466, loads two plugins, FeedUserAuthScanLabelGenerator and DefinedSetFilterScanLabelGenerator. See Reading Cells with Labels.

默认实现在HBASE-12466中实现,加载两个插件、FeedUserAuthScanLabelGenerator和DefinedSetFilterScanLabelGenerator。阅读带有标签的阅读单元。

63.3.5. Replicating Visibility Tags as Strings

63.3.5。将可见标签复制为字符串。

As mentioned in the above sections, the interface VisibilityLabelService could be used to implement a different way of storing the visibility expressions in the cells. Clusters with replication enabled also must replicate the visibility expressions to the peer cluster. If DefaultVisibilityLabelServiceImpl is used as the implementation for VisibilityLabelService, all the visibility expression are converted to the corresponding expression based on the ordinals for each visibility label stored in the labels table. During replication, visible cells are also replicated with the ordinal-based expression intact. The peer cluster may not have the same labels table with the same ordinal mapping for the visibility labels. In that case, replicating the ordinals makes no sense. It would be better if the replication occurred with the visibility expressions transmitted as strings. To replicate the visibility expression as strings to the peer cluster, create a RegionServerObserver configuration which works based on the implementation of the VisibilityLabelService interface. The configuration below enables replication of visibility expressions to peer clusters as strings. See HBASE-11639 for more details.

正如前面提到的,接口VisibilityLabelService可以用来实现存储单元中可见性表达式的不同方法。启用了复制的集群也必须将可见性表达式复制到对等集群。如果使用DefaultVisibilityLabelServiceImpl作为VisibilityLabelService的实现,那么所有可见性表达式都将根据存储在标签表中的每个可见性标签的顺序转换为相应的表达式。在复制过程中,可见的细胞也被复制,并保持了基于顺序的表达。对于可见性标签,对等集群可能没有相同的标签表和相同的序号映射。在这种情况下,复制序号毫无意义。如果复制发生在以字符串形式传输的可见性表达式中,则会更好。为了将可见性表达式复制到对等集群,创建一个基于VisibilityLabelService接口实现的区域服务器观察者配置。下面的配置允许将可见性表达式复制到对等集群作为字符串。参见HBASE-11639了解更多细节。

<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>
<property>
  <name>hbase.coprocessor.regionserver.classes</name>
  <value>org.apache.hadoop.hbase.security.visibility.VisibilityController$VisibilityReplication</value>
</property>

63.4. Transparent Encryption of Data At Rest

63.4。透明加密数据的休息。

HBase provides a mechanism for protecting your data at rest, in HFiles and the WAL, which reside within HDFS or another distributed filesystem. A two-tier architecture is used for flexible and non-intrusive key rotation. "Transparent" means that no implementation changes are needed on the client side. When data is written, it is encrypted. When it is read, it is decrypted on demand.

HBase提供了一种机制来保护您的数据在rest、HFiles和WAL中,它们驻留在HDFS或另一个分布式文件系统中。一个两层架构用于灵活和非侵入性的关键旋转。“透明”意味着客户端不需要进行任何实现更改。当写入数据时,它是加密的。当它被读取时,它会根据需要解密。

63.4.1. How It Works

63.4.1。它是如何工作的

The administrator provisions a master key for the cluster, which is stored in a key provider accessible to every trusted HBase process, including the HMaster, RegionServers, and clients (such as HBase Shell) on administrative workstations. The default key provider is integrated with the Java KeyStore API and any key management systems with support for it. Other custom key provider implementations are possible. The key retrieval mechanism is configured in the hbase-site.xml configuration file. The master key may be stored on the cluster servers, protected by a secure KeyStore file, or on an external keyserver, or in a hardware security module. This master key is resolved as needed by HBase processes through the configured key provider.

管理员为集群提供了一个主密钥,该密钥存储在每个可信的HBase进程(包括HMaster、区域性服务器和客户端(比如HBase Shell))上的每个可信的HBase过程中。默认密钥提供程序与Java KeyStore API和任何支持它的密钥管理系统集成。其他定制密钥提供程序实现是可能的。关键的检索机制是在hbase-site中配置的。xml配置文件。主密钥可以存储在集群服务器上,由安全密钥存储文件、外部密钥服务器或硬件安全模块保护。通过配置的密钥提供程序,通过HBase进程来解决这个主键。

Next, encryption use can be specified in the schema, per column family, by creating or modifying a column descriptor to include two additional attributes: the name of the encryption algorithm to use (currently only "AES" is supported), and optionally, a data key wrapped (encrypted) with the cluster master key. If a data key is not explicitly configured for a ColumnFamily, HBase will create a random data key per HFile. This provides an incremental improvement in security over the alternative. Unless you need to supply an explicit data key, such as in a case where you are generating encrypted HFiles for bulk import with a given data key, only specify the encryption algorithm in the ColumnFamily schema metadata and let HBase create data keys on demand. Per Column Family keys facilitate low impact incremental key rotation and reduce the scope of any external leak of key material. The wrapped data key is stored in the ColumnFamily schema metadata, and in each HFile for the Column Family, encrypted with the cluster master key. After the Column Family is configured for encryption, any new HFiles will be written encrypted. To ensure encryption of all HFiles, trigger a major compaction after enabling this feature.

接下来,可以通过创建或修改一个列描述符来在模式中指定一个列描述符,以包含两个额外的属性:使用的加密算法的名称(目前只支持“AES”),还可以选择用集群主密钥封装的数据密钥(加密的)。如果没有为ColumnFamily显式配置数据键,HBase将在每个HFile中创建一个随机数据密钥。这在安全性上提供了一种额外的改进。除非您需要提供一个显式的数据键,比如您正在使用给定的数据键为批量导入生成加密的hfile,但只在ColumnFamily模式元数据中指定加密算法,并让HBase根据需要创建数据键。每列的家庭钥匙可以促进低影响的增量关键旋转和减少任何外部泄漏的关键材料。封装的数据密钥存储在ColumnFamily模式元数据中,在每个HFile中存储列族,使用集群主密钥进行加密。在将列家族配置为加密后,任何新的HFiles都将被加密。为了确保所有HFiles的加密,在启用该特性之后触发一个主要的压缩。

When the HFile is opened, the data key is extracted from the HFile, decrypted with the cluster master key, and used for decryption of the remainder of the HFile. The HFile will be unreadable if the master key is not available. If a remote user somehow acquires access to the HFile data because of some lapse in HDFS permissions, or from inappropriately discarded media, it will not be possible to decrypt either the data key or the file data.

打开HFile后,从HFile中提取数据密钥,并使用集群主密钥解密,并用于解密HFile的其余部分。如果主密钥不可用,则HFile将不可读。如果一个远程用户由于HDFS权限的某些错误而获得了对HFile数据的访问,或者从不适当的丢弃的媒体中获取,那么就不可能对数据密钥或文件数据进行解密。

It is also possible to encrypt the WAL. Even though WALs are transient, it is necessary to encrypt the WALEdits to avoid circumventing HFile protections for encrypted column families, in the event that the underlying filesystem is compromised. When WAL encryption is enabled, all WALs are encrypted, regardless of whether the relevant HFiles are encrypted.

也可以加密WAL。尽管WALs是暂时的,但是需要对WALEdits进行加密,以避免在底层文件系统被破坏的情况下,对加密的列家庭绕过HFile保护。当启用了WAL加密时,所有的WALs都是加密的,不管相关的HFiles是否被加密。

63.4.2. Server-Side Configuration

63.4.2。服务器端配置

This procedure assumes you are using the default Java keystore implementation. If you are using a custom implementation, check its documentation and adjust accordingly.

这个过程假设您使用的是默认的Java keystore实现。如果您正在使用自定义实现,请检查其文档并进行相应的调整。

  1. Create a secret key of appropriate length for AES encryption, using the keytool utility.

    使用keytool工具为AES加密创建适当长度的密钥。

    $ keytool -keystore /path/to/hbase/conf/hbase.jks \
      -storetype jceks -storepass **** \
      -genseckey -keyalg AES -keysize 128 \
      -alias <alias>

    Replace **** with the password for the keystore file and <alias> with the username of the HBase service account, or an arbitrary string. If you use an arbitrary string, you will need to configure HBase to use it, and that is covered below. Specify a keysize that is appropriate. Do not specify a separate password for the key, but press Return when prompted.

    用密钥存储库文件的密码替换****,并使用HBase服务帐户的用户名或任意字符串来使用 。如果您使用任意字符串,您将需要配置HBase以使用它,下面将介绍它。指定适当的密钥大小。不要为键指定单独的密码,但在提示时按回车键。

  2. Set appropriate permissions on the keyfile and distribute it to all the HBase servers.

    在密钥文件上设置适当的权限,并将其分发给所有HBase服务器。

    The previous command created a file called hbase.jks in the HBase conf/ directory. Set the permissions and ownership on this file such that only the HBase service account user can read the file, and securely distribute the key to all HBase servers.

    前面的命令创建了一个名为hbase的文件。jks在HBase conf/目录中。设置此文件的权限和所有权,这样只有HBase服务帐户用户才能读取文件,并安全地将密钥分发给所有HBase服务器。

  3. Configure the HBase daemons.

    配置HBase守护进程。

    Set the following properties in hbase-site.xml on the region servers, to configure HBase daemons to use a key provider backed by the KeyStore file or retrieving the cluster master key. In the example below, replace **** with the password.

    在hbase站点中设置以下属性。在区域服务器上,配置HBase守护进程,以使用由KeyStore文件支持的密钥提供程序或检索集群主密钥。在下面的示例中,用密码替换****。

    <property>
      <name>hbase.crypto.keyprovider</name>
      <value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
    </property>
    <property>
      <name>hbase.crypto.keyprovider.parameters</name>
      <value>jceks:///path/to/hbase/conf/hbase.jks?password=****</value>
    </property>

    By default, the HBase service account name will be used to resolve the cluster master key. However, you can store it with an arbitrary alias (in the keytool command). In that case, set the following property to the alias you used.

    默认情况下,HBase服务帐户名称将用于解析集群主密钥。但是,您可以使用任意别名(在keytool命令中)存储它。在这种情况下,将以下属性设置为所使用的别名。

    <property>
      <name>hbase.crypto.master.key.name</name>
      <value>my-alias</value>
    </property>

    You also need to be sure your HFiles use HFile v3, in order to use transparent encryption. This is the default configuration for HBase 1.0 onward. For previous versions, set the following property in your hbase-site.xml file.

    您还需要确保HFiles使用HFile v3,以便使用透明加密。这是HBase 1.0的默认配置。对于以前的版本,在您的hbase站点中设置以下属性。xml文件。

    <property>
      <name>hfile.format.version</name>
      <value>3</value>
    </property>

    Optionally, you can use a different cipher provider, either a Java Cryptography Encryption (JCE) algorithm provider or a custom HBase cipher implementation.

    可选地,您可以使用不同的密码提供程序,或者使用Java加密加密(JCE)算法提供程序或定制的HBase密码实现。

    • JCE:

      JCE:

      • Install a signed JCE provider (supporting AES/CTR/NoPadding mode with 128 bit keys)

        安装一个签名的JCE提供程序(支持AES/CTR/ nopadd模式,使用128位密钥)

      • Add it with highest preference to the JCE site configuration file $JAVA_HOME/lib/security/java.security.

        将它添加到JCE站点配置文件$JAVA_HOME/lib/security/java.security中。

      • Update hbase.crypto.algorithm.aes.provider and hbase.crypto.algorithm.rng.provider options in hbase-site.xml.

        更新hbase.crypto.algorithm.aes。提供者和hbase.crypto.algorithm.rng。供应商在hbase-site.xml选项。

    • Custom HBase Cipher:

      自定义HBase密码:

      • Implement org.apache.hadoop.hbase.io.crypto.CipherProvider.

        实现org.apache.hadoop.hbase.io.crypto.CipherProvider。

      • Add the implementation to the server classpath.

        将实现添加到服务器类路径。

      • Update hbase.crypto.cipherprovider in hbase-site.xml.

        更新hbase.crypto。cipherprovider hbase-site.xml。

  4. Configure WAL encryption.

    配置WAL加密。

    Configure WAL encryption in every RegionServer’s hbase-site.xml, by setting the following properties. You can include these in the HMaster’s hbase-site.xml as well, but the HMaster does not have a WAL and will not use them.

    在每个区域服务器的hbase站点中配置WAL加密。xml,通过设置以下属性。您可以将这些内容包括在HMaster的hbase站点中。xml也一样,但是HMaster没有WAL,也不会使用它们。

    <property>
      <name>hbase.regionserver.hlog.reader.impl</name>
      <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
    </property>
    <property>
      <name>hbase.regionserver.hlog.writer.impl</name>
      <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
    </property>
    <property>
      <name>hbase.regionserver.wal.encryption</name>
      <value>true</value>
    </property>
  5. Configure permissions on the hbase-site.xml file.

    在hbase站点上配置权限。xml文件。

    Because the keystore password is stored in the hbase-site.xml, you need to ensure that only the HBase user can read the hbase-site.xml file, using file ownership and permissions.

    因为密钥存储库密码存储在hbase站点中。xml,您需要确保只有HBase用户才能读取HBase站点。xml文件,使用文件所有权和权限。

  6. Restart your cluster.

    重新启动集群。

    Distribute the new configuration file to all nodes and restart your cluster.

    将新配置文件分发到所有节点并重新启动集群。

63.4.3. Administration

63.4.3。政府

Administrative tasks can be performed in HBase Shell or the Java API.

管理任务可以在HBase Shell或Java API中执行。

Java API

Java API examples in this section are taken from the source file hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckEncryption.java. .

本节中的Java API示例取自源文件hbase-server/src/test/ Java / org/apache/hadoop/hbase/util/testhbasefsckencryption.java。

Neither these examples, nor the source files they are taken from, are part of the public HBase API, and are provided for illustration only. Refer to the official API for usage instructions.

这些示例和它们所使用的源文件都不是公共HBase API的一部分,仅供参考。请参考使用说明的官方API。

Enable Encryption on a Column Family

To enable encryption on a column family, you can either use HBase Shell or the Java API. After enabling encryption, trigger a major compaction. When the major compaction completes, the HFiles will be encrypted.

要在列家族中启用加密,您可以使用HBase Shell或Java API。启用加密后,触发一个主要的压缩。当主压缩完成时,HFiles将被加密。

Rotate the Data Key

To rotate the data key, first change the ColumnFamily key in the column descriptor, then trigger a major compaction. When compaction is complete, all HFiles will be re-encrypted using the new data key. Until the compaction completes, the old HFiles will still be readable using the old key.

要旋转数据键,首先在列描述符中更改ColumnFamily键,然后触发一个主要的压缩。当compaction完成时,所有HFiles将使用新的数据键重新加密。在压缩完成之前,旧的HFiles仍然可以通过使用旧密钥来读取。

Switching Between Using a Random Data Key and Specifying A Key

If you configured a column family to use a specific key and you want to return to the default behavior of using a randomly-generated key for that column family, use the Java API to alter the HColumnDescriptor so that no value is sent with the key ENCRYPTION_KEY.

如果您将一个列家族配置为使用一个特定的键,并且您想要返回到使用该列家族的随机生成键的默认行为,那么使用Java API来修改HColumnDescriptor,这样就不会使用密钥ENCRYPTION_KEY来发送任何值。

Rotate the Master Key

To rotate the master key, first generate and distribute the new key. Then update the KeyStore to contain a new master key, and keep the old master key in the KeyStore using a different alias. Next, configure fallback to the old master key in the hbase-site.xml file.

要旋转主密钥,首先生成并分发新密钥。然后更新密钥库以包含一个新的主密钥,并使用不同的别名保存密钥存储库中的老主密钥。接下来,配置fallback到hbase站点中的老主键。xml文件。

63.5. Secure Bulk Load

63.5。安全的批量加载

Bulk loading in secure mode is a bit more involved than normal setup, since the client has to transfer the ownership of the files generated from the MapReduce job to HBase. Secure bulk loading is implemented by a coprocessor, named SecureBulkLoadEndpoint, which uses a staging directory configured by the configuration property hbase.bulkload.staging.dir, which defaults to /tmp/hbase-staging/.

在安全模式下的批量加载比一般的设置要复杂一些,因为客户端必须将从MapReduce作业生成的文件的所有权转移到HBase。安全批量加载是由一个名为SecureBulkLoadEndpoint的协处理器实现的,它使用由配置属性hbase.bulkload.分段配置的临时目录。该目录默认为/tmp/hbase-staging/。

Secure Bulk Load Algorithm
  • One time only, create a staging directory which is world-traversable and owned by the user which runs HBase (mode 711, or rwx—​x—​x). A listing of this directory will look similar to the following:

    只有一次,创建一个临时目录,该目录是由运行HBase(模式711,或rwx - x - x)的用户所拥有的。该目录的清单将类似于以下内容:

    $ ls -ld /tmp/hbase-staging
    drwx--x--x  2 hbase  hbase  68  3 Sep 14:54 /tmp/hbase-staging
  • A user writes out data to a secure output directory owned by that user. For example, /user/foo/data.

    用户将数据写入到该用户拥有的安全输出目录中。例如,/ user / foo /数据。

  • Internally, HBase creates a secret staging directory which is globally readable/writable (-rwxrwxrwx, 777). For example, /tmp/hbase-staging/averylongandrandomdirectoryname. The name and location of this directory is not exposed to the user. HBase manages creation and deletion of this directory.

    在内部,HBase创建一个秘密的staging目录,它是全局可读/可写的(-rwxrwxrwx, 777)。例如,/ tmp / hbase-staging / averylongandrandomdirectoryname。此目录的名称和位置不向用户公开。HBase管理这个目录的创建和删除。

  • The user makes the data world-readable and world-writable, moves it into the random staging directory, then calls the SecureBulkLoadClient#bulkLoadHFiles method.

    用户使数据世界可读和可写,将其移动到随机的staging目录中,然后调用SecureBulkLoadClient#bulkLoadHFiles方法。

The strength of the security lies in the length and randomness of the secret directory.

安全的力量在于秘密目录的长度和随机性。

To enable secure bulk load, add the following properties to hbase-site.xml.

要启用安全批量加载,请将以下属性添加到hbase-site.xml。

<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>
<property>
  <name>hbase.bulkload.staging.dir</name>
  <value>/tmp/hbase-staging</value>
</property>
<property>
  <name>hbase.coprocessor.region.classes</name>
  <value>org.apache.hadoop.hbase.security.token.TokenProvider,
  org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint</value>
</property>

63.6. Secure Enable

63.6。安全使

After hbase-2.x, the default 'hbase.security.authorization' changed. Before hbase-2.x, it defaulted to true, in later HBase versions, the default became false. So to enable hbase authorization, the following propertie must be configured in hbase-site.xml. See HBASE-19483;

hbase-2之后。x,默认“hbase.security。授权的改变。hbase-2之前。它默认为true,在后来的HBase版本中,默认为false。因此,为了启用hbase授权,必须在hbase-site.xml中配置下面的propertie。看到hbase - 19483;

<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>

64. Security Configuration Example

64年。安全配置示例

This configuration example includes support for HFile v3, ACLs, Visibility Labels, and transparent encryption of data at rest and the WAL. All options have been discussed separately in the sections above.

这个配置示例包括支持HFile v3、acl、可见性标签和对rest和WAL的数据的透明加密。以上各节分别讨论了所有选项。

Example 34. Example Security Settings in hbase-site.xml
<!-- HFile v3 Support -->
<property>
  <name>hfile.format.version</name>
  <value>3</value>
</property>
<!-- HBase Superuser -->
<property>
  <name>hbase.superuser</name>
  <value>hbase, admin</value>
</property>
<!-- Coprocessors for ACLs and Visibility Tags -->
<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>
<property>
  <name>hbase.coprocessor.region.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController,
  org.apache.hadoop.hbase.security.visibility.VisibilityController,
  org.apache.hadoop.hbase.security.token.TokenProvider</value>
</property>
<property>
  <name>hbase.coprocessor.master.classes</name>
  <value>org.apache.hadoop.hbase.security.access.AccessController,
  org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
</property>
<property>
  <name>hbase.coprocessor.regionserver.classes</name>
  <value>org.apache.hadoop/hbase.security.access.AccessController,
  org.apache.hadoop.hbase.security.access.VisibilityController</value>
</property>
<!-- Executable ACL for Coprocessor Endpoints -->
<property>
  <name>hbase.security.exec.permission.checks</name>
  <value>true</value>
</property>
<!-- Whether a user needs authorization for a visibility tag to set it on a cell -->
<property>
  <name>hbase.security.visibility.mutations.checkauth</name>
  <value>false</value>
</property>
<!-- Secure RPC Transport -->
<property>
  <name>hbase.rpc.protection</name>
  <value>privacy</value>
 </property>
 <!-- Transparent Encryption -->
<property>
  <name>hbase.crypto.keyprovider</name>
  <value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
</property>
<property>
  <name>hbase.crypto.keyprovider.parameters</name>
  <value>jceks:///path/to/hbase/conf/hbase.jks?password=***</value>
</property>
<property>
  <name>hbase.crypto.master.key.name</name>
  <value>hbase</value>
</property>
<!-- WAL Encryption -->
<property>
  <name>hbase.regionserver.hlog.reader.impl</name>
  <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
</property>
<property>
  <name>hbase.regionserver.hlog.writer.impl</name>
  <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
</property>
<property>
  <name>hbase.regionserver.wal.encryption</name>
  <value>true</value>
</property>
<!-- For key rotation -->
<property>
  <name>hbase.crypto.master.alternate.key.name</name>
  <value>hbase.old</value>
</property>
<!-- Secure Bulk Load -->
<property>
  <name>hbase.bulkload.staging.dir</name>
  <value>/tmp/hbase-staging</value>
</property>
<property>
  <name>hbase.coprocessor.region.classes</name>
  <value>org.apache.hadoop.hbase.security.token.TokenProvider,
  org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint</value>
</property>
Example 35. Example Group Mapper in Hadoop core-site.xml

Adjust these settings to suit your environment.

调整这些设置以适应您的环境。

<property>
  <name>hadoop.security.group.mapping</name>
  <value>org.apache.hadoop.security.LdapGroupsMapping</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.url</name>
  <value>ldap://server</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.bind.user</name>
  <value>Administrator@example-ad.local</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.bind.password</name>
  <value>****</value> <!-- Replace with the actual password -->
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.base</name>
  <value>dc=example-ad,dc=local</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
  <value>(&amp;(objectClass=user)(sAMAccountName={0}))</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
  <value>(objectClass=group)</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
  <value>member</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>
  <value>cn</value>
</property>

Architecture

体系结构

65. Overview

65年。概述

65.1. NoSQL?

65.1。NoSQL吗?

HBase is a type of "NoSQL" database. "NoSQL" is a general term meaning that the database isn’t an RDBMS which supports SQL as its primary access language, but there are many types of NoSQL databases: BerkeleyDB is an example of a local NoSQL database, whereas HBase is very much a distributed database. Technically speaking, HBase is really more a "Data Store" than "Data Base" because it lacks many of the features you find in an RDBMS, such as typed columns, secondary indexes, triggers, and advanced query languages, etc.

HBase是一种“NoSQL”数据库。“NoSQL”是一个通用术语,意思是数据库不是支持SQL作为主要访问语言的RDBMS,但是有许多类型的NoSQL数据库:BerkeleyDB是一个本地NoSQL数据库的示例,而HBase是一个非常多的分布式数据库。从技术上讲,HBase实际上是一个“数据存储”,而不是“数据库”,因为它缺乏在RDBMS中发现的许多特性,比如类型化列、二级索引、触发器和高级查询语言等等。

However, HBase has many features which supports both linear and modular scaling. HBase clusters expand by adding RegionServers that are hosted on commodity class servers. If a cluster expands from 10 to 20 RegionServers, for example, it doubles both in terms of storage and as well as processing capacity. An RDBMS can scale well, but only up to a point - specifically, the size of a single database server - and for the best performance requires specialized hardware and storage devices. HBase features of note are:

然而,HBase有许多支持线性和模块化扩展的特性。通过添加托管在商品类服务器上的区域服务器来扩展HBase集群。例如,如果一个集群从10个区域服务器扩展到20个区域服务器,那么它的存储和处理能力就会翻倍。RDBMS可以很好地扩展,但只需要达到一个点——具体地说,是单个数据库服务器的大小——最好的性能需要专门的硬件和存储设备。HBase的特点是:

  • Strongly consistent reads/writes: HBase is not an "eventually consistent" DataStore. This makes it very suitable for tasks such as high-speed counter aggregation.

    强烈一致的读/写:HBase不是一个“最终一致”的数据存储。这使得它非常适合诸如高速计数器聚合之类的任务。

  • Automatic sharding: HBase tables are distributed on the cluster via regions, and regions are automatically split and re-distributed as your data grows.

    自动分片:HBase表通过区域分布在集群上,随着数据的增长,区域会自动分割和重新分布。

  • Automatic RegionServer failover

    RegionServer自动故障转移

  • Hadoop/HDFS Integration: HBase supports HDFS out of the box as its distributed file system.

    Hadoop/HDFS集成:HBase支持HDFS作为其分布式文件系统。

  • MapReduce: HBase supports massively parallelized processing via MapReduce for using HBase as both source and sink.

    MapReduce: HBase支持通过MapReduce进行大规模并行处理,以使用HBase作为源和接收器。

  • Java Client API: HBase supports an easy to use Java API for programmatic access.

    Java客户端API: HBase支持使用Java API进行编程访问。

  • Thrift/REST API: HBase also supports Thrift and REST for non-Java front-ends.

    节约/REST API: HBase也支持非java前端的节约和休息。

  • Block Cache and Bloom Filters: HBase supports a Block Cache and Bloom Filters for high volume query optimization.

    块缓存和开放过滤器:HBase支持一个块缓存和开放过滤器,用于高容量的查询优化。

  • Operational Management: HBase provides build-in web-pages for operational insight as well as JMX metrics.

    操作管理:HBase为业务洞察力和JMX度量提供了内置的web页面。

65.2. When Should I Use HBase?

65.2。我应该什么时候使用HBase?

HBase isn’t suitable for every problem.

HBase不适合所有问题。

First, make sure you have enough data. If you have hundreds of millions or billions of rows, then HBase is a good candidate. If you only have a few thousand/million rows, then using a traditional RDBMS might be a better choice due to the fact that all of your data might wind up on a single node (or two) and the rest of the cluster may be sitting idle.

首先,确保你有足够的数据。如果你有数亿或数十亿行,那么HBase是一个很好的候选者。如果您只有几千/百万行,那么使用传统RDBMS可能是更好的选择,因为您的所有数据可能会在单个节点(或两个节点)上结束,而集群的其余部分可能处于闲置状态。

Second, make sure you can live without all the extra features that an RDBMS provides (e.g., typed columns, secondary indexes, transactions, advanced query languages, etc.) An application built against an RDBMS cannot be "ported" to HBase by simply changing a JDBC driver, for example. Consider moving from an RDBMS to HBase as a complete redesign as opposed to a port.

其次,确保您能够在没有RDBMS提供的所有额外特性的情况下生存(例如,类型列、二级索引、事务、高级查询语言等),而基于RDBMS的应用程序不能通过简单地更改JDBC驱动程序来“移植”到HBase。考虑从RDBMS到HBase,作为完全的重新设计,而不是一个端口。

Third, make sure you have enough hardware. Even HDFS doesn’t do well with anything less than 5 DataNodes (due to things such as HDFS block replication which has a default of 3), plus a NameNode.

第三,确保你有足够的硬件。即使是HDFS也不能很好地处理任何少于5个DataNodes(由于诸如HDFS块复制,默认为3),加上一个NameNode。

HBase can run quite well stand-alone on a laptop - but this should be considered a development configuration only.

HBase可以在笔记本电脑上独立运行,但这只需要考虑开发配置。

65.3. What Is The Difference Between HBase and Hadoop/HDFS?

65.3。HBase和Hadoop/HDFS有什么区别?

HDFS is a distributed file system that is well suited for the storage of large files. Its documentation states that it is not, however, a general purpose file system, and does not provide fast individual record lookups in files. HBase, on the other hand, is built on top of HDFS and provides fast record lookups (and updates) for large tables. This can sometimes be a point of conceptual confusion. HBase internally puts your data in indexed "StoreFiles" that exist on HDFS for high-speed lookups. See the Data Model and the rest of this chapter for more information on how HBase achieves its goals.

HDFS是一个分布式文件系统,非常适合存储大型文件。但是,它的文档说明它不是一个通用的文件系统,并且不提供文件中快速的单独记录查找。另一方面,HBase构建在HDFS之上,并为大型表提供快速的记录查找(和更新)。这有时可能是概念上的混乱。HBase在内部将您的数据放入索引的“StoreFiles”中,这些“StoreFiles”存在于HDFS上,用于高速查找。有关HBase如何实现其目标的更多信息,请参见数据模型和本章的其余部分。

66. Catalog Tables

66年。目录表

The catalog table hbase:meta exists as an HBase table and is filtered out of the HBase shell’s list command, but is in fact a table just like any other.

目录表hbase:meta作为一个hbase表存在,并从hbase shell的列表命令中过滤掉,但实际上与其他表一样。

66.1. hbase:meta

66.1。hbase:元

The hbase:meta table (previously called .META.) keeps a list of all regions in the system, and the location of hbase:meta is stored in ZooKeeper.

hbase:元表(以前称为. meta .)保存系统中所有区域的列表,以及hbase的位置:meta存储在ZooKeeper中。

The hbase:meta table structure is as follows:

hbase:元表结构如下:

Key
  • Region key of the format ([table],[region start key],[region id])

    格式的区域键([表],[区域启动键],[区域id])

Values
  • info:regioninfo (serialized HRegionInfo instance for this region)

    info:区域信息(该区域的序列化HRegionInfo实例)

  • info:server (server:port of the RegionServer containing this region)

    信息:服务器(服务器:包含该区域的区域服务器端口)

  • info:serverstartcode (start-time of the RegionServer process containing this region)

    info:serverstartcode(包含该区域的区域服务器进程的启动时间)

When a table is in the process of splitting, two other columns will be created, called info:splitA and info:splitB. These columns represent the two daughter regions. The values for these columns are also serialized HRegionInfo instances. After the region has been split, eventually this row will be deleted.

当一个表处于分裂的过程中,将会创建另外两个列,称为info:splitA和info:splitB。这些列表示两个子区域。这些列的值也是序列化的h区域性信息实例。区域被分割后,最终将删除这一行。

Note on HRegionInfo

The empty key is used to denote table start and table end. A region with an empty start key is the first region in a table. If a region has both an empty start and an empty end key, it is the only region in the table

空键用于表示表启动和表结束。一个具有空启动键的区域是表中的第一个区域。如果一个区域有一个空的开始和一个空的结束键,它是表中唯一的区域。

In the (hopefully unlikely) event that programmatic processing of catalog metadata is required, see the RegionInfo.parseFrom utility.

在(希望不大可能)事件中,需要对目录元数据进行程序化处理,请参阅区域信息。parseFrom效用。

66.2. Startup Sequencing

66.2。启动顺序

First, the location of hbase:meta is looked up in ZooKeeper. Next, hbase:meta is updated with server and startcode values.

首先,hbase的位置:meta在ZooKeeper中查找。接下来,hbase:meta更新了服务器和startcode值。

For information on region-RegionServer assignment, see Region-RegionServer Assignment.

有关区域-区域服务器分配的信息,请参见区域-区域服务器分配。

67. Client

67年。客户端

The HBase client finds the RegionServers that are serving the particular row range of interest. It does this by querying the hbase:meta table. See hbase:meta for details. After locating the required region(s), the client contacts the RegionServer serving that region, rather than going through the master, and issues the read or write request. This information is cached in the client so that subsequent requests need not go through the lookup process. Should a region be reassigned either by the master load balancer or because a RegionServer has died, the client will requery the catalog tables to determine the new location of the user region.

HBase客户端发现服务于特定行范围的区域服务器。它通过查询hbase:元表来实现这一点。详情见hbase:元。在定位所需区域之后,客户端会联系服务该区域的区域服务器,而不是通过主服务器,并发出读或写请求。此信息被缓存在客户端,以便后续请求不需要经过查找过程。如果一个区域被主负载均衡器重新分配,或者由于区域服务器已经死亡,客户端将重新查询目录表来确定用户区域的新位置。

See Runtime Impact for more information about the impact of the Master on HBase Client communication.

请参阅运行时影响,了解关于主服务器对HBase客户端通信的影响的更多信息。

Administrative functions are done via an instance of Admin

管理功能是通过Admin实例完成的。

67.1. Cluster Connections

67.1。集群连接

The API changed in HBase 1.0. For connection configuration information, see Client configuration and dependencies connecting to an HBase cluster.

在HBase 1.0中,API发生了变化。有关连接配置信息,请参阅连接到HBase集群的客户端配置和依赖项。

67.1.1. API as of HBase 1.0.0

67.1.1。API为HBase 1.0.0。

It’s been cleaned up and users are returned Interfaces to work against rather than particular types. In HBase 1.0, obtain a Connection object from ConnectionFactory and thereafter, get from it instances of Table, Admin, and RegionLocator on an as-need basis. When done, close the obtained instances. Finally, be sure to cleanup your Connection instance before exiting. Connections are heavyweight objects but thread-safe so you can create one for your application and keep the instance around. Table, Admin and RegionLocator instances are lightweight. Create as you go and then let go as soon as you are done by closing them. See the Client Package Javadoc Description for example usage of the new HBase 1.0 API.

它被清理了,用户返回的接口是针对而不是特定类型的。在HBase 1.0中,从ConnectionFactory获得一个连接对象,然后在需要的基础上从表、管理和区域定位器中获取连接对象。完成后,关闭已获得的实例。最后,在退出之前一定要清理您的连接实例。连接是重量级的对象,但线程安全,因此您可以为应用程序创建一个,并保留实例。表、管理和区域定位器实例是轻量级的。当你离开的时候就去创造,当你关闭它们的时候就放手。参见客户机包Javadoc描述,例如使用新的HBase 1.0 API。

67.1.2. API before HBase 1.0.0

67.1.2。API在HBase 1.0.0

Instances of HTable are the way to interact with an HBase cluster earlier than 1.0.0. Table instances are not thread-safe. Only one thread can use an instance of Table at any given time. When creating Table instances, it is advisable to use the same HBaseConfiguration instance. This will ensure sharing of ZooKeeper and socket instances to the RegionServers which is usually what you want. For example, this is preferred:

HTable的实例是在1.0.0之前与HBase集群交互的方式。表实例不是线程安全的。在任何给定的时间,只有一个线程可以使用表的实例。在创建表实例时,建议使用相同的HBaseConfiguration实例。这将确保将ZooKeeper和socket实例共享到通常您想要的区域服务器。例如,这是首选:

HBaseConfiguration conf = HBaseConfiguration.create();
HTable table1 = new HTable(conf, "myTable");
HTable table2 = new HTable(conf, "myTable");

as opposed to this:

而不是:

HBaseConfiguration conf1 = HBaseConfiguration.create();
HTable table1 = new HTable(conf1, "myTable");
HBaseConfiguration conf2 = HBaseConfiguration.create();
HTable table2 = new HTable(conf2, "myTable");

For more information about how connections are handled in the HBase client, see ConnectionFactory.

有关在HBase客户机中如何处理连接的更多信息,请参见ConnectionFactory。

Connection Pooling
连接池

For applications which require high-end multithreaded access (e.g., web-servers or application servers that may serve many application threads in a single JVM), you can pre-create a Connection, as shown in the following example:

对于需要高端多线程访问的应用程序(例如,web服务器或应用服务器可以在单个JVM中提供许多应用程序线程),您可以预先创建一个连接,如下面的示例所示:

Example 36. Pre-Creating a Connection
// Create a connection to the cluster.
Configuration conf = HBaseConfiguration.create();
try (Connection connection = ConnectionFactory.createConnection(conf);
     Table table = connection.getTable(TableName.valueOf(tablename))) {
  // use table as needed, the table returned is lightweight
}
HTablePool is Deprecated

Previous versions of this guide discussed HTablePool, which was deprecated in HBase 0.94, 0.95, and 0.96, and removed in 0.98.1, by HBASE-6580, or HConnection, which is deprecated in HBase 1.0 by Connection. Please use Connection instead.

该指南的以前版本讨论了HTablePool,它在HBase 0.94、0.95和0.96中被弃用,在0.98.1中被HBase -6580或HConnection删除,这在HBase 1.0中是通过连接被弃用的。请使用连接。

67.2. WriteBuffer and Batch Methods

67.2。WriteBuffer和批处理方法

In HBase 1.0 and later, HTable is deprecated in favor of Table. Table does not use autoflush. To do buffered writes, use the BufferedMutator class.

在HBase 1.0和以后,HTable被弃用,支持表。表不使用自动刷新。要进行缓冲写入,请使用BufferedMutator类。

In HBase 2.0 and later, HTable does not use BufferedMutator to execute the Put operation. Refer to HBASE-18500 for more information.

在HBase 2.0和之后,HTable不使用BufferedMutator来执行Put操作。更多信息请参考HBASE-18500。

For additional information on write durability, review the ACID semantics page.

有关写入持久性的其他信息,请参阅ACID语义页。

For fine-grained control of batching of Puts or Deletes, see the batch methods on Table.

对于添加或删除的批处理的细粒度控制,请参见表中的批处理方法。

67.3. Asynchronous Client

67.3。异步客户端

It is a new API introduced in HBase 2.0 which aims to provide the ability to access HBase asynchronously.

它是HBase 2.0中引入的一个新API,旨在提供异步访问HBase的能力。

You can obtain an AsyncConnection from ConnectionFactory, and then get a asynchronous table instance from it to access HBase. When done, close the AsyncConnection instance(usually when your program exits).

您可以从ConnectionFactory获得一个AsyncConnection,然后从它获得一个异步表实例来访问HBase。完成后,关闭AsyncConnection实例(通常在程序退出时)。

For the asynchronous table, most methods have the same meaning with the old Table interface, expect that the return value is wrapped with a CompletableFuture usually. We do not have any buffer here so there is no close method for asynchronous table, you do not need to close it. And it is thread safe.

对于异步表,大多数方法与旧表接口具有相同的含义,期望返回值通常用一个CompletableFuture包装。这里没有任何缓冲区,因此对于异步表没有关闭方法,您不需要关闭它。它是线程安全的。

There are several differences for scan:

扫描有几个不同之处:

  • There is still a getScanner method which returns a ResultScanner. You can use it in the old way and it works like the old ClientAsyncPrefetchScanner.

    仍然有一个getScanner方法返回一个ResultScanner。你可以用旧的方式使用它,就像老的ClientAsyncPrefetchScanner一样。

  • There is a scanAll method which will return all the results at once. It aims to provide a simpler way for small scans which you want to get the whole results at once usually.

    有一个scanAll方法,它将立即返回所有的结果。它的目的是提供一种更简单的方法来进行小的扫描,而你想要在通常情况下得到整个结果。

  • The Observer Pattern. There is a scan method which accepts a ScanResultConsumer as a parameter. It will pass the results to the consumer.

    观察者模式。有一个扫描方法,它接受一个ScanResultConsumer作为参数。它将把结果传递给消费者。

Notice that AsyncTable interface is templatized. The template parameter specifies the type of ScanResultConsumerBase used by scans, which means the observer style scan APIs are different. The two types of scan consumers are - ScanResultConsumer and AdvancedScanResultConsumer.

注意,AsyncTable接口是模板化的。模板参数指定扫描使用的ScanResultConsumerBase的类型,这意味着观察者样式的扫描api是不同的。两种类型的扫描消费者是- ScanResultConsumer和AdvancedScanResultConsumer。

ScanResultConsumer needs a separate thread pool which is used to execute the callbacks registered to the returned CompletableFuture. Because the use of separate thread pool frees up RPC threads, callbacks are free to do anything. Use this if the callbacks are not quick, or when in doubt.

ScanResultConsumer需要一个单独的线程池,该线程池用于执行向返回的CompletableFuture注册的回调。因为使用单独的线程池释放了RPC线程,所以回调可以自由地做任何事情。如果回调不快,或者在有疑问时使用它。

AdvancedScanResultConsumer executes callbacks inside the framework thread. It is not allowed to do time consuming work in the callbacks else it will likely block the framework threads and cause very bad performance impact. As its name, it is designed for advanced users who want to write high performance code. See org.apache.hadoop.hbase.client.example.HttpProxyExample for how to write fully asynchronous code with it.

AdvancedScanResultConsumer在框架线程中执行回调。在回调中不允许做耗时的工作,否则它可能会阻塞框架线程,并造成非常糟糕的性能影响。作为它的名称,它是为那些想要编写高性能代码的高级用户设计的。看到org.apache.hadoop.hbase.client.example。如何用它编写完全异步代码的HttpProxyExample。

67.4. Asynchronous Admin

67.4。异步管理

You can obtain an AsyncConnection from ConnectionFactory, and then get a AsyncAdmin instance from it to access HBase. Notice that there are two getAdmin methods to get a AsyncAdmin instance. One method has one extra thread pool parameter which is used to execute callbacks. It is designed for normal users. Another method doesn’t need a thread pool and all the callbacks are executed inside the framework thread so it is not allowed to do time consuming works in the callbacks. It is designed for advanced users.

您可以从ConnectionFactory获得一个AsyncConnection,然后从它获得一个AsyncAdmin实例以访问HBase。注意,有两个getAdmin方法来获得一个AsyncAdmin实例。一个方法有一个额外的线程池参数,用于执行回调。它是为普通用户设计的。另一种方法不需要线程池,所有回调都在框架线程中执行,因此不允许在回调中执行耗时的工作。它是为高级用户设计的。

The default getAdmin methods will return a AsyncAdmin instance which use default configs. If you want to customize some configs, you can use getAdminBuilder methods to get a AsyncAdminBuilder for creating AsyncAdmin instance. Users are free to only set the configs they care about to create a new AsyncAdmin instance.

默认的getAdmin方法将返回一个使用默认配置的AsyncAdmin实例。如果您想定制一些configs,您可以使用getAdminBuilder方法来获得创建AsyncAdmin实例的AsyncAdminBuilder。用户可以自由地设置他们所关心的配置,以创建一个新的AsyncAdmin实例。

For the AsyncAdmin interface, most methods have the same meaning with the old Admin interface, expect that the return value is wrapped with a CompletableFuture usually.

对于AsyncAdmin接口,大多数方法与旧的管理界面具有相同的含义,期望返回值通常用一个CompletableFuture包装。

For most admin operations, when the returned CompletableFuture is done, it means the admin operation has also been done. But for compact operation, it only means the compact request was sent to HBase and may need some time to finish the compact operation. For rollWALWriter method, it only means the rollWALWriter request was sent to the region server and may need some time to finish the rollWALWriter operation.

对于大多数管理操作,当完成返回的CompletableFuture时,意味着管理操作也已经完成。但是对于紧凑的操作,它只意味着紧凑的请求被发送到HBase,可能需要一些时间来完成紧凑的操作。对于rollWALWriter方法,它只意味着rollWALWriter请求被发送到区域服务器,可能需要一些时间来完成rollWALWriter操作。

For region name, we only accept byte[] as the parameter type and it may be a full region name or a encoded region name. For server name, we only accept ServerName as the parameter type. For table name, we only accept TableName as the parameter type. For list* operations, we only accept Pattern as the parameter type if you want to do regex matching.

对于区域名称,我们只接受byte[]作为参数类型,它可能是一个完整的区域名称或编码的区域名称。对于服务器名,我们只接受ServerName作为参数类型。对于表名,我们只接受TableName作为参数类型。对于列表*操作,如果您想要进行regex匹配,我们只接受模式作为参数类型。

67.5. External Clients

67.5。外部客户

Information on non-Java clients and custom protocols is covered in Apache HBase External APIs

Apache HBase外部api包含关于非java客户端和自定义协议的信息。

68. Client Request Filters

68年。客户端请求过滤器

Get and Scan instances can be optionally configured with filters which are applied on the RegionServer.

获取和扫描实例可以被选择性地配置为在区域服务器上应用的过滤器。

Filters can be confusing because there are many different types, and it is best to approach them by understanding the groups of Filter functionality.

过滤器可能会混淆,因为有许多不同的类型,最好通过理解过滤功能组来接近它们。

68.1. Structural

68.1。结构

Structural Filters contain other Filters.

结构过滤器包含其他过滤器。

68.1.1. FilterList

68.1.1。FilterList

FilterList represents a list of Filters with a relationship of FilterList.Operator.MUST_PASS_ALL or FilterList.Operator.MUST_PASS_ONE between the Filters. The following example shows an 'or' between two Filters (checking for either 'my value' or 'my other value' on the same attribute).

FilterList表示带有FilterList. operator关系的过滤器列表。MUST_PASS_ALL或FilterList.Operator。MUST_PASS_ONE之间的过滤器。下面的示例显示了两个过滤器之间的“or”(在同一个属性上检查“my value”或“my other value”)。

FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ONE);
SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
  cf,
  column,
  CompareOperator.EQUAL,
  Bytes.toBytes("my value")
  );
list.add(filter1);
SingleColumnValueFilter filter2 = new SingleColumnValueFilter(
  cf,
  column,
  CompareOperator.EQUAL,
  Bytes.toBytes("my other value")
  );
list.add(filter2);
scan.setFilter(list);

68.2. Column Value

68.2。列值

68.2.1. SingleColumnValueFilter

68.2.1。SingleColumnValueFilter

A SingleColumnValueFilter (see: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html) can be used to test column values for equivalence (CompareOperaor.EQUAL), inequality (CompareOperaor.NOT_EQUAL), or ranges (e.g., CompareOperaor.GREATER). The following is an example of testing equivalence of a column to a String value "my value"…​

可以使用一个SingleColumnValueFilter(参见:https://hbase.org/apidocs/org/apache/org/apache/hadoop/hbase / filter/singlecolumnvaluefilter.html)来测试等价(compareoper主动脉. equal)、不等式(compareoper主动脉. not_equal)或范围(例如,compareoper主动脉. greater)的列值。下面是一个测试等价的列到字符串值“my value”的例子。

SingleColumnValueFilter filter = new SingleColumnValueFilter(
  cf,
  column,
  CompareOperaor.EQUAL,
  Bytes.toBytes("my value")
  );
scan.setFilter(filter);

68.2.2. ColumnValueFilter

68.2.2。ColumnValueFilter

Introduced in HBase-2.0.0 version as a complementation of SingleColumnValueFilter, ColumnValueFilter gets matched cell only, while SingleColumnValueFilter gets the entire row (has other columns and values) to which the matched cell belongs. Parameters of constructor of ColumnValueFilter are the same as SingleColumnValueFilter.

在HBase-2.0.0版本中作为SingleColumnValueFilter的补充,ColumnValueFilter只得到匹配的单元格,而singlecolumvaluefilter获取整个行(包含其他列和值),匹配的单元格属于该行。ColumnValueFilter的构造函数的参数与SingleColumnValueFilter相同。

ColumnValueFilter filter = new ColumnValueFilter(
  cf,
  column,
  CompareOperaor.EQUAL,
  Bytes.toBytes("my value")
  );
scan.setFilter(filter);

Note. For simple query like "equals to a family:qualifier:value", we highly recommend to use the following way instead of using SingleColumnValueFilter or ColumnValueFilter:

请注意。对于简单的查询,比如“等于一个家庭:qualifier:value”,我们强烈建议使用以下方法,而不是使用SingleColumnValueFilter或ColumnValueFilter:

Scan scan = new Scan();
scan.addColumn(Bytes.toBytes("family"), Bytes.toBytes("qualifier"));
ValueFilter vf = new ValueFilter(CompareOperator.EQUAL,
  new BinaryComparator(Bytes.toBytes("value")));
scan.setFilter(vf);
...

This scan will restrict to the specified column 'family:qualifier', avoiding scan unrelated families and columns, which has better performance, and ValueFilter is the condition used to do the value filtering.

该扫描将限制到指定的列“家庭:限定符”,避免扫描不相关的家庭和列,这具有更好的性能,而ValueFilter是用于执行值筛选的条件。

But if query is much more complicated beyond this book, then please make your good choice case by case.

但是如果查询比这本书复杂得多,那么请根据情况选择合适的案例。

68.3. Column Value Comparators

68.3。列值比较器

There are several Comparator classes in the Filter package that deserve special mention. These Comparators are used in concert with other Filters, such as SingleColumnValueFilter.

在过滤包中有几个比较器类值得特别提及。这些比较器与其他过滤器一起使用,例如SingleColumnValueFilter。

68.3.1. RegexStringComparator

68.3.1。RegexStringComparator

RegexStringComparator supports regular expressions for value comparisons.

RegexStringComparator支持值比较的正则表达式。

RegexStringComparator comp = new RegexStringComparator("my.");   // any value that starts with 'my'
SingleColumnValueFilter filter = new SingleColumnValueFilter(
  cf,
  column,
  CompareOperaor.EQUAL,
  comp
  );
scan.setFilter(filter);

See the Oracle JavaDoc for supported RegEx patterns in Java.

请参阅Oracle JavaDoc以支持Java中的RegEx模式。

68.3.2. SubstringComparator

68.3.2。SubstringComparator

SubstringComparator can be used to determine if a given substring exists in a value. The comparison is case-insensitive.

SubstringComparator可以用来确定给定的子字符串是否存在于一个值中。比较是不区分大小写的。

SubstringComparator comp = new SubstringComparator("y val");   // looking for 'my value'
SingleColumnValueFilter filter = new SingleColumnValueFilter(
  cf,
  column,
  CompareOperaor.EQUAL,
  comp
  );
scan.setFilter(filter);

68.3.3. BinaryPrefixComparator

68.3.3。BinaryPrefixComparator

See BinaryPrefixComparator.

看到BinaryPrefixComparator。

68.3.4. BinaryComparator

68.3.4。BinaryComparator

See BinaryComparator.

看到BinaryComparator。

68.4. KeyValue Metadata

68.4。KeyValue元数据

As HBase stores data internally as KeyValue pairs, KeyValue Metadata Filters evaluate the existence of keys (i.e., ColumnFamily:Column qualifiers) for a row, as opposed to values the previous section.

当HBase将数据作为键值对存储在内部时,KeyValue元数据过滤器会评估键的存在(即:与上一节的值相反,列的列限定符。

68.4.1. FamilyFilter

68.4.1。FamilyFilter

FamilyFilter can be used to filter on the ColumnFamily. It is generally a better idea to select ColumnFamilies in the Scan than to do it with a Filter.

FamilyFilter可用于对ColumnFamily进行过滤。通常,在扫描中选择ColumnFamilies比使用过滤器更好。

68.4.2. QualifierFilter

68.4.2。QualifierFilter

QualifierFilter can be used to filter based on Column (aka Qualifier) name.

QualifierFilter可用于基于列(也称为限定符)的名称进行筛选。

68.4.3. ColumnPrefixFilter

68.4.3。ColumnPrefixFilter

ColumnPrefixFilter can be used to filter based on the lead portion of Column (aka Qualifier) names.

ColumnPrefixFilter可用于基于列(也称为限定符)名称的主要部分进行筛选。

A ColumnPrefixFilter seeks ahead to the first column matching the prefix in each row and for each involved column family. It can be used to efficiently get a subset of the columns in very wide rows.

一个ColumnPrefixFilter在第一个列中查找与每个行中的前缀相匹配,并为每个涉及的列族。它可以用于有效地获取非常宽行中的列的子集。

Note: The same column qualifier can be used in different column families. This filter returns all matching columns.

注意:相同的列限定符可以用于不同的列族。这个过滤器返回所有匹配的列。

Example: Find all columns in a row and family that start with "abc"

示例:查找从“abc”开始的行和家庭中的所有列

Table t = ...;
byte[] row = ...;
byte[] family = ...;
byte[] prefix = Bytes.toBytes("abc");
Scan scan = new Scan(row, row); // (optional) limit to one row
scan.addFamily(family); // (optional) limit to one family
Filter f = new ColumnPrefixFilter(prefix);
scan.setFilter(f);
scan.setBatch(10); // set this if there could be many columns returned
ResultScanner rs = t.getScanner(scan);
for (Result r = rs.next(); r != null; r = rs.next()) {
  for (KeyValue kv : r.raw()) {
    // each kv represents a column
  }
}
rs.close();

68.4.4. MultipleColumnPrefixFilter

68.4.4。MultipleColumnPrefixFilter

MultipleColumnPrefixFilter behaves like ColumnPrefixFilter but allows specifying multiple prefixes.

MultipleColumnPrefixFilter的行为类似于ColumnPrefixFilter,但允许指定多个前缀。

Like ColumnPrefixFilter, MultipleColumnPrefixFilter efficiently seeks ahead to the first column matching the lowest prefix and also seeks past ranges of columns between prefixes. It can be used to efficiently get discontinuous sets of columns from very wide rows.

与ColumnPrefixFilter一样,MultipleColumnPrefixFilter有效地寻找与最低前缀匹配的第一列,并查找前缀之间的列的过去范围。它可以用来有效地从非常宽的行中得到不连续的列。

Example: Find all columns in a row and family that start with "abc" or "xyz"

示例:查找从“abc”或“xyz”开始的行和家庭中的所有列

Table t = ...;
byte[] row = ...;
byte[] family = ...;
byte[][] prefixes = new byte[][] {Bytes.toBytes("abc"), Bytes.toBytes("xyz")};
Scan scan = new Scan(row, row); // (optional) limit to one row
scan.addFamily(family); // (optional) limit to one family
Filter f = new MultipleColumnPrefixFilter(prefixes);
scan.setFilter(f);
scan.setBatch(10); // set this if there could be many columns returned
ResultScanner rs = t.getScanner(scan);
for (Result r = rs.next(); r != null; r = rs.next()) {
  for (KeyValue kv : r.raw()) {
    // each kv represents a column
  }
}
rs.close();

68.4.5. ColumnRangeFilter

68.4.5。ColumnRangeFilter

A ColumnRangeFilter allows efficient intra row scanning.

一个ColumnRangeFilter允许有效的内部扫描。

A ColumnRangeFilter can seek ahead to the first matching column for each involved column family. It can be used to efficiently get a 'slice' of the columns of a very wide row. i.e. you have a million columns in a row but you only want to look at columns bbbb-bbdd.

一个ColumnRangeFilter可以为每个涉及的列家庭寻找第一个匹配的列。它可以被用来有效地获取一个非常宽行的列的“切片”。也就是说,你有一百万列,但是你只需要看bbbb-bbdd。

Note: The same column qualifier can be used in different column families. This filter returns all matching columns.

注意:相同的列限定符可以用于不同的列族。这个过滤器返回所有匹配的列。

Example: Find all columns in a row and family between "bbbb" (inclusive) and "bbdd" (inclusive)

示例:在“bbbb”(包含)和“bbdd”(包括)之间查找行和家庭中的所有列

Table t = ...;
byte[] row = ...;
byte[] family = ...;
byte[] startColumn = Bytes.toBytes("bbbb");
byte[] endColumn = Bytes.toBytes("bbdd");
Scan scan = new Scan(row, row); // (optional) limit to one row
scan.addFamily(family); // (optional) limit to one family
Filter f = new ColumnRangeFilter(startColumn, true, endColumn, true);
scan.setFilter(f);
scan.setBatch(10); // set this if there could be many columns returned
ResultScanner rs = t.getScanner(scan);
for (Result r = rs.next(); r != null; r = rs.next()) {
  for (KeyValue kv : r.raw()) {
    // each kv represents a column
  }
}
rs.close();

Note: Introduced in HBase 0.92

注意:在HBase 0.92中引入。

68.5. RowKey

68.5。RowKey

68.5.1. RowFilter

68.5.1。RowFilter

It is generally a better idea to use the startRow/stopRow methods on Scan for row selection, however RowFilter can also be used.

使用startRow/stopRow方法来扫描行选择通常是一个更好的主意,但也可以使用RowFilter。

68.6. Utility

68.6。实用程序

68.6.1. FirstKeyOnlyFilter

68.6.1。FirstKeyOnlyFilter

This is primarily used for rowcount jobs. See FirstKeyOnlyFilter.

这主要用于rowcount作业。看到FirstKeyOnlyFilter。

69. Master

69年。主

HMaster is the implementation of the Master Server. The Master server is responsible for monitoring all RegionServer instances in the cluster, and is the interface for all metadata changes. In a distributed cluster, the Master typically runs on the NameNode. J Mohamed Zahoor goes into some more detail on the Master Architecture in this blog posting, HBase HMaster Architecture .

HMaster是主服务器的实现。主服务器负责监控集群中的所有区域服务器实例,并且是所有元数据更改的接口。在分布式集群中,Master通常在NameNode上运行。J Mohamed Zahoor在这篇博客文章HBase HMaster Architecture中详细介绍了主架构。

69.1. Startup Behavior

69.1。创业行为

If run in a multi-Master environment, all Masters compete to run the cluster. If the active Master loses its lease in ZooKeeper (or the Master shuts down), then the remaining Masters jostle to take over the Master role.

如果在一个多主机环境中运行,所有的主机都竞争运行集群。如果主动主人失去了在动物园管理员(或主人关闭)的租约,那么剩下的主人就会争夺主人的角色。

69.2. Runtime Impact

69.2。运行时的影响

A common dist-list question involves what happens to an HBase cluster when the Master goes down. Because the HBase client talks directly to the RegionServers, the cluster can still function in a "steady state". Additionally, per Catalog Tables, hbase:meta exists as an HBase table and is not resident in the Master. However, the Master controls critical functions such as RegionServer failover and completing region splits. So while the cluster can still run for a short time without the Master, the Master should be restarted as soon as possible.

一个常见的列表问题包括当主节点下降时,HBase集群会发生什么情况。由于HBase客户端直接与区域服务器对话,集群仍然可以在“稳定状态”中运行。另外,每个目录表,hbase:meta作为一个hbase表存在,并不是驻留在主表中。但是,主控制关键功能,如区域服务器故障转移和完成区域分割。因此,虽然集群仍然可以在没有主服务器的情况下运行很短的时间,但是主服务器应该尽快重新启动。

69.3. Interface

69.3。接口

The methods exposed by HMasterInterface are primarily metadata-oriented methods:

HMasterInterface公开的方法主要是面向元数据的方法:

  • Table (createTable, modifyTable, removeTable, enable, disable)

    表(createTable, modifyTable, removeTable, enable, disable)

  • ColumnFamily (addColumn, modifyColumn, removeColumn)

    ColumnFamily(addColumn modifyColumn removeColumn)

  • Region (move, assign, unassign) For example, when the Admin method disableTable is invoked, it is serviced by the Master server.

    例如,当调用管理方法disableTable时,它由主服务器提供服务。

69.4. Processes

69.4。流程

The Master runs several background threads:

Master运行几个后台线程:

69.4.1. LoadBalancer

69.4.1。loadbalance

Periodically, and when there are no regions in transition, a load balancer will run and move regions around to balance the cluster’s load. See Balancer for configuring this property.

当转换中没有区域时,负载均衡器会运行并移动区域以平衡集群的负载。请参阅平衡器来配置此属性。

See Region-RegionServer Assignment for more information on region assignment.

有关区域分配的更多信息,请参见区域-区域服务器分配。

69.4.2. CatalogJanitor

69.4.2。CatalogJanitor

Periodically checks and cleans up the hbase:meta table. See hbase:meta for more information on the meta table.

定期检查并清理hbase:元表。参见hbase:meta表中的更多信息。

70. RegionServer

70年。RegionServer

HRegionServer is the RegionServer implementation. It is responsible for serving and managing regions. In a distributed cluster, a RegionServer runs on a DataNode.

区域服务器是区域服务器实现。负责区域的服务和管理。在分布式集群中,区域服务器运行在DataNode上。

70.1. Interface

70.1。接口

The methods exposed by HRegionRegionInterface contain both data-oriented and region-maintenance methods:

h分区域接口公开的方法包含数据导向和区域维护方法:

  • Data (get, put, delete, next, etc.)

    数据(get、put、delete、next等)

  • Region (splitRegion, compactRegion, etc.) For example, when the Admin method majorCompact is invoked on a table, the client is actually iterating through all regions for the specified table and requesting a major compaction directly to each region.

    例如,当在表上调用管理方法major compact时,客户端实际上是遍历指定表的所有区域,并直接向每个区域请求一个主要的压缩。

70.2. Processes

70.2。流程

The RegionServer runs a variety of background threads:

区域服务器运行各种后台线程:

70.2.1. CompactSplitThread

70.2.1。CompactSplitThread

Checks for splits and handle minor compactions.

检查分割和处理小的压缩。

70.2.2. MajorCompactionChecker

70.2.2。MajorCompactionChecker

Checks for major compactions.

检查主要件。

70.2.3. MemStoreFlusher

70.2.3。MemStoreFlusher

Periodically flushes in-memory writes in the MemStore to StoreFiles.

内存中定期刷新内存到存储文件。

70.2.4. LogRoller

70.2.4。LogRoller

Periodically checks the RegionServer’s WAL.

定期检查区域服务器的WAL。

70.3. Coprocessors

70.3。协处理器

Coprocessors were added in 0.92. There is a thorough Blog Overview of CoProcessors posted. Documentation will eventually move to this reference guide, but the blog is the most current information available at this time.

在0.92中加入了协处理器。有一个完整的关于协处理器的博客概述。文档最终会转移到这个参考指南,但是这个博客是目前可用的最多的信息。

70.4. Block Cache

70.4。块缓存

HBase provides two different BlockCache implementations: the default on-heap LruBlockCache and the BucketCache, which is (usually) off-heap. This section discusses benefits and drawbacks of each implementation, how to choose the appropriate option, and configuration options for each.

HBase提供了两种不同的BlockCache实现:默认的on-heap LruBlockCache和BucketCache(通常是堆外的)。本节讨论每个实现的优点和缺点,如何选择合适的选项,以及每种实现的配置选项。

Block Cache Reporting: UI

See the RegionServer UI for detail on caching deploy. Since HBase 0.98.4, the Block Cache detail has been significantly extended showing configurations, sizings, current usage, time-in-the-cache, and even detail on block counts and types.

有关缓存部署的详细信息,请参见区域服务器UI。由于HBase 0.98.4,块缓存的细节已经大大扩展,显示了配置、sizings、当前使用、缓存时间,甚至是块计数和类型的细节。

70.4.1. Cache Choices

70.4.1。缓存的选择

LruBlockCache is the original implementation, and is entirely within the Java heap. BucketCache is mainly intended for keeping block cache data off-heap, although BucketCache can also keep data on-heap and serve from a file-backed cache.

LruBlockCache是最初的实现,完全在Java堆中。BucketCache主要用于保持块缓存数据堆外,尽管BucketCache还可以保存数据,并从文件支持的缓存中提供服务。

BucketCache is production ready as of HBase 0.98.6

To run with BucketCache, you need HBASE-11678. This was included in 0.98.6.

要使用BucketCache,您需要HBASE-11678。这包括在0.98.6中。

Fetching will always be slower when fetching from BucketCache, as compared to the native on-heap LruBlockCache. However, latencies tend to be less erratic across time, because there is less garbage collection when you use BucketCache since it is managing BlockCache allocations, not the GC. If the BucketCache is deployed in off-heap mode, this memory is not managed by the GC at all. This is why you’d use BucketCache, so your latencies are less erratic and to mitigate GCs and heap fragmentation. See Nick Dimiduk’s BlockCache 101 for comparisons running on-heap vs off-heap tests. Also see Comparing BlockCache Deploys which finds that if your dataset fits inside your LruBlockCache deploy, use it otherwise if you are experiencing cache churn (or you want your cache to exist beyond the vagaries of java GC), use BucketCache.

当从BucketCache中获取时,抓取总是比在堆上的LruBlockCache更慢。但是,延迟在跨时间的情况下会变得不那么不稳定,因为当您使用BucketCache时,会减少垃圾收集,因为它是管理阻塞缓存分配,而不是GC。如果BucketCache被部署在堆外模式中,那么这个内存就不是由GC管理的。这就是为什么您要使用BucketCache,因此您的延迟不那么不稳定,并且可以减轻GCs和堆碎片。请参阅Nick Dimiduk的BlockCache 101,用于比较运行在堆和堆上的测试。还可以看到比较BlockCache部署,它发现如果您的数据集适合您的LruBlockCache部署,则使用它,否则如果您正在经历高速缓存(或者您希望您的缓存超出java GC的异常),使用BucketCache。

When you enable BucketCache, you are enabling a two tier caching system, an L1 cache which is implemented by an instance of LruBlockCache and an off-heap L2 cache which is implemented by BucketCache. Management of these two tiers and the policy that dictates how blocks move between them is done by CombinedBlockCache. It keeps all DATA blocks in the L2 BucketCache and meta blocks — INDEX and BLOOM blocks — on-heap in the L1 LruBlockCache. See Off-heap Block Cache for more detail on going off-heap.

当启用BucketCache时,您可以启用两个层缓存系统,一个L1缓存由一个LruBlockCache实例和一个由BucketCache实现的非堆L2缓存实现。这两层的管理和规定如何在它们之间移动的策略由组合块缓存完成。它在L1 LruBlockCache中保存了L2 BucketCache和meta块索引和BLOOM块中的所有数据块。请参阅堆外块缓存,以获得更多关于关闭堆的详细信息。

70.4.2. General Cache Configurations

70.4.2。通用缓存配置

Apart from the cache implementation itself, you can set some general configuration options to control how the cache performs. See https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html. After setting any of these options, restart or rolling restart your cluster for the configuration to take effect. Check logs for errors or unexpected behavior.

除了缓存实现本身之外,您还可以设置一些常规配置选项来控制缓存的执行方式。见https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html。在设置了这些选项后,重新启动或滚动重新启动您的集群以使配置生效。检查记录错误或意外行为。

See also Prefetch Option for Blockcache, which discusses a new option introduced in HBASE-9857.

还可以看到Blockcache的预取选项,它讨论了在HBASE-9857中引入的新选项。

70.4.3. LruBlockCache Design

70.4.3。LruBlockCache设计

The LruBlockCache is an LRU cache that contains three levels of block priority to allow for scan-resistance and in-memory ColumnFamilies:

LruBlockCache是一个LRU缓存,它包含三个级别的块优先级,以支持scan-resistance和内存中的ColumnFamilies:

  • Single access priority: The first time a block is loaded from HDFS it normally has this priority and it will be part of the first group to be considered during evictions. The advantage is that scanned blocks are more likely to get evicted than blocks that are getting more usage.

    单一访问优先级:第一次从HDFS加载一个块时,它通常有这个优先级,并且它将是在被驱逐期间被考虑的第一个组的一部分。优点是,扫描块比那些得到更多使用的块更容易被逐出。

  • Multi access priority: If a block in the previous priority group is accessed again, it upgrades to this priority. It is thus part of the second group considered during evictions.

    多访问优先级:如果先前优先级组中的一个块再次被访问,它将升级到这个优先级。这是在驱逐期间考虑的第二组的一部分。

  • In-memory access priority: If the block’s family was configured to be "in-memory", it will be part of this priority disregarding the number of times it was accessed. Catalog tables are configured like this. This group is the last one considered during evictions.

    内存访问优先级:如果块的家庭被配置为“内存中”,那么它将是这个优先级的一部分,不考虑它被访问的次数。编目表是这样配置的。这一组是在驱逐期间考虑的最后一组。

    To mark a column family as in-memory, call

    将列族标记为内存中,调用。

HColumnDescriptor.setInMemory(true);

if creating a table from java, or set IN_MEMORY ⇒ true when creating or altering a table in the shell: e.g.

如果在创建或更改shell中的表时,从java创建一个表,或者设置IN_MEMORY,例如:

hbase(main):003:0> create  't', {NAME => 'f', IN_MEMORY => 'true'}

For more information, see the LruBlockCache source

有关更多信息,请参见LruBlockCache源代码。

70.4.4. LruBlockCache Usage

70.4.4。LruBlockCache用法

Block caching is enabled by default for all the user tables which means that any read operation will load the LRU cache. This might be good for a large number of use cases, but further tunings are usually required in order to achieve better performance. An important concept is the working set size, or WSS, which is: "the amount of memory needed to compute the answer to a problem". For a website, this would be the data that’s needed to answer the queries over a short amount of time.

默认情况下,所有用户表都启用了块缓存,这意味着任何读操作都将加载LRU缓存。这可能对大量的用例有好处,但是为了获得更好的性能,通常需要进行进一步的调整。一个重要的概念是工作集大小,即WSS,即:“计算问题的答案所需的内存数量”。对于一个网站来说,这将是在短时间内回答查询所需要的数据。

The way to calculate how much memory is available in HBase for caching is:

计算HBase中缓存的可用内存的方法是:

number of region servers * heap size * hfile.block.cache.size * 0.99

The default value for the block cache is 0.25 which represents 25% of the available heap. The last value (99%) is the default acceptable loading factor in the LRU cache after which eviction is started. The reason it is included in this equation is that it would be unrealistic to say that it is possible to use 100% of the available memory since this would make the process blocking from the point where it loads new blocks. Here are some examples:

块缓存的默认值是0.25,表示可用堆的25%。最后一个值(99%)是LRU缓存中默认可接受的加载因子,在此之后将开始驱逐。之所以将其包含在这个等式中,是因为认为可以使用100%的可用内存是不现实的,因为这将使进程从加载新块的地方阻塞。下面是一些例子:

  • One region server with the heap size set to 1 GB and the default block cache size will have 253 MB of block cache available.

    将堆大小设置为1 GB的区域服务器和默认的块缓存大小将有253 MB的块缓存可用。

  • 20 region servers with the heap size set to 8 GB and a default block cache size will have 39.6 of block cache.

    将堆大小设置为8 GB的区域服务器和默认的块缓存大小将有39.6块缓存。

  • 100 region servers with the heap size set to 24 GB and a block cache size of 0.5 will have about 1.16 TB of block cache.

    100个区域服务器,堆大小设置为24 GB,块缓存大小为0.5,将有大约1.16 TB的块缓存。

Your data is not the only resident of the block cache. Here are others that you may have to take into account:

您的数据并不是块缓存的唯一驻留。以下是一些你可能不得不考虑到的问题:

Catalog Tables

The hbase:meta table is forced into the block cache and have the in-memory priority which means that they are harder to evict.

hbase:元表被强制进入块缓存并具有内存优先级,这意味着它们更难被驱逐。

The hbase:meta tables can occupy a few MBs depending on the number of regions.
HFiles Indexes

An HFile is the file format that HBase uses to store data in HDFS. It contains a multi-layered index which allows HBase to seek to the data without having to read the whole file. The size of those indexes is a factor of the block size (64KB by default), the size of your keys and the amount of data you are storing. For big data sets it’s not unusual to see numbers around 1GB per region server, although not all of it will be in cache because the LRU will evict indexes that aren’t used.

HFile是HBase用于在HDFS中存储数据的文件格式。它包含一个多层索引,允许HBase在不读取整个文件的情况下查找数据。这些索引的大小是块大小的一个因素(默认为64KB)、密钥的大小和存储的数据量。对于大数据集,在1GB /区域服务器上看到数字并不少见,尽管不是所有的数据都在缓存中,因为LRU会驱逐未使用的索引。

Keys

The values that are stored are only half the picture, since each value is stored along with its keys (row key, family qualifier, and timestamp). See Try to minimize row and column sizes.

存储的值只是图像的一半,因为每个值都与它的键(行键、家庭限定符和时间戳)一起存储。请尝试最小化行和列的大小。

Bloom Filters

Just like the HFile indexes, those data structures (when enabled) are stored in the LRU.

就像HFile索引一样,这些数据结构(启用时)存储在LRU中。

Currently the recommended way to measure HFile indexes and bloom filters sizes is to look at the region server web UI and checkout the relevant metrics. For keys, sampling can be done by using the HFile command line tool and look for the average key size metric. Since HBase 0.98.3, you can view details on BlockCache stats and metrics in a special Block Cache section in the UI.

目前推荐的测量HFile索引和bloom filter大小的方法是查看区域服务器web UI并检查相关的指标。对于密钥,可以使用HFile命令行工具进行抽样,并查找平均密钥大小度量。由于HBase 0.98.3,您可以在UI中一个特殊的块缓存部分查看BlockCache属性和指标的详细信息。

It’s generally bad to use block caching when the WSS doesn’t fit in memory. This is the case when you have for example 40GB available across all your region servers' block caches but you need to process 1TB of data. One of the reasons is that the churn generated by the evictions will trigger more garbage collections unnecessarily. Here are two use cases:

当WSS不适合内存时,使用块缓存通常是不好的。例如,在所有区域服务器的块缓存中都有40GB可用,但您需要处理1TB的数据。其中一个原因是,驱逐所产生的搅动会导致不必要的垃圾收集。这里有两个用例:

  • Fully random reading pattern: This is a case where you almost never access the same row twice within a short amount of time such that the chance of hitting a cached block is close to 0. Setting block caching on such a table is a waste of memory and CPU cycles, more so that it will generate more garbage to pick up by the JVM. For more information on monitoring GC, see JVM Garbage Collection Logs.

    完全随机阅读模式:这是一种情况,在短时间内,您几乎从不访问相同的行,这样就可以将缓存块的命中率接近于0。在这样的表上设置块缓存是对内存和CPU周期的浪费,因此它会产生更多的垃圾来接收JVM。有关监视GC的更多信息,请参见JVM垃圾收集日志。

  • Mapping a table: In a typical MapReduce job that takes a table in input, every row will be read only once so there’s no need to put them into the block cache. The Scan object has the option of turning this off via the setCaching method (set it to false). You can still keep block caching turned on on this table if you need fast random read access. An example would be counting the number of rows in a table that serves live traffic, caching every block of that table would create massive churn and would surely evict data that’s currently in use.

    映射表:在典型的MapReduce作业中,每一行将只读取一次,因此不需要将它们放入块缓存中。扫描对象可以选择通过setcache方法将其关闭(将其设置为false)。如果您需要快速的随机读取访问,您仍然可以在这个表上打开块缓存。一个例子是计算一个服务于实时流量的表中的行数,缓存该表的每个块会产生大量的搅动,并且肯定会驱逐当前正在使用的数据。

Caching META blocks only (DATA blocks in fscache)
仅缓存元数据块(fscache中的数据块)

An interesting setup is one where we cache META blocks only and we read DATA blocks in on each access. If the DATA blocks fit inside fscache, this alternative may make sense when access is completely random across a very large dataset. To enable this setup, alter your table and for each column family set BLOCKCACHE ⇒ 'false'. You are 'disabling' the BlockCache for this column family only. You can never disable the caching of META blocks. Since HBASE-4683 Always cache index and bloom blocks, we will cache META blocks even if the BlockCache is disabled.

一个有趣的设置是我们只缓存元数据块,并在每个访问中读取数据块。如果数据块适合于fscache,那么当访问完全随机地跨一个非常大的数据集时,这个选择可能是有意义的。要启用这个设置,请修改您的表,并为每个列家族设置BLOCKCACHE“false”。您仅为这个列家族“禁用”阻塞缓存。您永远不能禁用元块的缓存。由于HBASE-4683总是缓存索引和bloom块,所以即使禁用了BlockCache,我们也会缓存元块。

70.4.5. Off-heap Block Cache

70.4.5。堆块缓存

How to Enable BucketCache
如何启用BucketCache

The usual deploy of BucketCache is via a managing class that sets up two caching tiers: an L1 on-heap cache implemented by LruBlockCache and a second L2 cache implemented with BucketCache. The managing class is CombinedBlockCache by default. The previous link describes the caching 'policy' implemented by CombinedBlockCache. In short, it works by keeping meta blocks — INDEX and BLOOM in the L1, on-heap LruBlockCache tier — and DATA blocks are kept in the L2, BucketCache tier. It is possible to amend this behavior in HBase since version 1.0 and ask that a column family have both its meta and DATA blocks hosted on-heap in the L1 tier by setting cacheDataInL1 via (HColumnDescriptor.setCacheDataInL1(true) or in the shell, creating or amending column families setting CACHE_DATA_IN_L1 to true: e.g.

BucketCache的通常部署是通过一个管理类来设置两个缓存层:一个由LruBlockCache实现的L1 on堆缓存和一个使用BucketCache实现的第二个L2缓存。在默认情况下,管理类是组合块缓存。前面的链接描述了由组合块缓存实现的缓存策略。简而言之,它的工作原理是保持元数据块——索引和在L1中开放,堆上的LruBlockCache层——和数据块保存在L2, BucketCache层中。可以在HBase中修改这个行为,因为版本1.0并要求一个列家族通过设置cacheDataInL1通过(HColumnDescriptor.setCacheDataInL1(true)或shell,创建或修改列家族将CACHE_DATA_IN_L1设置为true,从而将其元数据块和数据块都托管在L1层中。

hbase(main):003:0> create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}}

The BucketCache Block Cache can be deployed on-heap, off-heap, or file based. You set which via the hbase.bucketcache.ioengine setting. Setting it to heap will have BucketCache deployed inside the allocated Java heap. Setting it to offheap will have BucketCache make its allocations off-heap, and an ioengine setting of file:PATH_TO_FILE will direct BucketCache to use a file caching (Useful in particular if you have some fast I/O attached to the box such as SSDs).

BucketCache块缓存可以部署在堆、堆外或基于文件。通过hbase.bucketcache设置。ioengine设置。将其设置为heap将会在已分配的Java堆中部署BucketCache。将其设置为offheap将会有BucketCache使其分配到堆外,并且一个文件的ioengine设置:PATH_TO_FILE将直接使用BucketCache来使用一个文件缓存(特别是如果您有一些快速的I/O附加到诸如ssd之类的框中)。

It is possible to deploy an L1+L2 setup where we bypass the CombinedBlockCache policy and have BucketCache working as a strict L2 cache to the L1 LruBlockCache. For such a setup, set CacheConfig.BUCKET_CACHE_COMBINED_KEY to false. In this mode, on eviction from L1, blocks go to L2. When a block is cached, it is cached first in L1. When we go to look for a cached block, we look first in L1 and if none found, then search L2. Let us call this deploy format, Raw L1+L2.

可以部署L1+L2设置,我们绕过了组合块缓存策略,并将BucketCache作为一个严格的L2缓存作为L1 LruBlockCache。对于这样的设置,设置CacheConfig。BUCKET_CACHE_COMBINED_KEY为假。在这种模式下,从L1中被逐出,block进入L2。当一个块被缓存时,它首先在L1中缓存。当我们去寻找一个缓存的块时,我们先看L1,如果没有找到,就搜索L2。让我们调用这个部署格式,原始L1+L2。

Other BucketCache configs include: specifying a location to persist cache to across restarts, how many threads to use writing the cache, etc. See the CacheConfig.html class for configuration options and descriptions.

其他的BucketCache configs包括:指定一个位置,以在重新启动时持久化缓存,有多少线程用于写入缓存,等等。用于配置选项和描述的html类。

BucketCache Example Configuration
BucketCache示例配置

This sample provides a configuration for a 4 GB off-heap BucketCache with a 1 GB on-heap cache.

这个示例提供了一个4 GB的非堆桶缓存的配置,其中有一个1 GB的堆缓存。

Configuration is performed on the RegionServer.

在区域服务器上执行配置。

Setting hbase.bucketcache.ioengine and hbase.bucketcache.size > 0 enables CombinedBlockCache. Let us presume that the RegionServer has been set to run with a 5G heap: i.e. HBASE_HEAPSIZE=5g.

设置hbase.bucketcache。ioengine hbase.bucketcache。尺寸> 0支持组合块缓存。让我们假定区域服务器已被设置为以5G堆运行:即HBASE_HEAPSIZE= 5G。

  1. First, edit the RegionServer’s hbase-env.sh and set HBASE_OFFHEAPSIZE to a value greater than the off-heap size wanted, in this case, 4 GB (expressed as 4G). Let’s set it to 5G. That’ll be 4G for our off-heap cache and 1G for any other uses of off-heap memory (there are other users of off-heap memory other than BlockCache; e.g. DFSClient in RegionServer can make use of off-heap memory). See Direct Memory Usage In HBase.

    首先,编辑区域服务器的hbase-env。sh和将HBASE_OFFHEAPSIZE设置为大于非堆大小的值,在本例中为4 GB(表示为4G)。我们把它设为5G。这将是我们的外堆缓存的4G,对于其他非堆内存的使用也将是1G(除了BlockCache,还有其他非堆内存的用户;区域服务器中的DFSClient可以利用堆外存储器。参见HBase中的直接内存使用。

    HBASE_OFFHEAPSIZE=5G
  2. Next, add the following configuration to the RegionServer’s hbase-site.xml.

    接下来,将以下配置添加到区域服务器的hbase-site.xml中。

    <property>
      <name>hbase.bucketcache.ioengine</name>
      <value>offheap</value>
    </property>
    <property>
      <name>hfile.block.cache.size</name>
      <value>0.2</value>
    </property>
    <property>
      <name>hbase.bucketcache.size</name>
      <value>4196</value>
    </property>
  3. Restart or rolling restart your cluster, and check the logs for any issues.

    重启或滚动重新启动您的集群,并检查日志是否有任何问题。

In the above, we set the BucketCache to be 4G. We configured the on-heap LruBlockCache have 20% (0.2) of the RegionServer’s heap size (0.2 * 5G = 1G). In other words, you configure the L1 LruBlockCache as you would normally (as if there were no L2 cache present).

在上面,我们将BucketCache设置为4G。我们配置了分区服务器堆大小的20% (0.2)(0.2 * 5G = 1G)。换句话说,您可以像往常一样配置L1 LruBlockCache(就像没有L2缓存一样)。

HBASE-10641 introduced the ability to configure multiple sizes for the buckets of the BucketCache, in HBase 0.98 and newer. To configurable multiple bucket sizes, configure the new property hfile.block.cache.sizes (instead of hfile.block.cache.size) to a comma-separated list of block sizes, ordered from smallest to largest, with no spaces. The goal is to optimize the bucket sizes based on your data access patterns. The following example configures buckets of size 4096 and 8192.

HBase -10641介绍了在HBase 0.98和更新版本中为桶缓存配置多个大小的能力。要配置多个bucket大小,请配置新属性hfile.block.cache。大小(而不是hfile.block.cache.size)到一个由逗号分隔的块大小列表,从最小到最大,没有空格。目标是根据您的数据访问模式优化bucket的大小。下面的示例配置大小为4096和8192的桶。

<property>
  <name>hfile.block.cache.sizes</name>
  <value>4096,8192</value>
</property>
Direct Memory Usage In HBase

The default maximum direct memory varies by JVM. Traditionally it is 64M or some relation to allocated heap size (-Xmx) or no limit at all (JDK7 apparently). HBase servers use direct memory, in particular short-circuit reading, the hosted DFSClient will allocate direct memory buffers. If you do off-heap block caching, you’ll be making use of direct memory. Starting your JVM, make sure the -XX:MaxDirectMemorySize setting in conf/hbase-env.sh is set to some value that is higher than what you have allocated to your off-heap BlockCache (hbase.bucketcache.size). It should be larger than your off-heap block cache and then some for DFSClient usage (How much the DFSClient uses is not easy to quantify; it is the number of open HFiles * hbase.dfs.client.read.shortcircuit.buffer.size where hbase.dfs.client.read.shortcircuit.buffer.size is set to 128k in HBase — see hbase-default.xml default configurations). Direct memory, which is part of the Java process heap, is separate from the object heap allocated by -Xmx. The value allocated by MaxDirectMemorySize must not exceed physical RAM, and is likely to be less than the total available RAM due to other memory requirements and system constraints.

默认的最大直接内存因JVM而异。传统上,它是64M或一些与分配的堆大小(-Xmx)或根本没有限制(显然是JDK7)的关系。HBase服务器使用直接内存,特别是在短路读取时,托管的DFSClient将分配直接内存缓冲区。如果您使用堆块缓存,则将使用直接内存。启动JVM,确保在conf/hbase-env中的-XX:MaxDirectMemorySize设置。sh被设置为比您分配给您的堆外块缓存(hbase.bucketcache.size)更高的值。它应该大于您的堆外块缓存,然后一些用于DFSClient使用(DFSClient使用多少不容易量化;它是打开的HFiles * hbase.dfs.client. read.buffer。hbase.dfs.client.read.shortcircuit.buffer大小。大小设置为128k在HBase -见HBase -default。xml默认配置)。直接内存是Java进程堆的一部分,它与-Xmx分配的对象堆是分开的。MaxDirectMemorySize所分配的值必须不超过物理RAM,并且由于其他内存需求和系统约束,可能会小于总可用RAM。

You can see how much memory — on-heap and off-heap/direct — a RegionServer is configured to use and how much it is using at any one time by looking at the Server Metrics: Memory tab in the UI. It can also be gotten via JMX. In particular the direct memory currently used by the server can be found on the java.nio.type=BufferPool,name=direct bean. Terracotta has a good write up on using off-heap memory in Java. It is for their product BigMemory but a lot of the issues noted apply in general to any attempt at going off-heap. Check it out.

您可以看到一个区域服务器被配置为使用多少内存——堆和堆外/直接——通过查看服务器指标:UI中的memory选项卡,可以在任何时间使用多少内存。它也可以通过JMX获得。特别是服务器当前使用的直接内存可以在java.nio上找到。类型=缓冲池名称=直接bean。Terracotta在Java中使用非堆内存有一个很好的记录。这是为了他们的产品BigMemory,但是很多问题都提到了对任何试图离开堆的尝试。检查出来。

hbase.bucketcache.percentage.in.combinedcache

This is a pre-HBase 1.0 configuration removed because it was confusing. It was a float that you would set to some value between 0.0 and 1.0. Its default was 0.9. If the deploy was using CombinedBlockCache, then the LruBlockCache L1 size was calculated to be (1 - hbase.bucketcache.percentage.in.combinedcache) * size-of-bucketcache and the BucketCache size was hbase.bucketcache.percentage.in.combinedcache * size-of-bucket-cache. where size-of-bucket-cache itself is EITHER the value of the configuration hbase.bucketcache.size IF it was specified as Megabytes OR hbase.bucketcache.size * -XX:MaxDirectMemorySize if hbase.bucketcache.size is between 0 and 1.0.

这是一个pre-HBase 1.0配置,因为它令人困惑。它是一个浮点数,在0。0到1。0之间。其默认为0.9。如果部署使用的是组合块缓存,那么LruBlockCache L1的大小就会被计算为(1 - hbase. BucketCache . . . . . .) *大小的桶缓存大小是hbase. bucketcache.l % age.l。在这种情况下,大小缓存本身就是配置hbase.bucketcache的值。如果指定为兆字节或hbase.bucketcache,那么大小。size * -XX:MaxDirectMemorySize如果hbase.bucketcache。大小介于0和1.0之间。

In 1.0, it should be more straight-forward. L1 LruBlockCache size is set as a fraction of java heap using hfile.block.cache.size setting (not the best name) and L2 is set as above either in absolute Megabytes or as a fraction of allocated maximum direct memory.

在1.0中,它应该更直接。L1 LruBlockCache大小设置为使用hfile.block.cache的java堆的一小部分。大小设置(不是最好的名称)和L2设置为绝对兆字节或分配的最大直接内存的一小部分。

70.4.6. Compressed BlockCache

70.4.6。压缩BlockCache

HBASE-11331 introduced lazy BlockCache decompression, more simply referred to as compressed BlockCache. When compressed BlockCache is enabled data and encoded data blocks are cached in the BlockCache in their on-disk format, rather than being decompressed and decrypted before caching.

HBASE-11331引入了惰性块缓存解压,更简单地称为压缩的块缓存。当被压缩的块缓存被启用时,数据块和编码数据块以磁盘的格式缓存到块缓存中,而不是在缓存之前被解压和解密。

For a RegionServer hosting more data than can fit into cache, enabling this feature with SNAPPY compression has been shown to result in 50% increase in throughput and 30% improvement in mean latency while, increasing garbage collection by 80% and increasing overall CPU load by 2%. See HBASE-11331 for more details about how performance was measured and achieved. For a RegionServer hosting data that can comfortably fit into cache, or if your workload is sensitive to extra CPU or garbage-collection load, you may receive less benefit.

对于一个承载更多数据的区域服务器来说,可以使用SNAPPY压缩来实现这一功能,结果显示吞吐量增加了50%,平均延迟提高了30%,垃圾收集增加了80%,总体CPU负载增加了2%。有关如何测量和实现性能的更多细节,请参见HBASE-11331。对于能够轻松适应高速缓存的区域服务器,或者如果您的工作负载对额外的CPU或垃圾收集负载敏感,您可能会收到较少的好处。

The compressed BlockCache is disabled by default. To enable it, set hbase.block.data.cachecompressed to true in hbase-site.xml on all RegionServers.

默认情况下禁用压缩的BlockCache。要启用它,设置hbase.block.data。在hbase网站上的cachecompressed。在所有RegionServers xml。

70.5. RegionServer Splitting Implementation

70.5。RegionServer分裂实现

As write requests are handled by the region server, they accumulate in an in-memory storage system called the memstore. Once the memstore fills, its content are written to disk as additional store files. This event is called a memstore flush. As store files accumulate, the RegionServer will compact them into fewer, larger files. After each flush or compaction finishes, the amount of data stored in the region has changed. The RegionServer consults the region split policy to determine if the region has grown too large or should be split for another policy-specific reason. A region split request is enqueued if the policy recommends it.

由于写请求是由区域服务器处理的,它们在内存存储系统中积累,称为memstore。一旦memstore被填充,它的内容就会被写入磁盘作为额外的存储文件。这个事件被称为内存存储刷新。随着存储文件的积累,区域服务器将把它们压缩成更少、更大的文件。每次刷新或压缩完成后,存储在该区域的数据量就会发生变化。区域服务器咨询区域分割政策,以确定该区域是否已经变得太大,或者应该为另一个特定于政策的原因而划分。如果策略建议,区域拆分请求将被排队。

Logically, the process of splitting a region is simple. We find a suitable point in the keyspace of the region where we should divide the region in half, then split the region’s data into two new regions at that point. The details of the process however are not simple. When a split happens, the newly created daughter regions do not rewrite all the data into new files immediately. Instead, they create small files similar to symbolic link files, named Reference files, which point to either the top or bottom part of the parent store file according to the split point. The reference file is used just like a regular data file, but only half of the records are considered. The region can only be split if there are no more references to the immutable data files of the parent region. Those reference files are cleaned gradually by compactions, so that the region will stop referring to its parents files, and can be split further.

从逻辑上讲,分割区域的过程很简单。我们在该区域的关键区域找到一个合适的点,我们应该将该区域划分为一半,然后将该区域的数据分割成两个新的区域。然而,这个过程的细节并不简单。当分离发生时,新创建的子区域不会立即将所有数据重新写入新文件。相反,它们会创建类似于符号链接文件的小文件,命名为Reference files,它指向父存储文件的顶部或底部,根据分叉点。引用文件就像普通的数据文件一样使用,但是只有一半的记录被考虑。如果没有对父区域的不可变数据文件的引用,该区域只能被分割。这些引用文件会逐渐被压缩,从而使该区域不再引用其父母文件,并且可以进一步拆分。

Although splitting the region is a local decision made by the RegionServer, the split process itself must coordinate with many actors. The RegionServer notifies the Master before and after the split, updates the .META. table so that clients can discover the new daughter regions, and rearranges the directory structure and data files in HDFS. Splitting is a multi-task process. To enable rollback in case of an error, the RegionServer keeps an in-memory journal about the execution state. The steps taken by the RegionServer to execute the split are illustrated in RegionServer Split Process. Each step is labeled with its step number. Actions from RegionServers or Master are shown in red, while actions from the clients are show in green.

虽然分割该区域是区域服务器的本地决策,但拆分过程本身必须与许多参与者协调。区域服务器在拆分之前和之后通知主机,更新. meta。表以便客户端能够发现新的子区域,并重新安排HDFS中的目录结构和数据文件。分裂是一个多任务过程。为了在出错时启用回滚,区域服务器保存了一个关于执行状态的内存日志。区域服务器执行拆分所采取的步骤在分区服务器拆分过程中进行了说明。每一步都标上它的步数。区域服务器或主服务器的操作显示为红色,而来自客户机的操作显示为绿色。

Region Split Process
Figure 1. RegionServer Split Process
  1. The RegionServer decides locally to split the region, and prepares the split. THE SPLIT TRANSACTION IS STARTED. As a first step, the RegionServer acquires a shared read lock on the table to prevent schema modifications during the splitting process. Then it creates a znode in zookeeper under /hbase/region-in-transition/region-name, and sets the znode’s state to SPLITTING.

    区域服务器决定本地分割区域,并准备分割。拆分事务开始了。作为第一步,区域服务器获取表上的共享读锁,以防止在分割过程中进行模式修改。然后,在zookeeper中创建一个znode,在/hbase/region-in- on/region-name中,并将znode的状态设置为拆分。

  2. The Master learns about this znode, since it has a watcher for the parent region-in-transition znode.

    Master了解这个znode,因为它有一个用于父区域内转换znode的监视程序。

  3. The RegionServer creates a sub-directory named .splits under the parent’s region directory in HDFS.

    区域服务器创建了一个名为.拆分的子目录,它位于HDFS的父区域目录下。

  4. The RegionServer closes the parent region and marks the region as offline in its local data structures. THE SPLITTING REGION IS NOW OFFLINE. At this point, client requests coming to the parent region will throw NotServingRegionException. The client will retry with some backoff. The closing region is flushed.

    区域服务器关闭父区域,并将该区域标记为在本地数据结构中离线。分裂区域现在处于脱机状态。此时,来自父区域的客户端请求将抛出notservingregion异常。客户将重新尝试一些备份。关闭区域被刷新。

  5. The RegionServer creates region directories under the .splits directory, for daughter regions A and B, and creates necessary data structures. Then it splits the store files, in the sense that it creates two Reference files per store file in the parent region. Those reference files will point to the parent region’s files.

    区域服务器在.拆分目录下创建区域目录,对于子区域A和B,并创建必要的数据结构。然后它分割存储文件,因为它在父区域中创建了每个存储文件的两个引用文件。这些引用文件将指向父区域的文件。

  6. The RegionServer creates the actual region directory in HDFS, and moves the reference files for each daughter.

    区域服务器在HDFS中创建实际的区域目录,并为每个女儿移动参考文件。

  7. The RegionServer sends a Put request to the .META. table, to set the parent as offline in the .META. table and add information about daughter regions. At this point, there won’t be individual entries in .META. for the daughters. Clients will see that the parent region is split if they scan .META., but won’t know about the daughters until they appear in .META.. Also, if this Put to .META. succeeds, the parent will be effectively split. If the RegionServer fails before this RPC succeeds, Master and the next Region Server opening the region will clean dirty state about the region split. After the .META. update, though, the region split will be rolled-forward by Master.

    区域服务器向. meta发送一个Put请求。在. meta中将父节点设置为脱机。表并添加关于子区域的信息。在这一点上,. meta中不会有单独的条目。的女儿。客户端将看到,如果他们扫描. meta,父区域将被分割。,但在女儿出现之前,她不会知道。另外,如果这个放到。meta。成功,父母将会有效地分裂。如果区域服务器在此RPC成功之前失败,主服务器和下一个区域服务器打开该区域将清除该区域的脏状态。在.META之后。不过,该区域的更新将由Master来完成。

  8. The RegionServer opens daughters A and B in parallel.

    区域服务器同时打开女儿A和B。

  9. The RegionServer adds the daughters A and B to .META., together with information that it hosts the regions. THE SPLIT REGIONS (DAUGHTERS WITH REFERENCES TO PARENT) ARE NOW ONLINE. After this point, clients can discover the new regions and issue requests to them. Clients cache the .META. entries locally, but when they make requests to the RegionServer or .META., their caches will be invalidated, and they will learn about the new regions from .META..

    区域服务器将女儿A和B添加到. meta。,以及它所承载的区域的信息。分裂地区(有父母的女儿)现在在网上。在此之后,客户机可以发现新的区域并向它们发出请求。客户端缓存.META。本地条目,但当它们向区域服务器或. meta请求时。他们的缓存将失效,他们将从.META了解新的区域。

  10. The RegionServer updates znode /hbase/region-in-transition/region-name in ZooKeeper to state SPLIT, so that the master can learn about it. The balancer can freely re-assign the daughter regions to other region servers if necessary. THE SPLIT TRANSACTION IS NOW FINISHED.

    区域服务器更新znode /hbase/region-in-transition/region-name在ZooKeeper中进行状态拆分,以便管理员了解它。如果需要,平衡器可以自由地将子区域重新分配给其他区域服务器。分割事务现在已经完成。

  11. After the split, .META. and HDFS will still contain references to the parent region. Those references will be removed when compactions in daughter regions rewrite the data files. Garbage collection tasks in the master periodically check whether the daughter regions still refer to the parent region’s files. If not, the parent region will be removed.

    分手后,.META。而HDFS仍然包含对父区域的引用。当子区域的compaction重写数据文件时,这些引用将被删除。主程序中的垃圾收集任务定期检查子区域是否仍然引用父区域的文件。否则,父区域将被删除。

70.6. Write Ahead Log (WAL)

70.6。提前写日志(细胞膜)

70.6.1. Purpose

70.6.1。目的

The Write Ahead Log (WAL) records all changes to data in HBase, to file-based storage. Under normal operations, the WAL is not needed because data changes move from the MemStore to StoreFiles. However, if a RegionServer crashes or becomes unavailable before the MemStore is flushed, the WAL ensures that the changes to the data can be replayed. If writing to the WAL fails, the entire operation to modify the data fails.

写在前面的日志(WAL)记录了在HBase中对数据的所有更改,到基于文件的存储。在正常操作下,不需要WAL,因为数据更改从MemStore转移到StoreFiles。但是,如果区域性服务器崩溃或在MemStore被刷新之前变得不可用,则该WAL确保对数据的更改可以重新播放。如果写入到WAL失败,整个操作修改数据失败。

HBase uses an implementation of the WAL interface. Usually, there is only one instance of a WAL per RegionServer. The RegionServer records Puts and Deletes to it, before recording them to the MemStore for the affected Store.

HBase使用了WAL接口的实现。通常,每个区域服务器只有一个实例。区域服务器记录在将其记录到受影响的存储的MemStore之前,将并删除它。

The HLog

Prior to 2.0, the interface for WALs in HBase was named HLog. In 0.94, HLog was the name of the implementation of the WAL. You will likely find references to the HLog in documentation tailored to these older versions.

在2.0之前,HBase中WALs的接口被命名为HLog。在0.94中,HLog是实现WAL的名字。您可能会发现针对这些旧版本的文档中的HLog的引用。

The WAL resides in HDFS in the /hbase/WALs/ directory (prior to HBase 0.94, they were stored in /hbase/.logs/), with subdirectories per region.

WAL驻留在/hbase/WALs/目录中的HDFS中(在hbase 0.94之前,它们存储在/hbase/.log /)中,每个区域都有子目录。

For more general information about the concept of write ahead logs, see the Wikipedia Write-Ahead Log article.

有关写前日志的概念的更一般的信息,请参阅Wikipedia写在前面的日志文章。

70.6.2. MultiWAL

70.6.2。MultiWAL

With a single WAL per RegionServer, the RegionServer must write to the WAL serially, because HDFS files must be sequential. This causes the WAL to be a performance bottleneck.

由于每个分区服务器都有一个单独的WAL,所以分区服务器必须连续地向WAL - mail写入,因为HDFS文件必须是连续的。这使WAL成为性能瓶颈。

HBase 1.0 introduces support MultiWal in HBASE-5699. MultiWAL allows a RegionServer to write multiple WAL streams in parallel, by using multiple pipelines in the underlying HDFS instance, which increases total throughput during writes. This parallelization is done by partitioning incoming edits by their Region. Thus, the current implementation will not help with increasing the throughput to a single Region.

HBase 1.0在HBase -5699中引入支持MultiWal。MultiWAL允许一个区域服务器并行地编写多个WAL - flow,通过在底层的HDFS实例中使用多个管道,在写入过程中增加了总吞吐量。这种并行化是通过将传入的edits划分到它们的区域来完成的。因此,当前的实现不会帮助将吞吐量增加到单个区域。

RegionServers using the original WAL implementation and those using the MultiWAL implementation can each handle recovery of either set of WALs, so a zero-downtime configuration update is possible through a rolling restart.

使用原始的WAL实现的区域服务器和使用MultiWAL实现的区域服务器都可以处理任意一组WALs的恢复,因此可以通过滚动重新启动实现零停机配置更新。

Configure MultiWAL

To configure MultiWAL for a RegionServer, set the value of the property hbase.wal.provider to multiwal by pasting in the following XML:

为了配置区域服务器的MultiWAL,设置属性hbase.wal的值。在以下XML中粘贴到多瓦的提供者:

<property>
  <name>hbase.wal.provider</name>
  <value>multiwal</value>
</property>

Restart the RegionServer for the changes to take effect.

重新启动区域服务器以使更改生效。

To disable MultiWAL for a RegionServer, unset the property and restart the RegionServer.

若要禁用区域服务器的MultiWAL,请取消设置该属性并重新启动区域服务器。

70.6.3. WAL Flushing

70.6.3。细胞膜冲洗

TODO (describe).

待办事项(描述)。

70.6.4. WAL Splitting

70.6.4。细胞膜分裂

A RegionServer serves many regions. All of the regions in a region server share the same active WAL file. Each edit in the WAL file includes information about which region it belongs to. When a region is opened, the edits in the WAL file which belong to that region need to be replayed. Therefore, edits in the WAL file must be grouped by region so that particular sets can be replayed to regenerate the data in a particular region. The process of grouping the WAL edits by region is called log splitting. It is a critical process for recovering data if a region server fails.

区域服务器服务于许多区域。区域服务器中的所有区域都共享相同的活动WAL文件。在WAL文件中的每一个编辑都包含了它所属的区域的信息。当一个区域被打开时,属于该区域的瓦尔文件的编辑需要重新播放。因此,在WAL - file中编辑必须按区域分组,这样特定的集合就可以被重放,以在特定区域重新生成数据。按区域划分WAL - edits的过程称为日志分裂。如果一个区域服务器失败,它是恢复数据的关键过程。

Log splitting is done by the HMaster during cluster start-up or by the ServerShutdownHandler as a region server shuts down. So that consistency is guaranteed, affected regions are unavailable until data is restored. All WAL edits need to be recovered and replayed before a given region can become available again. As a result, regions affected by log splitting are unavailable until the process completes.

在集群启动过程中,HMaster或服务器关闭处理服务器关闭时,由HMaster完成日志拆分。这样就保证了一致性,直到数据恢复之前,受影响的区域是不可用的。所有的WAL - edits都需要在给定的区域再次可用之前恢复和重新播放。因此,在进程完成之前,被日志分裂影响的区域是不可用的。

Procedure: Log Splitting, Step by Step
  1. The /hbase/WALs/<host>,<port>,<startcode> directory is renamed.

    /hbase/WALs/ <主机> , <端口> , 目录被重命名。

    Renaming the directory is important because a RegionServer may still be up and accepting requests even if the HMaster thinks it is down. If the RegionServer does not respond immediately and does not heartbeat its ZooKeeper session, the HMaster may interpret this as a RegionServer failure. Renaming the logs directory ensures that existing, valid WAL files which are still in use by an active but busy RegionServer are not written to by accident.

    重命名目录是很重要的,因为即使HMaster认为它已经宕机了,区域服务器仍然可以接受请求并接受请求。如果区域服务器没有立即响应并没有心跳,那么HMaster可能会将其解释为区域性服务器故障。重命名日志目录可以确保现有的、有效的、仍在使用的、但繁忙的区域服务器仍然使用的WAL - file文件不是偶然写入的。

    The new directory is named according to the following pattern:

    新目录根据以下模式命名:

    /hbase/WALs/<host>,<port>,<startcode>-splitting

    An example of such a renamed directory might look like the following:

    这样一个重命名目录的示例可能如下所示:

    /hbase/WALs/srv.example.com,60020,1254173957298-splitting
  2. Each log file is split, one at a time.

    每个日志文件被拆分,一次一个。

    The log splitter reads the log file one edit entry at a time and puts each edit entry into the buffer corresponding to the edit’s region. At the same time, the splitter starts several writer threads. Writer threads pick up a corresponding buffer and write the edit entries in the buffer to a temporary recovered edit file. The temporary edit file is stored to disk with the following naming pattern:

    日志拆分器每次读取一个编辑条目,并将每个编辑条目放入与编辑区域对应的缓冲区中。与此同时,splitter启动了几个编写线程。编写线程获取相应的缓冲区,并将缓冲区中的编辑条目写入临时恢复的编辑文件。临时编辑文件以以下命名模式存储到磁盘:

    /hbase/<table_name>/<region_id>/recovered.edits/.temp

    This file is used to store all the edits in the WAL log for this region. After log splitting completes, the .temp file is renamed to the sequence ID of the first log written to the file.

    该文件用于存储该区域的WAL - log中的所有编辑。日志分解完成后,.temp文件被重命名为写入文件的第一个日志的序列ID。

    To determine whether all edits have been written, the sequence ID is compared to the sequence of the last edit that was written to the HFile. If the sequence of the last edit is greater than or equal to the sequence ID included in the file name, it is clear that all writes from the edit file have been completed.

    为了确定是否已经编写了所有编辑,将序列ID与写入HFile的最后一个编辑序列进行比较。如果最后一个编辑的序列大于或等于文件名中包含的序列ID,那么很明显,编辑文件中的所有写入都已完成。

  3. After log splitting is complete, each affected region is assigned to a RegionServer.

    日志拆分完成后,每个受影响区域都分配给一个区域服务器。

    When the region is opened, the recovered.edits folder is checked for recovered edits files. If any such files are present, they are replayed by reading the edits and saving them to the MemStore. After all edit files are replayed, the contents of the MemStore are written to disk (HFile) and the edit files are deleted.

    当该区域被打开时,恢复。编辑文件夹检查恢复编辑文件。如果存在这样的文件,则通过读取编辑器并将其保存到MemStore来重放它们。在重新播放所有编辑文件之后,MemStore的内容被写入磁盘(HFile),并删除编辑文件。

Handling of Errors During Log Splitting
处理日志分裂期间的错误。

If you set the hbase.hlog.split.skip.errors option to true, errors are treated as follows:

如果设置了hbase.hlog.split.skip。错误选项为真,错误处理如下:

  • Any error encountered during splitting will be logged.

    在拆分过程中遇到的任何错误都会被记录。

  • The problematic WAL log will be moved into the .corrupt directory under the hbase rootdir,

    问题的WAL - log将被移到hbase rootdir下的.corrupt目录中,

  • Processing of the WAL will continue

    对沃尔的处理将继续进行。

If the hbase.hlog.split.skip.errors option is set to false, the default, the exception will be propagated and the split will be logged as failed. See HBASE-2958 When hbase.hlog.split.skip.errors is set to false, we fail the split but that’s it. We need to do more than just fail split if this flag is set.

如果hbase.hlog.split.skip。错误选项被设置为false,默认情况下,将会传播异常,并且将日志记录为失败。当hbase.hlog.split.skip看到hbase - 2958。错误被设置为false,我们失败了,但就是这样。如果设置了这个标志,我们需要做的不仅仅是失败。

How EOFExceptions are treated when splitting a crashed RegionServer’s WALs
当分割崩溃的区域服务器的WALs时,如何处理eofexception ?

If an EOFException occurs while splitting logs, the split proceeds even when hbase.hlog.split.skip.errors is set to false. An EOFException while reading the last log in the set of files to split is likely, because the RegionServer was likely in the process of writing a record at the time of a crash. For background, see HBASE-2643 Figure how to deal with eof splitting logs

如果在分割日志时发生了EOFException,那么即使在hbase.hlog.split.skip中也会发生拆分。错误设置为false。在读取文件集中的最后一个日志时,可能会出现一个EOFException,因为分区服务器很可能在崩溃的时候写入一个记录。关于背景,请参见HBASE-2643,了解如何处理拆分日志。

Performance Improvements during Log Splitting
日志拆分期间的性能改进。

WAL log splitting and recovery can be resource intensive and take a long time, depending on the number of RegionServers involved in the crash and the size of the regions. Enabling or Disabling Distributed Log Splitting was developed to improve performance during log splitting.

WAL - log分裂和恢复可能是资源密集型的,而且需要很长时间,这取决于参与崩溃的区域服务器的数量和区域的大小。为了提高日志拆分期间的性能,开发了启用或禁用分布式日志拆分。

Enabling or Disabling Distributed Log Splitting

Distributed log processing is enabled by default since HBase 0.92. The setting is controlled by the hbase.master.distributed.log.splitting property, which can be set to true or false, but defaults to true.

分布式日志处理自HBase 0.92以来默认启用。该设置由hbase.master.distribu.log控制。分割属性,可以设置为真或假,但默认为真。

Distributed Log Splitting, Step by Step

After configuring distributed log splitting, the HMaster controls the process. The HMaster enrolls each RegionServer in the log splitting process, and the actual work of splitting the logs is done by the RegionServers. The general process for log splitting, as described in Distributed Log Splitting, Step by Step still applies here.

配置分布式日志拆分之后,HMaster控制这个过程。HMaster在日志分割过程中对每个区域服务器进行卷卷,而分割日志的实际工作由区域服务器完成。日志拆分的一般过程,如分布式日志拆分中所描述的,一步一步地应用到这里。

  1. If distributed log processing is enabled, the HMaster creates a split log manager instance when the cluster is started.

    如果启用了分布式日志处理,那么当集群启动时,HMaster将创建一个split日志管理器实例。

    1. The split log manager manages all log files which need to be scanned and split.

      split日志管理器管理所有需要扫描和拆分的日志文件。

    2. The split log manager places all the logs into the ZooKeeper splitlog node (/hbase/splitlog) as tasks.

      split日志管理器将所有的日志放到ZooKeeper splitlog节点(/hbase/splitlog)中作为任务。

    3. You can view the contents of the splitlog by issuing the following zkCli command. Example output is shown.

      通过发出以下zkCli命令,您可以查看splitlog的内容。示例输出。

      ls /hbase/splitlog
      [hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost8.sample.com%2C57020%2C1340474893275-splitting%2Fhost8.sample.com%253A57020.1340474893900,
      hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost3.sample.com%2C57020%2C1340474893299-splitting%2Fhost3.sample.com%253A57020.1340474893931,
      hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost4.sample.com%2C57020%2C1340474893287-splitting%2Fhost4.sample.com%253A57020.1340474893946]

      The output contains some non-ASCII characters. When decoded, it looks much more simple:

      输出包含一些非ascii字符。当解码时,它看起来更简单:

      [hdfs://host2.sample.com:56020/hbase/.logs
      /host8.sample.com,57020,1340474893275-splitting
      /host8.sample.com%3A57020.1340474893900,
      hdfs://host2.sample.com:56020/hbase/.logs
      /host3.sample.com,57020,1340474893299-splitting
      /host3.sample.com%3A57020.1340474893931,
      hdfs://host2.sample.com:56020/hbase/.logs
      /host4.sample.com,57020,1340474893287-splitting
      /host4.sample.com%3A57020.1340474893946]

      The listing represents WAL file names to be scanned and split, which is a list of log splitting tasks.

      清单表示要扫描和拆分的WAL - file名称,这是一个日志分解任务列表。

  2. The split log manager monitors the log-splitting tasks and workers.

    拆分日志管理器监视日志分解任务和工作人员。

    The split log manager is responsible for the following ongoing tasks:

    拆分日志管理器负责下列正在进行的任务:

    • Once the split log manager publishes all the tasks to the splitlog znode, it monitors these task nodes and waits for them to be processed.

      一旦拆分日志管理器将所有任务发布到splitlog znode,它将监视这些任务节点并等待它们被处理。

    • Checks to see if there are any dead split log workers queued up. If it finds tasks claimed by unresponsive workers, it will resubmit those tasks. If the resubmit fails due to some ZooKeeper exception, the dead worker is queued up again for retry.

      检查是否有任何死劈开的日志工作人员排队。如果它发现没有响应的工作人员声称的任务,它将重新提交这些任务。如果重新提交失败,由于一些ZooKeeper异常,死去的工作人员将再次排队等待重试。

    • Checks to see if there are any unassigned tasks. If it finds any, it create an ephemeral rescan node so that each split log worker is notified to re-scan unassigned tasks via the nodeChildrenChanged ZooKeeper event.

      检查是否有未分配的任务。如果它找到了,它将创建一个临时的重新扫描节点,以便通知每一个分割日志工作人员通过nodeChildrenChanged ZooKeeper事件重新扫描未分配的任务。

    • Checks for tasks which are assigned but expired. If any are found, they are moved back to TASK_UNASSIGNED state again so that they can be retried. It is possible that these tasks are assigned to slow workers, or they may already be finished. This is not a problem, because log splitting tasks have the property of idempotence. In other words, the same log splitting task can be processed many times without causing any problem.

      检查已分配但已过期的任务。如果找到了,则返回到TASK_UNASSIGNED状态,以便重新尝试。这些任务有可能被分配给缓慢的工人,或者他们可能已经完成了。这不是问题,因为日志分割任务具有幂等性的特性。换句话说,相同的日志分割任务可以多次处理而不会产生任何问题。

    • The split log manager watches the HBase split log znodes constantly. If any split log task node data is changed, the split log manager retrieves the node data. The node data contains the current state of the task. You can use the zkCli get command to retrieve the current state of a task. In the example output below, the first line of the output shows that the task is currently unassigned.

      分割日志管理器经常监视HBase分割日志znode。如果更改了任何分离日志任务节点数据,则拆分日志管理器将检索节点数据。节点数据包含任务的当前状态。您可以使用zkCli get命令来检索任务的当前状态。在下面的示例输出中,输出的第一行显示当前未分配任务。

      get /hbase/splitlog/hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost6.sample.com%2C57020%2C1340474893287-splitting%2Fhost6.sample.com%253A57020.1340474893945
      
      unassigned host2.sample.com:57000
      cZxid = 0×7115
      ctime = Sat Jun 23 11:13:40 PDT 2012
      ...

      Based on the state of the task whose data is changed, the split log manager does one of the following:

      根据数据被更改的任务的状态,split日志管理器执行以下操作之一:

    • Resubmit the task if it is unassigned

      如果任务未分配,则重新提交任务。

    • Heartbeat the task if it is assigned

      如果任务被分配了,那么它就是心跳。

    • Resubmit or fail the task if it is resigned (see Reasons a Task Will Fail)

      如果任务被放弃,则重新提交或失败(见任务失败的原因)

    • Resubmit or fail the task if it is completed with errors (see Reasons a Task Will Fail)

      如果任务被错误完成,则重新提交或失败(参见任务失败的原因)

    • Resubmit or fail the task if it could not complete due to errors (see Reasons a Task Will Fail)

      如果由于错误而无法完成任务,则重新提交或失败(参见任务失败的原因)

    • Delete the task if it is successfully completed or failed

      如果任务成功完成或失败,则删除该任务。

      Reasons a Task Will Fail
      • The task has been deleted.

        任务已被删除。

      • The node no longer exists.

        节点不再存在。

      • The log status manager failed to move the state of the task to TASK_UNASSIGNED.

        日志状态管理器未能将任务的状态移动到TASK_UNASSIGNED。

      • The number of resubmits is over the resubmit threshold.

        重新提交的数量超过重新提交的阈值。

  3. Each RegionServer’s split log worker performs the log-splitting tasks.

    每个区域服务器的拆分日志工作人员执行日志分解任务。

    Each RegionServer runs a daemon thread called the split log worker, which does the work to split the logs. The daemon thread starts when the RegionServer starts, and registers itself to watch HBase znodes. If any splitlog znode children change, it notifies a sleeping worker thread to wake up and grab more tasks. If a worker’s current task’s node data is changed, the worker checks to see if the task has been taken by another worker. If so, the worker thread stops work on the current task.

    每个区域服务器都运行一个名为split log worker的守护线程,该线程负责拆分日志。守护线程在区域服务器启动时启动,并注册自己以监视HBase znode。如果任何splitlog znode的子节点发生变化,它会通知睡眠工作者线程醒来并获取更多的任务。如果一个worker的当前任务的节点数据发生了变化,那么工作人员将检查该任务是否被另一个工作人员执行。如果是这样,工作线程将停止当前任务的工作。

    The worker monitors the splitlog znode constantly. When a new task appears, the split log worker retrieves the task paths and checks each one until it finds an unclaimed task, which it attempts to claim. If the claim was successful, it attempts to perform the task and updates the task’s state property based on the splitting outcome. At this point, the split log worker scans for another unclaimed task.

    工作人员不断监视splitlog znode。当一个新任务出现时,分离日志工作人员将检索任务路径并检查每个任务路径,直到找到一个无人认领的任务,并试图声明该任务。如果断言成功,它尝试执行任务,并根据分离结果更新任务的状态属性。此时,分离日志工作人员扫描另一个无人认领的任务。

    How the Split Log Worker Approaches a Task
    • It queries the task state and only takes action if the task is in `TASK_UNASSIGNED `state.

      它查询任务状态,并且只在任务处于“TASK_UNASSIGNED”状态时才采取行动。

    • If the task is in TASK_UNASSIGNED state, the worker attempts to set the state to TASK_OWNED by itself. If it fails to set the state, another worker will try to grab it. The split log manager will also ask all workers to rescan later if the task remains unassigned.

      如果任务处于TASK_UNASSIGNED状态,则worker尝试将状态设置为TASK_OWNED。如果它不能设置状态,另一个工人会尝试去抓取它。拆分日志管理器还将要求所有工作人员在任务未分配的情况下重新进行扫描。

    • If the worker succeeds in taking ownership of the task, it tries to get the task state again to make sure it really gets it asynchronously. In the meantime, it starts a split task executor to do the actual work:

      如果工作人员成功地获得了任务的所有权,那么它就会尝试再次获得任务状态,以确保它确实能够异步地得到它。与此同时,它启动了一个拆分任务执行器来完成实际工作:

      • Get the HBase root folder, create a temp folder under the root, and split the log file to the temp folder.

        获取HBase根文件夹,在根目录下创建一个临时文件夹,并将日志文件拆分为temp文件夹。

      • If the split was successful, the task executor sets the task to state TASK_DONE.

        如果拆分成功,则任务执行器将任务设置为状态TASK_DONE。

      • If the worker catches an unexpected IOException, the task is set to state TASK_ERR.

        如果工作人员捕获了一个意外的IOException,任务将被设置为状态TASK_ERR。

      • If the worker is shutting down, set the task to state TASK_RESIGNED.

        如果工人正在关闭,将任务设置为state task_辞职。

      • If the task is taken by another worker, just log it.

        如果任务是由另一个工作人员完成的,那么就把它记录下来。

  4. The split log manager monitors for uncompleted tasks.

    分割日志管理器监视未完成的任务。

    The split log manager returns when all tasks are completed successfully. If all tasks are completed with some failures, the split log manager throws an exception so that the log splitting can be retried. Due to an asynchronous implementation, in very rare cases, the split log manager loses track of some completed tasks. For that reason, it periodically checks for remaining uncompleted task in its task map or ZooKeeper. If none are found, it throws an exception so that the log splitting can be retried right away instead of hanging there waiting for something that won’t happen.

    当所有任务都成功完成时,split日志管理器将返回。如果所有任务都完成了一些失败,那么拆分日志管理器将抛出一个异常,以便重新尝试日志拆分。由于异步实现,在非常少见的情况下,分割日志管理器会丢失一些已完成任务的跟踪。出于这个原因,它会定期检查任务映射或ZooKeeper中的未完成任务。如果没有找到,就会抛出异常,以便可以立即重试日志,而不是挂在那里等待不会发生的事情。

70.6.5. WAL Compression

70.6.5。细胞膜压缩

The content of the WAL can be compressed using LRU Dictionary compression. This can be used to speed up WAL replication to different datanodes. The dictionary can store up to 215 elements; eviction starts after this number is exceeded.

可以使用LRU字典压缩对WAL的内容进行压缩。这可以用来加速对不同datanodes的复制。字典可以存储多达215个元素;在超过这个数字之后开始驱逐。

To enable WAL compression, set the hbase.regionserver.wal.enablecompression property to true. The default value for this property is false. By default, WAL tag compression is turned on when WAL compression is enabled. You can turn off WAL tag compression by setting the hbase.regionserver.wal.tags.enablecompression property to 'false'.

要启用WAL压缩,请设置hbase. local server. WAL。enablecompression属性为true。此属性的默认值为false。默认情况下,在启用了WAL压缩时,会启用WAL - tag压缩。您可以通过设置hbase. local server.wal.tags关闭WAL - tag压缩。enablecompression属性为“假”。

A possible downside to WAL compression is that we lose more data from the last block in the WAL if it ill-terminated mid-write. If entries in this last block were added with new dictionary entries but we failed persist the amended dictionary because of an abrupt termination, a read of this last block may not be able to resolve last-written entries.

对WAL压缩的一个可能的缺点是,如果它在中端写得不好,我们会丢失更多的数据。如果最后一个块中的条目添加了新字典条目,但是由于突然终止而未能保存修订的字典,那么最后一个块的读取可能无法解析最后写入的条目。

70.6.6. WAL Compression

70.6.6。细胞膜压缩

The content of the WAL can be compressed using LRU Dictionary compression. This can be used to speed up WAL replication to different datanodes. The dictionary can store up to 215 elements; eviction starts after this number is exceeded.

可以使用LRU字典压缩对WAL的内容进行压缩。这可以用来加速对不同datanodes的复制。字典可以存储多达215个元素;在超过这个数字之后开始驱逐。

To enable WAL compression, set the hbase.regionserver.wal.enablecompression property to true. The default value for this property is false. By default, WAL tag compression is turned on when WAL compression is enabled. You can turn off WAL tag compression by setting the hbase.regionserver.wal.tags.enablecompression property to 'false'.

要启用WAL压缩,请设置hbase. local server. WAL。enablecompression属性为true。此属性的默认值为false。默认情况下,在启用了WAL压缩时,会启用WAL - tag压缩。您可以通过设置hbase. local server.wal.tags关闭WAL - tag压缩。enablecompression属性为“假”。

A possible downside to WAL compression is that we lose more data from the last block in the WAL if it ill-terminated mid-write. If entries in this last block were added with new dictionary entries but we failed persist the amended dictionary because of an abrupt termination, a read of this last block may not be able to resolve last-written entries.

对WAL压缩的一个可能的缺点是,如果它在中端写得不好,我们会丢失更多的数据。如果最后一个块中的条目添加了新字典条目,但是由于突然终止而未能保存修订的字典,那么最后一个块的读取可能无法解析最后写入的条目。

70.6.7. Disabling the WAL

70.6.7。禁用细胞膜

It is possible to disable the WAL, to improve performance in certain specific situations. However, disabling the WAL puts your data at risk. The only situation where this is recommended is during a bulk load. This is because, in the event of a problem, the bulk load can be re-run with no risk of data loss.

可以禁用WAL,在某些特定的情况下提高性能。但是,禁用WAL会使您的数据处于危险之中。建议的唯一情况是在批量加载期间。这是因为,在出现问题时,可以重新运行大量负载,而不存在数据丢失的风险。

The WAL is disabled by calling the HBase client field Mutation.writeToWAL(false). Use the Mutation.setDurability(Durability.SKIP_WAL) and Mutation.getDurability() methods to set and get the field’s value. There is no way to disable the WAL for only a specific table.

通过调用HBase客户字段mut. writetowal (false)来禁用WAL。使用mut_ . set持久性(Durability.SKIP_WAL)和mut. get持久性()方法来设置和获取字段的值。只有一个特定的表,没有办法禁用WAL。

If you disable the WAL for anything other than bulk loads, your data is at risk.

71. Regions

71年。地区

Regions are the basic element of availability and distribution for tables, and are comprised of a Store per Column Family. The hierarchy of objects is as follows:

区域是表的可用性和分布的基本元素,由每个列族的存储组成。对象的层次结构如下:

Table                    (HBase table)
    Region               (Regions for the table)
        Store            (Store per ColumnFamily for each Region for the table)
            MemStore     (MemStore for each Store for each Region for the table)
            StoreFile    (StoreFiles for each Store for each Region for the table)
                Block    (Blocks within a StoreFile within a Store for each Region for the table)

For a description of what HBase files look like when written to HDFS, see Browsing HDFS for HBase Objects.

要了解HBase文件在编写到HDFS时的样子,请参阅浏览HBase对象的HDFS。

71.1. Considerations for Number of Regions

71.1。对区域数量的考虑。

In general, HBase is designed to run with a small (20-200) number of relatively large (5-20Gb) regions per server. The considerations for this are as follows:

一般来说,HBase的设计是在每个服务器上运行一个较小的(20-200)个相对较大的(5-20Gb)区域。对此的考虑如下:

71.1.1. Why should I keep my Region count low?

71.1.1。我为什么要让我的地域低呢?

Typically you want to keep your region count low on HBase for numerous reasons. Usually right around 100 regions per RegionServer has yielded the best results. Here are some of the reasons below for keeping region count low:

通常情况下,您希望保持您的区域在HBase上的低计数是有很多原因的。通常,每个区域服务器大约有100个区域已经产生了最好的结果。以下是以下几个保持区域数低的原因:

  1. MSLAB (MemStore-local allocation buffer) requires 2MB per MemStore (that’s 2MB per family per region). 1000 regions that have 2 families each is 3.9GB of heap used, and it’s not even storing data yet. NB: the 2MB value is configurable.

    MSLAB (MemStore-local分配缓冲区)需要2MB / MemStore(这是每个区域的2MB)。拥有2个家庭的1000个区域是3.9GB的堆,甚至还没有存储数据。NB: 2MB的值是可配置的。

  2. If you fill all the regions at somewhat the same rate, the global memory usage makes it that it forces tiny flushes when you have too many regions which in turn generates compactions. Rewriting the same data tens of times is the last thing you want. An example is filling 1000 regions (with one family) equally and let’s consider a lower bound for global MemStore usage of 5GB (the region server would have a big heap). Once it reaches 5GB it will force flush the biggest region, at that point they should almost all have about 5MB of data so it would flush that amount. 5MB inserted later, it would flush another region that will now have a bit over 5MB of data, and so on. This is currently the main limiting factor for the number of regions; see Number of regions per RS - upper bound for detailed formula.

    如果您以相同的速率填充所有区域,那么全局内存使用就会使它在您有太多的区域时产生微小的波动,而这些区域反过来又会产生压缩。把相同的数据重写成几十次是你最不想要的。一个例子是将1000个区域(一个家庭)平均地填满,让我们考虑一个更低的范围,用于全球MemStore使用5GB(该区域服务器将有一个很大的堆)。一旦它达到5GB,它就会迫使最大的区域冲水,那时它们几乎都有5MB的数据,所以它会刷新这个数字。5MB插入后,它将刷新另一个区域,现在有超过5MB的数据,等等。这是目前区域数量的主要限制因素;参见每个RS -上界的区域数,以得到详细的公式。

  3. The master as is is allergic to tons of regions, and will take a lot of time assigning them and moving them around in batches. The reason is that it’s heavy on ZK usage, and it’s not very async at the moment (could really be improved — and has been improved a bunch in 0.96 HBase).

    主人是对许多地区的过敏,而且要花很多时间来分配他们,并且分批地移动他们。原因是它对ZK的使用很重,而且目前还不是非常异步(可以真正改进),并且在0.96 HBase中得到了改进。

  4. In older versions of HBase (pre-HFile v2, 0.90 and previous), tons of regions on a few RS can cause the store file index to rise, increasing heap usage and potentially creating memory pressure or OOME on the RSs

    在老版本的HBase (pre-HFile v2, 0.90和之前)中,在一些RS上的大量区域会导致存储文件索引的增加,增加堆的使用,并可能在RSs上创建内存压力或OOME。

Another issue is the effect of the number of regions on MapReduce jobs; it is typical to have one mapper per HBase region. Thus, hosting only 5 regions per RS may not be enough to get sufficient number of tasks for a MapReduce job, while 1000 regions will generate far too many tasks.

另一个问题是区域数量对MapReduce作业的影响;每个HBase区域有一个mapper是典型的。因此,每个RS只有5个区域可能不足以获得足够数量的MapReduce任务的任务,而1000个区域将生成太多的任务。

See Determining region count and size for configuration guidelines.

请参见确定区域计数和配置指南的大小。

71.2. Region-RegionServer Assignment

71.2。Region-RegionServer赋值

This section describes how Regions are assigned to RegionServers.

本节描述区域如何分配给区域服务器。

71.2.1. Startup

71.2.1。启动

When HBase starts regions are assigned as follows (short version):

当HBase启动区域被分配如下(短版本):

  1. The Master invokes the AssignmentManager upon startup.

    在启动时,主调用赋值管理器。

  2. The AssignmentManager looks at the existing region assignments in hbase:meta.

    作业管理器查看hbase中的现有区域分配:meta。

  3. If the region assignment is still valid (i.e., if the RegionServer is still online) then the assignment is kept.

    如果区域分配仍然有效(即:,如果区域服务器仍然在线,那么任务就被保留了。

  4. If the assignment is invalid, then the LoadBalancerFactory is invoked to assign the region. The load balancer (StochasticLoadBalancer by default in HBase 1.0) assign the region to a RegionServer.

    如果分配无效,则调用LoadBalancerFactory来分配该区域。负载平衡器(在HBase 1.0中默认为StochasticLoadBalancer)将该区域分配给区域服务器。

  5. hbase:meta is updated with the RegionServer assignment (if needed) and the RegionServer start codes (start time of the RegionServer process) upon region opening by the RegionServer.

    hbase:meta在区域服务器打开的区域服务器分配(如果需要)和区域服务器启动代码(区域服务器进程启动时间)更新。

71.2.2. Failover

71.2.2。故障转移

When a RegionServer fails:

当RegionServer失败:

  1. The regions immediately become unavailable because the RegionServer is down.

    由于区域服务器宕机,区域立即变得不可用。

  2. The Master will detect that the RegionServer has failed.

    主服务器将检测到区域服务器已经失败。

  3. The region assignments will be considered invalid and will be re-assigned just like the startup sequence.

    区域分配将被视为无效,将被重新分配,就像启动序列一样。

  4. In-flight queries are re-tried, and not lost.

    在飞行中的查询被重新尝试,而不是丢失。

  5. Operations are switched to a new RegionServer within the following amount of time:

    在以下时间内,业务转到新的区域服务器:

    ZooKeeper session timeout + split time + assignment/replay time

71.2.3. Region Load Balancing

71.2.3。地区负载平衡

Regions can be periodically moved by the LoadBalancer.

区域可以被负载均衡器定期移动。

71.2.4. Region State Transition

71.2.4。地区的状态转换

HBase maintains a state for each region and persists the state in hbase:meta. The state of the hbase:meta region itself is persisted in ZooKeeper. You can see the states of regions in transition in the Master web UI. Following is the list of possible region states.

HBase为每个区域维护一个状态,并在HBase中持久化状态:meta。hbase的状态:meta区域本身存在于ZooKeeper中。您可以在主web UI中看到转换区域的状态。下面是可能的区域状态列表。

Possible Region States
  • OFFLINE: the region is offline and not opening

    离线:该区域离线,不开放。

  • OPENING: the region is in the process of being opened

    开放:该地区正在开放的过程中。

  • OPEN: the region is open and the RegionServer has notified the master

    打开:该区域是开放的,区域服务器已经通知了主服务器。

  • FAILED_OPEN: the RegionServer failed to open the region

    FAILED_OPEN:区域服务器未能打开该区域。

  • CLOSING: the region is in the process of being closed

    关闭:该区域正在关闭过程中。

  • CLOSED: the RegionServer has closed the region and notified the master

    关闭:区域服务器关闭了该区域并通知了主服务器。

  • FAILED_CLOSE: the RegionServer failed to close the region

    FAILED_CLOSE:区域服务器未能关闭该区域。

  • SPLITTING: the RegionServer notified the master that the region is splitting

    分割:区域服务器通知主服务器区域正在分裂。

  • SPLIT: the RegionServer notified the master that the region has finished splitting

    分割:区域服务器通知主机,该区域已经结束分裂。

  • SPLITTING_NEW: this region is being created by a split which is in progress

    SPLITTING_NEW:这个区域是由正在进行的分割创建的。

  • MERGING: the RegionServer notified the master that this region is being merged with another region

    合并:区域服务器通知主机,该区域正在与另一个区域合并。

  • MERGED: the RegionServer notified the master that this region has been merged

    合并:区域服务器通知主机该区域已经合并。

  • MERGING_NEW: this region is being created by a merge of two regions

    MERGING_NEW:该区域由两个区域的合并创建。

region states
Figure 2. Region State Transitions
Graph Legend
  • Brown: Offline state, a special state that can be transient (after closed before opening), terminal (regions of disabled tables), or initial (regions of newly created tables)

    Brown:脱机状态,一个特殊状态,可以是瞬态(在打开前关闭)、终端(禁用表的区域)或初始(新创建的表的区域)

  • Palegreen: Online state that regions can serve requests

    古绿:在线状态,区域可以服务请求。

  • Lightblue: Transient states

    Lightblue:瞬态状态

  • Red: Failure states that need OPS attention

    红色:故障状态,需要操作系统的注意。

  • Gold: Terminal states of regions split/merged

    黄金:分割/合并区域的终端状态。

  • Grey: Initial states of regions created through split/merge

    灰色:通过拆分/合并创建的区域的初始状态。

Transition State Descriptions
  1. The master moves a region from OFFLINE to OPENING state and tries to assign the region to a RegionServer. The RegionServer may or may not have received the open region request. The master retries sending the open region request to the RegionServer until the RPC goes through or the master runs out of retries. After the RegionServer receives the open region request, the RegionServer begins opening the region.

    master将一个区域从脱机移动到打开状态,并尝试将该区域分配给区域服务器。区域服务器可能或可能没有收到open region请求。主重试将打开的区域请求发送到区域服务器,直到RPC完成或主服务器耗尽重试。区域服务器接收到open region请求后,区域服务器开始打开该区域。

  2. If the master is running out of retries, the master prevents the RegionServer from opening the region by moving the region to CLOSING state and trying to close it, even if the RegionServer is starting to open the region.

    如果主服务器没有重试,则主服务器阻止区域服务器通过将该区域移动到关闭状态并试图关闭该区域来打开该区域,即使该区域服务器已经开始打开该区域。

  3. After the RegionServer opens the region, it continues to try to notify the master until the master moves the region to OPEN state and notifies the RegionServer. The region is now open.

    区域服务器打开该区域之后,它将继续尝试通知主服务器,直到主服务器将该区域移动到开放状态并通知区域服务器。该地区现在开放了。

  4. If the RegionServer cannot open the region, it notifies the master. The master moves the region to CLOSED state and tries to open the region on a different RegionServer.

    如果区域服务器无法打开该区域,则通知主服务器。master将该区域移动到关闭状态,并尝试在不同的区域服务器上打开该区域。

  5. If the master cannot open the region on any of a certain number of regions, it moves the region to FAILED_OPEN state, and takes no further action until an operator intervenes from the HBase shell, or the server is dead.

    如果主服务器不能在某个特定的区域上打开该区域,它就会将该区域移动到FAILED_OPEN状态,直到操作员从HBase shell中进行干预,或者服务器已经死亡,才会采取进一步的操作。

  6. The master moves a region from OPEN to CLOSING state. The RegionServer holding the region may or may not have received the close region request. The master retries sending the close request to the server until the RPC goes through or the master runs out of retries.

    大师将一个区域从开放状态移动到关闭状态。持有该区域的区域服务器可能收到或可能没有收到关闭区域的请求。主重试将关闭请求发送到服务器,直到RPC完成或主服务器耗尽重试。

  7. If the RegionServer is not online, or throws NotServingRegionException, the master moves the region to OFFLINE state and re-assigns it to a different RegionServer.

    如果区域服务器没有联机,或抛出notservingregion异常,则主服务器将该区域移动到脱机状态,并将其重新分配给不同的区域服务器。

  8. If the RegionServer is online, but not reachable after the master runs out of retries, the master moves the region to FAILED_CLOSE state and takes no further action until an operator intervenes from the HBase shell, or the server is dead.

    如果区域服务器是联机的,但在主服务器运行完重试之后无法到达,则主服务器将该区域移动到FAILED_CLOSE状态,直到操作员从HBase shell中进行干预,否则服务器就会死亡。

  9. If the RegionServer gets the close region request, it closes the region and notifies the master. The master moves the region to CLOSED state and re-assigns it to a different RegionServer.

    如果区域服务器获得了关闭区域请求,它将关闭该区域并通知主服务器。master将该区域移动到关闭状态,并将其重新分配给不同的区域服务器。

  10. Before assigning a region, the master moves the region to OFFLINE state automatically if it is in CLOSED state.

    在分配一个区域之前,如果该区域处于关闭状态,则它会自动将该区域移动到脱机状态。

  11. When a RegionServer is about to split a region, it notifies the master. The master moves the region to be split from OPEN to SPLITTING state and add the two new regions to be created to the RegionServer. These two regions are in SPLITTING_NEW state initially.

    当一个区域服务器将要分裂一个区域时,它会通知主人。master将该区域从打开状态切换到分裂状态,并将两个新区域添加到区域服务器。这两个区域最初是在SPLITTING_NEW状态中。

  12. After notifying the master, the RegionServer starts to split the region. Once past the point of no return, the RegionServer notifies the master again so the master can update the hbase:meta table. However, the master does not update the region states until it is notified by the server that the split is done. If the split is successful, the splitting region is moved from SPLITTING to SPLIT state and the two new regions are moved from SPLITTING_NEW to OPEN state.

    在通知主服务器后,区域服务器开始分割区域。一旦过了不返回的点,区域服务器将再次通知主人,这样主人就可以更新hbase:元表。但是,主服务器直到服务器通知分割完成后才更新该区域状态。如果拆分成功,分裂区域将从分裂状态转移到分裂状态,两个新区域从SPLITTING_NEW移动到OPEN状态。

  13. If the split fails, the splitting region is moved from SPLITTING back to OPEN state, and the two new regions which were created are moved from SPLITTING_NEW to OFFLINE state.

    如果分割失败,分裂区域就会从分裂状态转移到开放状态,创建的两个新区域从SPLITTING_NEW移动到脱机状态。

  14. When a RegionServer is about to merge two regions, it notifies the master first. The master moves the two regions to be merged from OPEN to MERGING state, and adds the new region which will hold the contents of the merged regions region to the RegionServer. The new region is in MERGING_NEW state initially.

    当一个区域服务器将要合并两个区域时,它首先通知主服务器。master将两个区域合并,从OPEN合并到合并状态,并添加新区域,该区域将将合并区域区域的内容保存到区域服务器。这个新区域最初是在合并新状态。

  15. After notifying the master, the RegionServer starts to merge the two regions. Once past the point of no return, the RegionServer notifies the master again so the master can update the META. However, the master does not update the region states until it is notified by the RegionServer that the merge has completed. If the merge is successful, the two merging regions are moved from MERGING to MERGED state and the new region is moved from MERGING_NEW to OPEN state.

    在通知主后,区域服务器开始合并两个区域。一旦过了不返回的点,区域服务器会再次通知主人,这样主人就可以更新META。但是,主服务器不会更新区域状态,直到区域服务器通知合并已经完成。如果合并成功,两个合并区域将从合并到合并状态,新区域从合并新到开放状态。

  16. If the merge fails, the two merging regions are moved from MERGING back to OPEN state, and the new region which was created to hold the contents of the merged regions is moved from MERGING_NEW to OFFLINE state.

    如果合并失败,两个合并区域将从合并返回到OPEN状态,而创建的用于保存合并区域内容的新区域将从合并新到脱机状态。

  17. For regions in FAILED_OPEN or FAILED_CLOSE states, the master tries to close them again when they are reassigned by an operator via HBase Shell.

    对于FAILED_OPEN或FAILED_CLOSE状态的区域,当主服务器通过HBase Shell重新分配它们时,主尝试再次关闭它们。

71.3. Region-RegionServer Locality

71.3。Region-RegionServer位置

Over time, Region-RegionServer locality is achieved via HDFS block replication. The HDFS client does the following by default when choosing locations to write replicas:

随着时间的推移,区域-区域服务器位置通过HDFS块复制实现。在选择要写副本的位置时,HDFS客户机默认执行以下操作:

  1. First replica is written to local node

    第一个副本被写到本地节点。

  2. Second replica is written to a random node on another rack

    第二个副本被写入另一个机架上的一个随机节点。

  3. Third replica is written on the same rack as the second, but on a different node chosen randomly

    第三个副本与第二个副本在同一个机架上,但在另一个随机选择的节点上。

  4. Subsequent replicas are written on random nodes on the cluster. See Replica Placement: The First Baby Steps on this page: HDFS Architecture

    随后的副本被写在集群上的随机节点上。参见复制放置:这个页面上的第一个婴儿步骤:HDFS架构。

Thus, HBase eventually achieves locality for a region after a flush or a compaction. In a RegionServer failover situation a RegionServer may be assigned regions with non-local StoreFiles (because none of the replicas are local), however as new data is written in the region, or the table is compacted and StoreFiles are re-written, they will become "local" to the RegionServer.

因此,HBase在刷新或压缩后最终实现区域的位置。在区域服务器故障转移情况下,区域服务器可以被分配到非本地存储文件的区域(因为所有副本都是本地的),但是由于新数据是在区域中写入的,或者表被压缩,并且存储文件被重新编写,它们将成为区域服务器的“本地”。

For more information, see Replica Placement: The First Baby Steps on this page: HDFS Architecture and also Lars George’s blog on HBase and HDFS locality.

有关更多信息,请参见复制放置:该页面上的第一个婴儿步骤:HDFS架构,以及Lars George在HBase和HDFS地区的博客。

71.4. Region Splits

71.4。区域分割

Regions split when they reach a configured threshold. Below we treat the topic in short. For a longer exposition, see Apache HBase Region Splitting and Merging by our Enis Soztutar.

区域在到达配置阈值时发生分裂。下面我们简短地讨论一下这个话题。为了更长的阐述,请参见Apache HBase区域,它将由我们的Enis Soztutar进行拆分和合并。

Splits run unaided on the RegionServer; i.e. the Master does not participate. The RegionServer splits a region, offlines the split region and then adds the daughter regions to hbase:meta, opens daughters on the parent’s hosting RegionServer and then reports the split to the Master. See Managed Splitting for how to manually manage splits (and for why you might do this).

在区域服务器上独立运行;即大师不参与。区域服务器分割一个区域,将分割区域划分开来,然后将子区域添加到hbase:meta,在父主机的托管区域服务器上打开女儿,然后向主服务器报告分割。关于如何手动管理分割(以及为什么您可能会这样做),请参见管理拆分。

71.4.1. Custom Split Policies

71.4.1。自定义分割政策

You can override the default split policy using a custom RegionSplitPolicy(HBase 0.94+). Typically a custom split policy should extend HBase’s default split policy: IncreasingToUpperBoundRegionSplitPolicy.

您可以使用自定义的区域splitpolicy (HBase 0.94+)覆盖默认的分割策略。典型的自定义拆分策略应该扩展HBase的默认拆分策略:增加toupperboundregionsplitpolicy。

The policy can set globally through the HBase configuration or on a per-table basis.

策略可以通过HBase配置或基于每个表进行全局设置。

Configuring the Split Policy Globally in hbase-site.xml
<property>
  <name>hbase.regionserver.region.split.policy</name>
  <value>org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy</value>
</property>
Configuring a Split Policy On a Table Using the Java API
HTableDescriptor tableDesc = new HTableDescriptor("test");
tableDesc.setValue(HTableDescriptor.SPLIT_POLICY, ConstantSizeRegionSplitPolicy.class.getName());
tableDesc.addFamily(new HColumnDescriptor(Bytes.toBytes("cf1")));
admin.createTable(tableDesc);
----
Configuring the Split Policy On a Table Using HBase Shell
hbase> create 'test', {METADATA => {'SPLIT_POLICY' => 'org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy'}},{NAME => 'cf1'}

The policy can be set globally through the HBaseConfiguration used or on a per table basis:

可以通过使用的HBaseConfiguration或在每个表基础上对策略进行全局设置:

HTableDescriptor myHtd = ...;
myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName());
The DisabledRegionSplitPolicy policy blocks manual region splitting.

71.5. Manual Region Splitting

71.5。手动地区分裂

It is possible to manually split your table, either at table creation (pre-splitting), or at a later time as an administrative action. You might choose to split your region for one or more of the following reasons. There may be other valid reasons, but the need to manually split your table might also point to problems with your schema design.

可以在表创建(预分解)或稍后作为管理操作的情况下手动拆分表。您可以选择将您的区域划分为一个或多个以下原因。可能还有其他一些合理的原因,但是手工拆分表的需要也可能指向模式设计的问题。

Reasons to Manually Split Your Table
  • Your data is sorted by timeseries or another similar algorithm that sorts new data at the end of the table. This means that the Region Server holding the last region is always under load, and the other Region Servers are idle, or mostly idle. See also Monotonically Increasing Row Keys/Timeseries Data.

    您的数据是按timeseries或另一个类似的算法来排序的,这些算法在表的末尾排序新数据。这意味着保存最后一个区域的区域服务器总是处于负载状态,而其他区域服务器是空闲的,或者大部分是空闲的。也可以看到单调递增的行键/Timeseries数据。

  • You have developed an unexpected hotspot in one region of your table. For instance, an application which tracks web searches might be inundated by a lot of searches for a celebrity in the event of news about that celebrity. See perf.one.region for more discussion about this particular scenario.

    您已经在您的表的一个区域开发了一个意想不到的热点。例如,一个追踪网络搜索的应用程序可能会被大量搜索名人的信息淹没。看到perf.one。关于这个特定场景的更多讨论。

  • After a big increase in the number of RegionServers in your cluster, to get the load spread out quickly.

    在集群中的区域服务器数量大幅增加之后,可以快速地将负载分散开来。

  • Before a bulk-load which is likely to cause unusual and uneven load across regions.

    在大负荷之前,可能会造成不同地区的不寻常和不均匀的负荷。

See Managed Splitting for a discussion about the dangers and possible benefits of managing splitting completely manually.

请参见管理拆分,以讨论管理完全手动拆分的危险和可能的好处。

The DisabledRegionSplitPolicy policy blocks manual region splitting.

71.5.1. Determining Split Points

71.5.1。确定分割点

The goal of splitting your table manually is to improve the chances of balancing the load across the cluster in situations where good rowkey design alone won’t get you there. Keeping that in mind, the way you split your regions is very dependent upon the characteristics of your data. It may be that you already know the best way to split your table. If not, the way you split your table depends on what your keys are like.

手动拆分表的目的是为了提高集群中负载均衡的机会,因为只有在良好的行键设计时才会实现。记住,分割区域的方式非常依赖于数据的特性。也许你已经知道了分割你的桌子的最好方法。如果不是,那么拆分表的方式取决于您的键是什么样子的。

Alphanumeric Rowkeys

If your rowkeys start with a letter or number, you can split your table at letter or number boundaries. For instance, the following command creates a table with regions that split at each vowel, so the first region has A-D, the second region has E-H, the third region has I-N, the fourth region has O-V, and the fifth region has U-Z.

如果您的rowkeys以字母或数字开头,您可以在字母或数字边界上拆分您的表。例如,下面的命令创建一个表,其中包含每个元音的区域,所以第一个区域有a - d,第二个区域有E-H,第三个区域有I-N,第四个区域有O-V,第五个区域有U-Z。

Using a Custom Algorithm

The RegionSplitter tool is provided with HBase, and uses a SplitAlgorithm to determine split points for you. As parameters, you give it the algorithm, desired number of regions, and column families. It includes two split algorithms. The first is the HexStringSplit algorithm, which assumes the row keys are hexadecimal strings. The second, UniformSplit, assumes the row keys are random byte arrays. You will probably need to develop your own SplitAlgorithm, using the provided ones as models.

区域分割器工具由HBase提供,并使用一个splitalgm来为您确定分割点。作为参数,您可以为它提供算法、所需的区域数量和列族。它包括两种分割算法。第一个是HexStringSplit算法,它假定行键是十六进制字符串。第二种,统一分割,假设行键是随机字节数组。您可能需要开发自己的splitalgm,使用提供的模型作为模型。

71.6. Online Region Merges

71.6。网络区域合并

Both Master and RegionServer participate in the event of online region merges. Client sends merge RPC to the master, then the master moves the regions together to the RegionServer where the more heavily loaded region resided. Finally the master sends the merge request to this RegionServer which then runs the merge. Similar to process of region splitting, region merges run as a local transaction on the RegionServer. It offlines the regions and then merges two regions on the file system, atomically delete merging regions from hbase:meta and adds the merged region to hbase:meta, opens the merged region on the RegionServer and reports the merge to the Master.

主服务器和区域服务器都参与在线区域合并。客户端将合并RPC发送给主服务器,然后主服务器将区域移动到区域服务器,在区域服务器中负载更大的区域。最后,主发送合并请求到该区域服务器,然后运行合并。与区域分割过程类似,区域合并在区域服务器上作为本地事务运行。它将区域划分为多个区域,然后将两个区域合并到文件系统中,从hbase中删除合并区域:meta并将合并后的区域添加到hbase:meta,在区域服务器上打开合并区域,并向主服务器报告合并。

An example of region merges in the HBase shell

在HBase shell中有一个区域合并的例子。

$ hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME'
$ hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME', true

It’s an asynchronous operation and call returns immediately without waiting merge completed. Passing true as the optional third parameter will force a merge. Normally only adjacent regions can be merged. The force parameter overrides this behaviour and is for expert use only.

它是一个异步操作,并在没有等待合并完成的情况下立即调用返回。通过true作为可选的第三个参数将强制合并。通常只有邻近区域可以合并。强制参数覆盖了这种行为,仅供专家使用。

71.7. Store

71.7。商店

A Store hosts a MemStore and 0 or more StoreFiles (HFiles). A Store corresponds to a column family for a table for a given region.

一个存储有一个MemStore和0个或更多的StoreFiles (HFiles)。一个存储对应于一个给定区域的表的列族。

71.7.1. MemStore

71.7.1。MemStore

The MemStore holds in-memory modifications to the Store. Modifications are Cells/KeyValues. When a flush is requested, the current MemStore is moved to a snapshot and is cleared. HBase continues to serve edits from the new MemStore and backing snapshot until the flusher reports that the flush succeeded. At this point, the snapshot is discarded. Note that when the flush happens, MemStores that belong to the same region will all be flushed.

MemStore保存对存储的内存修改。修改/ keyvalue细胞。当请求刷新时,当前的MemStore被移动到一个快照并被清除。HBase继续服务于新的MemStore和备份快照,直到刷新报告成功。此时,快照将被丢弃。注意,当刷新发生时,属于同一区域的memstore将全部刷新。

71.7.2. MemStore Flush

71.7.2。MemStore冲洗

A MemStore flush can be triggered under any of the conditions listed below. The minimum flush unit is per region, not at individual MemStore level.

可以在下面列出的任何条件下触发内存存储。最小的刷新单元是每个区域,而不是单个的MemStore级别。

  1. When a MemStore reaches the size specified by hbase.hregion.memstore.flush.size, all MemStores that belong to its region will be flushed out to disk.

    当一个MemStore达到hbase.hbase指定的大小。大小,属于该区域的所有memstore将被刷新到磁盘。

  2. When the overall MemStore usage reaches the value specified by hbase.regionserver.global.memstore.upperLimit, MemStores from various regions will be flushed out to disk to reduce overall MemStore usage in a RegionServer.

    当整个MemStore使用达到hbase. local .global.memstore指定的值时。来自不同区域的内存存储将被刷新到磁盘,以减少区域服务器中的整体MemStore使用。

    The flush order is based on the descending order of a region’s MemStore usage.

    刷新顺序是基于一个区域的MemStore用法的降序排列的。

    Regions will have their MemStores flushed until the overall MemStore usage drops to or slightly below hbase.regionserver.global.memstore.lowerLimit.

    区域将使其MemStore被刷新,直到整个MemStore使用量下降到或略低于hbase. global.memstore.下限。

  3. When the number of WAL log entries in a given region server’s WAL reaches the value specified in hbase.regionserver.max.logs, MemStores from various regions will be flushed out to disk to reduce the number of logs in the WAL.

    当给定区域服务器中的WAL - log项数达到hbase.regionserver.max中指定的值时。日志,来自不同区域的memstore将被刷新到磁盘,以减少在WAL中的日志数量。

    The flush order is based on time.

    刷新顺序是基于时间的。

    Regions with the oldest MemStores are flushed first until WAL count drops below hbase.regionserver.max.logs.

    有最古老的MemStores的区域首先被刷新,直到WAL count下降到hbase.org . Regions server.max.log。

71.7.3. Scans

71.7.3。扫描

  • When a client issues a scan against a table, HBase generates RegionScanner objects, one per region, to serve the scan request.

    当客户端对表进行扫描时,HBase会生成区域扫描对象,每个区域一个,以服务于扫描请求。

  • The RegionScanner object contains a list of StoreScanner objects, one per column family.

    区域扫描对象包含一个StoreScanner对象列表,每个列家庭一个。

  • Each StoreScanner object further contains a list of StoreFileScanner objects, corresponding to each StoreFile and HFile of the corresponding column family, and a list of KeyValueScanner objects for the MemStore.

    每个StoreScanner对象都包含一个StoreFileScanner对象的列表,对应于相应列族的每个StoreFile和HFile,以及一个用于MemStore的KeyValueScanner对象列表。

  • The two lists are merged into one, which is sorted in ascending order with the scan object for the MemStore at the end of the list.

    将两个列表合并为一个列表,在列表末尾的MemStore的扫描对象中按升序排序。

  • When a StoreFileScanner object is constructed, it is associated with a MultiVersionConcurrencyControl read point, which is the current memstoreTS, filtering out any new updates beyond the read point.

    当构建一个StoreFileScanner对象时,它与一个多版本的concurrencycontrol读取点相关联,这是当前的memstoreTS,它过滤掉读点之外的任何新更新。

71.7.4. StoreFile (HFile)

71.7.4。StoreFile(HFile)

StoreFiles are where your data lives.

存储文件就是你的数据所在。

HFile Format
HFile格式

The HFile file format is based on the SSTable file described in the BigTable [2006] paper and on Hadoop’s TFile (The unit test suite and the compression harness were taken directly from TFile). Schubert Zhang’s blog post on HFile: A Block-Indexed File Format to Store Sorted Key-Value Pairs makes for a thorough introduction to HBase’s HFile. Matteo Bertozzi has also put up a helpful description, HBase I/O: HFile.

HFile文件格式基于BigTable[2006]文件中描述的SSTable文件和Hadoop的TFile(单元测试套件和压缩带直接取自TFile)。舒伯特·张在HFile上的博客文章:一个块索引的文件格式存储分类的键值对,这是对HBase的HFile的全面介绍。他还提出了一个有用的描述,HBase I/O: HFile。

For more information, see the HFile source code. Also see HBase file format with inline blocks (version 2) for information about the HFile v2 format that was included in 0.92.

有关更多信息,请参见HFile源代码。也可以看到HBase文件格式与内联块(版本2)有关HFile v2格式的信息,该格式包含在0.92中。

HFile Tool
HFile工具

To view a textualized version of HFile content, you can use the org.apache.hadoop.hbase.io.hfile.HFile tool. Type the following to see usage:

要查看HFile内容的textualized版本,可以使用org.apache.hadoop. hbase.o.hfile。HFile工具。键入以下内容查看使用情况:

$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile

For example, to view the content of the file hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475, type the following:

例如,要查看文件hdfs的内容://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475,类型如下:

 $ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475

If you leave off the option -v to see just a summary on the HFile. See usage for other things to do with the HFile tool.

如果您放弃选项-v,只查看HFile上的摘要。查看其他与HFile工具有关的用法。

StoreFile Directory Structure on HDFS
在HDFS上的StoreFile目录结构。

For more information of what StoreFiles look like on HDFS with respect to the directory structure, see Browsing HDFS for HBase Objects.

有关在HDFS上关于目录结构的存储文件的更多信息,请参阅浏览HBase对象的HDFS。

71.7.5. Blocks

71.7.5。块

StoreFiles are composed of blocks. The blocksize is configured on a per-ColumnFamily basis.

存储文件由块组成。块大小是按每个列的家庭基础配置的。

Compression happens at the block level within StoreFiles. For more information on compression, see Compression and Data Block Encoding In HBase.

压缩发生在StoreFiles中的块级别。有关压缩的更多信息,请参阅HBase中的压缩和数据块编码。

For more information on blocks, see the HFileBlock source code.

有关块的更多信息,请参见HFileBlock源代码。

71.7.6. KeyValue

71.7.6。KeyValue

The KeyValue class is the heart of data storage in HBase. KeyValue wraps a byte array and takes offsets and lengths into the passed array which specify where to start interpreting the content as KeyValue.

KeyValue类是HBase中数据存储的核心。KeyValue封装了一个字节数组,并将偏移量和长度输入到传递的数组中,该数组指定了将内容作为键值进行解释的位置。

The KeyValue format inside a byte array is:

字节数组中的KeyValue格式是:

  • keylength

    keylength

  • valuelength

    valuelength

  • key

    关键

  • value

    价值

The Key is further decomposed as:

该密钥进一步分解为:

  • rowlength

    rowlength

  • row (i.e., the rowkey)

    行(即。rowkey)

  • columnfamilylength

    columnfamilylength

  • columnfamily

    columnfamily

  • columnqualifier

    columnqualifier

  • timestamp

    时间戳

  • keytype (e.g., Put, Delete, DeleteColumn, DeleteFamily)

    keytype(例如,Put、Delete、DeleteColumn、DeleteFamily)

KeyValue instances are not split across blocks. For example, if there is an 8 MB KeyValue, even if the block-size is 64kb this KeyValue will be read in as a coherent block. For more information, see the KeyValue source code.

键值实例不会在块之间拆分。例如,如果有一个8 MB的KeyValue,即使块大小是64kb,这个KeyValue将被作为一个一致的块读取。有关更多信息,请参见KeyValue源代码。

Example
例子

To emphasize the points above, examine what happens with two Puts for two different columns for the same row:

为了强调以上几点,请检查在同一行的两个不同的列中发生的情况:

  • Put #1: rowkey=row1, cf:attr1=value1

    # 1:rowkey =第一行,cf:attr1 = value1

  • Put #2: rowkey=row1, cf:attr2=value2

    把# 2:rowkey =第一行,cf:attr2 = value2

Even though these are for the same row, a KeyValue is created for each column:

即使这些是相同的行,每个列都创建一个KeyValue:

Key portion for Put #1:

输入#1的关键部分:

  • rowlength -----------→ 4

    rowlength - - - - - - - - - - - -→4

  • row -----------------→ row1

    行- - - - - - - - - - - - - - - - - -→第一行

  • columnfamilylength --→ 2

    columnfamilylength——→2

  • columnfamily --------→ cf

    columnfamily - - - - - - - - - -→cf

  • columnqualifier -----→ attr1

    columnqualifier - - - - - -→attr1

  • timestamp -----------→ server time of Put

    时间戳- - - - - - - - - - - -→把服务器时间

  • keytype -------------→ Put

    keytype - - - - - - - - - - - - -→

Key portion for Put #2:

输入#2的关键部分:

  • rowlength -----------→ 4

    rowlength - - - - - - - - - - - -→4

  • row -----------------→ row1

    行- - - - - - - - - - - - - - - - - -→第一行

  • columnfamilylength --→ 2

    columnfamilylength——→2

  • columnfamily --------→ cf

    columnfamily - - - - - - - - - -→cf

  • columnqualifier -----→ attr2

    columnqualifier - - - - - -→attr2

  • timestamp -----------→ server time of Put

    时间戳----------服务器时间。

  • keytype -------------→ Put

    keytype - - - - - - - - - - - - -→

It is critical to understand that the rowkey, ColumnFamily, and column (aka columnqualifier) are embedded within the KeyValue instance. The longer these identifiers are, the bigger the KeyValue is.

关键是要理解rowkey、ColumnFamily和column (aka columnqualifier)都嵌入到了KeyValue实例中。这些标识符越长,键值越大。

71.7.7. Compaction

71.7.7。压实

Ambiguous Terminology
  • A StoreFile is a facade of HFile. In terms of compaction, use of StoreFile seems to have prevailed in the past.

    一个StoreFile是HFile的外观。在压实方面,使用StoreFile似乎已经占了上风。

  • A Store is the same thing as a ColumnFamily. StoreFiles are related to a Store, or ColumnFamily.

    商店就像一根柱子。StoreFiles与商店或ColumnFamily相关。

  • If you want to read more about StoreFiles versus HFiles and Stores versus ColumnFamilies, see HBASE-11316.

    如果您想阅读更多关于StoreFiles与HFiles和存储与ColumnFamilies的内容,请参见HBASE-11316。

When the MemStore reaches a given size (hbase.hregion.memstore.flush.size), it flushes its contents to a StoreFile. The number of StoreFiles in a Store increases over time. Compaction is an operation which reduces the number of StoreFiles in a Store, by merging them together, in order to increase performance on read operations. Compactions can be resource-intensive to perform, and can either help or hinder performance depending on many factors.

当MemStore达到给定的大小(hbase.hlocal .memstore.flush.size)时,它将其内容刷新到一个StoreFile中。随着时间的推移,存储库的数量会增加。压缩是一种操作,通过将存储中的存储文件合并在一起,从而提高读取操作的性能。压缩可以是资源密集型的,可以帮助或阻碍性能,这取决于许多因素。

Compactions fall into two categories: minor and major. Minor and major compactions differ in the following ways.

compaction可分为两类:minor和major。次要和主要的压缩在以下方面有不同。

Minor compactions usually select a small number of small, adjacent StoreFiles and rewrite them as a single StoreFile. Minor compactions do not drop (filter out) deletes or expired versions, because of potential side effects. See Compaction and Deletions and Compaction and Versions for information on how deletes and versions are handled in relation to compactions. The end result of a minor compaction is fewer, larger StoreFiles for a given Store.

较小的压缩通常会选择一小部分相邻的小型存储文件,并将它们重写为单个存储文件。较小的压缩不会删除(过滤掉)删除或过期的版本,因为潜在的副作用。请参阅Compaction和Deletions和Compaction和版本,了解如何处理删除和版本处理与Compaction的关系。一个小型压缩的最终结果是一个给定存储的更少、更大的存储文件。

The end result of a major compaction is a single StoreFile per Store. Major compactions also process delete markers and max versions. See Compaction and Deletions and Compaction and Versions for information on how deletes and versions are handled in relation to compactions.

主要压缩的最终结果是每个存储的单个StoreFile。主要的压缩也处理删除标记和最大版本。请参阅Compaction和Deletions和Compaction和版本,了解如何处理删除和版本处理与Compaction的关系。

Compaction and Deletions

When an explicit deletion occurs in HBase, the data is not actually deleted. Instead, a tombstone marker is written. The tombstone marker prevents the data from being returned with queries. During a major compaction, the data is actually deleted, and the tombstone marker is removed from the StoreFile. If the deletion happens because of an expired TTL, no tombstone is created. Instead, the expired data is filtered out and is not written back to the compacted StoreFile.

当在HBase中出现显式删除时,数据实际上不会被删除。取而代之的是一个墓碑上的标记。tombstone标记可以防止数据被查询返回。在一个主要的压缩过程中,数据实际上被删除了,并且tombstone标记从存储文件中删除。如果由于TTL过期而导致删除,则不会创建tombstone。相反,过期的数据被过滤掉,并没有被写回压缩的存储文件。

Compaction and Versions

When you create a Column Family, you can specify the maximum number of versions to keep, by specifying HColumnDescriptor.setMaxVersions(int versions). The default value is 3. If more versions than the specified maximum exist, the excess versions are filtered out and not written back to the compacted StoreFile.

当您创建一个列族时,您可以通过指定HColumnDescriptor来指定要保存的版本的最大数量。setMaxVersions(int版本)。默认值是3。如果比指定的最大值存在更多的版本,多余的版本就会被过滤掉,而不会被写回压缩的存储文件中。

Major Compactions Can Impact Query Results

In some situations, older versions can be inadvertently resurrected if a newer version is explicitly deleted. See Major compactions change query results for a more in-depth explanation. This situation is only possible before the compaction finishes.

在某些情况下,如果新版本被显式地删除,旧版本可能会在无意中恢复。请参阅主要的compaction更改查询结果以获得更深入的解释。只有在压缩完成之前,这种情况才可能发生。

In theory, major compactions improve performance. However, on a highly loaded system, major compactions can require an inappropriate number of resources and adversely affect performance. In a default configuration, major compactions are scheduled automatically to run once in a 7-day period. This is sometimes inappropriate for systems in production. You can manage major compactions manually. See Managed Compactions.

从理论上讲,主要的压缩可以提高性能。然而,在一个高负载的系统中,主要的压缩可能需要不适当的资源数量并对性能产生负面影响。在默认配置中,主要的compaction将自动在7天内运行一次。这有时不适用于生产系统。您可以手动管理主要的压缩。看到件管理。

Compactions do not perform region merges. See Merge for more information on region merging.

compaction不执行区域合并。有关区域合并的更多信息,请参见“合并”。

Compaction Policy - HBase 0.96.x and newer
压缩策略- HBase 0.96。x和更新

Compacting large StoreFiles, or too many StoreFiles at once, can cause more IO load than your cluster is able to handle without causing performance problems. The method by which HBase selects which StoreFiles to include in a compaction (and whether the compaction is a minor or major compaction) is called the compaction policy.

压缩大型存储文件,或者一次过多的存储文件,可能会导致比集群更大的IO负载,而不会导致性能问题。HBase选择在压缩中包含哪些存储文件的方法(以及压缩是小的还是主要的压缩)称为压缩策略。

Prior to HBase 0.96.x, there was only one compaction policy. That original compaction policy is still available as RatioBasedCompactionPolicy. The new compaction default policy, called ExploringCompactionPolicy, was subsequently backported to HBase 0.94 and HBase 0.95, and is the default in HBase 0.96 and newer. It was implemented in HBASE-7842. In short, ExploringCompactionPolicy attempts to select the best possible set of StoreFiles to compact with the least amount of work, while the RatioBasedCompactionPolicy selects the first set that meets the criteria.

HBase 0.96之前。x,只有一个压缩策略。原始的压缩策略仍然可用来作为比率压缩策略。新的compaction默认策略,称为ExploringCompactionPolicy,随后被反向移植到HBase 0.94和HBase 0.95,并且是HBase 0.96和更新的默认值。它是在HBASE-7842中实现的。简而言之,ExploringCompactionPolicy尝试选择最可能的一组存储文件,并使用最少的工作量,而RatioBasedCompactionPolicy则选择满足标准的第一个集合。

Regardless of the compaction policy used, file selection is controlled by several configurable parameters and happens in a multi-step approach. These parameters will be explained in context, and then will be given in a table which shows their descriptions, defaults, and implications of changing them.

不管使用了什么压缩策略,文件选择都由几个可配置参数控制,并且采用了多步骤的方法。这些参数将在上下文中进行解释,然后将在一个表中给出它们的描述、默认值和更改它们的含义。

Being Stuck
被卡住了

When the MemStore gets too large, it needs to flush its contents to a StoreFile. However, a Store can only have hbase.hstore.blockingStoreFiles files, so the MemStore needs to wait for the number of StoreFiles to be reduced by one or more compactions. However, if the MemStore grows larger than hbase.hregion.memstore.flush.size, it is not able to flush its contents to a StoreFile. If the MemStore is too large and the number of StoreFiles is also too high, the algorithm is said to be "stuck". The compaction algorithm checks for this "stuck" situation and provides mechanisms to alleviate it.

当MemStore变得太大时,它需要将其内容刷新到一个StoreFile中。然而,商店只能有hbase.hstore。blockingStoreFiles文件,因此,MemStore需要等待一个或多个压缩文件减少StoreFiles的数量。但是,如果MemStore大于hbase.hlocal .memstore.flush。大小,它不能将其内容刷新到一个StoreFile。如果MemStore太大,而存储文件的数量也太高,则该算法被认为是“卡住”的。压缩算法检查这种“卡住”的情况,并提供一些机制来减轻它。

The ExploringCompactionPolicy Algorithm
ExploringCompactionPolicy算法

The ExploringCompactionPolicy algorithm considers each possible set of adjacent StoreFiles before choosing the set where compaction will have the most benefit.

探索compactionpolicy算法考虑每一种可能的相邻存储文件集,然后选择最有利于压缩的集合。

One situation where the ExploringCompactionPolicy works especially well is when you are bulk-loading data and the bulk loads create larger StoreFiles than the StoreFiles which are holding data older than the bulk-loaded data. This can "trick" HBase into choosing to perform a major compaction each time a compaction is needed, and cause a lot of extra overhead. With the ExploringCompactionPolicy, major compactions happen much less frequently because minor compactions are more efficient.

在一个情况下,当您是批量加载数据和批量加载时,您将创建较大的存储,而不是存储大于装载数据的存储数据的存储文件。这可以“欺骗”HBase在每次需要压缩时选择执行主要的压缩,并造成大量额外开销。在探索压缩策略中,主要的压缩发生的频率要低得多,因为较小的压缩更有效。

In general, ExploringCompactionPolicy is the right choice for most situations, and thus is the default compaction policy. You can also use ExploringCompactionPolicy along with Experimental: Stripe Compactions.

总的来说,在大多数情况下,探索compactionpolicy是正确的选择,因此是默认的压缩策略。您还可以使用ExploringCompactionPolicy和实验性的:Stripe compaction。

The logic of this policy can be examined in hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/ExploringCompactionPolicy.java. The following is a walk-through of the logic of the ExploringCompactionPolicy.

该策略的逻辑可以在hbase-server/src/main/java/ org/apache/hadoop/hbase/localserver/compactions/exploringcompactionpolicy.java中进行检查。下面是对探索压缩策略的逻辑的演练。

  1. Make a list of all existing StoreFiles in the Store. The rest of the algorithm filters this list to come up with the subset of HFiles which will be chosen for compaction.

    列出存储中所有现有的存储文件。该算法的其余部分将筛选这个列表,以生成将被选择为compaction的HFiles子集。

  2. If this was a user-requested compaction, attempt to perform the requested compaction type, regardless of what would normally be chosen. Note that even if the user requests a major compaction, it may not be possible to perform a major compaction. This may be because not all StoreFiles in the Column Family are available to compact or because there are too many Stores in the Column Family.

    如果这是用户请求的压缩,则尝试执行所请求的压缩类型,而不考虑通常会选择什么。请注意,即使用户请求主要的压缩,也可能无法执行主要的压缩。这可能是因为在列家族中并不是所有的存储文件都可以压缩,或者因为列家族中有太多的存储。

  3. Some StoreFiles are automatically excluded from consideration. These include:

    一些存储文件被自动排除在考虑之外。这些包括:

    • StoreFiles that are larger than hbase.hstore.compaction.max.size

      大于hbase.hstore.compaction.max.size的StoreFiles。

    • StoreFiles that were created by a bulk-load operation which explicitly excluded compaction. You may decide to exclude StoreFiles resulting from bulk loads, from compaction. To do this, specify the hbase.mapreduce.hfileoutputformat.compaction.exclude parameter during the bulk load operation.

      由一个批量加载操作创建的存储文件,它明确地排除了压缩。您可以决定不包含从压缩中产生的批量加载的存储文件。要做到这一点,请指定hbase.mapreduce.hfileoutputformat.compaction.exclude参数在批量加载操作期间。

  4. Iterate through the list from step 1, and make a list of all potential sets of StoreFiles to compact together. A potential set is a grouping of hbase.hstore.compaction.min contiguous StoreFiles in the list. For each set, perform some sanity-checking and figure out whether this is the best compaction that could be done:

    遍历步骤1中的列表,并将所有可能的StoreFiles集列表组合在一起。一个潜在的集合是一个hbase.hstore.comaction.min的列表中的连续存储文件。对于每一组,执行一些检查,并弄清楚这是否是最好的压缩方法:

    • If the number of StoreFiles in this set (not the size of the StoreFiles) is fewer than hbase.hstore.compaction.min or more than hbase.hstore.compaction.max, take it out of consideration.

      如果这个集合中的StoreFiles的数量(而不是StoreFiles的大小)小于hbase. hstore.comaction. min或多于hbase.hstore.hstore.comaction.max,那么就把它考虑在内。

    • Compare the size of this set of StoreFiles with the size of the smallest possible compaction that has been found in the list so far. If the size of this set of StoreFiles represents the smallest compaction that could be done, store it to be used as a fall-back if the algorithm is "stuck" and no StoreFiles would otherwise be chosen. See Being Stuck.

      将这组StoreFiles的大小与列表中所发现的最小可能压缩的大小进行比较。如果这一组存储文件的大小表示可以完成的最小压缩,那么如果该算法被“阻塞”,并且不选择存储文件,则将其存储为回滚。看到被卡住了。

    • Do size-based sanity checks against each StoreFile in this set of StoreFiles.

      对这组存储文件中的每个StoreFile进行基于大小的完整性检查。

      • If the size of this StoreFile is larger than hbase.hstore.compaction.max.size, take it out of consideration.

        如果这个StoreFile的大小大于hbase.hstore.comaction.max.size,那么就不要考虑它。

      • If the size is greater than or equal to hbase.hstore.compaction.min.size, sanity-check it against the file-based ratio to see whether it is too large to be considered.

        如果size大于或等于hbase.hstore.compaction.min.size,则根据基于文件的比率检查它是否太大,不能考虑。

        The sanity-checking is successful if:

        三度检查是成功的:

      • There is only one StoreFile in this set, or

        在这个集合中只有一个StoreFile,或者。

      • For each StoreFile, its size multiplied by hbase.hstore.compaction.ratio (or hbase.hstore.compaction.ratio.offpeak if off-peak hours are configured and it is during off-peak hours) is less than the sum of the sizes of the other HFiles in the set.

        对于每个StoreFile,它的大小乘以hbase.hstore.compaction.ratio(或hbase.hstore.compaction.ratio.offpeak,如果在非高峰时间配置,则在非高峰时间)小于该集合中其他HFiles的大小之和。

  5. If this set of StoreFiles is still in consideration, compare it to the previously-selected best compaction. If it is better, replace the previously-selected best compaction with this one.

    如果这组存储文件仍在考虑中,将其与预先选择的最佳压缩文件进行比较。如果是更好的,用这个替换之前选择的最佳压缩。

  6. When the entire list of potential compactions has been processed, perform the best compaction that was found. If no StoreFiles were selected for compaction, but there are multiple StoreFiles, assume the algorithm is stuck (see Being Stuck) and if so, perform the smallest compaction that was found in step 3.

    在处理所有可能的压缩操作列表时,执行所找到的最佳压缩。如果没有为compaction选择StoreFiles,但是有多个StoreFiles,假设算法被卡住(见被卡住),如果是这样,则执行步骤3中发现的最小压缩。

RatioBasedCompactionPolicy Algorithm
RatioBasedCompactionPolicy算法

The RatioBasedCompactionPolicy was the only compaction policy prior to HBase 0.96, though ExploringCompactionPolicy has now been backported to HBase 0.94 and 0.95. To use the RatioBasedCompactionPolicy rather than the ExploringCompactionPolicy, set hbase.hstore.defaultengine.compactionpolicy.class to RatioBasedCompactionPolicy in the hbase-site.xml file. To switch back to the ExploringCompactionPolicy, remove the setting from the hbase-site.xml.

在HBase 0.96之前,比值压缩策略是唯一的压缩策略,不过,ExploringCompactionPolicy现在已经被反向移植到HBase 0.94和0.95。在hbase-site中,使用RatioBasedCompactionPolicy,而不是ExploringCompactionPolicy,设置hbase.hstore.defaultengine.compactionpolicy.class到RatioBasedCompactionPolicy。xml文件。要切换回ExploringCompactionPolicy,请从hbase-site.xml中删除设置。

The following section walks you through the algorithm used to select StoreFiles for compaction in the RatioBasedCompactionPolicy.

下面的小节将介绍用于在RatioBasedCompactionPolicy中为compaction选择StoreFiles的算法。

  1. The first phase is to create a list of all candidates for compaction. A list is created of all StoreFiles not already in the compaction queue, and all StoreFiles newer than the newest file that is currently being compacted. This list of StoreFiles is ordered by the sequence ID. The sequence ID is generated when a Put is appended to the write-ahead log (WAL), and is stored in the metadata of the HFile.

    第一个阶段是创建一个所有候选压缩的列表。一个列表是创建在压缩队列中的所有存储文件,并且所有的存储文件都比当前压缩的最新文件更新。这个StoreFiles列表是由序列ID排序的。当一个Put被追加到write-ahead日志(WAL)时,将生成序列ID,并存储在HFile的元数据中。

  2. Check to see if the algorithm is stuck (see Being Stuck, and if so, a major compaction is forced. This is a key area where The ExploringCompactionPolicy Algorithm is often a better choice than the RatioBasedCompactionPolicy.

    检查算法是否被卡住了(看是否被卡住了,如果是这样的话,主要的压缩是被迫的)。这是一个关键的领域,在这个领域中,探索compactionpolicy算法通常是一个更好的选择,而不是比值压缩策略。

  3. If the compaction was user-requested, try to perform the type of compaction that was requested. Note that a major compaction may not be possible if all HFiles are not available for compaction or if too many StoreFiles exist (more than hbase.hstore.compaction.max).

    如果compaction是用户请求的,请尝试执行所请求的压缩类型。请注意,如果所有HFiles都不能用于压缩,或者存在太多的存储文件(超过hbase. hstore.hstore.compaction .max),那么可能不可能出现大的压缩。

  4. Some StoreFiles are automatically excluded from consideration. These include:

    一些存储文件被自动排除在考虑之外。这些包括:

    • StoreFiles that are larger than hbase.hstore.compaction.max.size

      大于hbase.hstore.compaction.max.size的StoreFiles。

    • StoreFiles that were created by a bulk-load operation which explicitly excluded compaction. You may decide to exclude StoreFiles resulting from bulk loads, from compaction. To do this, specify the hbase.mapreduce.hfileoutputformat.compaction.exclude parameter during the bulk load operation.

      由一个批量加载操作创建的存储文件,它明确地排除了压缩。您可以决定不包含从压缩中产生的批量加载的存储文件。要做到这一点,请指定hbase.mapreduce.hfileoutputformat.compaction.exclude参数在批量加载操作期间。

  5. The maximum number of StoreFiles allowed in a major compaction is controlled by the hbase.hstore.compaction.max parameter. If the list contains more than this number of StoreFiles, a minor compaction is performed even if a major compaction would otherwise have been done. However, a user-requested major compaction still occurs even if there are more than hbase.hstore.compaction.max StoreFiles to compact.

    大型压缩包中允许的最大存储文件数由hbase.hstore.compaction.max参数控制。如果列表中包含的存储文件数量超过了这个数量,那么即使一个主要的压缩程序已经完成,也会执行一个较小的压缩。然而,即使有超过hbase. hstore.comaction.max StoreFiles的压缩,用户请求的主压缩仍然会发生。

  6. If the list contains fewer than hbase.hstore.compaction.min StoreFiles to compact, a minor compaction is aborted. Note that a major compaction can be performed on a single HFile. Its function is to remove deletes and expired versions, and reset locality on the StoreFile.

    如果列表中包含的小于hbase.hstore.comaction.min StoreFiles到compact,一个较小的压缩就会被终止。注意,可以在单个HFile上执行主要的压缩。它的功能是删除删除和过期的版本,并在StoreFile上重置位置。

  7. The value of the hbase.hstore.compaction.ratio parameter is multiplied by the sum of StoreFiles smaller than a given file, to determine whether that StoreFile is selected for compaction during a minor compaction. For instance, if hbase.hstore.compaction.ratio is 1.2, FileX is 5MB, FileY is 2MB, and FileZ is 3MB:

    比参数的值乘以小于给定文件的存储文件的总和,以确定在较小的压缩过程中是否选择了这个StoreFile来进行压缩。例如,如果hbase.hstore.compaction.ratio为1.2,FileX为5MB, FileY为2MB, FileZ为3MB:

    5 <= 1.2 x (2 + 3)            or            5 <= 6

    In this scenario, FileX is eligible for minor compaction. If FileX were 7MB, it would not be eligible for minor compaction. This ratio favors smaller StoreFile. You can configure a different ratio for use in off-peak hours, using the parameter hbase.hstore.compaction.ratio.offpeak, if you also configure hbase.offpeak.start.hour and hbase.offpeak.end.hour.

    在这种情况下,FileX可以满足较小的压缩。如果FileX是7MB,那么它就不符合小型压缩的条件。这个比率有利于较小的存储文件。您可以配置一个不同的比例,以便在非高峰时间使用,使用参数hbase.hstore.compaction.offpeak,如果您也配置了hbase.offpeak.start。小时,hbase.offpeak.end.hour。

  8. If the last major compaction was too long ago and there is more than one StoreFile to be compacted, a major compaction is run, even if it would otherwise have been minor. By default, the maximum time between major compactions is 7 days, plus or minus a 4.8 hour period, and determined randomly within those parameters. Prior to HBase 0.96, the major compaction period was 24 hours. See hbase.hregion.majorcompaction in the table below to tune or disable time-based major compactions.

    如果最后一个主要的压实时间太长,并且有多个StoreFile被压缩,那么就会运行一个主要的compaction,即使它本来是次要的。在默认情况下,主压缩的最大时间为7天,正负4.8小时,并在这些参数中随机确定。在HBase 0.96之前,主要的压实周期为24小时。看到hbase.hregion。在下面的表中主要的压缩以调整或禁用基于时间的主要压缩。

Parameters Used by Compaction Algorithm
压缩算法使用的参数。

This table contains the main configuration parameters for compaction. This list is not exhaustive. To tune these parameters from the defaults, edit the hbase-default.xml file. For a full list of all configuration parameters available, see config.files

这个表包含了压缩的主要配置参数。这个列表不是详尽的。要从默认值中调优这些参数,可以编辑hbase-default。xml文件。有关所有配置参数的完整列表,请参见config.files。

hbase.hstore.compaction.min

The minimum number of StoreFiles which must be eligible for compaction before compaction can run. The goal of tuning hbase.hstore.compaction.min is to avoid ending up with too many tiny StoreFiles to compact. Setting this value to 2 would cause a minor compaction each time you have two StoreFiles in a Store, and this is probably not appropriate. If you set this value too high, all the other values will need to be adjusted accordingly. For most cases, the default value is appropriate. In previous versions of HBase, the parameter hbase.hstore.compaction.min was called hbase.hstore.compactionThreshold.

在compaction运行之前,必须符合压缩条件的最小存储文件数。优化hbase.hstore.compaction.min的目的是避免使用太多的小存储文件来压缩。将此值设置为2会导致每次在存储中有两个storefile时都会产生一个小的压缩,而这可能是不合适的。如果将此值设置得过高,则需要相应地调整所有其他值。对于大多数情况,默认值是合适的。在HBase的以前版本中,HBase .hstore.compaction.min被称为hbase.hstore.compactionThreshold。

Default: 3

默认值:3

hbase.hstore.compaction.max

The maximum number of StoreFiles which will be selected for a single minor compaction, regardless of the number of eligible StoreFiles. Effectively, the value of hbase.hstore.compaction.max controls the length of time it takes a single compaction to complete. Setting it larger means that more StoreFiles are included in a compaction. For most cases, the default value is appropriate.

将为单个小型压缩而选择的存储文件的最大数量,而不考虑合格的存储文件的数量。实际上,hbase.hstore.compaction.max控制了单个压缩完成的时间长度。更大的设置意味着在压缩中包含更多的存储文件。对于大多数情况,默认值是合适的。

Default: 10

默认值:10

hbase.hstore.compaction.min.size

A StoreFile smaller than this size will always be eligible for minor compaction. StoreFiles this size or larger are evaluated by hbase.hstore.compaction.ratio to determine if they are eligible. Because this limit represents the "automatic include" limit for all StoreFiles smaller than this value, this value may need to be reduced in write-heavy environments where many files in the 1-2 MB range are being flushed, because every StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the minimum size and require further compaction. If this parameter is lowered, the ratio check is triggered more quickly. This addressed some issues seen in earlier versions of HBase but changing this parameter is no longer necessary in most situations.

小于此大小的存储文件将始终符合小型压缩的条件。这个大小的存储文件由hbase.hstore.compaction.ratio来评估,以确定它们是否符合条件。因为这个限制代表了“自动包括“限制StoreFiles小于这个值,这个值可能需要减少write-heavy环境中许多文件1 - 2 MB的范围被刷新,因为每个StoreFile将针对压实和由此产生的StoreFiles仍可能受到最小的大小和需要进一步压实。如果这个参数被降低,比率检查被触发的更快。这解决了在早期版本的HBase中所看到的一些问题,但是在大多数情况下更改此参数不再是必要的。

Default:128 MB

默认值:128 MB

hbase.hstore.compaction.max.size

A StoreFile larger than this size will be excluded from compaction. The effect of raising hbase.hstore.compaction.max.size is fewer, larger StoreFiles that do not get compacted often. If you feel that compaction is happening too often without much benefit, you can try raising this value.

大于这个大小的StoreFile将被排除在compaction之外。增大hbase.hstore.compaction.max.size的效果更小,更大的存储文件不经常被压缩。如果你觉得压实经常发生,没有太多好处,你可以试着提高这个值。

Default: Long.MAX_VALUE

默认值:Long.MAX_VALUE

hbase.hstore.compaction.ratio

For minor compaction, this ratio is used to determine whether a given StoreFile which is larger than hbase.hstore.compaction.min.size is eligible for compaction. Its effect is to limit compaction of large StoreFile. The value of hbase.hstore.compaction.ratio is expressed as a floating-point decimal.

对于较小的压缩,此比率用于确定一个给定的存储文件是否大于hbase.hstore.hstore.comaction .min.size具有压缩的条件。它的作用是限制大型存储文件的压缩。hbase.hstore.compaction.ratio被表示为浮点小数。

  • A large ratio, such as 10, will produce a single giant StoreFile. Conversely, a value of .25, will produce behavior similar to the BigTable compaction algorithm, producing four StoreFiles.

    一个大的比率,例如10,将产生一个单一的巨大的存储文件。相反,如果值为.25,则会产生类似于BigTable compaction算法的行为,生成4个StoreFiles。

  • A moderate value of between 1.0 and 1.4 is recommended. When tuning this value, you are balancing write costs with read costs. Raising the value (to something like 1.4) will have more write costs, because you will compact larger StoreFiles. However, during reads, HBase will need to seek through fewer StoreFiles to accomplish the read. Consider this approach if you cannot take advantage of Bloom Filters.

    建议在1.0和1.4之间有一个适中的值。在调优这个值时,您需要平衡写成本和读取成本。提高值(大约1.4)会有更多的写成本,因为你会压缩更大的存储文件。但是,在读取过程中,HBase需要通过更少的存储文件来完成读取操作。如果不能利用Bloom过滤器,请考虑这种方法。

  • Alternatively, you can lower this value to something like 1.0 to reduce the background cost of writes, and use to limit the number of StoreFiles touched during reads. For most cases, the default value is appropriate.

    或者,您可以将这个值降低到类似于1.0这样的东西,以降低写的背景成本,并用于限制在读取过程中被触摸的存储文件的数量。对于大多数情况,默认值是合适的。

    Default: 1.2F

    默认值:1.2度

hbase.hstore.compaction.ratio.offpeak

The compaction ratio used during off-peak compactions, if off-peak hours are also configured (see below). Expressed as a floating-point decimal. This allows for more aggressive (or less aggressive, if you set it lower than hbase.hstore.compaction.ratio) compaction during a set time period. Ignored if off-peak is disabled (default). This works the same as hbase.hstore.compaction.ratio.

在非高峰压实时使用的压实比,如果非高峰时间也被配置(见下文)。以浮点小数形式表示。这允许更积极的(或更少的攻击,如果您设置它低于hbase.hstore.comaction .ratio)在一个设置的时间段内的压缩。如果off-peak被禁用(默认),则忽略它。这与hbase.hstore.compaction.ratio相同。

Default: 5.0F

默认值:5.0度

hbase.offpeak.start.hour

The start of off-peak hours, expressed as an integer between 0 and 23, inclusive. Set to -1 to disable off-peak.

非高峰时间的开始,表示为0到23之间的整数,包括。设置为-1,以禁用非峰值。

Default: -1 (disabled)

默认值:1(禁用)

hbase.offpeak.end.hour

The end of off-peak hours, expressed as an integer between 0 and 23, inclusive. Set to -1 to disable off-peak.

非高峰时间的结束,表示为0到23之间的整数,包括。设置为-1,以禁用非峰值。

Default: -1 (disabled)

默认值:1(禁用)

hbase.regionserver.thread.compaction.throttle

There are two different thread pools for compactions, one for large compactions and the other for small compactions. This helps to keep compaction of lean tables (such as hbase:meta) fast. If a compaction is larger than this threshold, it goes into the large compaction pool. In most cases, the default value is appropriate.

compaction有两个不同的线程池,一个用于大的压缩,另一个用于小的压缩。这有助于保持瘦表(如hbase:meta)的紧凑。如果一个压缩比这个阈值大,它就会进入大的压缩池。在大多数情况下,默认值是适当的。

Default: 2 x hbase.hstore.compaction.max x hbase.hregion.memstore.flush.size (which defaults to 128)

默认值:2 x hbase. hstore.m。hbase。大小(默认为128)

hbase.hregion.majorcompaction

Time between major compactions, expressed in milliseconds. Set to 0 to disable time-based automatic major compactions. User-requested and size-based major compactions will still run. This value is multiplied by hbase.hregion.majorcompaction.jitter to cause compaction to start at a somewhat-random time during a given window of time.

主要压实之间的时间,以毫秒表示。设置为0,以禁用基于时间的自动主要压缩。用户请求和基于大小的主要压缩仍然会运行。这个值乘以hbase。hbase。在给定时间窗口内的某个随机时间,抖动导致压缩。

Default: 7 days (604800000 milliseconds)

默认:7天(604800000毫秒)

hbase.hregion.majorcompaction.jitter

A multiplier applied to hbase.hregion.majorcompaction to cause compaction to occur a given amount of time either side of hbase.hregion.majorcompaction. The smaller the number, the closer the compactions will happen to the hbase.hregion.majorcompaction interval. Expressed as a floating-point decimal.

一个用于hbase.hregion的乘数。主要压实作用,使压实发生在某一给定的时间内,即hbase.h。数字越小,hbase.hregion就会越紧密。majorcompaction区间。以浮点小数形式表示。

Default: .50F

默认值:.50F

Compaction File Selection
压缩文件选择
Legacy Information

This section has been preserved for historical reasons and refers to the way compaction worked prior to HBase 0.96.x. You can still use this behavior if you enable RatioBasedCompactionPolicy Algorithm. For information on the way that compactions work in HBase 0.96.x and later, see Compaction.

这一节由于历史原因被保留,是指在HBase 0.96.x之前的压缩方法。如果启用了RatioBasedCompactionPolicy算法,您仍然可以使用该行为。有关在HBase 0.96中压缩工作方式的信息。x和之后,参见压缩。

To understand the core algorithm for StoreFile selection, there is some ASCII-art in the Store source code that will serve as useful reference.

为了理解StoreFile选择的核心算法,在存储源代码中有一些ASCII-art作为有用的参考。

It has been copied below:

它已被复制如下:

/* normal skew: * * older ----> newer * _ * | | _ * | | | | _ * --|-|- |-|- |-|---_-------_------- minCompactSize * | | | | | | | | _ | | * | | | | | | | | | | | | * | | | | | | | | | | | | */
Important knobs:
  • hbase.hstore.compaction.ratio Ratio used in compaction file selection algorithm (default 1.2f).

    在压缩文件选择算法中使用的比率比率(默认为1.2f)。

  • hbase.hstore.compaction.min (in HBase v 0.90 this is called hbase.hstore.compactionThreshold) (files) Minimum number of StoreFiles per Store to be selected for a compaction to occur (default 2).

    min(在HBase v0.90中,这被称为HBase . hstore.compactionthreshold)(文件)每个存储库的最小存储文件数量,以进行压缩(默认为2)。

  • hbase.hstore.compaction.max (files) Maximum number of StoreFiles to compact per minor compaction (default 10).

    compaction.max(文件)每个小型压缩包的最大存储文件数(默认为10)。

  • hbase.hstore.compaction.min.size (bytes) Any StoreFile smaller than this setting with automatically be a candidate for compaction. Defaults to hbase.hregion.memstore.flush.size (128 mb).

    大小(字节)任何小于此设置的存储文件,自动成为压实的候选。默认为hbase.hregion.memstore.flush。大小(128 mb)。

  • hbase.hstore.compaction.max.size (.92) (bytes) Any StoreFile larger than this setting with automatically be excluded from compaction (default Long.MAX_VALUE).

    max.size(.92)(字节)任何大于此设置的存储文件,自动被排除在compaction(默认的Long.MAX_VALUE)中。

The minor compaction StoreFile selection logic is size based, and selects a file for compaction when the file ⇐ sum(smaller_files) * hbase.hstore.compaction.ratio.

小压实StoreFile选择逻辑是基于大小,并选择一个文件时压缩文件(smaller_files)* hbase.hstore.compaction.ratio⇐数目。

Minor Compaction File Selection - Example #1 (Basic Example)
小型压缩文件选择-例#1(基本示例)

This example mirrors an example from the unit test TestCompactSelection.

这个例子反映了单元测试TestCompactSelection的一个示例。

  • hbase.hstore.compaction.ratio = 1.0f

    hbase.hstore.compaction.ratio f = 1.0

  • hbase.hstore.compaction.min = 3 (files)

    hbase.hstore.compaction.min = 3(文件)

  • hbase.hstore.compaction.max = 5 (files)

    hbase.hstore.compaction.max = 5(文件)

  • hbase.hstore.compaction.min.size = 10 (bytes)

    hbase.hstore.compaction.min.size = 10(字节)

  • hbase.hstore.compaction.max.size = 1000 (bytes)

    hbase.hstore.compaction.max.size = 1000(字节)

The following StoreFiles exist: 100, 50, 23, 12, and 12 bytes apiece (oldest to newest). With the above parameters, the files that would be selected for minor compaction are 23, 12, and 12.

下面的存储文件有:100、50、23、12和12字节(最老的是最新的)。使用上述参数,将为小型压缩选择的文件是23、12和12。

Why?

为什么?

  • 100 → No, because sum(50, 23, 12, 12) * 1.0 = 97.

    不,因为和(50,23,12,12)* 1.0 = 97。

  • 50 → No, because sum(23, 12, 12) * 1.0 = 47.

    不,因为sum(23, 12, 12) * 1.0 = 47。

  • 23 → Yes, because sum(12, 12) * 1.0 = 24.

    是的,因为sum(12,12) * 1.0 = 24。

  • 12 → Yes, because the previous file has been included, and because this does not exceed the max-file limit of 5

    12→是的,因为前面的文件已经包含,因为这的max-file限制不超过5

  • 12 → Yes, because the previous file had been included, and because this does not exceed the max-file limit of 5.

    是的,因为前面的文件已经被包含了,而且因为这个文件的最大值不超过5。

Minor Compaction File Selection - Example #2 (Not Enough Files ToCompact)
小型压缩文件选择-例#2(没有足够的文件)

This example mirrors an example from the unit test TestCompactSelection.

这个例子反映了单元测试TestCompactSelection的一个示例。

  • hbase.hstore.compaction.ratio = 1.0f

    hbase.hstore.compaction.ratio f = 1.0

  • hbase.hstore.compaction.min = 3 (files)

    hbase.hstore.compaction.min = 3(文件)

  • hbase.hstore.compaction.max = 5 (files)

    hbase.hstore.compaction.max = 5(文件)

  • hbase.hstore.compaction.min.size = 10 (bytes)

    hbase.hstore.compaction.min.size = 10(字节)

  • hbase.hstore.compaction.max.size = 1000 (bytes)

    hbase.hstore.compaction.max.size = 1000(字节)

The following StoreFiles exist: 100, 25, 12, and 12 bytes apiece (oldest to newest). With the above parameters, no compaction will be started.

下面的StoreFiles存在:100、25、12和12字节(最老的是最新的)。使用上述参数,将不会启动压缩。

Why?

为什么?

  • 100 → No, because sum(25, 12, 12) * 1.0 = 47

    100→不,因为金额(12)25日,12日* 1.0 = 47岁

  • 25 → No, because sum(12, 12) * 1.0 = 24

    25→不,因为金额(12日12)* 1.0 = 24

  • 12 → No. Candidate because sum(12) * 1.0 = 12, there are only 2 files to compact and that is less than the threshold of 3

    12号→因为sum(12) * 1.0 = 12,只有2个文件压缩,小于3的阈值。

  • 12 → No. Candidate because the previous StoreFile was, but there are not enough files to compact

    12号→因为之前的StoreFile是,但是没有足够的文件来压缩。

Minor Compaction File Selection - Example #3 (Limiting Files To Compact)
小型压实文件选择-例#3(限制文件压缩)

This example mirrors an example from the unit test TestCompactSelection.

这个例子反映了单元测试TestCompactSelection的一个示例。

  • hbase.hstore.compaction.ratio = 1.0f

    hbase.hstore.compaction.ratio f = 1.0

  • hbase.hstore.compaction.min = 3 (files)

    hbase.hstore.compaction.min = 3(文件)

  • hbase.hstore.compaction.max = 5 (files)

    hbase.hstore.compaction.max = 5(文件)

  • hbase.hstore.compaction.min.size = 10 (bytes)

    hbase.hstore.compaction.min.size = 10(字节)

  • hbase.hstore.compaction.max.size = 1000 (bytes)

    hbase.hstore.compaction.max.size = 1000(字节)

The following StoreFiles exist: 7, 6, 5, 4, 3, 2, and 1 bytes apiece (oldest to newest). With the above parameters, the files that would be selected for minor compaction are 7, 6, 5, 4, 3.

下面的存储文件有:7、6、5、4、3、2和1个字节(最古老的是最新的)。使用上述参数,将被选择的小型压缩文件的文件为7、6、5、4、3。

Why?

为什么?

  • 7 → Yes, because sum(6, 5, 4, 3, 2, 1) * 1.0 = 21. Also, 7 is less than the min-size

    7→是的,因为金额(6、5、4、3、2、1)* 1.0 = 21。另外,7比最小尺寸小。

  • 6 → Yes, because sum(5, 4, 3, 2, 1) * 1.0 = 15. Also, 6 is less than the min-size.

    6,是的,因为和(5,4,3,2,1)* 1.0 = 15。另外,6比最小尺寸小。

  • 5 → Yes, because sum(4, 3, 2, 1) * 1.0 = 10. Also, 5 is less than the min-size.

    5→是的,因为金额(4、3、2、1)* 1.0 = 10。另外,5比最小尺寸小。

  • 4 → Yes, because sum(3, 2, 1) * 1.0 = 6. Also, 4 is less than the min-size.

    4→是的,因为金额(3,2,1)* 1.0 = 6。另外,4比最小尺寸小。

  • 3 → Yes, because sum(2, 1) * 1.0 = 3. Also, 3 is less than the min-size.

    3→是的,因为金额(2,1)* 1.0 = 3。另外,3比最小尺寸小。

  • 2 → No. Candidate because previous file was selected and 2 is less than the min-size, but the max-number of files to compact has been reached.

    2→没有。候选人因为之前的文件被选中,2比最小的要小,但是已经达到了文件的max数量。

  • 1 → No. Candidate because previous file was selected and 1 is less than the min-size, but max-number of files to compact has been reached.

    1→没有。候选人因为之前的文件被选中,而1比最小的要少,但是已经达到了文件的max数量。

Impact of Key Configuration Options
This information is now included in the configuration parameter table in Parameters Used by Compaction Algorithm.
Date Tiered Compaction
日期分层压实

Date tiered compaction is a date-aware store file compaction strategy that is beneficial for time-range scans for time-series data.

日期分级压缩是一种数据感知存储文件压缩策略,它有利于时间序列数据的时间范围扫描。

When To Use Date Tiered Compactions
何时使用日期分级压缩。

Consider using Date Tiered Compaction for reads for limited time ranges, especially scans of recent data

考虑使用日期分级压实来读取有限的时间范围,特别是对最近数据的扫描。

Don’t use it for

不要使用它

  • random gets without a limited time range

    随机得到的时间范围是有限的。

  • frequent deletes and updates

    频繁的删除和更新

  • Frequent out of order data writes creating long tails, especially writes with future timestamps

    频繁的顺序数据写入创建长尾,特别是在未来的时间戳中写入。

  • frequent bulk loads with heavily overlapping time ranges

    频繁的批量加载和重叠的时间范围。

Performance Improvements

Performance testing has shown that the performance of time-range scans improve greatly for limited time ranges, especially scans of recent data.

性能测试表明,时间范围扫描的性能大大提高了有限的时间范围,特别是对最近数据的扫描。

Enabling Date Tiered Compaction
启用日期分层压实

You can enable Date Tiered compaction for a table or a column family, by setting its hbase.hstore.engine.class to org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine.

您可以通过设置其hbase.hstore.engine,为表或列家族启用日期分级压缩。类org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine。

You also need to set hbase.hstore.blockingStoreFiles to a high number, such as 60, if using all default settings, rather than the default value of 12). Use 1.5~2 x projected file count if changing the parameters, Projected file count = windows per tier x tier count + incoming window min + files older than max age

您还需要设置hbase.hstore。如果使用所有默认设置,则将storefiles设置为一个较高的数字(如60),而不是默认值为12。使用1.5~2 x的投影文件计数,如果改变参数,投影文件计数= windows每层x层计数+传入窗口最小+文件大于最大年龄。

You also need to set hbase.hstore.compaction.max to the same value as hbase.hstore.blockingStoreFiles to unblock major compaction.

您还需要设置hbase.hstore.compaction.max与hbase.hstore相同的值。blockingStoreFiles以打开主要的压缩。

Procedure: Enable Date Tiered Compaction
  1. Run one of following commands in the HBase shell. Replace the table name orders_table with the name of your table.

    在HBase shell中运行以下命令之一。将表名orders_table替换为表的名称。

    alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 'hbase.hstore.compaction.max'=>'60'}
    alter 'orders_table', {NAME => 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 'hbase.hstore.compaction.max'=>'60'}}
    create 'orders_table', 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 'hbase.hstore.compaction.max'=>'60'}
  2. Configure other options if needed. See Configuring Date Tiered Compaction for more information.

    如果需要,配置其他选项。有关更多信息,请参见配置日期分级压缩。

Procedure: Disable Date Tiered Compaction
  1. Set the hbase.hstore.engine.class option to either nil or org.apache.hadoop.hbase.regionserver.DefaultStoreEngine. Either option has the same effect. Make sure you set the other options you changed to the original settings too.

    设置hbase.hstore.engine。class选项为nil或org.apache.hadoop. hbase.org . defaultstoreengine。两种选择都有同样的效果。确保您设置的其他选项也更改为原始设置。

    alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DefaultStoreEngine' 'hbase.hstore.blockingStoreFiles' => '12', 'hbase.hstore.compaction.min'=>'6', 'hbase.hstore.compaction.max'=>'12'}}

When you change the store engine either way, a major compaction will likely be performed on most regions. This is not necessary on new tables.

当您更改存储引擎时,大多数区域都可能执行主要的压缩。这在新表中是不必要的。

Configuring Date Tiered Compaction
配置日期分层压实

Each of the settings for date tiered compaction should be configured at the table or column family level. If you use HBase shell, the general command pattern is as follows:

日期分级压缩的每个设置都应该在表或列家庭级别配置。如果使用HBase shell,一般命令模式如下:

alter 'orders_table', CONFIGURATION => {'key' => 'value', ..., 'key' => 'value'}}
Tier Parameters

You can configure your date tiers by changing the settings for the following parameters:

您可以通过更改以下参数的设置来配置您的日期级别:

Table 10. Date Tier Parameters
Setting Notes

hbase.hstore.compaction.date.tiered.max.storefile.age.millis

hbase.hstore.compaction.date.tiered.max.storefile.age.millis

Files with max-timestamp smaller than this will no longer be compacted.Default at Long.MAX_VALUE.

比这更小的时间戳的文件将不再被压缩。在Long.MAX_VALUE违约。

hbase.hstore.compaction.date.tiered.base.window.millis

hbase.hstore.compaction.date.tiered.base.window.millis

Base window size in milliseconds. Default at 6 hours.

基本窗口大小以毫秒为单位。默认为6个小时。

hbase.hstore.compaction.date.tiered.windows.per.tier

hbase.hstore.compaction.date.tiered.windows.per.tier

Number of windows per tier. Default at 4.

每个层的窗口数。默认为4。

hbase.hstore.compaction.date.tiered.incoming.window.min

hbase.hstore.compaction.date.tiered.incoming.window.min

Minimal number of files to compact in the incoming window. Set it to expected number of files in the window to avoid wasteful compaction. Default at 6.

在传入窗口中压缩文件的最小数量。将其设置为期望窗口中文件的数量,以避免浪费的压缩。默认的6点。

hbase.hstore.compaction.date.tiered.window.policy.class

hbase.hstore.compaction.date.tiered.window.policy.class

The policy to select store files within the same time window. It doesn’t apply to the incoming window. Default at exploring compaction. This is to avoid wasteful compaction.

在同一时间窗口中选择存储文件的策略。它不适用于传入的窗口。默认在探索压实。这是为了避免浪费。

Compaction Throttler

With tiered compaction all servers in the cluster will promote windows to higher tier at the same time, so using a compaction throttle is recommended: Set hbase.regionserver.throughput.controller to org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController.

在分层压缩的情况下,集群中的所有服务器都将同时向更高的层提升窗口,因此建议使用压缩节流:设置hbase. local server.吞吐量。org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController控制器。

For more information about date tiered compaction, please refer to the design specification at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8
Experimental: Stripe Compactions
实验:条纹紧凑排列

Stripe compactions is an experimental feature added in HBase 0.98 which aims to improve compactions for large regions or non-uniformly distributed row keys. In order to achieve smaller and/or more granular compactions, the StoreFiles within a region are maintained separately for several row-key sub-ranges, or "stripes", of the region. The stripes are transparent to the rest of HBase, so other operations on the HFiles or data work without modification.

Stripe compaction是在HBase 0.98中添加的一个实验特性,该特性旨在改进大区域或非均匀分布的行键的压缩。为了实现更小的和/或更细粒度的压缩,区域内的存储文件分别维护区域内的几个行键子范围(或“条纹”)。条纹对HBase的其他部分是透明的,所以HFiles或数据工作的其他操作不需要修改。

Stripe compactions change the HFile layout, creating sub-regions within regions. These sub-regions are easier to compact, and should result in fewer major compactions. This approach alleviates some of the challenges of larger regions.

Stripe compaction更改HFile布局,在区域内创建子区域。这些子区域更容易压缩,并且会导致更少的主要压缩。这种方法减轻了较大地区的一些挑战。

Stripe compaction is fully compatible with Compaction and works in conjunction with either the ExploringCompactionPolicy or RatioBasedCompactionPolicy. It can be enabled for existing tables, and the table will continue to operate normally if it is disabled later.

Stripe compaction与compaction完全兼容,并与ExploringCompactionPolicy或RatioBasedCompactionPolicy一起工作。它可以对现有的表启用,如果以后禁用,表将继续正常运行。

When To Use Stripe Compactions
何时使用Stripe compaction。

Consider using stripe compaction if you have either of the following:

如果您有以下任一种情况,请考虑使用stripe compaction:

  • Large regions. You can get the positive effects of smaller regions without additional overhead for MemStore and region management overhead.

    大的区域。您可以获得较小区域的正面效果,而不需要额外的开销来存储MemStore和区域管理开销。

  • Non-uniform keys, such as time dimension in a key. Only the stripes receiving the new keys will need to compact. Old data will not compact as often, if at all

    非均匀键,例如键中的时间维度。只有接受新键的条纹才需要紧凑。旧数据将不会像往常一样紧凑。

Performance Improvements

Performance testing has shown that the performance of reads improves somewhat, and variability of performance of reads and writes is greatly reduced. An overall long-term performance improvement is seen on large non-uniform-row key regions, such as a hash-prefixed timestamp key. These performance gains are the most dramatic on a table which is already large. It is possible that the performance improvement might extend to region splits.

性能测试表明,读取的性能有所提高,读写性能的可变性大大降低。在大型的非均匀行关键区域(例如hash-prefixed timestamp key)中可以看到总体的长期性能改进。在一个已经很大的表格上,这些性能的提高是最显著的。性能改进有可能扩展到区域分割。

Enabling Stripe Compaction
使条纹压实

You can enable stripe compaction for a table or a column family, by setting its hbase.hstore.engine.class to org.apache.hadoop.hbase.regionserver.StripeStoreEngine. You also need to set the hbase.hstore.blockingStoreFiles to a high number, such as 100 (rather than the default value of 10).

通过设置它的hbase.hstore.engine,您可以为一个表或一个列家庭启用stripe压缩。类org.apache.hadoop.hbase.regionserver.StripeStoreEngine。您还需要设置hbase.hstore。将storefiles添加到一个较高的数字,比如100(而不是默认值10)。

Procedure: Enable Stripe Compaction
  1. Run one of following commands in the HBase shell. Replace the table name orders_table with the name of your table.

    在HBase shell中运行以下命令之一。将表名orders_table替换为表的名称。

    alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' => '100'}
    alter 'orders_table', {NAME => 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' => '100'}}
    create 'orders_table', 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' => '100'}
  2. Configure other options if needed. See Configuring Stripe Compaction for more information.

    如果需要,配置其他选项。有关更多信息,请参阅配置Stripe Compaction。

  3. Enable the table.

    启用表。

Procedure: Disable Stripe Compaction
  1. Set the hbase.hstore.engine.class option to either nil or org.apache.hadoop.hbase.regionserver.DefaultStoreEngine. Either option has the same effect.

    设置hbase.hstore.engine。class选项为nil或org.apache.hadoop. hbase.org . defaultstoreengine。两种选择都有同样的效果。

    alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 'rg.apache.hadoop.hbase.regionserver.DefaultStoreEngine'}
  2. Enable the table.

    启用表。

When you enable a large table after changing the store engine either way, a major compaction will likely be performed on most regions. This is not necessary on new tables.

当您在改变存储引擎后启用一个大表时,大多数区域可能会执行一个主要的压缩。这在新表中是不必要的。

Configuring Stripe Compaction
配置条纹压实

Each of the settings for stripe compaction should be configured at the table or column family level. If you use HBase shell, the general command pattern is as follows:

条带压缩的每个设置都应该在表或列的家庭级别配置。如果使用HBase shell,一般命令模式如下:

alter 'orders_table', CONFIGURATION => {'key' => 'value', ..., 'key' => 'value'}}
Region and stripe sizing

You can configure your stripe sizing based upon your region sizing. By default, your new regions will start with one stripe. On the next compaction after the stripe has grown too large (16 x MemStore flushes size), it is split into two stripes. Stripe splitting continues as the region grows, until the region is large enough to split.

您可以根据区域大小配置您的条纹大小。默认情况下,您的新区域将以一个条带开始。当条纹长得太大(16倍的MemStore刷新尺寸)后,它被分成两道。随着区域的增长,条纹分裂会继续,直到该区域大到足以分裂为止。

You can improve this pattern for your own data. A good rule is to aim for a stripe size of at least 1 GB, and about 8-12 stripes for uniform row keys. For example, if your regions are 30 GB, 12 x 2.5 GB stripes might be a good starting point.

您可以为自己的数据改进此模式。一个好的规则是针对至少1 GB的条纹大小,以及大约8-12条的均匀行键。例如,如果您的区域是30 GB, 12×2.5 GB的条纹可能是一个很好的起点。

Table 11. Stripe Sizing Settings
Setting Notes

hbase.store.stripe.initialStripeCount

hbase.store.stripe.initialStripeCount

The number of stripes to create when stripe compaction is enabled. You can use it as follows:

启用条纹压缩时要创建的条纹的数量。你可以使用以下方法:

  • For relatively uniform row keys, if you know the approximate target number of stripes from the above, you can avoid some splitting overhead by starting with several stripes (2, 5, 10…​). If the early data is not representative of overall row key distribution, this will not be as efficient.

    对于相对均匀的行键,如果你知道上面的条纹的近似目标数,你可以从几条条纹开始(2,5,10…)如果早期数据不能代表整个行键分布,那么这将不是有效的。

  • For existing tables with a large amount of data, this setting will effectively pre-split your stripes.

    对于具有大量数据的现有表,此设置将有效地预分割您的条纹。

  • For keys such as hash-prefixed sequential keys, with more than one hash prefix per region, pre-splitting may make sense.

    对于键,例如hash-prefixed顺序键,每个区域有多个哈希前缀,预分解可能是有意义的。

hbase.store.stripe.sizeToSplit

hbase.store.stripe.sizeToSplit

The maximum size a stripe grows before splitting. Use this in conjunction with hbase.store.stripe.splitPartCount to control the target stripe size (sizeToSplit = splitPartsCount * target stripe size), according to the above sizing considerations.

在分裂之前,条纹的最大尺寸会增加。使用此与hbase.store.stripe一起使用。splitPartCount控制目标条带大小(sizeToSplit = splitPartsCount * target stripe size),根据上述大小考虑。

hbase.store.stripe.splitPartCount

hbase.store.stripe.splitPartCount

The number of new stripes to create when splitting a stripe. The default is 2, which is appropriate for most cases. For non-uniform row keys, you can experiment with increasing the number to 3 or 4, to isolate the arriving updates into narrower slice of the region without additional splits being required.

在分割一条条纹时产生的新条纹的数量。默认值为2,这在大多数情况下是合适的。对于非均匀的行键,您可以尝试将数字增加到3或4,将到达的更新隔离到区域的更窄的部分,而不需要额外的分割。

MemStore Size Settings

By default, the flush creates several files from one MemStore, according to existing stripe boundaries and row keys to flush. This approach minimizes write amplification, but can be undesirable if the MemStore is small and there are many stripes, because the files will be too small.

默认情况下,刷新会从一个MemStore创建多个文件,根据现有的条带边界和行键来刷新。这种方法最小化了写放大,但是如果MemStore很小,并且有许多条带,因为文件太小,就不可取了。

In this type of situation, you can set hbase.store.stripe.compaction.flushToL0 to true. This will cause a MemStore flush to create a single file instead. When at least hbase.store.stripe.compaction.minFilesL0 such files (by default, 4) accumulate, they will be compacted into striped files.

在这种情况下,您可以设置hbase.store.stripe.compaction.flushToL0到true。这将导致MemStore刷新以创建单个文件。当至少是hbase. store.t .compaction. minfilesl0这样的文件(默认情况下,4)积累,它们将被压缩成有条纹的文件。

Normal Compaction Configuration and Stripe Compaction

All the settings that apply to normal compactions (see Parameters Used by Compaction Algorithm) apply to stripe compactions. The exceptions are the minimum and maximum number of files, which are set to higher values by default because the files in stripes are smaller. To control these for stripe compactions, use hbase.store.stripe.compaction.minFiles and hbase.store.stripe.compaction.maxFiles, rather than hbase.hstore.compaction.min and hbase.hstore.compaction.max.

所有应用于普通压缩的设置(参见Compaction算法使用的参数)适用于stripe Compaction。异常是文件的最小值和最大数量,默认情况下,这些文件被设置为更高的值,因为条纹中的文件更小。为了控制这些条带压缩,使用hbase.store.stripe.compaction.minFiles和hbase. store.t . compaction.maxfiles,而不是hbase.hstore.hstore.comaction.m。

72. Bulk Loading

72年。批量加载

72.1. Overview

72.1。概述

HBase includes several methods of loading data into tables. The most straightforward method is to either use the TableOutputFormat class from a MapReduce job, or use the normal client APIs; however, these are not always the most efficient methods.

HBase包含了几种将数据加载到表中的方法。最直接的方法是从MapReduce作业中使用TableOutputFormat类,或者使用普通的客户机api;然而,这些并不总是最有效的方法。

The bulk load feature uses a MapReduce job to output table data in HBase’s internal data format, and then directly loads the generated StoreFiles into a running cluster. Using bulk load will use less CPU and network resources than simply using the HBase API.

批量加载特性使用MapReduce作业来输出HBase内部数据格式的表数据,然后将生成的存储文件直接加载到运行的集群中。使用批量加载比简单地使用HBase API使用更少的CPU和网络资源。

72.2. Bulk Load Limitations

72.2。批量加载限制

As bulk loading bypasses the write path, the WAL doesn’t get written to as part of the process. Replication works by reading the WAL files so it won’t see the bulk loaded data – and the same goes for the edits that use Put.setDurability(SKIP_WAL). One way to handle that is to ship the raw files or the HFiles to the other cluster and do the other processing there.

由于批量加载绕过了写入路径,所以没有将WAL写入到进程的一部分。复制的工作原理是读取WAL的文件,这样它就不会看到大量加载的数据了,而使用put . set持久性(SKIP_WAL)的编辑器也是如此。一种处理方法是将原始文件或HFiles发送到其他集群,并在那里进行其他处理。

72.3. Bulk Load Architecture

72.3。批量加载架构

The HBase bulk load process consists of two main steps.

HBase批量加载过程包括两个主要步骤。

72.3.1. Preparing data via a MapReduce job

72.3.1。通过MapReduce作业准备数据。

The first step of a bulk load is to generate HBase data files (StoreFiles) from a MapReduce job using HFileOutputFormat2. This output format writes out data in HBase’s internal storage format so that they can be later loaded very efficiently into the cluster.

批量加载的第一步是使用HFileOutputFormat2从MapReduce作业生成HBase数据文件(StoreFiles)。这个输出格式在HBase的内部存储格式中写入数据,以便以后可以非常高效地加载到集群中。

In order to function efficiently, HFileOutputFormat2 must be configured such that each output HFile fits within a single region. In order to do this, jobs whose output will be bulk loaded into HBase use Hadoop’s TotalOrderPartitioner class to partition the map output into disjoint ranges of the key space, corresponding to the key ranges of the regions in the table.

为了有效地发挥作用,必须配置HFileOutputFormat2,以便每个输出HFile都适合于单个区域。为了做到这一点,将大量加载到HBase中的作业将使用Hadoop的totalorderpartiatorclass将映射输出划分为关键空间的不连接范围,对应于表中区域的键范围。

HFileOutputFormat2 includes a convenience function, configureIncrementalLoad(), which automatically sets up a TotalOrderPartitioner based on the current region boundaries of a table.

HFileOutputFormat2包含一个方便的函数,configure - incrementalload(),它根据表的当前区域边界自动设置一个totalorderpartiator。

72.3.2. Completing the data load

72.3.2。完成数据加载

After a data import has been prepared, either by using the importtsv tool with the “importtsv.bulk.output” option or by some other MapReduce job using the HFileOutputFormat, the completebulkload tool is used to import the data into the running cluster. This command line tool iterates through the prepared data files, and for each one determines the region the file belongs to. It then contacts the appropriate RegionServer which adopts the HFile, moving it into its storage directory and making the data available to clients.

在准备了数据导入之后,可以使用importtsv工具和“importtsv.bulk”。输出“选项或其他一些MapReduce作业使用HFileOutputFormat, completebulkload工具用于将数据导入到运行的集群中。这个命令行工具遍历已准备好的数据文件,每个文件都确定该文件所属的区域。然后,它会联系使用HFile的适当的区域服务器,将其移动到它的存储目录中,并将数据提供给客户机。

If the region boundaries have changed during the course of bulk load preparation, or between the preparation and completion steps, the completebulkload utility will automatically split the data files into pieces corresponding to the new boundaries. This process is not optimally efficient, so users should take care to minimize the delay between preparing a bulk load and importing it into the cluster, especially if other clients are simultaneously loading data through other means.

如果在批量加载准备过程中,或者在准备和完成步骤之间,区域边界发生了变化,completebulkload实用程序将自动将数据文件分割成与新边界对应的部分。这个过程不是最有效的,因此用户应该注意最小化准备批量加载和导入集群之间的延迟,特别是在其他客户机同时通过其他方式加载数据的情况下。

$ hadoop jar hbase-server-VERSION.jar completebulkload [-c /path/to/hbase/config/hbase-site.xml] /user/todd/myoutput mytable

The -c config-file option can be used to specify a file containing the appropriate hbase parameters (e.g., hbase-site.xml) if not supplied already on the CLASSPATH (In addition, the CLASSPATH must contain the directory that has the zookeeper configuration file if zookeeper is NOT managed by HBase).

可以使用-c config-file选项来指定包含适当的hbase参数(例如,hbase-site.xml)的文件,如果没有在类路径上提供的话(另外,类路径必须包含有zookeeper配置文件的目录,如果zookeeper不是由hbase管理的)。

If the target table does not already exist in HBase, this tool will create the table automatically.

72.4. See Also

72.4。另请参阅

For more information about the referenced utilities, see ImportTsv and CompleteBulkLoad.

有关引用实用程序的更多信息,请参见ImportTsv和CompleteBulkLoad。

See How-to: Use HBase Bulk Loading, and Why for a recent blog on current state of bulk loading.

参见How-to:使用HBase批量加载,以及为什么在最近的博客中关于当前的批量加载状态。

72.5. Advanced Usage

72.5。高级用法

Although the importtsv tool is useful in many cases, advanced users may want to generate data programmatically, or import data from other formats. To get started doing so, dig into ImportTsv.java and check the JavaDoc for HFileOutputFormat.

尽管importtsv工具在很多情况下是有用的,但高级用户可能希望以编程方式生成数据,或者从其他格式导入数据。为了开始这样做,请深入研究ImportTsv。java和检查JavaDoc的HFileOutputFormat。

The import step of the bulk load can also be done programmatically. See the LoadIncrementalHFiles class for more information.

批量加载的导入步骤也可以通过编程方式完成。有关更多信息,请参见loadincrementalhfile类。

73. HDFS

73年。HDFS

As HBase runs on HDFS (and each StoreFile is written as a file on HDFS), it is important to have an understanding of the HDFS Architecture especially in terms of how it stores files, handles failovers, and replicates blocks.

HBase在HDFS上运行(并且每个StoreFile都是作为一个文件在HDFS上编写的),对于HDFS体系结构的理解非常重要,尤其是在它如何存储文件、处理failovers和复制块的方面。

See the Hadoop documentation on HDFS Architecture for more information.

有关更多信息,请参阅HDFS架构的Hadoop文档。

73.1. NameNode

73.1。NameNode

The NameNode is responsible for maintaining the filesystem metadata. See the above HDFS Architecture link for more information.

NameNode负责维护文件系统元数据。有关更多信息,请参见上面的HDFS架构链接。

73.2. DataNode

73.2。DataNode

The DataNodes are responsible for storing HDFS blocks. See the above HDFS Architecture link for more information.

DataNodes负责存储HDFS块。有关更多信息,请参见上面的HDFS架构链接。

74. Timeline-consistent High Available Reads

74年。Timeline-consistent高可用的读取

74.1. Introduction

74.1。介绍

HBase, architecturally, always had the strong consistency guarantee from the start. All reads and writes are routed through a single region server, which guarantees that all writes happen in an order, and all reads are seeing the most recent committed data.

从架构上来说,HBase自始至终都具有很强的一致性保证。所有的读和写都通过一个区域服务器进行路由,该服务器保证所有的写入都是按顺序进行的,所有的读操作都是看到最近提交的数据。

However, because of this single homing of the reads to a single location, if the server becomes unavailable, the regions of the table that were hosted in the region server become unavailable for some time. There are three phases in the region recovery process - detection, assignment, and recovery. Of these, the detection is usually the longest and is presently in the order of 20-30 seconds depending on the ZooKeeper session timeout. During this time and before the recovery is complete, the clients will not be able to read the region data.

但是,由于对单个位置的读取,如果服务器变得不可用,在该区域服务器上托管的表的区域在一段时间内不可用。区域恢复过程有三个阶段——检测、分配和恢复。在这些情况下,检测通常是最长的,并且根据ZooKeeper会话超时时间,目前的顺序是20-30秒。在此期间和恢复完成之前,客户端将无法读取该区域的数据。

However, for some use cases, either the data may be read-only, or doing reads against some stale data is acceptable. With timeline-consistent high available reads, HBase can be used for these kind of latency-sensitive use cases where the application can expect to have a time bound on the read completion.

但是,对于某些用例,数据可能是只读的,或者对一些陈旧的数据进行读取是可以接受的。使用timeline一致的高可用读,HBase可以用于此类延迟敏感的用例,应用程序可以期望有一个时间绑定在读取完成上。

For achieving high availability for reads, HBase provides a feature called region replication. In this model, for each region of a table, there will be multiple replicas that are opened in different RegionServers. By default, the region replication is set to 1, so only a single region replica is deployed and there will not be any changes from the original model. If region replication is set to 2 or more, then the master will assign replicas of the regions of the table. The Load Balancer ensures that the region replicas are not co-hosted in the same region servers and also in the same rack (if possible).

为了实现读取的高可用性,HBase提供了一个称为区域复制的特性。在这个模型中,对于一个表的每个区域,将会有多个在不同的区域服务器中打开的副本。默认情况下,区域复制被设置为1,因此只部署了一个区域副本,并且不会对原始模型进行任何更改。如果将区域复制设置为2或更多,则主将指定表区域的副本。负载平衡器确保区域副本不会在相同的区域服务器中共存,并且在相同的机架中(如果可能的话)。

All of the replicas for a single region will have a unique replica_id, starting from 0. The region replica having replica_id==0 is called the primary region, and the others secondary regions or secondaries. Only the primary can accept writes from the client, and the primary will always contain the latest changes. Since all writes still have to go through the primary region, the writes are not highly-available (meaning they might block for some time if the region becomes unavailable).

单个区域的所有副本都将有一个惟一的复制id,从0开始。具有replica_id==0的区域副本称为主要区域,以及其他次要区域或次级区域。只有主服务器可以接受来自客户机的写操作,主服务器将始终包含最新的更改。由于所有的写都必须经过主区域,所以写的内容并不高(意味着如果该区域不可用,它们可能会阻塞一段时间)。

74.2. Timeline Consistency

74.2。时间一致性

With this feature, HBase introduces a Consistency definition, which can be provided per read operation (get or scan).

基于此特性,HBase引入了一致性定义,可以通过读取操作(获取或扫描)来提供一致性定义。

public enum Consistency {
    STRONG,
    TIMELINE
}

Consistency.STRONG is the default consistency model provided by HBase. In case the table has region replication = 1, or in a table with region replicas but the reads are done with this consistency, the read is always performed by the primary regions, so that there will not be any change from the previous behaviour, and the client always observes the latest data.

一致性。STRONG是HBase提供的默认一致性模型。表有地区复制= 1,或与地区表副本读取完成这种一致性,阅读总是执行的主要地区,所以不会有任何变化从先前的行为,和客户端总是观察最新的数据。

In case a read is performed with Consistency.TIMELINE, then the read RPC will be sent to the primary region server first. After a short interval (hbase.client.primaryCallTimeout.get, 10ms by default), parallel RPC for secondary region replicas will also be sent if the primary does not respond back. After this, the result is returned from whichever RPC is finished first. If the response came back from the primary region replica, we can always know that the data is latest. For this Result.isStale() API has been added to inspect the staleness. If the result is from a secondary region, then Result.isStale() will be set to true. The user can then inspect this field to possibly reason about the data.

如果读的是一致性的。时间轴,然后read RPC将首先被发送到主区域服务器。在短时间间隔后(hbase.client.primaryCallTimeout。默认情况下获得10ms,如果主服务器没有响应,则还将发送二级区域副本的并行RPC。在此之后,将从最先完成的RPC返回结果。如果响应从主区域副本返回,我们可以始终知道数据是最新的。对于这个结果,添加了is陈腐()API来检查是否过时。如果结果来自一个次要区域,那么结果。is()将被设置为true。然后,用户可以检查这个字段,从而可能对数据进行推理。

In terms of semantics, TIMELINE consistency as implemented by HBase differs from pure eventual consistency in these respects:

在语义方面,HBase实现的时间轴一致性与这些方面的纯最终一致性不同:

  • Single homed and ordered updates: Region replication or not, on the write side, there is still only 1 defined replica (primary) which can accept writes. This replica is responsible for ordering the edits and preventing conflicts. This guarantees that two different writes are not committed at the same time by different replicas and the data diverges. With this, there is no need to do read-repair or last-timestamp-wins kind of conflict resolution.

    单个的homed和有序的更新:在写端,仍然只有一个定义的副本(primary)可以接受写入。这个副本负责命令编辑和防止冲突。这保证了两个不同的写在同一时间不被不同的副本所提交,并且数据是发散的。有了这个,就不需要做读修复或最后一次的stamp-赢得类型的冲突解决。

  • The secondaries also apply the edits in the order that the primary committed them. This way the secondaries will contain a snapshot of the primaries data at any point in time. This is similar to RDBMS replications and even HBase’s own multi-datacenter replication, however in a single cluster.

    第二代也按照主提交的顺序应用编辑。通过这种方式,第二组将在任何时间点包含初选数据的快照。这类似于RDBMS复制,甚至是HBase自己的多数据中心复制,但是在单个集群中。

  • On the read side, the client can detect whether the read is coming from up-to-date data or is stale data. Also, the client can issue reads with different consistency requirements on a per-operation basis to ensure its own semantic guarantees.

    在读取端,客户端可以检测读取是否来自最新的数据或过时的数据。此外,客户端可以在每个操作基础上发出不同的一致性要求,以确保自己的语义保证。

  • The client can still observe edits out-of-order, and can go back in time, if it observes reads from one secondary replica first, then another secondary replica. There is no stickiness to region replicas or a transaction-id based guarantee. If required, this can be implemented later though.

    客户端仍然可以观察编辑的无序状态,并且可以回溯到时间,如果它观察到从一个二级副本中读取,然后另一个二级副本。对于区域副本或基于事务id的保证没有粘性。如果需要,这可以在以后实现。

Timeline Consistency
Figure 3. Timeline Consistency

To better understand the TIMELINE semantics, let’s look at the above diagram. Let’s say that there are two clients, and the first one writes x=1 at first, then x=2 and x=3 later. As above, all writes are handled by the primary region replica. The writes are saved in the write ahead log (WAL), and replicated to the other replicas asynchronously. In the above diagram, notice that replica_id=1 received 2 updates, and its data shows that x=2, while the replica_id=2 only received a single update, and its data shows that x=1.

为了更好地理解时间轴语义,让我们看一下上面的图。假设有两个客户机,第一个是x=1,然后是x=2,然后是x=3。如上所述,所有的写都由主区域副本处理。写入将保存在前面的写入日志(WAL)中,并以异步方式复制到其他副本。在上面的图中,注意到replica_id=1收到了2个更新,其数据显示x=2,而replica_id=2只收到一个更新,其数据显示x=1。

If client1 reads with STRONG consistency, it will only talk with the replica_id=0, and thus is guaranteed to observe the latest value of x=3. In case of a client issuing TIMELINE consistency reads, the RPC will go to all replicas (after primary timeout) and the result from the first response will be returned back. Thus the client can see either 1, 2 or 3 as the value of x. Let’s say that the primary region has failed and log replication cannot continue for some time. If the client does multiple reads with TIMELINE consistency, she can observe x=2 first, then x=1, and so on.

如果client1具有很强的一致性,它只会与replica_id=0进行对话,因此可以保证观察x=3的最新值。如果客户机发出时间轴一致性,则RPC将返回所有副本(在主超时之后),并返回第一个响应的结果。因此,客户端可以看到1、2或3作为x的值,假设主区域已经失败,并且日志复制不能持续一段时间。如果客户端多次读取时间轴一致性,她可以先观察x=2,然后x=1,以此类推。

74.3. Tradeoffs

74.3。权衡

Having secondary regions hosted for read availability comes with some tradeoffs which should be carefully evaluated per use case. Following are advantages and disadvantages.

拥有用于阅读可用性的次要区域带来了一些权衡,每个用例都应该仔细评估。以下是优点和缺点。

Advantages
  • High availability for read-only tables

    对只读表的高可用性。

  • High availability for stale reads

    对陈腐的阅读的高可用性。

  • Ability to do very low latency reads with very high percentile (99.9%+) latencies for stale reads

    执行非常低延迟的能力的读取率非常高(99.9%以上)。

Disadvantages
  • Double / Triple MemStore usage (depending on region replication count) for tables with region replication > 1

    对于带有区域复制> 1的表,双/三重MemStore用法(取决于区域复制计数)。

  • Increased block cache usage

    增加块缓存的使用

  • Extra network traffic for log replication

    用于日志复制的额外网络流量。

  • Extra backup RPCs for replicas

    用于副本的额外备份rpc。

To serve the region data from multiple replicas, HBase opens the regions in secondary mode in the region servers. The regions opened in secondary mode will share the same data files with the primary region replica, however each secondary region replica will have its own MemStore to keep the unflushed data (only primary region can do flushes). Also to serve reads from secondary regions, the blocks of data files may be also cached in the block caches for the secondary regions.

为了从多个副本服务区域数据,HBase在区域服务器中打开二级模式的区域。在二级模式中打开的区域将共享与主区域副本相同的数据文件,但是每个次要区域副本将有自己的MemStore来保存未刷新的数据(只有主区域可以进行刷新)。此外,为了服务于次要区域的读取,数据文件块也可以缓存到第二区域的块缓存中。

74.4. Where is the code

74.4。代码在哪里

This feature is delivered in two phases, Phase 1 and 2. The first phase is done in time for HBase-1.0.0 release. Meaning that using HBase-1.0.x, you can use all the features that are marked for Phase 1. Phase 2 is committed in HBase-1.1.0, meaning all HBase versions after 1.1.0 should contain Phase 2 items.

该特性分两阶段交付,第1阶段和第2阶段。第一个阶段是在HBase-1.0.0版本中完成的。这意味着使用hbase - 1.0。x,您可以使用标记为阶段1的所有特性。阶段2是在HBase-1.1.0中提交的,这意味着在1.1.0之后的所有HBase版本都应该包含第2阶段的项目。

74.5. Propagating writes to region replicas

74.5。传播写入到区域副本。

As discussed above writes only go to the primary region replica. For propagating the writes from the primary region replica to the secondaries, there are two different mechanisms. For read-only tables, you do not need to use any of the following methods. Disabling and enabling the table should make the data available in all region replicas. For mutable tables, you have to use only one of the following mechanisms: storefile refresher, or async wal replication. The latter is recommended.

如上所述,只写到主区域副本。为了将写从主区域复制到第二个副本,有两种不同的机制。对于只读表,您不需要使用以下任何方法。禁用和启用表应该可以使所有区域副本中的数据都可用。对于可变表,您只能使用以下机制之一:storefile刷新或async wal复制。建议采用后一种方法。

74.5.1. StoreFile Refresher

74.5.1。StoreFile进修

The first mechanism is store file refresher which is introduced in HBase-1.0+. Store file refresher is a thread per region server, which runs periodically, and does a refresh operation for the store files of the primary region for the secondary region replicas. If enabled, the refresher will ensure that the secondary region replicas see the new flushed, compacted or bulk loaded files from the primary region in a timely manner. However, this means that only flushed data can be read back from the secondary region replicas, and after the refresher is run, making the secondaries lag behind the primary for an a longer time.

第一个机制是在HBase-1.0+中引入的存储文件刷新。存储文件刷新是每个区域服务器上的一个线程,它定期运行,并为二级区域副本的主区域的存储文件进行刷新操作。如果启用,更新器将确保二级区域副本能够及时地从主区域看到新的刷新、压缩或批量加载的文件。然而,这意味着只有刷新的数据才能从次区域副本中读取,并且在刷新之后,使得第二部分在较长时间内落后于主区域。

For turning this feature on, you should configure hbase.regionserver.storefile.refresh.period to a non-zero value. See Configuration section below.

要启用这个特性,您应该配置hbase. local server.storefile.refresh。周期为非零值。请参阅下面的配置部分。

74.5.2. Asnyc WAL replication

74.5.2。Asnyc细胞膜复制

The second mechanism for propagation of writes to secondaries is done via “Async WAL Replication” feature and is only available in HBase-1.1+. This works similarly to HBase’s multi-datacenter replication, but instead the data from a region is replicated to the secondary regions. Each secondary replica always receives and observes the writes in the same order that the primary region committed them. In some sense, this design can be thought of as “in-cluster replication”, where instead of replicating to a different datacenter, the data goes to secondary regions to keep secondary region’s in-memory state up to date. The data files are shared between the primary region and the other replicas, so that there is no extra storage overhead. However, the secondary regions will have recent non-flushed data in their memstores, which increases the memory overhead. The primary region writes flush, compaction, and bulk load events to its WAL as well, which are also replicated through wal replication to secondaries. When they observe the flush/compaction or bulk load event, the secondary regions replay the event to pick up the new files and drop the old ones.

通过“Async WAL - Replication”特性,实现了对二次开发的写入的第二种传播机制,它只在HBase-1.1+中可用。这与HBase的多数据中心复制类似,但是从一个区域的数据复制到次要区域。每个次要副本总是接收和观察与主区域提交它们的相同顺序的写入。从某种意义上说,这种设计可以被认为是“集群内复制”,而不是复制到不同的数据中心,数据进入次要区域,以保持次要区域的内存状态更新。数据文件在主区域和其他副本之间共享,这样就不会有额外的存储开销。但是,次要区域将在其memstore中拥有最近的非刷新数据,这会增加内存开销。主区域将刷新、压缩和批量加载事件写入到它的WAL中,这也通过WAL - replication复制到二级。当他们观察刷新/压缩或批量加载事件时,次要区域重放事件以获取新文件并删除旧文件。

Committing writes in the same order as in primary ensures that the secondaries won’t diverge from the primary regions data, but since the log replication is asynchronous, the data might still be stale in secondary regions. Since this feature works as a replication endpoint, the performance and latency characteristics is expected to be similar to inter-cluster replication.

提交与主服务器相同的顺序可以确保第二个分区不会偏离主区域的数据,但是由于日志复制是异步的,所以数据可能仍然会在次要区域中失效。由于该特性作为一个复制端点工作,因此性能和延迟特性将与集群内复制类似。

Async WAL Replication is disabled by default. You can enable this feature by setting hbase.region.replica.replication.enabled to true. Asyn WAL Replication feature will add a new replication peer named region_replica_replication as a replication peer when you create a table with region replication > 1 for the first time. Once enabled, if you want to disable this feature, you need to do two actions: * Set configuration property hbase.region.replica.replication.enabled to false in hbase-site.xml (see Configuration section below) * Disable the replication peer named region_replica_replication in the cluster using hbase shell or Admin class:

默认情况下,Async WAL复制是禁用的。您可以通过设置hbase. local .replication. replication来启用该特性。启用为true。当您第一次创建带有区域复制> 1的表时,Asyn WAL复制特性将添加一个名为region_replica_replication的新复制节点作为复制节点。一旦启用,如果您想禁用此功能,您需要执行两个操作:*设置配置属性hbase. local .replica.replication。在hbase站点中启用了false。xml(请参阅下面的配置节)*使用hbase shell或Admin类在集群中禁用名为区域复制的复制节点:

        hbase> disable_peer 'region_replica_replication'

74.6. Store File TTL

74.6。存储文件TTL

In both of the write propagation approaches mentioned above, store files of the primary will be opened in secondaries independent of the primary region. So for files that the primary compacted away, the secondaries might still be referring to these files for reading. Both features are using HFileLinks to refer to files, but there is no protection (yet) for guaranteeing that the file will not be deleted prematurely. Thus, as a guard, you should set the configuration property hbase.master.hfilecleaner.ttl to a larger value, such as 1 hour to guarantee that you will not receive IOExceptions for requests going to replicas.

在上面提到的两种写传播方法中,主区域的存储文件将在独立于主区域的二级中打开。因此,对于主压缩的文件,第二个可能仍然是指这些文件用于读取。这两个特性都使用HFileLinks来引用文件,但是没有保护(但是)保证文件不会过早地被删除。因此,作为一个警卫,您应该设置配置属性hbase.master.hfilecleaner。ttl更大的值,比如1小时,以保证不会接收到要复制的请求的ioexception。

74.7. Region replication for META table’s region

74.7。元表区域的区域复制。

Currently, Async WAL Replication is not done for the META table’s WAL. The meta table’s secondary replicas still refreshes themselves from the persistent store files. Hence the hbase.regionserver.meta.storefile.refresh.period needs to be set to a certain non-zero value for refreshing the meta store files. Note that this configuration is configured differently than hbase.regionserver.storefile.refresh.period.

目前,对于元数据表的WAL,还没有进行异步复制。元表的二级副本仍然从持久性存储文件刷新自己。因此hbase.regionserver.meta.storefile.refresh。期间需要设置为一个非零值来刷新元存储文件。注意,这个配置的配置与hbase. server.storefile.refresh.period不同。

74.8. Memory accounting

74.8。记忆的会计

The secondary region replicas refer to the data files of the primary region replica, but they have their own memstores (in HBase-1.1+) and uses block cache as well. However, one distinction is that the secondary region replicas cannot flush the data when there is memory pressure for their memstores. They can only free up memstore memory when the primary region does a flush and this flush is replicated to the secondary. Since in a region server hosting primary replicas for some regions and secondaries for some others, the secondaries might cause extra flushes to the primary regions in the same host. In extreme situations, there can be no memory left for adding new writes coming from the primary via wal replication. For unblocking this situation (and since secondary cannot flush by itself), the secondary is allowed to do a “store file refresh” by doing a file system list operation to pick up new files from primary, and possibly dropping its memstore. This refresh will only be performed if the memstore size of the biggest secondary region replica is at least hbase.region.replica.storefile.refresh.memstore.multiplier (default 4) times bigger than the biggest memstore of a primary replica. One caveat is that if this is performed, the secondary can observe partial row updates across column families (since column families are flushed independently). The default should be good to not do this operation frequently. You can set this value to a large number to disable this feature if desired, but be warned that it might cause the replication to block forever.

二级区域副本是指主区域副本的数据文件,但是它们有自己的memstores(在HBase-1.1+中),并使用块缓存。但是,有一个区别是,当内存对其内存存储有压力时,次要区域复制不能刷新数据。当主区域进行刷新时,它们只能释放memstore内存,并将此刷新复制到次要区域。由于在一个区域服务器中为某些区域和一些其他区域提供了主副本,因此,二级服务器可能会对同一主机的主要区域造成额外的影响。在极端情况下,可以不留下任何内存来添加来自主要通过wal复制的新写。为了消除这种情况(并且由于次要服务器不能自行刷新),可以通过执行文件系统列表操作来从主服务器获取新文件,并可能删除其memstore,从而允许进行“存储文件刷新”。只有在最大的二级区域副本的memstore大小至少是hbase.region.replica.storefile.refresh.memstore时,才会执行此刷新。乘数(默认4)倍于主副本最大的memstore。需要注意的是,如果执行了这个操作,那么第二部分可以观察到跨列家庭的部分行更新(因为列家族是独立地刷新的)。默认情况下,最好不要频繁地执行这个操作。如果需要,可以将这个值设置为一个大的数字来禁用该功能,但是要注意,它可能会导致复制永远阻塞。

74.9. Secondary replica failover

74.9。二次副本故障转移

When a secondary region replica first comes online, or fails over, it may have served some edits from its memstore. Since the recovery is handled differently for secondary replicas, the secondary has to ensure that it does not go back in time before it starts serving requests after assignment. For doing that, the secondary waits until it observes a full flush cycle (start flush, commit flush) or a “region open event” replicated from the primary. Until this happens, the secondary region replica will reject all read requests by throwing an IOException with message “The region’s reads are disabled”. However, the other replicas will probably still be available to read, thus not causing any impact for the rpc with TIMELINE consistency. To facilitate faster recovery, the secondary region will trigger a flush request from the primary when it is opened. The configuration property hbase.region.replica.wait.for.primary.flush (enabled by default) can be used to disable this feature if needed.

当一个次级区域复制首先出现在网上,或者失败了,它可能从它的memstore中提供了一些编辑。由于二次复制的恢复处理方式不同,因此次要副本必须确保在任务完成后开始服务请求之前不会返回。为此,次要等待直到它观察到一个完整的刷新周期(开始刷新、提交刷新)或从主服务器复制的“区域开放事件”。在这种情况发生之前,二级区域副本将拒绝所有读取请求,抛出一个IOException,并将“该区域的读取被禁用”。但是,其他副本可能仍然可以读取,因此不会对rpc的时间轴一致性造成任何影响。为了促进更快的恢复,第二区域在打开时将触发来自主的刷新请求。hbase.region.replica.wait.for.primary配置属性。刷新(默认启用)可以在需要时禁用此功能。

74.10. Configuration properties

74.10。配置属性

To use highly available reads, you should set the following properties in hbase-site.xml file. There is no specific configuration to enable or disable region replicas. Instead you can change the number of region replicas per table to increase or decrease at the table creation or with alter table. The following configuration is for using async wal replication and using meta replicas of 3.

要使用高可用的读取,应该在hbase-site中设置以下属性。xml文件。没有特定的配置来启用或禁用区域副本。相反,您可以更改每个表的区域副本数量,以增加或减少表创建或更改表。下面的配置是使用async wal复制,并使用3的元副本。

74.10.1. Server side properties

74.10.1。服务器端性能

<property>
    <name>hbase.regionserver.storefile.refresh.period</name>
    <value>0</value>
    <description>
      The period (in milliseconds) for refreshing the store files for the secondary regions. 0 means this feature is disabled. Secondary regions sees new files (from flushes and compactions) from primary once the secondary region refreshes the list of files in the region (there is no notification mechanism). But too frequent refreshes might cause extra Namenode pressure. If the files cannot be refreshed for longer than HFile TTL (hbase.master.hfilecleaner.ttl) the requests are rejected. Configuring HFile TTL to a larger value is also recommended with this setting.
    </description>
</property>

<property>
    <name>hbase.regionserver.meta.storefile.refresh.period</name>
    <value>300000</value>
    <description>
      The period (in milliseconds) for refreshing the store files for the hbase:meta tables secondary regions. 0 means this feature is disabled. Secondary regions sees new files (from flushes and compactions) from primary once the secondary region refreshes the list of files in the region (there is no notification mechanism). But too frequent refreshes might cause extra Namenode pressure. If the files cannot be refreshed for longer than HFile TTL (hbase.master.hfilecleaner.ttl) the requests are rejected. Configuring HFile TTL to a larger value is also recommended with this setting. This should be a non-zero number if meta replicas are enabled (via hbase.meta.replica.count set to greater than 1).
    </description>
</property>

<property>
    <name>hbase.region.replica.replication.enabled</name>
    <value>true</value>
    <description>
      Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named "region_replica_replication" will be created which will tail the logs and replicate the mutations to region replicas for tables that have region replication > 1. If this is enabled once, disabling this replication also requires disabling the replication peer using shell or Admin java class. Replication to secondary region replicas works over standard inter-cluster replication.
    </description>
</property>
<property>
  <name>hbase.region.replica.replication.memstore.enabled</name>
  <value>true</value>
  <description>
    If you set this to `false`, replicas do not receive memstore updates from
    the primary RegionServer. If you set this to `true`, you can still disable
    memstore replication on a per-table basis, by setting the table's
    `REGION_MEMSTORE_REPLICATION` configuration property to `false`. If
    memstore replication is disabled, the secondaries will only receive
    updates for events like flushes and bulkloads, and will not have access to
    data which the primary has not yet flushed. This preserves the guarantee
    of row-level consistency, even when the read requests `Consistency.TIMELINE`.
  </description>
</property>

<property>
    <name>hbase.master.hfilecleaner.ttl</name>
    <value>3600000</value>
    <description>
      The period (in milliseconds) to keep store files in the archive folder before deleting them from the file system.</description>
</property>

<property>
    <name>hbase.meta.replica.count</name>
    <value>3</value>
    <description>
      Region replication count for the meta regions. Defaults to 1.
    </description>
</property>


<property>
    <name>hbase.region.replica.storefile.refresh.memstore.multiplier</name>
    <value>4</value>
    <description>
      The multiplier for a “store file refresh” operation for the secondary region replica. If a region server has memory pressure, the secondary region will refresh it’s store files if the memstore size of the biggest secondary replica is bigger this many times than the memstore size of the biggest primary replica. Set this to a very big value to disable this feature (not recommended).
    </description>
</property>

<property>
 <name>hbase.region.replica.wait.for.primary.flush</name>
    <value>true</value>
    <description>
      Whether to wait for observing a full flush cycle from the primary before start serving data in a secondary. Disabling this might cause the secondary region replicas to go back in time for reads between region movements.
    </description>
</property>

One thing to keep in mind also is that, region replica placement policy is only enforced by the StochasticLoadBalancer which is the default balancer. If you are using a custom load balancer property in hbase-site.xml (hbase.master.loadbalancer.class) replicas of regions might end up being hosted in the same server.

要记住的一点是,区域副本放置策略只由默认的平衡器执行,即StochasticLoadBalancer。如果您在hbase站点使用自定义负载平衡器属性。xml (hbase.master.loadbalancer.class)副本可能会在同一个服务器上被托管。

74.10.2. Client side properties

74.10.2。客户端属性

Ensure to set the following for all clients (and servers) that will use region replicas.

确保为所有使用区域副本的客户端(和服务器)设置以下内容。

<property>
    <name>hbase.ipc.client.specificThreadForWriting</name>
    <value>true</value>
    <description>
      Whether to enable interruption of RPC threads at the client side. This is required for region replicas with fallback RPC’s to secondary regions.
    </description>
</property>
<property>
  <name>hbase.client.primaryCallTimeout.get</name>
  <value>10000</value>
  <description>
    The timeout (in microseconds), before secondary fallback RPC’s are submitted for get requests with Consistency.TIMELINE to the secondary replicas of the regions. Defaults to 10ms. Setting this lower will increase the number of RPC’s, but will lower the p99 latencies.
  </description>
</property>
<property>
  <name>hbase.client.primaryCallTimeout.multiget</name>
  <value>10000</value>
  <description>
      The timeout (in microseconds), before secondary fallback RPC’s are submitted for multi-get requests (Table.get(List<Get>)) with Consistency.TIMELINE to the secondary replicas of the regions. Defaults to 10ms. Setting this lower will increase the number of RPC’s, but will lower the p99 latencies.
  </description>
</property>
<property>
  <name>hbase.client.replicaCallTimeout.scan</name>
  <value>1000000</value>
  <description>
    The timeout (in microseconds), before secondary fallback RPC’s are submitted for scan requests with Consistency.TIMELINE to the secondary replicas of the regions. Defaults to 1 sec. Setting this lower will increase the number of RPC’s, but will lower the p99 latencies.
  </description>
</property>
<property>
    <name>hbase.meta.replicas.use</name>
    <value>true</value>
    <description>
      Whether to use meta table replicas or not. Default is false.
    </description>
</property>

Note HBase-1.0.x users should use hbase.ipc.client.allowsInterrupt rather than hbase.ipc.client.specificThreadForWriting.

注意hbase - 1.0。x用户应该使用hbase.ipc.client。allowsInterrupt而不是hbase.ipc.client.specificThreadForWriting。

74.11. User Interface

74.11。用户界面

In the masters user interface, the region replicas of a table are also shown together with the primary regions. You can notice that the replicas of a region will share the same start and end keys and the same region name prefix. The only difference would be the appended replica_id (which is encoded as hex), and the region encoded name will be different. You can also see the replica ids shown explicitly in the UI.

在masters用户界面中,表的区域副本也与主要区域一起显示。您可以注意到,一个区域的副本将共享相同的开始和结束键以及相同的区域名称前缀。惟一的区别是append replica_id(它被编码为十六进制),而区域编码的名称将会不同。您还可以看到在UI中显式显示的副本id。

74.12. Creating a table with region replication

74.12。创建具有区域复制的表。

Region replication is a per-table property. All tables have REGION_REPLICATION = 1 by default, which means that there is only one replica per region. You can set and change the number of replicas per region of a table by supplying the REGION_REPLICATION property in the table descriptor.

区域复制是每个表的属性。默认情况下,所有表都有区域性复制= 1,这意味着每个区域只有一个副本。您可以通过提供表描述符中的region _replication属性来设置和更改表的每个区域的副本数量。

74.12.1. Shell

74.12.1。壳牌

create 't1', 'f1', {REGION_REPLICATION => 2}

describe 't1'
for i in 1..100
put 't1', "r#{i}", 'f1:c1', i
end
flush 't1'

74.12.2. Java

74.12.2。Java

HTableDescriptor htd = new HTableDescriptor(TableName.valueOf(test_table));
htd.setRegionReplication(2);
...
admin.createTable(htd);

You can also use setRegionReplication() and alter table to increase, decrease the region replication for a table.

您还可以使用setRegionReplication()和alter table来增加一个表的区域复制。

74.13. Read API and Usage

74.13。阅读API和使用

74.13.1. Shell

74.13.1。壳牌

You can do reads in shell using a the Consistency.TIMELINE semantics as follows

您可以使用一个一致性来读取shell。时间轴语义如下

hbase(main):001:0> get 't1','r6', {CONSISTENCY => "TIMELINE"}

You can simulate a region server pausing or becoming unavailable and do a read from the secondary replica:

您可以模拟一个区域服务器暂停或变得不可用,并从第二个副本读取数据:

$ kill -STOP <pid or primary region server>

hbase(main):001:0> get 't1','r6', {CONSISTENCY => "TIMELINE"}

Using scans is also similar

使用扫描也很相似。

hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}

74.13.2. Java

74.13.2。Java

You can set the consistency for Gets and Scans and do requests as follows.

您可以设置获取和扫描的一致性,并按如下方式执行请求。

Get get = new Get(row);
get.setConsistency(Consistency.TIMELINE);
...
Result result = table.get(get);

You can also pass multiple gets:

你也可以通过多个get:

Get get1 = new Get(row);
get1.setConsistency(Consistency.TIMELINE);
...
ArrayList<Get> gets = new ArrayList<Get>();
gets.add(get1);
...
Result[] results = table.get(gets);

And Scans:

和扫描:

Scan scan = new Scan();
scan.setConsistency(Consistency.TIMELINE);
...
ResultScanner scanner = table.getScanner(scan);

You can inspect whether the results are coming from primary region or not by calling the Result.isStale() method:

您可以通过调用result . is陈旧()方法来检查结果是否来自主区域。

Result result = table.get(get);
if (result.isStale()) {
  ...
}

74.14. Resources

74.14。资源

  1. More information about the design and implementation can be found at the jira issue: HBASE-10070

    关于设计和实现的更多信息可以在jira问题上找到:HBASE-10070。

  2. HBaseCon 2014 talk: HBase Read High Availability Using Timeline-Consistent Region Replicas also contains some details and slides.

    HBaseCon 2014年的演讲:HBase使用时间一致的区域副本读取高可用性,并且包含一些细节和幻灯片。

75. Storing Medium-sized Objects (MOB)

75年。中型存储对象(群)

Data comes in many sizes, and saving all of your data in HBase, including binary data such as images and documents, is ideal. While HBase can technically handle binary objects with cells that are larger than 100 KB in size, HBase’s normal read and write paths are optimized for values smaller than 100KB in size. When HBase deals with large numbers of objects over this threshold, referred to here as medium objects, or MOBs, performance is degraded due to write amplification caused by splits and compactions. When using MOBs, ideally your objects will be between 100KB and 10MB (see the FAQ). HBase FIX_VERSION_NUMBER adds support for better managing large numbers of MOBs while maintaining performance, consistency, and low operational overhead. MOB support is provided by the work done in HBASE-11339. To take advantage of MOB, you need to use HFile version 3. Optionally, configure the MOB file reader’s cache settings for each RegionServer (see Configuring the MOB Cache), then configure specific columns to hold MOB data. Client code does not need to change to take advantage of HBase MOB support. The feature is transparent to the client.

数据有很多大小,在HBase中保存所有的数据,包括像图像和文档这样的二进制数据是理想的。虽然HBase在技术上可以处理大于100KB大小的单元的二进制对象,但是HBase的正常读写路径对小于100KB大小的值进行了优化。当HBase处理超过这个阈值的大量对象时,称为介质对象,或者是MOBs,由于拆分和压缩导致的写入放大,性能下降。在使用MOBs时,理想情况下您的对象将在100KB到10MB之间(参见FAQ)。HBase FIX_VERSION_NUMBER添加了支持,以便更好地管理大量的MOBs,同时保持性能、一致性和较低的操作开销。在HBASE-11339所做的工作提供了MOB支持。为了利用MOB,您需要使用HFile版本3。可选地,为每个区域服务器配置MOB文件阅读器的缓存设置(参见配置MOB缓存),然后配置特定的列来保存MOB数据。客户端代码不需要更改以利用HBase MOB支持。该特性对客户机是透明的。

75.1. Configuring Columns for MOB

75.1。配置列暴徒

You can configure columns to support MOB during table creation or alteration, either in HBase Shell or via the Java API. The two relevant properties are the boolean IS_MOB and the MOB_THRESHOLD, which is the number of bytes at which an object is considered to be a MOB. Only IS_MOB is required. If you do not specify the MOB_THRESHOLD, the default threshold value of 100 KB is used.

您可以配置列来支持在表创建或更改中支持MOB,无论是在HBase Shell中,还是通过Java API。两个相关属性是布尔型IS_MOB和MOB_THRESHOLD,即一个对象被认为是一个MOB的字节数。只有IS_MOB是必需的。如果没有指定MOB_THRESHOLD,则使用默认阈值为100kb。

Example 37. Configure a Column for MOB Using HBase Shell
hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400}
hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400}
Example 38. Configure a Column for MOB Using the Java API
...
HColumnDescriptor hcd = new HColumnDescriptor(f);
hcd.setMobEnabled(true);
...
hcd.setMobThreshold(102400L);
...

75.2. Testing MOB

75.2。测试暴徒

The utility org.apache.hadoop.hbase.IntegrationTestIngestWithMOB is provided to assist with testing the MOB feature. The utility is run as follows:

该实用程序org.apache.hadoop.hbase。提供集成测试,以协助测试MOB特性。本实用程序运行如下:

$ sudo -u hbase hbase org.apache.hadoop.hbase.IntegrationTestIngestWithMOB \
            -threshold 1024 \
            -minMobDataSize 512 \
            -maxMobDataSize 5120
  • threshold is the threshold at which cells are considered to be MOBs. The default is 1 kB, expressed in bytes.

    阈值是细胞被认为是暴民的阈值。默认值是1kb,以字节表示。

  • minMobDataSize is the minimum value for the size of MOB data. The default is 512 B, expressed in bytes.

    minMobDataSize是MOB数据大小的最小值。默认值是512b,以字节表示。

  • maxMobDataSize is the maximum value for the size of MOB data. The default is 5 kB, expressed in bytes.

    maxMobDataSize是MOB数据大小的最大值。默认值是5kb,以字节表示。

75.3. Configuring the MOB Cache

75.3。配置暴徒缓存

Because there can be a large number of MOB files at any time, as compared to the number of HFiles, MOB files are not always kept open. The MOB file reader cache is a LRU cache which keeps the most recently used MOB files open. To configure the MOB file reader’s cache on each RegionServer, add the following properties to the RegionServer’s hbase-site.xml, customize the configuration to suit your environment, and restart or rolling restart the RegionServer.

因为在任何时候都可以有大量的MOB文件,与HFiles的数量相比,MOB文件并不总是被打开。MOB文件阅读器缓存是一个LRU缓存,它保持最近使用的MOB文件打开。要在每个区域服务器上配置MOB文件阅读器的缓存,请将以下属性添加到区域服务器的hbase-site中。xml,自定义配置以适应您的环境,重新启动或滚动重新启动区域服务器。

Example 39. Example MOB Cache Configuration
<property>
    <name>hbase.mob.file.cache.size</name>
    <value>1000</value>
    <description>
      Number of opened file handlers to cache.
      A larger value will benefit reads by providing more file handlers per mob
      file cache and would reduce frequent file opening and closing.
      However, if this is set too high, this could lead to a "too many opened file handers"
      The default value is 1000.
    </description>
</property>
<property>
    <name>hbase.mob.cache.evict.period</name>
    <value>3600</value>
    <description>
      The amount of time in seconds after which an unused file is evicted from the
      MOB cache. The default value is 3600 seconds.
    </description>
</property>
<property>
    <name>hbase.mob.cache.evict.remain.ratio</name>
    <value>0.5f</value>
    <description>
      A multiplier (between 0.0 and 1.0), which determines how many files remain cached
      after the threshold of files that remains cached after a cache eviction occurs
      which is triggered by reaching the `hbase.mob.file.cache.size` threshold.
      The default value is 0.5f, which means that half the files (the least-recently-used
      ones) are evicted.
    </description>
</property>

75.4. MOB Optimization Tasks

75.4。暴徒优化任务

75.4.1. Manually Compacting MOB Files

75.4.1。手动压实暴徒文件

To manually compact MOB files, rather than waiting for the configuration to trigger compaction, use the compact or major_compact HBase shell commands. These commands require the first argument to be the table name, and take a column family as the second argument. and take a compaction type as the third argument.

要手动压缩MOB文件,而不是等待配置触发压缩,使用compact或major_compact HBase shell命令。这些命令要求第一个参数为表名,并以列族作为第二个参数。然后以压缩类型作为第三个参数。

hbase> compact 't1', 'c1’, ‘MOB’
hbase> major_compact 't1', 'c1’, ‘MOB’

These commands are also available via Admin.compact and Admin.majorCompact methods.

这些命令也可以通过Admin.compact和Admin获得。majorCompact方法。

Backup and Restore

备份和恢复

76. Overview

76年。概述

Backup and restore is a standard operation provided by many databases. An effective backup and restore strategy helps ensure that users can recover data in case of unexpected failures. The HBase backup and restore feature helps ensure that enterprises using HBase as a canonical data repository can recover from catastrophic failures. Another important feature is the ability to restore the database to a particular point-in-time, commonly referred to as a snapshot.

备份和恢复是许多数据库提供的标准操作。有效的备份和恢复策略有助于确保用户在出现意外故障时能够恢复数据。HBase备份和恢复特性有助于确保使用HBase作为规范数据存储库的企业可以从灾难性故障中恢复。另一个重要特性是能够将数据库恢复到特定的时间点(通常称为快照)。

The HBase backup and restore feature provides the ability to create full backups and incremental backups on tables in an HBase cluster. The full backup is the foundation on which incremental backups are applied to build iterative snapshots. Incremental backups can be run on a schedule to capture changes over time, for example by using a Cron task. Incremental backups are more cost-effective than full backups because they only capture the changes since the last backup and they also enable administrators to restore the database to any prior incremental backup. Furthermore, the utilities also enable table-level data backup-and-recovery if you do not want to restore the entire dataset of the backup.

HBase备份和恢复功能提供了在HBase集群中创建完整备份和增量备份的能力。完整备份是用于构建迭代快照的增量备份的基础。增量备份可以在计划中运行,以捕获随时间变化的变化,例如使用Cron任务。增量备份比完全备份更具成本效益,因为它们只捕获自上次备份以来的更改,而且它们还使管理员能够将数据库恢复到任何以前的增量备份。此外,如果不希望恢复备份的整个数据集,那么这些实用程序还可以支持表级数据的反向恢复。

The backup and restore feature supplements the HBase Replication feature. While HBase replication is ideal for creating "hot" copies of the data (where the replicated data is immediately available for query), the backup and restore feature is ideal for creating "cold" copies of data (where a manual step must be taken to restore the system). Previously, users only had the ability to create full backups via the ExportSnapshot functionality. The incremental backup implementation is the novel improvement over the previous "art" provided by ExportSnapshot.

备份和恢复特性补充了HBase复制特性。虽然HBase复制非常适合创建数据的“热”副本(复制的数据可以立即用于查询),但是备份和恢复特性是创建“冷”数据副本的理想方法(在这里,必须采取手动步骤来恢复系统)。以前,用户只能通过ExportSnapshot功能创建完整的备份。增量备份实现是对ExportSnapshot提供的先前“艺术”的新改进。

77. Terminology

77年。术语

The backup and restore feature introduces new terminology which can be used to understand how control flows through the system.

备份和恢复特性引入了新的术语,可以用来理解控制是如何通过系统的。

  • A backup: A logical unit of data and metadata which can restore a table to its state at a specific point in time.

    备份:数据和元数据的逻辑单元,可以在特定的时间点将表恢复到它的状态。

  • Full backup: a type of backup which wholly encapsulates the contents of the table at a point in time.

    完全备份:一种类型的备份,它在某个时间点完全封装了表的内容。

  • Incremental backup: a type of backup which contains the changes in a table since a full backup.

    增量备份:一种备份,它包含自完全备份以来的表中的更改。

  • Backup set: A user-defined name which references one or more tables over which a backup can be executed.

    备份集:一个用户定义的名称,它引用一个或多个备份可以执行的表。

  • Backup ID: A unique names which identifies one backup from the rest, e.g. backupId_1467823988425

    备份ID:一个惟一的名称,用来标识其他的备份,例如backupId_1467823988425。

78. Planning

78年。规划

There are some common strategies which can be used to implement backup and restore in your environment. The following section shows how these strategies are implemented and identifies potential tradeoffs with each.

有一些通用的策略可以用于在您的环境中实现备份和恢复。下一节将展示这些策略是如何实现的,并指出每种策略的潜在权衡。

This backup and restore tools has not been tested on Transparent Data Encryption (TDE) enabled HDFS clusters. This is related to the open issue HBASE-16178.

78.1. Backup within a cluster

78.1。备份在一个集群

This strategy stores the backups on the same cluster as where the backup was taken. This approach is only appropriate for testing as it does not provide any additional safety on top of what the software itself already provides.

该策略将备份存储在与备份节点相同的集群上。这种方法只适合于测试,因为它不提供任何额外的安全性,而这正是软件本身所提供的。

backup intra cluster
Figure 4. Intra-Cluster Backup

78.2. Backup using a dedicated cluster

78.2。使用专用集群进行备份。

This strategy provides greater fault tolerance and provides a path towards disaster recovery. In this setting, you will store the backup on a separate HDFS cluster by supplying the backup destination cluster’s HDFS URL to the backup utility. You should consider backing up to a different physical location, such as a different data center.

该策略提供了更大的容错,并为灾难恢复提供了一条路径。在此设置中,您将在一个单独的HDFS集群上存储备份,将备份目标集群的HDFS URL提供给备份实用程序。您应该考虑备份到不同的物理位置,例如不同的数据中心。

Typically, a backup-dedicated HDFS cluster uses a more economical hardware profile to save money.

通常,一个备份专用的HDFS集群使用更经济的硬件配置文件来节省资金。

backup dedicated cluster
Figure 5. Dedicated HDFS Cluster Backup

78.3. Backup to the Cloud or a storage vendor appliance

78.3。备份到云或存储供应商设备。

Another approach to safeguarding HBase incremental backups is to store the data on provisioned, secure servers that belong to third-party vendors and that are located off-site. The vendor can be a public cloud provider or a storage vendor who uses a Hadoop-compatible file system, such as S3 and other HDFS-compatible destinations.

另一种保护HBase增量备份的方法是将数据存储在提供的、安全的服务器上,这些服务器属于第三方供应商,并且位于站点之外。供应商可以是公共云提供商,也可以是使用hadoop兼容文件系统的存储供应商,比如S3和其他兼容hdfs的目的地。

backup cloud appliance
Figure 6. Backup to Cloud or Vendor Storage Solutions
The HBase backup utility does not support backup to multiple destinations. A workaround is to manually create copies of the backup files from HDFS or S3.

79. First-time configuration steps

79年。第一次配置步骤

This section contains the necessary configuration changes that must be made in order to use the backup and restore feature. As this feature makes significant use of YARN’s MapReduce framework to parallelize these I/O heavy operations, configuration changes extend outside of just hbase-site.xml.

此部分包含必须进行的配置更改,以便使用备份和恢复功能。由于这个特性显著地使用了纱线的MapReduce框架来并行化这些I/O重操作,配置更改扩展到hbase-site.xml之外。

79.1. Allow the "hbase" system user in YARN

79.1。允许“hbase”系统用户使用纱线。

The YARN container-executor.cfg configuration file must have the following property setting: allowed.system.users=hbase. No spaces are allowed in entries of this configuration file.

纱线container-executor。cfg配置文件必须具有以下属性设置:允许。system.users=hbase。在此配置文件的条目中不允许使用空格。

Skipping this step will result in runtime errors when executing the first backup tasks.

Example of a valid container-executor.cfg file for backup and restore:

一个有效的容器执行器示例。cfg文件备份和恢复:

yarn.nodemanager.log-dirs=/var/log/hadoop/mapred
yarn.nodemanager.linux-container-executor.group=yarn
banned.users=hdfs,yarn,mapred,bin
allowed.system.users=hbase
min.user.id=500

79.2. HBase specific changes

79.2。HBase具体变化

Add the following properties to hbase-site.xml and restart HBase if it is already running.

将以下属性添加到hbase站点。如果已经运行了xml,则重新启动HBase。

The ",…​" is an ellipsis meant to imply that this is a comma-separated list of values, not literal text which should be added to hbase-site.xml.
<property>
  <name>hbase.backup.enable</name>
  <value>true</value>
</property>
<property>
  <name>hbase.master.logcleaner.plugins</name>
  <value>org.apache.hadoop.hbase.backup.master.BackupLogCleaner,...</value>
</property>
<property>
  <name>hbase.procedure.master.classes</name>
  <value>org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager,...</value>
</property>
<property>
  <name>hbase.procedure.regionserver.classes</name>
  <value>org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager,...</value>
</property>
<property>
  <name>hbase.coprocessor.region.classes</name>
  <value>org.apache.hadoop.hbase.backup.BackupObserver,...</value>
</property>
<property>
  <name>hbase.master.hfilecleaner.plugins</name>
  <value>org.apache.hadoop.hbase.backup.BackupHFileCleaner,...</value>
</property>

80. Backup and Restore commands

80年。备份和恢复命令

This covers the command-line utilities that administrators would run to create, restore, and merge backups. Tools to inspect details on specific backup sessions is covered in the next section, Administration of Backup Images.

这包括管理员将运行创建、恢复和合并备份的命令行实用程序。下一节将介绍用于检查特定备份会话细节的工具,管理备份映像。

Run the command hbase backup help <command> to access the online help that provides basic information about a command and its options. The below information is captured in this help message for each command.

运行命令hbase备份帮助 <命令> 来访问提供关于命令及其选项的基本信息的联机帮助。下面的信息在每个命令的帮助消息中被捕获。

80.1. Creating a Backup Image

80.1。创建一个备份映像

For HBase clusters also using Apache Phoenix: include the SQL system catalog tables in the backup. In the event that you need to restore the HBase backup, access to the system catalog tables enable you to resume Phoenix interoperability with the restored data.

对于HBase集群,也使用Apache Phoenix:在备份中包含SQL系统编目表。在需要恢复HBase备份的情况下,访问系统目录表使您能够恢复与恢复的数据的Phoenix互操作性。

The first step in running the backup and restore utilities is to perform a full backup and to store the data in a separate image from the source. At a minimum, you must do this to get a baseline before you can rely on incremental backups.

运行备份和恢复实用程序的第一步是执行完整的备份,并将数据存储在来自源的单独映像中。至少,在依赖增量备份之前,您必须这样做以获得一个基线。

Run the following command as HBase superuser:

运行以下命令作为HBase超级用户:

hbase backup create <type> <backup_path>

After the command finishes running, the console prints a SUCCESS or FAILURE status message. The SUCCESS message includes a backup ID. The backup ID is the Unix time (also known as Epoch time) that the HBase master received the backup request from the client.

在命令结束运行后,控制台打印成功或失败状态消息。成功消息包括一个备份ID。备份ID是Unix时间(也称为Epoch时间),HBase主服务器接收来自客户机的备份请求。

Record the backup ID that appears at the end of a successful backup. In case the source cluster fails and you need to recover the dataset with a restore operation, having the backup ID readily available can save time.

记录在成功备份结束时出现的备份ID。如果源集群失败,您需要恢复使用恢复操作的数据集,可以随时使用备份ID,这样可以节省时间。

80.1.1. Positional Command-Line Arguments

80.1.1。位置命令行参数

type

The type of backup to execute: full or incremental. As a reminder, an incremental backup requires a full backup to already exist.

执行的备份类型:full或增量。作为提醒,增量备份需要完全备份。

backup_path

The backup_path argument specifies the full filesystem URI of where to store the backup image. Valid prefixes are are hdfs:, webhdfs:, gpfs:, and s3fs:.

backup_path参数指定存储备份映像的位置的完整文件系统URI。有效的前缀是hdfs:, webhdfs:, gpfs:,和s3fs:。

80.1.2. Named Command-Line Arguments

80.1.2。指定命令行参数

-t <table_name[,table_name]>

A comma-separated list of tables to back up. If no tables are specified, all tables are backed up. No regular-expression or wildcard support is present; all table names must be explicitly listed. See Backup Sets for more information about peforming operations on collections of tables. Mutually exclusive with the -s option; one of these named options are required.

要备份的表的逗号分隔列表。如果没有指定表,则备份所有表。没有正则表达式或通配符支持;所有表名必须显式列出。请参阅备份集,以获得关于在表集合上进行的关于peform操作的更多信息。与-s选项互斥;其中一个命名选项是必需的。

-s <backup_set_name>

Identify tables to backup based on a backup set. See Using Backup Sets for the purpose and usage of backup sets. Mutually exclusive with the -t option.

根据备份集确定表的备份。请参阅使用备份集的目的和使用备份集。与-t选项互斥。

-w <number_workers>

(Optional) Specifies the number of parallel workers to copy data to backup destination. Backups are currently executed by MapReduce jobs so this value corresponds to the number of Mappers that will be spawned by the job.

(可选)指定将数据复制到备份目的地的并行工作人员的数量。备份目前由MapReduce作业执行,所以这个值对应于由作业生成的映射器的数量。

-b <bandwidth_per_worker>

(Optional) Specifies the bandwidth of each worker in MB per second.

(可选)指定每个工作人员每秒的带宽。

-d

(Optional) Enables "DEBUG" mode which prints additional logging about the backup creation.

(可选)启用“调试”模式,该模式打印关于备份创建的额外日志记录。

-q <name>

(Optional) Allows specification of the name of a YARN queue which the MapReduce job to create the backup should be executed in. This option is useful to prevent backup tasks from stealing resources away from other MapReduce jobs of high importance.

(可选)允许指定一个纱线队列的名称,以便在MapReduce任务中创建备份。此选项有助于防止备份任务从其他高度重要的MapReduce作业中窃取资源。

80.1.3. Example usage

80.1.3。示例使用

$ hbase backup create full hdfs://host5:8020/data/backup -t SALES2,SALES3 -w 3

This command creates a full backup image of two tables, SALES2 and SALES3, in the HDFS instance who NameNode is host5:8020 in the path /data/backup. The -w option specifies that no more than three parallel works complete the operation.

该命令在HDFS实例中创建了两个表、SALES2和SALES3的完整备份映像,在路径/数据/备份中NameNode是host5:8020。w选项指定不超过三个并行工作完成操作。

80.2. Restoring a Backup Image

80.2。恢复备份映像

Run the following command as an HBase superuser. You can only restore a backup on a running HBase cluster because the data must be redistributed the RegionServers for the operation to complete successfully.

运行以下命令作为HBase超级用户。您只能在运行的HBase集群上恢复备份,因为必须将数据重新分配到区域服务器,以便操作成功完成。

hbase restore <backup_path> <backup_id>

80.2.1. Positional Command-Line Arguments

80.2.1。位置命令行参数

backup_path

The backup_path argument specifies the full filesystem URI of where to store the backup image. Valid prefixes are are hdfs:, webhdfs:, gpfs:, and s3fs:.

backup_path参数指定存储备份映像的位置的完整文件系统URI。有效的前缀是hdfs:, webhdfs:, gpfs:,和s3fs:。

backup_id

The backup ID that uniquely identifies the backup image to be restored.

唯一标识要恢复的备份映像的备份ID。

80.2.2. Named Command-Line Arguments

80.2.2。指定命令行参数

-t <table_name[,table_name]>

A comma-separated list of tables to restore. See Backup Sets for more information about peforming operations on collections of tables. Mutually exclusive with the -s option; one of these named options are required.

要恢复的表的逗号分隔列表。请参阅备份集,以获得关于在表集合上进行的关于peform操作的更多信息。与-s选项互斥;其中一个命名选项是必需的。

-s <backup_set_name>

Identify tables to backup based on a backup set. See Using Backup Sets for the purpose and usage of backup sets. Mutually exclusive with the -t option.

根据备份集确定表的备份。请参阅使用备份集的目的和使用备份集。与-t选项互斥。

-q <name>

(Optional) Allows specification of the name of a YARN queue which the MapReduce job to create the backup should be executed in. This option is useful to prevent backup tasks from stealing resources away from other MapReduce jobs of high importance.

(可选)允许指定一个纱线队列的名称,以便在MapReduce任务中创建备份。此选项有助于防止备份任务从其他高度重要的MapReduce作业中窃取资源。

-c

(Optional) Perform a dry-run of the restore. The actions are checked, but not executed.

(可选)执行恢复的干运行。操作会被检查,但不会执行。

-m <target_tables>

(Optional) A comma-separated list of tables to restore into. If this option is not provided, the original table name is used. When this option is provided, there must be an equal number of entries provided in the -t option.

(可选)以逗号分隔的表列表,以恢复。如果没有提供此选项,则使用原始表名。当提供此选项时,在-t选项中必须提供相等数量的条目。

-o

(Optional) Overwrites the target table for the restore if the table already exists.

(可选)如果表已经存在,则重写目标表。

80.2.3. Example of Usage

80.2.3。使用的例子

hbase backup restore /tmp/backup_incremental backupId_1467823988425 -t mytable1,mytable2

This command restores two tables of an incremental backup image. In this example: • /tmp/backup_incremental is the path to the directory containing the backup image. • backupId_1467823988425 is the backup ID. • mytable1 and mytable2 are the names of tables in the backup image to be restored.

这个命令恢复了一个增量备份映像的两个表。在本例中:•/tmp/ backup_递增是包含备份映像的目录的路径。•backupId_1467823988425是备份ID。•mytable1和mytable2是备份映像中要恢复的表名。

80.3. Merging Incremental Backup Images

80.3。合并增量备份映像

This command can be used to merge two or more incremental backup images into a single incremental backup image. This can be used to consolidate multiple, small incremental backup images into a single larger incremental backup image. This command could be used to merge hourly incremental backups into a daily incremental backup image, or daily incremental backups into a weekly incremental backup.

此命令可用于将两个或多个增量备份映像合并到单个增量备份映像中。这可以用于将多个小增量备份映像合并到一个更大的增量备份映像中。该命令可用于将每小时增量备份合并为每日增量备份映像,或将每日增量备份合并为每周增量备份。

$ hbase backup merge <backup_ids>

80.3.1. Positional Command-Line Arguments

80.3.1。位置命令行参数

backup_ids

A comma-separated list of incremental backup image IDs that are to be combined into a single image.

一个由逗号分隔的增量备份图像id列表,该列表将合并成一个图像。

80.3.2. Named Command-Line Arguments

80.3.2。指定命令行参数

None.

一个也没有。

80.3.3. Example usage

80.3.3。示例使用

$ hbase backup merge backupId_1467823988425,backupId_1467827588425

80.4. Using Backup Sets

80.4。使用备份集

Backup sets can ease the administration of HBase data backups and restores by reducing the amount of repetitive input of table names. You can group tables into a named backup set with the hbase backup set add command. You can then use the -set option to invoke the name of a backup set in the hbase backup create or hbase backup restore rather than list individually every table in the group. You can have multiple backup sets.

备份集可以通过减少表名重复输入的数量来简化HBase数据备份和恢复的管理。您可以使用hbase备份集添加命令将表分组到指定的备份集。然后,您可以使用-set选项来调用hbase备份创建或hbase备份恢复中的备份集的名称,而不是单独列出组中的每个表。您可以有多个备份集。

Note the differentiation between the hbase backup set add command and the -set option. The hbase backup set add command must be run before using the -set option in a different command because backup sets must be named and defined before using backup sets as a shortcut.

If you run the hbase backup set add command and specify a backup set name that does not yet exist on your system, a new set is created. If you run the command with the name of an existing backup set name, then the tables that you specify are added to the set.

如果运行hbase备份集添加命令并指定系统中尚未存在的备份集名称,则会创建一个新集合。如果使用现有备份集名称的名称运行该命令,则将指定的表添加到集合中。

In this command, the backup set name is case-sensitive.

在这个命令中,备份集名称是区分大小写的。

The metadata of backup sets are stored within HBase. If you do not have access to the original HBase cluster with the backup set metadata, then you must specify individual table names to restore the data.

To create a backup set, run the following command as the HBase superuser:

要创建备份集,请运行以下命令作为HBase超级用户:

$ hbase backup set <subcommand> <backup_set_name> <tables>

80.4.1. Backup Set Subcommands

80.4.1。备份设置子命令

The following list details subcommands of the hbase backup set command.

下面的列表详细说明了hbase备份集命令的子命令。

You must enter one (and no more than one) of the following subcommands after hbase backup set to complete an operation. Also, the backup set name is case-sensitive in the command-line utility.
add

Adds table[s] to a backup set. Specify a backup_set_name value after this argument to create a backup set.

将表[s]添加到备份集。在此参数之后指定backup_set_name值,以创建备份集。

remove

Removes tables from the set. Specify the tables to remove in the tables argument.

从集合中删除表。在表参数中指定要删除的表。

list

Lists all backup sets.

列出所有的备份集。

describe

Displays a description of a backup set. The information includes whether the set has full or incremental backups, start and end times of the backups, and a list of the tables in the set. This subcommand must precede a valid value for the backup_set_name value.

显示一个备份集的描述。该信息包括该集合是否有完整或增量备份、开始和结束的备份时间,以及集合中的表的列表。这个子命令必须在backup_set_name值的有效值之前。

delete

Deletes a backup set. Enter the value for the backup_set_name option directly after the hbase backup set delete command.

删除备份集。在hbase备份集删除命令后,直接输入backup_set_name选项的值。

80.4.2. Positional Command-Line Arguments

80.4.2。位置命令行参数

backup_set_name

Use to assign or invoke a backup set name. The backup set name must contain only printable characters and cannot have any spaces.

用于分配或调用备份集名称。备份集名称必须仅包含可打印字符,不能有任何空格。

tables

List of tables (or a single table) to include in the backup set. Enter the table names as a comma-separated list. If no tables are specified, all tables are included in the set.

备份集中包含的表(或单个表)的列表。将表名作为逗号分隔的列表输入。如果没有指定表,那么所有表都包含在集合中。

Maintain a log or other record of the case-sensitive backup set names and the corresponding tables in each set on a separate or remote cluster, backup strategy. This information can help you in case of failure on the primary cluster.

80.4.3. Example of Usage

80.4.3。使用的例子

$ hbase backup set add Q1Data TEAM3,TEAM_4

Depending on the environment, this command results in one of the following actions:

根据环境的不同,此命令将导致以下操作之一:

  • If the Q1Data backup set does not exist, a backup set containing tables TEAM_3 and TEAM_4 is created.

    如果Q1Data备份集不存在,则创建一个包含表TEAM_3和TEAM_4的备份集。

  • If the Q1Data backup set exists already, the tables TEAM_3 and TEAM_4 are added to the Q1Data backup set.

    如果已经存在Q1Data备份集,则将表TEAM_3和TEAM_4添加到Q1Data备份集。

81. Administration of Backup Images

81年。管理备份映像

The hbase backup command has several subcommands that help with administering backup images as they accumulate. Most production environments require recurring backups, so it is necessary to have utilities to help manage the data of the backup repository. Some subcommands enable you to find information that can help identify backups that are relevant in a search for particular data. You can also delete backup images.

hbase备份命令有几个子命令,这些子命令有助于在它们累积时管理备份映像。大多数生产环境都需要重复的备份,因此有必要使用实用程序来帮助管理备份存储库的数据。一些子命令使您能够找到能够帮助识别与搜索特定数据相关的备份的信息。您还可以删除备份映像。

The following list details each hbase backup subcommand that can help administer backups. Run the full command-subcommand line as the HBase superuser.

下面的列表详细说明了可以帮助管理备份的每个hbase备份子命令。作为HBase超级用户运行完整的命令子命令行。

81.1. Managing Backup Progress

81.1。管理备份进度

You can monitor a running backup in another terminal session by running the hbase backup progress command and specifying the backup ID as an argument.

您可以通过运行hbase备份进度命令并指定备份ID作为参数来监视另一个终端会话中的运行备份。

For example, run the following command as hbase superuser to view the progress of a backup

例如,运行以下命令作为hbase超级用户来查看备份的进度。

$ hbase backup progress <backup_id>

81.1.1. Positional Command-Line Arguments

81.1.1。位置命令行参数

backup_id

Specifies the backup that you want to monitor by seeing the progress information. The backupId is case-sensitive.

通过查看进度信息指定要监视的备份。backupId是区分大小写的。

81.1.2. Named Command-Line Arguments

81.1.2。指定命令行参数

None.

一个也没有。

81.1.3. Example usage

81.1.3。示例使用

hbase backup progress backupId_1467823988425

81.2. Managing Backup History

81.2。管理备份历史

This command displays a log of backup sessions. The information for each session includes backup ID, type (full or incremental), the tables in the backup, status, and start and end time. Specify the number of backup sessions to display with the optional -n argument.

这个命令显示一个备份会话日志。每个会话的信息包括备份ID、类型(全部或增量)、备份中的表、状态、开始和结束时间。指定用可选的-n参数显示的备份会话的数量。

$ hbase backup history <backup_id>

81.2.1. Positional Command-Line Arguments

81.2.1。位置命令行参数

backup_id

Specifies the backup that you want to monitor by seeing the progress information. The backupId is case-sensitive.

通过查看进度信息指定要监视的备份。backupId是区分大小写的。

81.2.2. Named Command-Line Arguments

81.2.2。指定命令行参数

-n <num_records>

(Optional) The maximum number of backup records (Default: 10).

(可选)备份记录的最大数量(默认为10)。

-p <backup_root_path>

The full filesystem URI of where backup images are stored.

存储备份映像的完整文件系统URI。

-s <backup_set_name>

The name of the backup set to obtain history for. Mutually exclusive with the -t option.

要获取历史的备份集的名称。与-t选项互斥。

-t <table_name>

The name of table to obtain history for. Mutually exclusive with the -s option.

表的名称以获取历史。与-s选项互斥。

81.2.3. Example usage

81.2.3。示例使用

$ hbase backup history
$ hbase backup history -n 20
$ hbase backup history -t WebIndexRecords

81.3. Describing a Backup Image

81.3。描述一个备份映像

This command can be used to obtain information about a specific backup image.

该命令可用于获取关于特定备份映像的信息。

$ hbase backup describe <backup_id>

81.3.1. Positional Command-Line Arguments

81.3.1。位置命令行参数

backup_id

The ID of the backup image to describe.

要描述的备份映像的ID。

81.3.2. Named Command-Line Arguments

81.3.2。指定命令行参数

None.

一个也没有。

81.3.3. Example usage

81.3.3。示例使用

$ hbase backup describe backupId_1467823988425

81.4. Deleting a Backup Image

81.4。删除备份映像

This command can be used to delete a backup image which is no longer needed.

此命令可用于删除不再需要的备份映像。

$ hbase backup delete <backup_id>

81.4.1. Positional Command-Line Arguments

81.4.1。位置命令行参数

backup_id

The ID to the backup image which should be deleted.

应该删除的备份映像的ID。

81.4.2. Named Command-Line Arguments

81.4.2。指定命令行参数

None.

一个也没有。

81.4.3. Example usage

81.4.3。示例使用

$ hbase backup delete backupId_1467823988425

81.5. Backup Repair Command

81.5。备份修复命令

This command attempts to correct any inconsistencies in persisted backup metadata which exists as the result of software errors or unhandled failure scenarios. While the backup implementation tries to correct all errors on its own, this tool may be necessary in the cases where the system cannot automatically recover on its own.

此命令试图纠正由于软件错误或未处理的故障场景而存在的持久备份元数据中的任何不一致。虽然备份实现试图自己纠正所有错误,但是在系统不能自动恢复的情况下,这个工具可能是必需的。

$ hbase backup repair

81.5.1. Positional Command-Line Arguments

81.5.1。位置命令行参数

None.

一个也没有。

81.6. Named Command-Line Arguments

81.6。指定命令行参数

None.

一个也没有。

81.6.1. Example usage

81.6.1。示例使用

$ hbase backup repair

82. Configuration keys

82年。配置钥匙

The backup and restore feature includes both required and optional configuration keys.

备份和恢复功能包括必需的和可选的配置键。

82.1. Required properties

82.1。必需的属性

hbase.backup.enable: Controls whether or not the feature is enabled (Default: false). Set this value to true.

hbase.backup。启用:控制功能是否启用(默认为false)。将此值设置为true。

hbase.master.logcleaner.plugins: A comma-separated list of classes invoked when cleaning logs in the HBase Master. Set this value to org.apache.hadoop.hbase.backup.master.BackupLogCleaner or append it to the current value.

hbase.master.logcleaner。插件:在HBase中清理日志时调用的类的逗号分隔列表。将此值设置为org.apache.hadoop.hbase.backup.master。BackupLogCleaner或将其附加到当前值。

hbase.procedure.master.classes: A comma-separated list of classes invoked with the Procedure framework in the Master. Set this value to org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager or append it to the current value.

hbase.procedure.master。类:用主程序框架调用的类的逗号分隔列表。将此值设置为org.apache.hadoop.hbase.backup.master。logrollmasterprocessduremanager或将其附加到当前值。

hbase.procedure.regionserver.classes: A comma-separated list of classes invoked with the Procedure framework in the RegionServer. Set this value to org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager or append it to the current value.

hbase.procedure.regionserver。类:在区域服务器中使用过程框架调用的类的逗号分隔列表。将此值设置为org.apache.hadoop.hbase. backup.localserver。logrollregiserverprocessduremanager或将其附加到当前值。

hbase.coprocessor.region.classes: A comma-separated list of RegionObservers deployed on tables. Set this value to org.apache.hadoop.hbase.backup.BackupObserver or append it to the current value.

hbase.coprocessor.region。类:部署在表上的区域观察者的逗号分隔列表。将此值设置为org.apache.hadoop.hbase.backup。BackupObserver或将其附加到当前值。

hbase.master.hfilecleaner.plugins: A comma-separated list of HFileCleaners deployed on the Master. Set this value to org.apache.hadoop.hbase.backup.BackupHFileCleaner or append it to the current value.

hbase.master.hfilecleaner。插件:在主服务器上部署的由逗号分隔的HFileCleaners列表。将此值设置为org.apache.hadoop.hbase.backup。BackupHFileCleaner或将其附加到当前值。

82.2. Optional properties

82.2。可选属性

hbase.backup.system.ttl: The time-to-live in seconds of data in the hbase:backup tables (default: forever). This property is only relevant prior to the creation of the hbase:backup table. Use the alter command in the HBase shell to modify the TTL when this table already exists. See the below section for more details on the impact of this configuration property.

hbase.backup.system。ttl:在hbase中以秒为单位的时间:备份表(默认值:永远)。此属性仅在创建hbase之前相关:备份表。当该表已经存在时,使用HBase shell中的alter命令修改TTL。有关此配置属性的影响,请参阅下面的部分。

hbase.backup.attempts.max: The number of attempts to perform when taking hbase table snapshots (default: 10).

hbase.backup.attempts。max:在使用hbase表快照(默认值:10)时执行的尝试次数。

hbase.backup.attempts.pause.ms: The amount of time to wait between failed snapshot attempts in milliseconds (default: 10000).

hbase.backup.attempts.pause。ms:在失败的快照尝试之间等待几毫秒(默认为10000)的时间。

hbase.backup.logroll.timeout.millis: The amount of time (in milliseconds) to wait for RegionServers to execute a WAL rolling in the Master’s procedure framework (default: 30000).

timeout.millis:时间(以毫秒为单位)等待区域服务器在主程序框架(默认值:30000)中执行一个“WAL”。

83. Best Practices

83年。最佳实践

83.1. Formulate a restore strategy and test it.

83.1。制定一个恢复策略并进行测试。

Before you rely on a backup and restore strategy for your production environment, identify how backups must be performed, and more importantly, how restores must be performed. Test the plan to ensure that it is workable. At a minimum, store backup data from a production cluster on a different cluster or server. To further safeguard the data, use a backup location that is at a different physical location.

在您依赖于备份和恢复生产环境策略之前,请确定如何执行备份,更重要的是,如何执行恢复。测试计划以确保它是可行的。至少,从不同集群或服务器上的生产集群中存储备份数据。为了进一步维护数据,请使用位于不同物理位置的备份位置。

If you have a unrecoverable loss of data on your primary production cluster as a result of computer system issues, you may be able to restore the data from a different cluster or server at the same site. However, a disaster that destroys the whole site renders locally stored backups useless. Consider storing the backup data and necessary resources (both computing capacity and operator expertise) to restore the data at a site sufficiently remote from the production site. In the case of a catastrophe at the whole primary site (fire, earthquake, etc.), the remote backup site can be very valuable.

如果由于计算机系统问题,您的主要生产集群上有不可恢复的数据丢失,您可能可以从同一站点的不同集群或服务器恢复数据。然而,破坏整个站点的灾难使本地存储的备份变得无用。考虑将备份数据和必要的资源(计算能力和操作人员的专业知识)存储在一个离生产站点足够远的站点上恢复数据。在整个主要站点(火灾、地震等)发生灾难的情况下,远程备份站点可能非常有价值。

83.2. Secure a full backup image first.

83.2。首先确保完整的备份映像。

As a baseline, you must complete a full backup of HBase data at least once before you can rely on incremental backups. The full backup should be stored outside of the source cluster. To ensure complete dataset recovery, you must run the restore utility with the option to restore baseline full backup. The full backup is the foundation of your dataset. Incremental backup data is applied on top of the full backup during the restore operation to return you to the point in time when backup was last taken.

作为基线,您必须至少在您可以依赖增量备份之前完成一次HBase数据的完整备份。完整备份应存储在源集群之外。为了确保完整的数据集恢复,您必须使用恢复基线完全备份的选项来运行恢复实用程序。完全备份是数据集的基础。增量备份数据在恢复操作中应用于完整备份的顶部,以便在最后一次备份时将您返回到时间点。

83.3. Define and use backup sets for groups of tables that are logical subsets of the entire dataset.

83.3。定义和使用对整个数据集的逻辑子集的表组的备份集。

You can group tables into an object called a backup set. A backup set can save time when you have a particular group of tables that you expect to repeatedly back up or restore.

您可以将表分组到一个称为备份集的对象中。备份集可以节省您希望重复备份或恢复的特定组表的时间。

When you create a backup set, you type table names to include in the group. The backup set includes not only groups of related tables, but also retains the HBase backup metadata. Afterwards, you can invoke the backup set name to indicate what tables apply to the command execution instead of entering all the table names individually.

当您创建一个备份集时,您可以在组中键入表名。备份集不仅包括相关表的组,而且还保留了HBase备份元数据。然后,您可以调用备份集名称来指示哪些表应用于命令执行,而不是单独输入所有的表名。

83.4. Document the backup and restore strategy, and ideally log information about each backup.

83.4。记录备份和恢复策略,最好记录每个备份的信息。

Document the whole process so that the knowledge base can transfer to new administrators after employee turnover. As an extra safety precaution, also log the calendar date, time, and other relevant details about the data of each backup. This metadata can potentially help locate a particular dataset in case of source cluster failure or primary site disaster. Maintain duplicate copies of all documentation: one copy at the production cluster site and another at the backup location or wherever it can be accessed by an administrator remotely from the production cluster.

记录整个过程,使知识库能够在员工离职后转移到新的管理人员。作为额外的安全预防措施,还可以记录每个备份数据的日历日期、时间和其他相关细节。此元数据可能有助于在源集群故障或主站点灾难的情况下定位特定的数据集。维护所有文档的副本:在生产集群站点上复制一个副本,在备份位置或任何可以由管理员远程从生产集群访问的地方复制一个副本。

84. Scenario: Safeguarding Application Datasets on Amazon S3

84年。场景:保护Amazon S3上的应用程序数据集。

This scenario describes how a hypothetical retail business uses backups to safeguard application data and then restore the dataset after failure.

此场景描述了假设的零售业务如何使用备份来保护应用程序数据,然后在失败后恢复数据集。

The HBase administration team uses backup sets to store data from a group of tables that have interrelated information for an application called green. In this example, one table contains transaction records and the other contains customer details. The two tables need to be backed up and be recoverable as a group.

HBase管理团队使用备份集来存储来自一组表的数据,这些表包含一个称为绿色的应用程序的相关信息。在本例中,一个表包含事务记录,另一个表包含客户详细信息。这两个表需要备份并作为一个组可恢复。

The admin team also wants to ensure daily backups occur automatically.

管理团队还希望确保每天自动备份。

backup app components
Figure 7. Tables Composing The Backup Set

The following is an outline of the steps and examples of commands that are used to backup the data for the green application and to recover the data later. All commands are run when logged in as HBase superuser.

下面是用于为绿色应用程序备份数据并在稍后恢复数据的命令的步骤和示例的概要。当以HBase超级用户登录时,所有命令都将运行。

  1. A backup set called green_set is created as an alias for both the transactions table and the customer table. The backup set can be used for all operations to avoid typing each table name. The backup set name is case-sensitive and should be formed with only printable characters and without spaces.

    一个名为green_set的备份集被创建为事务表和customer表的别名。备份集可以用于所有操作,以避免键入每个表名。备份集名称是大小写敏感的,应该只使用可打印字符和没有空格的格式。

$ hbase backup set add green_set transactions
$ hbase backup set add green_set customer
  1. The first backup of green_set data must be a full backup. The following command example shows how credentials are passed to Amazon S3 and specifies the file system with the s3a: prefix.

    green_set数据的第一个备份必须是一个完整的备份。下面的命令示例说明如何将凭据传递给Amazon S3,并使用s3a:前缀指定文件系统。

$ ACCESS_KEY=ABCDEFGHIJKLMNOPQRST
$ SECRET_KEY=123456789abcdefghijklmnopqrstuvwxyzABCD
$ sudo -u hbase hbase backup create full\
  s3a://$ACCESS_KEY:SECRET_KEY@prodhbasebackups/backups -s green_set
  1. Incremental backups should be run according to a schedule that ensures essential data recovery in the event of a catastrophe. At this retail company, the HBase admin team decides that automated daily backups secures the data sufficiently. The team decides that they can implement this by modifying an existing Cron job that is defined in /etc/crontab. Consequently, IT modifies the Cron job by adding the following line:

    增量备份应该按照确保在灾难发生时的基本数据恢复的时间表运行。在这个零售公司,HBase管理团队决定自动的日常备份可以充分地保护数据。团队决定他们可以通过修改在/etc/crontab中定义的现有Cron作业来实现这一点。因此,它通过添加以下一行来修改Cron作业:

@daily hbase hbase backup create incremental s3a://$ACCESS_KEY:$SECRET_KEY@prodhbasebackups/backups -s green_set
  1. A catastrophic IT incident disables the production cluster that the green application uses. An HBase system administrator of the backup cluster must restore the green_set dataset to the point in time closest to the recovery objective.

    一个灾难性的IT事件会禁用绿色应用程序所使用的生产集群。备份集群的HBase系统管理员必须将green_set数据集恢复到最接近恢复目标的时间点。

If the administrator of the backup HBase cluster has the backup ID with relevant details in accessible records, the following search with the hdfs dfs -ls command and manually scanning the backup ID list can be bypassed. Consider continuously maintaining and protecting a detailed log of backup IDs outside the production cluster in your environment.

The HBase administrator runs the following command on the directory where backups are stored to print

HBase管理员在备份存储的目录上运行以下命令。