我们首先查看一下HDFS中的HBASE存储,可以找到其中几个目录
1 |
hdfs dfs -ls -R /hbase |
临时文件 /hbase/.tmp
归档 /hbase/archive
WAL日志 /hbase/WALs/debugo01 …
数据 /hbase/data/
我们下面建立’member’这张表:
1 2 3 |
> create 'member','id','address','info' 0 row(s) in 0.4790 seconds => Hbase::Table – member |
这时候在HDFS中我们没有找到member所属的文件。下面我们put一条数据:
1 2 3 4 5 6 |
put 'member', 'debugo','id','11' put 'member', 'debugo','info:age','27' put 'member', 'debugo','info:birthday','1987-04-04' put 'member', 'debugo','info:industry', 'it' put 'member', 'debugo','address:city','beijing' put 'member', 'debugo','address:country','china' |
再次查看HDFS中的文件:
1 2 3 4 5 6 7 8 9 10 |
$ hdfs dfs -ls -R /hbase | grep member drwxr-xr-x - hbase hadoop 0 2015-03-12 09:57 /hbase/data/default/member drwxr-xr-x - hbase hadoop 0 2015-03-12 09:57 /hbase/data/default/member/.tabledesc -rw-r--r-- 3 hbase hadoop 777 2015-03-12 09:57 /hbase/data/default/member/.tabledesc/.tableinfo.0000000001 drwxr-xr-x - hbase hadoop 0 2015-03-12 09:57 /hbase/data/default/member/.tmp drwxr-xr-x - hbase hadoop 0 2015-03-12 09:57 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401 -rw-r--r-- 3 hbase hadoop 39 2015-03-12 09:57 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/.regioninfo drwxr-xr-x - hbase hadoop 0 2015-03-12 09:57 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address drwxr-xr-x - hbase hadoop 0 2015-03-12 09:57 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/id drwxr-xr-x - hbase hadoop 0 2015-03-12 09:57 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/info |
对于这个表,HBase维护了一个tabledesc,regioninfo等元信息文件,以及各个列簇对应的目录文件。但是各个列簇下面为空。这是因为现在数据还在RegionServer的memstore之中,并未写入到HDFS的HFile之中,下面我们进行flush操作。
1 2 3 4 5 6 7 |
> flush 'member' 0 row(s) in 2.3840 seconds $ hdfs dfs -ls -R /hbase | grep member …… /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address -rw-r--r-- 3 hbase hadoop 1069 2015-03-12 10:10 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address/dc68317f743d468d8fac5ab6227e2af0 …… |
持久化的HFile已经生成。下面我们再次插入数据。通过flush,新的storefile已经生成。
1 2 3 4 5 6 7 8 9 10 11 12 13 |
put 'member', 'Sariel', 'id', '21' put 'member', 'Sariel','info:age', '26' put 'member', 'Sariel','info:birthday', '1988-05-09 ' put 'member', 'Sariel','info:industry', 'it' put 'member', 'Sariel','address:city', 'beijing' put 'member', 'Sariel','address:country', 'china' > flush 'member' $ hdfs dfs -ls -R /hbase | grep member …… drwxr-xr-x - hbase hadoop 0 2015-03-12 10:16 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address -rw-r--r-- 3 hbase hadoop 1069 2015-03-12 10:16 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address/4eb78a90ba524c0490b0f411a3f1db05 -rw-r--r-- 3 hbase hadoop 1069 2015-03-12 10:10 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address/dc68317f743d468d8fac5ab6227e2af0 …… |
然后进行Compaction。通过命令行的compact(minor)或major_compact以及WebUI都可以进行。其中major_compact的使用方法如下:
1 2 3 4 5 6 7 8 |
hbase> major_compact 't1' hbase> major_compact 'ns1:t1' Compact an entire region: hbase> major_compact 'r1' Compact a single column family within a region: hbase> major_compact 'r1', 'c1' Compact a single column family within a table: hbase> major_compact 't1', 'c1' |
下面对’member’全表级进行一次major compaction。但是在生产中一定要对列簇级进行,以避免对系统的影响太大。
1 2 3 4 5 6 7 |
> major_compact 'member' …… $ hdfs dfs -ls -R /hbase | grep member drwxr-xr-x - hbase hadoop 0 2015-03-12 10:21 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address -rw-r--r-- 3 hbase hadoop 1098 2015-03-12 10:21 /hbase/data/default/member/52a7a893f05471d44cf239f5707c5401/address/3f46fb9d6f5d40de93ab8f226e6612fa …… |
^^
还有:虚张声势,矫揉造作、蛮横无理……