hadoop---hbase--hdfs-块损坏--EOF while reading metadata file
- UID
- 1066743
|
hadoop---hbase--hdfs-块损坏--EOF while reading metadata file
遇到问题
报错如下:
EOF while reading metadata file header
或者
Problem reading HFile Trailer from file hdfs://host250:8020/hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be
如图:
原因
首先可以先cat一下文件看看是否能够访问:
hadoop fs -cat /hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be
发现有输出。
首先检查这张表是否存储一致性问题
hbase hbck -details tablename
发现的确出现了2个不一致的地方
743 inconsistencies detected.
既然不一致,尝试修复一下:
hbase hbck -repair tablename
这个功能要管理权限,使用慎重!修复完了以后结果如下
Summary:
Table hbase:meta is okay.
Number of regions: 1
Deployed on: ctum2f0602005.idc.wanda-group.net,60020,1482504754412
Table idctag:user_basic_info is okay.
Number of regions: 124
0 inconsistencies detected.
Status: OK
测试一下是否修复:
hbase(main):001:0> get 'mynamespace:user_basic_info','BAC3510A922CF026500874EA3975E123'
如果明确是哪一个块出问题,可以使用块命令检查:
hdfs fsck /hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be
输出如下:
[zzq@host252 ~]$ hdfs fsck /hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be
Connecting to namenode via http://host250:50070/fsck?ugi=zzq&path=%2Fhbase%2Fdata%2Fdefault%2Fchipdata%2Fc911d705e54912187a9d0c4dd3314550%2Fd%2Fa8bc31a879a24264a02b640786b778be
FSCK started by zzq (auth:SIMPLE) from /192.168.30.252 for path /hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be at Fri Jun 15 17:02:19 CST 2018
.
/hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be: CORRUPT blockpool BP-1746182121-192.168.30.250-1511263334312 block blk_1075413120
/hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be: CORRUPT blockpool BP-1746182121-192.168.30.250-1511263334312 block blk_1075413121
/hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be: CORRUPT blockpool BP-1746182121-192.168.30.250-1511263334312 block blk_1075413122
/hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be: MISSING 3 blocks of total size 302037078 B.Status: CORRUPT
Total size: 8355100758 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 63 (avg. block size 132620646 B)
********************************
UNDER MIN REPL'D BLOCKS: 3 (4.7619047 %)
dfs.namenode.replication.min: 1
CORRUPT FILES: 1
MISSING BLOCKS: 3
MISSING SIZE: 302037078 B
CORRUPT BLOCKS: 3
********************************
Minimally replicated blocks: 60 (95.2381 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 1.9047619
Corrupt blocks: 3
Missing replicas: 0 (0.0 %)
Number of data-nodes: 5
Number of racks: 1
FSCK ended at Fri Jun 15 17:02:19 CST 2018 in 1 milliseconds
The filesystem under path '/hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be' is CORRUPT
[zzq@host252 ~]$
解决方法
使用修复命令
hbase hbck -repair tablename
如果修复无效,只能移除损坏的块。
移除损坏的块到跟hdfs根目录
hadoop fs -move /hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be /
如果无法移动就使用删除
hadoop fsck / -delete /hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be
或者
hadoop fs -rm /hbase/data/default/chipdata/c911d705e54912187a9d0c4dd3314550/d/a8bc31a879a24264a02b640786b778be
查看集群状况
hbase hbck
输出
743 inconsistencies detected.
表示不一致的块有743个。
其他参考命令
重新修复hbase meta表
hbase hbck -fixMeta
重新将hbase meta表分给regionserver
hbase hbck -fixAssignments |
|
|
|
|
|