2018-07-31

RocketMQ的存储索引的实现方法

RocketMQ存储索引解读

索引结构

索引包括3个部分，分别是Header、Slot、Index。这3个部分是顺序存储的。其中Header的长度为40个字节。

Header

beginTimestamp	endTimestamp	beginPhyOffset	endPhyOffset	hashSlotCount	indexCount
long	long	long	long	int	int
8	8	8	8	4	4

Slot

indexCount
int
4

Index

keyHash	phyOffset	timeDiff	slotValue
int	long	int	int
4	8	4	4

索引文件存储的长度为
IndexHeader.INDEX_HEADER_SIZE + (hashSlotNum * hashSlotSize) + (indexNum * indexSize)
默认值为
40 + 5000000 * 4 + 5000000 * 4 * 20
其中文件头的长度为40字节，slot的长度为4字节， index的长度为20字节

Slot和Index的位置算法

int keyHash = indexKeyHashMethod(key);
int slotPos = keyHash % this.hashSlotNum;
int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize;
int absIndexPos = IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize
        + this.indexHeader.getIndexCount() * indexSize;

Slot和Index的内容

this.mappedByteBuffer.putInt(absIndexPos, keyHash);
this.mappedByteBuffer.putLong(absIndexPos + 4, phyOffset);
this.mappedByteBuffer.putInt(absIndexPos + 4 + 8, (int) timeDiff);
this.mappedByteBuffer.putInt(absIndexPos + 4 + 8 + 4, slotValue);

this.mappedByteBuffer.putInt(absSlotPos, this.indexHeader.getIndexCount());

搜索算法

计算KEY在索引文件中的位置方法如下：

计算KEY的哈希值
用计算的结果除以slot的数量，取余
头文件长度 + 余数 * slot的长度即为slot的位置
slot中存储的值为index的数量
头文件长度 + slot数量 slot的长度 + index数量 index的长度即为index的位置
按index的内容取出数据

索引文件名称

索引文件的名称是由时间戳构成的

String.format("%04d%02d%02d%02d%02d%02d%03d", 
    cal.get(Calendar.YEAR), 
    cal.get(Calendar.MONTH) + 1,
    cal.get(Calendar.DAY_OF_MONTH), 
    cal.get(Calendar.HOUR_OF_DAY), 
    cal.get(Calendar.MINUTE), 
    cal.get(Calendar.SECOND),
    cal.get(Calendar.MILLISECOND));

优化手段

服务在启动时会把所有的索引文件加载到内存中
定期清理过期索引文件
异步同步索引到磁盘

疑问

slot的作用是什么呢，为什么不直接使用index
对于KEY的哈希计算有一定的几率计算结果相同，即哈希值相同
由于代码中把文件锁的代码注释掉了，这样不会有并发问题吗

用到的类

ByteBuffer
MappedByteBuffer
FileChannel
FileLock