ConcurrentHashMap中tabAtsetTabAt方法的意义所在

jiezi

6 年前

在学习 ConcurrentHashMap 时发现，源码中对 table 数组的元素进行操作时，使用了三个封装好的原子操作方法，如下：

/* ---------------- Table element access -------------- */

/*
 * Atomic access methods are used for table elements as well as
 * elements of in-progress next table while resizing.  All uses of
 * the tab arguments must be null checked by callers.  All callers
 * also paranoically precheck that tab's length is not zero (or an
 * equivalent check), thus ensuring that any index argument taking
 * the form of a hash value anded with (length - 1) is a valid
 * index.  Note that, to be correct wrt arbitrary concurrency
 * errors by users, these checks must operate on local variables,
 * which accounts for some odd-looking inline assignments below.
 * Note that calls to setTabAt always occur within locked regions,
 * and so require only release ordering.
 */

@SuppressWarnings("unchecked")
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {return (Node<K,V>)U.getObjectAcquire(tab, ((long)i << ASHIFT) + ABASE);
}

static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
                                    Node<K,V> c, Node<K,V> v) {return U.compareAndSetObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}

static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {U.putObjectRelease(tab, ((long)i << ASHIFT) + ABASE, v);
}

casTabAt这个方法我们可以很清晰地明白它封装了对于数组元素的 cas 操作，但是另外两个方法的意义如何理解呢？

源码的作者使用 Unsafe 直接通过数组内存地址以及索引偏移量去访问和修改数组元素的值，和我们直接使用 java 代码访问（arr[i]）、赋值（arr[i] = x ）有什么区别呢？

请教了成哥（同事）后得到了一个重要的点：数组越界异常ArrayIndexOutOfBoundsException

如果 java 中的数组和 c 一样，仅仅是一个指针的话，那么也许通过 arr[i] 访问和通过内存地址访问不会有什么区别，但是由于 ArrayIndexOutOfBoundsException 这个众所周知的异常，我们可以推断：java 中的数组是经过了包装的
另一个可以从侧面印证的点是arr.length

大概搜索了一下了解到以下的知识（不保证正确）：

jvm 会为数组对象动态创建 Class 类文件，其标识为[|
（HotSpot VM 中）java 对象的对象头（Object header）内会有一段用于记录数组长度的数据

不敢再深挖了，感觉是个大坑。。

总结：ConcurrentHashMap 中针对 table 数组的 Unsafe 访问和赋值是存在优化的意义的。

以上为抛砖引玉。。

参考链接：
知乎 - 请问 Java 数组的 length 行为是如何实现的？