关于java:synchronized关键字的内存语义及实现

1.同步的语义

上面的内容摘自JSR 133 FAQ:

Synchronization has several aspects. The most well-understood is mutual exclusion -- only one thread can hold a monitor at once, so synchronizing on a monitor means that once one thread enters a synchronized block protected by a monitor, no other thread can enter a block protected by that monitor until the first thread exits the synchronized block.

同步有几个方面。最容易了解的是互斥 —— 只有一个线程能够立刻持有一个监视器，因而在监视器上进行同步意味着一旦一个线程进入由一个监视器爱护的同步块，则其余线程都不能进入该监视器爱护的块，直到第一个线程退出同步块。

But there is more to synchronization than mutual exclusion. Synchronization ensures that memory writes by a thread before or during a synchronized block are made visible in a predictable manner to other threads which synchronize on the same monitor. After we exit a synchronized block, we release the monitor, which has the effect of flushing the cache to main memory, so that writes made by this thread can be visible to other threads. Before we can enter a synchronized block, we acquire the monitor, which has the effect of invalidating the local processor cache so that variables will be reloaded from main memory. We will then be able to see all of the writes made visible by the previous release.

然而同步不仅仅是互斥。 同步确保以可预感的形式，使线程在同步块之前或期间对内存的写入对于在同一监视器上同步的其余线程可见。 退出同步块后，咱们开释该监视器，其有将缓存刷新到主内存的成果，以便该线程进行的写入对于其余线程可见。在咱们进入一个同步块之前，咱们须要获取该监视器，该监视器具备使本地处理器缓存有效的作用，以便能够从主内存中从新加载变量。而后，咱们将可能看到以前开释中所有可见的写入。

Discussing this in terms of caches, it may sound as if these issues only affect multiprocessor machines. However, the reordering effects can be easily seen on a single processor. It is not possible, for example, for the compiler to move your code before an acquire or after a release. When we say that acquires and releases act on caches, we are using shorthand for a number of possible effects.

从高速缓存的角度进行探讨，听起来仿佛这些问题仅影响多处理器计算机。然而，重排序成果能够在单个处理器上轻松看到。例如，编译器不可能在获取之前或开释之后挪动代码。当咱们说获取和开释作用于缓存时，咱们应用简写来示意多种可能的影响。

The new memory model semantics create a partial ordering on memory operations (read field, write field, lock, unlock) and other thread operations (start and join), where some actions are said to happen before other operations. When one action happens before another, the first is guaranteed to be ordered before and visible to the second. The rules of this ordering are as follows:

新的内存模型语义在内存操作（读字段，写字段，锁定，解锁）和其余线程操作（ start 和 join ）上创立了局部排序，其中某些操作据说 happen before其余操作。当一个动作在另一个动作之前产生时，第一个动作被确保排序在第二个动作之前并且对于第二个动作可见。此排序规定如下：

Each action in a thread happens before every action in that thread that comes later in the program's order.

线程中的每个动作先于该线程中的在程序程序上后呈现的每个动作产生。

An unlock on a monitor happens before every subsequent lock on that same monitor.

监视器上的一个解锁产生在同一个监视器上的每个后续锁定之前。

A write to a volatile field happens before every subsequent read of that same volatile.

对 volatile 字段的每个写操作产生在每次后续读取同一个 volatile之前。

A call to start() on a thread happens before any actions in the started thread.

一个对线程的 start() 的调用产生在被启动线程中的任何操作之前。

All actions in a thread happen before any other thread successfully returns from a join() on that thread.

线程中的所有操作产生在其余线程胜利从该线程上的 join() 返回之前。

This means that any memory operations which were visible to a thread before exiting a synchronized block are visible to any thread after it enters a synchronized block protected by the same monitor, since all the memory operations happen before the release, and the release happens before the acquire.

这意味着线程在退出同步块之前对一个线程可见的任何内存操作，在进入受同一监视器爱护的同步块之后对于任何线程都是可见的，因为所有内存操作都产生在开释之前，而开释产生在获取之前。

能够看到同步的语义蕴含两点：一个是互斥，一个是保障可见性。

2 synchronized的根本应用

依据Java 语言标准可知：

Java外面的每个对象都关联着一个 monitor，一个线程能够 lock 或者 unlock这个 monitor。

对于一个类办法，该办法所在类的Class对象关联的monitor被应用。
对于一个实例办法，与this（某个调用该办法的实例对象）关联的monitor被应用。
对于一个同步块，即synchronized(obj)｛....｝,与obj关联的monitor被应用。

举个栗子：

package synchronizedTest; class Test {    int count;    //实例同步办法    synchronized void bump() {        count++;    }    static int classCount;    //类同步办法    static synchronized void classBump() {        classCount++;    }}

下面的代码等价于：

package synchronizedTest; class BumpTest {    int count;    void bump() {        //同步块        synchronized (this) { count++; }    }    static int classCount;    static void classBump() {        try {             //同步块            synchronized (Class.forName("BumpTest")) {                classCount++;            }        } catch (ClassNotFoundException e) {}    }}

3.从JVM字节码层面看同步块

反解析下下面两个类对应的字节码文件。

编译成class文件

javac synchronizedTest/Test.java

将Class文件反汇编下

javap -p -v synchronizedTest/Test > synchronizedTest/Test.disasm

相似地：

javac synchronizedTest/BumpTest.javajavap -p -v synchronizedTest/BumpTest > synchronizedTest/BumpTest.disasm

上面重点看下Test.disasm和BumpTest.disasm

BumpTest.bump办法节选

void bump();  descriptor: ()V  flags:  Code:    stack=3, locals=3, args_size=1       0: aload_0       1: dup       2: astore_1       3: monitorenter       4: aload_0       5: dup       6: getfield      #2                  // Field count:I       9: iconst_1      10: iadd      11: putfield      #2                  // Field count:I      14: aload_1      15: monitorexit      16: goto          24      19: astore_2      20: aload_1      21: monitorexit      22: aload_2      23: athrow      24: return    Exception table:       from    to  target type           4    16    19   any          19    22    19   any    LineNumberTable:      line 6: 0      line 7: 24

3.1 monitor_enter的阐明：lock特定对象的monitor。

The objectref must be of type reference.
Each object is associated with a monitor. A monitor is locked if and only if it has an owner. The thread that executes monitorenter attempts to gain ownership of the monitor associated with objectref, as follows:
If the entry count of the monitor associated with objectref is zero, the thread enters the monitor and sets its entry count to one. The thread is then the owner of the monitor.
If the thread already owns the monitor associated with objectref, it reenters the monitor, incrementing its entry count.
If another thread already owns the monitor associated with objectref, the thread blocks until the monitor's entry count is zero, then tries again to gain ownership.

objectref必须是援用类型。
每个对象都与一个监视器关联。监视器只有在领有所有者的状况下才被锁定。执行monitorenter的线程尝试取得与objectref关联的监视器的所有权，如下所示：
如果与objectref关联的监视器的条目计数为零，则线程进入监视器并将其条目计数设置为1。而后，该线程是监视器的所有者。
如果线程曾经领有与objectref关联的监视器，则它将从新进入监视器，从而条目计数加1。
如果另一个线程曾经领有与objectref关联的监视器，则该线程将阻塞，直到该监视器的条目计数为零为止，而后再次尝试获取所有权。

3.2 monitor_exit的阐明： unlock特定对象的monitor。

The objectref must be of type reference.
The thread that executes monitorexit must be the owner of the monitor associated with the instance referenced by objectref.
The thread decrements the entry count of the monitor associated with objectref. If as a result the value of the entry count is zero, the thread exits the monitor and is no longer its owner. Other threads that are blocking to enter the monitor are allowed to attempt to do so.

objectref必须是援用类型。
执行monitorexit的线程必须是与objectref援用的实例相关联的监视器的所有者。
该线程缩小与objectref关联的监视器的条目计数。后果，如果条目计数的值为零，则线程退出监视器，并且不再是其所有者。其余被阻塞进入监视器的线程能够尝试进入监视器。

3.3 上面是对于异样表( Exception table)的阐明：

下面是bump办法对应的指令，异样表有两行（如下所示），每一行称为异样表条目：

 4    16    19   any    19    22    19   any

每个异样表条目监控[from, to)的字节码，如果出现异常，则跳转到target指针对应的字节码执行，type则代表该处理器所能捕捉的异样类型（any代表任何异样）。
对应的下面两个异样条目标意思就是：

4对应from，16对应to, 19对应target， any对应type；也就是[4,16)指向的字节码指令抛任何异样（any）了，都会跳转到19执行。
19对应from，22对应to, 19对应target， any对应type；也就是[19, 22)指向的字节码抛出任何异样（any）了，都会跳转到19执行。

也就是

状况一：[4,16)执行没有任何异样，则goto到24，返回。在这种状况下失常加锁 3: monitorenter，开释锁 15: monitorexit
状况二：[4,16)抛出任何异样，都跳转19，都会执行到 21: monitorexit；如果胜利了，则异样完结；如果在[19, 22)执行中抛出任何异样，就跳转到19再从新执行一遍。

通过下面的剖析，咱们能够发现，不论是否抛出异样，synchronized 同步块，都会开释之前获取的锁，也就是 monitorenter 与 monitorexit 始终是成对呈现的。

BumpTest.classBump和BumpTest.bump是相似，你能够本人尝试剖析下。

4.从JVM字节码层面看同步办法

Test.bump节选

synchronized void bump();  descriptor: ()V  flags: ACC_SYNCHRONIZED  Code:    stack=3, locals=1, args_size=1       0: aload_0       1: dup       2: getfield      #2                  // Field count:I       5: iconst_1       6: iadd       7: putfield      #2                  // Field count:I      10: return    LineNumberTable:      line 6: 0      line 7: 10

由Java虚拟机标准可知

Monitor entry on invocation of a synchronized method, and monitor exit on its return, are handled implicitly by the Java Virtual Machine's method invocation and return instructions, as if monitorenter and monitorexit were used.

Java虚拟机的办法调用和返回指令，隐式解决了调用同步办法时的monitor entry 和返回时的monitor exit，就像应用了monitorenter 和monitorexit 一样。

所以，同步办法和同步块的实现形式实质上并没有什么不同。

5.从机器码看synchronized

如果咱们把BumpTest改成如下代码：

package synchronizedTest;class BumpTest {    int count;    void bump() {        synchronized (this) { count++; }    }    static int classCount;    static void classBump() {        try {            synchronized (Class.forName("BumpTest")) {                classCount++;            }        } catch (ClassNotFoundException e) {}    }    public static void main(String[] args){        for(int i=0; i< 100000; i++){            classBump();        }        System.out.println(classCount);    }}

而后从新编译成字节码文件，再失去本地机器指令文件（留神，执行第二条指令还须要hsdis,你能够参考我之前的文章装置）

javac synchronizedTest/BumpTest.javajava -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly synchronizedTest/BumpTest > synchronizedTest/BumpTest.native

而后咱们搜寻'classBump'，发现了上面的机器指令，能够看到在Intel64 CPU 下，synchronized 最终还是用 lock cmpxchg 实现的。

从这里咱们能够发现，这里的 synchronized 和之前讲的volatile， Unsafe中的CAS，刚好是用相似的原子指令（比方这里的lock cmpxchg）实现的。

至于BumpTest 和Test中其余同步办法或同步块，你能够试一下，后果是统一的。

6. 同步

然而，并不是所有的同步都是用上述的原子指令实现的（其实是轻量级锁），而是依据不同状况应用不同的锁，锁的类型分为重量级锁，轻量级锁，偏差锁。上面次要简略地阐明这三种锁的实现。

HotSpot JVM 应用一个 two-word 对象头，第一个word是 Class pointer，第二个是 Mark word 用来保留同步，GC 和 hashCode 等相干信息。Mark word应用形式见下图：

6.1 重量级锁 heavyweight monitor

重量级锁对应上述 Mark word 的 tag bits 为10的状况，即此时状态为inflated。

重量级锁会应用操作系统级别的锁定原语 ( OS-level locking primitives，比方 pthread mutex) 来实现。 这些操作将波及零碎调用，须要从操作系统的用户态切换至内核态，其开销十分之大。重量级锁能够在所有场景应用。

6.2 轻量级锁 lightweight lock

轻量级锁对应上述 Mark word 的 tag bits 为 00 的状况，即此时状态为 lightweight-locked。

轻量级锁是对重量级锁的优化。轻量级锁应用一个或两个 CPU 级别的原子指令（比方 lock cmpxchg），从而防止了应用操作系统级别的锁定原语。
可选的轻量级锁实现算法有 Metalock (CAS in both acquire and release), Thin Locks(CAS in acquire), 和 Relaxed-locks (CAS in acquire).

但轻量级锁适用范围无限，以 Thin Locks 为例子，它实用于这样的对象，这些对象不被争用，不须要对本人执行 wait，notify 或 notifyAll 操作，并且没有锁定到过多的嵌套深度。绝大多数对象都满足上述条件；那些不满足条件的对象的锁要用重量级锁来实现。

6.3 偏差锁 biased lock

偏差锁在 JDK6 引入，对应上述 Mark word 的 tag bits 为 01 的状况，即此时状态为 unlocked 或 biasable。

偏差锁是对轻量级锁的再优化，尝试在 acquire 和 release 中防止原子指令, 仅在第一次获取时执行一次原子指令，以将锁定线程 ID 装置到 mark word 中。

但偏差锁的适用范围绝对轻量级锁来说更加无限，偏差锁实用于单个过程重复获取并开释锁，而其余过程很少拜访该锁的状况，即大多数对象在其生命周期中最多只能被一个线程锁定的状况。

更多对于偏差锁的常识，请见 Quickly Reacquirable Locks 和 Biased Locking in HotSpot

6.4 锁的转换

在 HotSpot JVM 中，依照偏差锁，轻量级锁，重量级锁的程序来尝试获取对象的锁。残缺流程见下图：

偏差锁相干流程（对应上图中以1结尾的）：
如果新调配的对象O是可偏差的但未被偏差（对应上图中1），那么第一次锁定的时候应用 CAS 在 mark word 中插入线程T1的ID（对应上图中1-1），而接下来的锁定仅仅将 mark word 外面的线程T1的ID 与以后线程T2的比拟，此时可能呈现两种状况：

状况1：如果线程ID一样，则表明对象O 已偏差以后线程 T2，也就是以后线程 T2 曾经锁定对象 O，能够无需 CAS 即可 lock/unlock (对应图中1-2）
状况2：如果线程ID不一样，则撤销对T1的偏差, 并须要查看对象O是否能够重偏差。如果能够重偏差，则将对象O重偏差到线程T2（对应上图中1-3）；否则将撤销偏差并回退到失常锁定流程（对应上图中以2结尾的），尔后对象 O 对应的类不能够再被偏差锁定。

轻量级锁和重量级锁流程（对应上图中以2结尾的）：
如果新调配的对象O对应的类不可偏差，则先尝试通过 CAS 设置 mark word来获取轻量级锁。如果胜利，则获取轻量级锁；如果失败，则先判断是否是递归锁定，如果是则表明曾经获取锁，如果不是则收缩为重量级锁。

轻量级锁定时，每次进入同步办法，都会在栈帧中生成一个新的 lock record （锁记录），该锁记录有两个字段displaced hdr和owner，displaced hdr 用来保留锁对象的对象头mark word， owner用来保留指向锁对象的指针。另外，Lock record出于内存对齐的要求，会确保lock record的存储地址最初两位为00 ，这两位刚好用来作为轻量级锁的标识。

轻量级锁定：尝试将 lock record 的 displaced hdr 用来保留锁对象原来的mark word；将lock record的owner指向锁对象；将锁对象原来的mark word 替换为指向lock record的指针；这三步都会在同一个CAS原子地进行尝试。
如果CAS 胜利，则表明获取轻量级锁胜利，也就是下图所示的状况

如果CAS 失败，则分为递归锁定和须要收缩到重量级锁两种状况解决。
虚拟机首先测试对象的 mark word 是否指向以后线程的办法栈。

如果是，则表明是递归锁定，以后线程曾经领有对象的锁，能够平安地继续执行它。对于这种递归锁定的对象，将 lock record 初始化为0而不是对象的 mark word。（对应上图中的2-2）
如果不是，则表明存在两个不同的线程同时在同一个对象上同步，这时须要将轻量级锁收缩到重量级锁，也就是将指向heavy monitor的指针赋值给对象的 mark word（对应上图中的2-3）

下面只是对于同步流程的局部总结，对于同步更全面的介绍，请参见Sun 的 Eliminating Synchronization Related Atomic Operations with Biased Locking and Bulk Rebiasing 和 Synchronization in Java SE 6(HotSpot) ，以及 Synchronization （我翻译的Synchronization中英对照版，还有 Java虚拟机是怎么实现synchronized的？，以及在 bytecodeInterpreter.cpp 搜寻Lock method if synchronized 。

7.总结

这篇文章首先对 synchronized 的根本应用进行了温习；而后尝试从字节码和本地机器码的角度上看 synchronized 的实现；最初通过查看官网文档弄清synchronized 的实现会别离尝试偏差锁（尝试防止原子指令，仅第一次的时候须要应用原子指令，以将锁定线程的 ID 装置到 header word 中），轻量级锁（在锁定和解锁中应用一个或两个CPU-level 的原子指令），重量级锁（操作系统级调用），这三种锁实现适用范围越来越大，但代价也越来越大。

其实 synchronized 如何实现对于个别人是无感的，这也是为什么每次 JDK 公布都可能会改善它的性能，咱们要做的基本上是依据 JDK 版本了解对应的实现，而后调整一下相应的 JVM 参数。

8.参考

1.Java语言标准第八版第17章
2.Java虚拟机标准第八版
3.https://www.cs.umd.edu/~pugh/...
4.https://time.geekbang.org/col...
5.https://blogs.oracle.com/dave...
6.https://wiki.openjdk.java.net...
7.https://stackoverflow.com/que...
8.https://www.zhihu.com/questio...