前言

最近golang的1.13版本发布了，有很多新特性与改进合入。这里主要分析sync.pool的优化。

本文主要解答以下几个问题：

sync.pool优化体现在哪里？
优化是如何实现？
优化的好处有哪些？

优化

具体优化项如下：

无锁化
GC策略

无锁化

sync.pool实现了无锁化，具体如下：

go1.12.1版本实现

1
2
3
4
5
6


// Local per-P Pool appendix.
type poolLocalInternal struct {
    private interface{}   // Can be used only by the respective P.
    shared  []interface{} // Can be used by any P.
    Mutex                 // Protects shared.
}

go1.13版本

1
2
3
4
5


// Local per-P Pool appendix.
type poolLocalInternal struct {
    private interface{} // Can be used only by the respective P.
    shared  poolChain   // Local P can pushHead/popHead; any P can popTail.
}

通过上面对比发现了go1.12版本的Mutex删除了。那么go1.13版本又是如何实现无锁化的呢？

先回答问题：go1.13通过poolChain实现SPMC的无锁队列来实现无锁化。

poolChain是什么东东呢？

别急，代码面前无秘密。我们具体看一下代码就可以了。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33


// poolChain is a dynamically-sized version of poolDequeue.
//
// This is implemented as a doubly-linked list queue of poolDequeues
// where each dequeue is double the size of the previous one. Once a
// dequeue fills up, this allocates a new one and only ever pushes to
// the latest dequeue. Pops happen from the other end of the list and
// once a dequeue is exhausted, it gets removed from the list.
type poolChain struct {
    // head is the poolDequeue to push to. This is only accessed
    // by the producer, so doesn't need to be synchronized.
    head *poolChainElt

    // tail is the poolDequeue to popTail from. This is accessed
    // by consumers, so reads and writes must be atomic.
    tail *poolChainElt
}


type poolChainElt struct {
    poolDequeue

    // next and prev link to the adjacent poolChainElts in this
    // poolChain.
    //
    // next is written atomically by the producer and read
    // atomically by the consumer. It only transitions from nil to
    // non-nil.
    //
    // prev is written atomically by the consumer and read
    // atomically by the producer. It only transitions from
    // non-nil to nil.
    next, prev *poolChainElt
} 

关于poolChain是如何实现SPMC无锁队列？具体可以分析poolqueue.go的代码。这一部分不展开说明，要点如下：

无锁队列是SPMC
无锁队列是可以灵活调整大小，调整大小的方法：slice+double-list实现（根据这个思路来阅读代码也是容易理解）
无锁队列的实现基础是CAS

好处

避免锁的开销，mutex变成atomic

GC策略

相比较于go1.12版本，go1.13版本中增加了victim cache。具体作法是：

GC处理过程直接回收oldPools的对象
GC处理并不直接将allPools的object直接进行GC处理，而是保存到oldPools，等到下一个GC周期到了再处理

具体代码如下：

1
2
3
4


var (
    allPoolsMu Mutex
    allPools   []*Pool
)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


var (
    allPoolsMu Mutex

    // allPools is the set of pools that have non-empty primary
    // caches. Protected by either 1) allPoolsMu and pinning or 2)
    // STW.
    allPools []*Pool

    // oldPools is the set of pools that may have non-empty victim
    // caches. Protected by STW.
    oldPools []*Pool
)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


func poolCleanup() {
    // This function is called with the world stopped, at the beginning of a garbage collection.
    // It must not allocate and probably should not call any runtime functions.

    // Because the world is stopped, no pool user can be in a
    // pinned section (in effect, this has all Ps pinned).

    // Drop victim caches from all pools.
    for _, p := range oldPools {
        p.victim = nil
        p.victimSize = 0
    }

    // Move primary cache to victim cache.
    for _, p := range allPools {
        p.victim = p.local
        p.victimSize = p.localSize
        p.local = nil
        p.localSize = 0
    }

    // The pools with non-empty primary caches now have non-empty
    // victim caches and no pools have primary caches.
    oldPools, allPools = allPools, nil
}

这样可导致Get的实现有变化，原来的实现是：

先从本P绑定的poolLocal获取对象：先从本poolLocal的private池获取对象，再从本poolLocal的shared池获取对象
上一步没有成功获取对象，再从其他P的shared池获取对象
上一步没有成功获取对象，则从Heap申请对象

引入victim cache，Get实现变成如下：

先从本P绑定的poolLocal获取对象：先从本poolLocal的private池获取对象，再从本poolLocal的shared池获取对象
上一步没有成功获取对象，再从其他P的shared池获取对象
上一步没有成功，则从victim cache获取对象
上一步没有成功获取对象，则从Heap申请对象

具体代码如下：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39


func (p *Pool) getSlow(pid int) interface{} {
    // See the comment in pin regarding ordering of the loads.
    size := atomic.LoadUintptr(&p.localSize) // load-acquire
    locals := p.local                        // load-consume
    // Try to steal one element from other procs.
    for i := 0; i < int(size); i++ {
        l := indexLocal(locals, (pid+i+1)%int(size))
        if x, _ := l.shared.popTail(); x != nil {
            return x
        }
    }

    // Try the victim cache. We do this after attempting to steal
    // from all primary caches because we want objects in the
    // victim cache to age out if at all possible.
    // 尝试从victim cache获取
    size = atomic.LoadUintptr(&p.victimSize)
    if uintptr(pid) >= size {
        return nil
    }
    locals = p.victim
    l := indexLocal(locals, pid)
    if x := l.private; x != nil {
        l.private = nil
        return x
    }
    for i := 0; i < int(size); i++ {
        l := indexLocal(locals, (pid+i)%int(size))
        if x, _ := l.shared.popTail(); x != nil {
            return x
        }
    }

    // Mark the victim cache as empty for future gets don't bother
    // with it.
    atomic.StoreUintptr(&p.victimSize, 0)

    return nil
}

好处

空间上通过引入victim cache增加了Get获取内存的选项，增加了对象复用的概率
时间上通过延迟GC，增加了对象复用的时间长度
上面这个两个方面降低了GC开销，增加了对象使用效率

深入分析Golang sync.pool优化

文章目录

前言

优化

无锁化

好处

GC策略

好处

参考