前言:
测试golang锁的性能是个无聊的事情,说下缘由吧,前两天朋友问我如果要对一个slice进行线程安全的读写操作,需要读写锁还是互斥锁,哪个性能高一点?
Rwmutex在什么场景下会比mutex性能要好? 在我看来lock和unlock之间,在没有io逻辑,没有复杂的计算逻辑下,mutex互斥锁要比rwlock读写锁更加的高效。 像社区里rwlock读写锁的设计实现有好几种,大多是抽象两把lock和reader计数器实现。
该文章后续仍在不断的更新修改中, 请移步到原文地址 http://xiaorui.cc/?p=5611
我对比过在cpp下lock和rwlock的性能对比,在简单赋值的逻辑下,他的benchmark跟我的预测是一样的。也就是说,互斥锁lock要比rwlock读写锁高效。当中间逻辑是一个空io读写操作时,rwlock也要比lock高效,这个也跟我们想的一样。 但当中间逻辑是map查找时,rwlock也要比lock高。但想来map是个复杂的数据结构,当查找key时,需要hashcode计算,然后通过hashcode找到数组里对应的bucket,然后再从链表里找到相关的key。
1 2 3 4 5 6 7 8 9 10 11 12 13 |
// from xiaorui.cc 简单赋值: 1、raw_lock耗时1.832699s; 2、raw_rwlock耗时3.620338s io操作: 1、simple_lock耗时14.058138s; 2、simple_rwlock耗时9.445691s map: 1、lock耗时2.925601s; 2、rwlock耗时0.320296s |
比完c++的锁,那么我们对比下golang的sync.rwmutex和sync.mutex的性能。废话不多说,直接贴测试代码。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 |
// xiaorui.cc package main // xiaorui.cc // github.com/rfyiamcool/golib import ( "fmt" "sync" "time" ) var ( num = 1000 * 10 gnum = 1000 ) func main() { fmt.Println("only read") testRwmutexReadOnly() testMutexReadOnly() fmt.Println("write and read") testRwmutexWriteRead() testMutexWriteRead() fmt.Println("write only") testRwmutexWriteOnly() testMutexWriteOnly() } func testRwmutexReadOnly() { var w = &sync.WaitGroup{} var rwmutexTmp = newRwmutex() w.Add(gnum) t1 := time.Now() for i := 0; i < gnum; i++ { go func() { defer w.Done() for in := 0; in < num; in++ { rwmutexTmp.get(in) } }() } w.Wait() fmt.Println("testRwmutexReadOnly cost:", time.Now().Sub(t1).String()) } func testRwmutexWriteOnly() { var w = &sync.WaitGroup{} var rwmutexTmp = newRwmutex() w.Add(gnum) t1 := time.Now() for i := 0; i < gnum; i++ { go func() { defer w.Done() for in := 0; in < num; in++ { rwmutexTmp.set(in, in) } }() } w.Wait() fmt.Println("testRwmutexWriteOnly cost:", time.Now().Sub(t1).String()) } func testRwmutexWriteRead() { var w = &sync.WaitGroup{} var rwmutexTmp = newRwmutex() w.Add(gnum) t1 := time.Now() for i := 0; i < gnum; i++ { if i%2 == 0 { go func() { defer w.Done() for in := 0; in < num; in++ { rwmutexTmp.get(in) } }() } else { go func() { defer w.Done() for in := 0; in < num; in++ { rwmutexTmp.set(in, in) } }() } } w.Wait() fmt.Println("testRwmutexWriteRead cost:", time.Now().Sub(t1).String()) } func testMutexReadOnly() { var w = &sync.WaitGroup{} var mutexTmp = newMutex() w.Add(gnum) t1 := time.Now() for i := 0; i < gnum; i++ { go func() { defer w.Done() for in := 0; in < num; in++ { mutexTmp.get(in) } }() } w.Wait() fmt.Println("testMutexReadOnly cost:", time.Now().Sub(t1).String()) } func testMutexWriteOnly() { var w = &sync.WaitGroup{} var mutexTmp = newMutex() w.Add(gnum) t1 := time.Now() for i := 0; i < gnum; i++ { go func() { defer w.Done() for in := 0; in < num; in++ { mutexTmp.set(in, in) } }() } w.Wait() fmt.Println("testMutexWriteOnly cost:", time.Now().Sub(t1).String()) } func testMutexWriteRead() { var w = &sync.WaitGroup{} var mutexTmp = newMutex() w.Add(gnum) t1 := time.Now() for i := 0; i < gnum; i++ { if i%2 == 0 { go func() { defer w.Done() for in := 0; in < num; in++ { mutexTmp.get(in) } }() } else { go func() { defer w.Done() for in := 0; in < num; in++ { mutexTmp.set(in, in) } }() } } w.Wait() fmt.Println("testMutexWriteRead cost:", time.Now().Sub(t1).String()) } func newRwmutex() *rwmutex { var t = &rwmutex{} t.mu = &sync.RWMutex{} t.ipmap = make(map[int]int, 100) for i := 0; i < 100; i++ { t.ipmap[i] = 0 } return t } type rwmutex struct { mu *sync.RWMutex ipmap map[int]int } func (t *rwmutex) get(i int) int { t.mu.RLock() defer t.mu.RUnlock() return t.ipmap[i] } func (t *rwmutex) set(k, v int) { t.mu.Lock() defer t.mu.Unlock() k = k % 100 t.ipmap[k] = v } func newMutex() *mutex { var t = &mutex{} t.mu = &sync.Mutex{} t.ipmap = make(map[int]int, 100) for i := 0; i < 100; i++ { t.ipmap[i] = 0 } return t } type mutex struct { mu *sync.Mutex ipmap map[int]int } func (t *mutex) get(i int) int { t.mu.Lock() defer t.mu.Unlock() return t.ipmap[i] } func (t *mutex) set(k, v int) { t.mu.Lock() defer t.mu.Unlock() k = k % 100 t.ipmap[k] = v } // xiaorui.cc |
测试结果
测试了多协程使用mutex,rwmutex,在只读,只写,读写的测试场景。 貌似只有在只写的场景下,mutex要比rwmutex高一点。
1 2 3 4 5 6 7 8 9 |
only read testRwmutexReadOnly cost: 465.546765ms testMutexReadOnly cost: 2.146494288s write and read testRwmutexWriteRead cost: 1.80217194s testMutexWriteRead cost: 2.322097403s write only testRwmutexWriteOnly cost: 2.836979159s testMutexWriteOnly cost: 2.490377869s |
把map的读写逻辑更换成全局的计数加减。可以发现跟上面的测试结果差不多,只写场景下mutex要比rwlock性能高一点。
1 2 3 4 5 6 7 8 9 |
only read testRwmutexReadOnly cost: 10.583448ms testMutexReadOnly cost: 10.908006ms write and read testRwmutexWriteRead cost: 12.405655ms testMutexWriteRead cost: 14.471428ms write only testRwmutexWriteOnly cost: 13.763028ms testMutexWriteOnly cost: 13.112282ms |
sync.RwMutex源码
我们分析下golang sync.RwMutex的实现,他的结构里也是有读锁,写锁,reader计数器的,跟社区里的字段差不多。最大的区别是对于reader的计数使用atomic指令操作,而社区里reader的加减是通过拿互斥锁来实现的。
1 2 3 4 5 6 7 |
type RWMutex struct { w Mutex // held if there are pending writers writerSem uint32 // semaphore for writers to wait for completing readers readerSem uint32 // semaphore for readers to wait for completing writers readerCount int32 // number of pending readers readerWait int32 // number of departing readers } |
读锁的过程, 他是直接使用atomic来减法操作。当reader小于0,等待读锁。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
func (rw *RWMutex) RLock() { if race.Enabled { _ = rw.w.state race.Disable() } if atomic.AddInt32(&rw.readerCount, 1) < 0 { // A writer is pending, wait for it. runtime_Semacquire(&rw.readerSem) } if race.Enabled { race.Enable() race.Acquire(unsafe.Pointer(&rw.readerSem)) } } |
释放读锁,也是使用atomic来对计数操作。当没有reader的时候,释放写锁。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
func (rw *RWMutex) RUnlock() { if race.Enabled { _ = rw.w.state race.ReleaseMerge(unsafe.Pointer(&rw.writerSem)) race.Disable() } if r := atomic.AddInt32(&rw.readerCount, -1); r < 0 { if r+1 == 0 || r+1 == -rwmutexMaxReaders { race.Enable() throw("sync: RUnlock of unlocked RWMutex") } // A writer is pending. if atomic.AddInt32(&rw.readerWait, -1) == 0 { // The last reader unblocks the writer. runtime_Semrelease(&rw.writerSem, false) } } if race.Enabled { race.Enable() } } |
写锁的过程,首先判断是否有读,有读,等待读来唤醒。释放写锁的时候,也会把读锁释放,释放了,自然就把Rlock的协程唤醒。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
func (rw *RWMutex) Lock() { if race.Enabled { _ = rw.w.state race.Disable() } // First, resolve competition with other writers. rw.w.Lock() // Announce to readers there is a pending writer. r := atomic.AddInt32(&rw.readerCount, -rwmutexMaxReaders) + rwmutexMaxReaders // Wait for active readers. if r != 0 && atomic.AddInt32(&rw.readerWait, r) != 0 { runtime_Semacquire(&rw.writerSem) } if race.Enabled { race.Enable() race.Acquire(unsafe.Pointer(&rw.readerSem)) race.Acquire(unsafe.Pointer(&rw.writerSem)) } } func (rw *RWMutex) Unlock() { if race.Enabled { _ = rw.w.state race.Release(unsafe.Pointer(&rw.readerSem)) race.Release(unsafe.Pointer(&rw.writerSem)) race.Disable() } // Announce to readers there is no active writer. r := atomic.AddInt32(&rw.readerCount, rwmutexMaxReaders) if r >= rwmutexMaxReaders { race.Enable() throw("sync: Unlock of unlocked RWMutex") } // Unblock blocked readers, if any. for i := 0; i < int(r); i++ { runtime_Semrelease(&rw.readerSem, false) } // Allow other writers to proceed. rw.w.Unlock() if race.Enabled { race.Enable() } } |
总结:
没什么好总结的,锁竞争一直是高并发系统的一个问题。对于上面map + mutex的使用,我们可以用1.9之后的sync.Map替换。在读多写少下,sync.Map的性能要比sync.RwMutex + map高的多。大家可以看了sync.Map的实现原理后,会发现他的写性能不高,读是可以通过copy on write的方式无锁读,但是写操作还是会有锁的。我们可以使用类似java concurrentMap分段锁的方法来分解锁竞争的压力。
解决锁竞争问题,除了上面的分段锁,还可以通过atomic cas指令来实现乐观锁。
有疑问加站长微信联系(非本文作者)