<p>I've wrote a simple concurrent hashmap for learning purpose</p>
<pre><code>package concurrent_hashmap
import (
"hash/fnv"
"sync"
)
type ConcurrentMap struct {
buckets []ThreadSafeMap
bucketCount uint32
}
type ThreadSafeMap struct {
mapLock sync.RWMutex
hashMap map[string]interface{}
}
func NewConcurrentMap(bucketSize uint32) *ConcurrentMap {
var threadSafeMapInstance ThreadSafeMap
var bucketOfThreadSafeMap []ThreadSafeMap
for i := 0; i <= int(bucketSize); i++ {
threadSafeMapInstance = ThreadSafeMap{sync.RWMutex{}, make(map[string]interface{})}
bucketOfThreadSafeMap = append(bucketOfThreadSafeMap, threadSafeMapInstance)
}
return &ConcurrentMap{bucketOfThreadSafeMap, bucketSize}
}
func (cMap *ConcurrentMap) Put(key string, val interface{}, thread int) {
bucketIndex := hash(key) % cMap.bucketCount
bucket := cMap.buckets[bucketIndex]
bucket.mapLock.Lock()
fmt.Println("start")
fmt.Println("this is my thread", thread)
fmt.Println(bucket)
fmt.Println(bucketIndex)
fmt.Println(bucket.mapLock)
fmt.Println(&bucket.mapLock)
bucket.hashMap[key] = val
bucket.mapLock.Unlock()
}
// Helper
func hash(s string) uint32 {
h := fnv.New32a()
h.Write([]byte(s))
return h.Sum32()
}
</code></pre>
<p>With benchmark</p>
<p>package concurrent_hashmap</p>
<pre><code>import (
"testing"
"runtime"
"math/rand"
"strconv"
"sync"
)
// Concurrent does not work
func BenchmarkMyFunc(b *testing.B) {
var wg sync.WaitGroup
runtime.GOMAXPROCS(runtime.NumCPU())
my_map := NewConcurrentMap(uint32(4))
for n := 0; n < b.N; n++ {
go insert(my_map, wg, n)
}
wg.Wait()
}
func insert(my_map *ConcurrentMap, wg sync.WaitGroup, thread int) {
wg.Add(1)
var rand_int int
for element_num := 0; element_num < 1000; element_num++ {
rand_int = 123//rand.Intn(100)
my_map.Put(strconv.Itoa(rand_int), rand_int, thread)
}
defer wg.Done()
}
// This works
func BenchmarkMyFuncSynchronize(b *testing.B) {
my_map := NewConcurrentMap(uint32(4))
for n := 0; n < b.N; n++ {
my_map.Put(strconv.Itoa(123), 123)
}
}
</code></pre>
<p>But I found some issue with this code</p>
<p>First, the following code</p>
<pre><code>type ThreadSafeMap struct {
mapLock sync.RWMutex
hashMap map[string]interface{}
}
</code></pre>
<p>need to change to </p>
<pre><code>type ThreadSafeMap struct {
mapLock *sync.RWMutex
hashMap map[string]interface{}
}
threadSafeMapInstance = ThreadSafeMap{&sync.RWMutex{}, make(map[string]interface{})}
</code></pre>
<p>for concurrent access to work otherwise it will throw <code>fatal error: concurrent map writes</code></p>
<p>Second after I fix the first issue, the print out of the concurrent access has the following sample output</p>
<pre><code> start
start
this is my thread 1236
this is my thread 299817
{0x421224300 map[123:123]}
{0x4212ecae0 map[123:123]}
3
3
&{{35679 1} 0 0 -1073741824 0}
&{{1194259 1} 0 0 -1073741824 0}
0x427319cf0
0x42731b6f0
start
start
this is my thread 1236
this is my thread 299817
{0x4212ecae0 map[123:123]}
{0x421224300 map[123:123]}
3
3
</code></pre>
<p>The above thread indicates couple of things</p>
<ol>
<li>2 or more thread is accessing logic wrapped in <code>bucket.mapLock.Lock()</code> which means there is race condition</li>
<li>Each mutex have different address even if I fix the input and always use the same bucket</li>
</ol>
<p>Can someone tell me why I need to use <code>&sync.RWMutex{}</code> and can't use <code>sync.RWMutex{}</code> and why does my lock have different address when accessed from different thread even though one bucket should have the one lock regardless of what thread is accessing it?</p>
<hr/>**评论:**<br/><br/>epiris: <pre><p>Your problem is that in your put method you are making a copy of the slice value. Because the buckets field is:</p>
<pre><code>buckets []ThreadSafeMap
</code></pre>
<p>When you do this:</p>
<pre><code>bucket := cMap.buckets[bucketIndex]
</code></pre>
<p>A copy of the bucket is made, if you changed it to:</p>
<pre><code>bucket := &cMap.buckets[bucketIndex]
</code></pre>
<p>Your undesired behavior should go away. You could also store a slice of pointers instead, as a slice of values that shouldn't be copied tends to sneak back up on you with subtleties like this later.</p></pre>daenney: <pre><p>Based on my understanding of what you're trying to do:</p>
<p>You should use <code>&sync.RWMutex{}</code> to get a reference to that mutex. If you do not then every call to <code>ThreadSafeMap()</code> gets a new mutex instead of the same one. Assuming you want to coordinate between those maps you'll need to ensure everyone has the same mutex (instead of a new mutex or a copy of the mutex, which is essentially a new mutex), otherwise locking/unlocking it doesn't really do you much good.</p>
<p>However, if you just replace <code>sync.RWMutex{}</code> with <code>&sync.RWMutex{}</code> you'll suffer the same issue b/c you're just passing in a reference to a new mutex every time. So assign it to a variable first <code>mut := &sync.RWMutex{}</code> and then pass in <code>mut</code> in every call to <code>ThreadSafeMap</code>. Since the map receives a pointer it should expect a <code>*sync.RWMutex</code>. Now all your <code>ThreadSafeMap</code>s will have a reference to the same mutex.</p>
<p>You can't do <code>mut := sync.RWMutex{}</code> either in this case. B/c Go is pass by value what will happen is that your <code>ThreadSafeMap</code> will receive a copy of the mutex, which is not the same thing. You really need the <code>&sync.RWMutex{}</code> assigned to a variable first so that you can pass a reference to the same mutex around.</p></pre>jlr52: <pre><p>Shouldn't this line <code>ThreadSafeMap{sync.RWMutex{}, make(map[string]interface{})}</code> guarantee a new instance of mutex is passed into <code>ThreadSafeMap struct</code> and stored in the struct during initialization?</p>
<p>Why does any subsequent call to <code>Put</code> create a new <code>sync.RWMutex{}</code> ?</p>
<p>Each <code>ThreadSafeMap</code> should have its own mutex so that subsequent write to a particular <code>ThreadSafeMap</code> will not block write to other <code>ThreadSafeMap</code></p>
<p>Note: I am implementing concurrent map with 4 partitions (ie. 4 <code>ThreadSafeMap</code> ), each <code>ThreadSafeMap</code> does not have to synchronize with some other <code>ThreadSafeMap</code></p></pre>daenney: <pre><p>Oh. Then I think I misunderstood what you were trying to do. Since you're passing in a mutex to <code>ThreadSafeMap</code> I assumed you wanted to share one in order to coordinate between instances.</p>
<p>If that's not the case I would avoid this entirely and simply embed the <code>sync.RWMutex</code> in your <code>ThreadSafeMap</code> instead:</p>
<p><code>
type ThreadSafeMap struct {
hashMap map[string]interface{}
sync.RWMutex
}
</code></p>
<p>Taking a closer look there's a few things that I wonder about. You append the <code>threadSafeMapInstance</code> to the bucket, but that would cause a copy of it to be appended if I'm not mistaken. I would expect that thing to maintain a reference to a <code>ThreadSafeMap</code>?</p></pre>jlr52: <pre><p>exactly, I do have mutex embedded into my <code>ThreadSafeMap</code></p>
<pre><code>type ThreadSafeMap struct {
mapLock sync.RWMutex
hashMap map[string]interface{}
}
</code></pre>
<p>But the problem is that I found that the lock is not enforced</p>
<p>Consider the following <code>println</code></p>
<pre><code>bucket.mapLock.Lock()
fmt.Println("start")
fmt.Println("this is my thread", thread)
fmt.Println(bucket)
fmt.Println(bucketIndex)
fmt.Println(bucket.mapLock)
fmt.Println(&bucket.mapLock)
bucket.hashMap[key] = val
bucket.mapLock.Unlock()
</code></pre>
<p>I see the following output</p>
<pre><code>start
start
this is my thread 1236
this is my thread 299817
{0x4212ecae0 map[123:123]}
{0x421224300 map[123:123]}
3
3
</code></pre>
<p>the above output is interesting because, I expect something like</p>
<pre><code>start
this is my thread
{0x4212ecae0 map[123:123]}
3
0x42731b6f0
0x42731b6f0
start
this is my thread
{0x4212ecae0 map[123:123]}
3
0x42731b6f0
0x42731b6f0
</code></pre>
<p>instead of having <code>start</code> back to back. <code>start</code> cannot be back to back because only 1 thread can gain access to the lock at a time.</p></pre>nhooyr: <pre><p>Use <code>gofmt</code> pls!</p></pre>jhlahsklfahdfkl: <pre><p>Go is pass by value, so when you pass the mutex to a function the value is copied. Pointer types are a special type in that they point to something. So when you pass a pointer into a function the value is still copied, but the copied pointer still points to the same location as before the copy. These special types allow the different parts of the program to share state.</p>
<p>Those are addresses of the pointers copied on function call, they both point to the same mutex. Try printing the address of the dereferenced pointer.</p>
<p>Go doesn't have references. Go has types and pointer-types, and those types and pointer-types hold values. Possibly the use of the term 'dereference' in literature is the source of the confusion.</p></pre>ruertar: <pre><p>Your mutex addresses look to be the same -- they were just printed in different orders.</p></pre>jlr52: <pre><blockquote>
<p>{0x4212ecae0 map[123:123]}
{0x421224300 map[123:123]}</p>
</blockquote>
<p>hmm, it doesn't look the same. Each thread seems to have a different mutex </p></pre>ruertar: <pre><p>Okay -- maybe I don't understand your code but what I really don't understand if your benchmark. I think there is something wrong there. It doesn't make any sense to benchmark the creation of a go routine.</p>
<p>The code as it is posted doesn't even compile. You should try posting some code that compiles so that others can try it out.</p></pre>janderssen: <pre><p>The reason for the &sync.RWMutex{} is you need to ensure everything "points" to the same instance, otherwise it passes by value.</p>
<p>Now for your second question, the benchmarking tool is generating multiple benchmarks, and this it the reason for what "looks" like a race is in fact not, check the address of the cMap as follows :</p>
<pre><code>fmt.Printf("start : %d bucket address %p, map address %p \r\n", bucketIndex, bucket, cMap)
</code></pre>
<p>It will show the cMap is a different instance, hence the mutex is different also.
In the BenchmarkMyFunc(b *testing.B) add the following line and you can see it generating a new instance :</p>
<pre><code>my_map := NewConcurrentMap(uint32(4))
fmt.Printf("MAP ADDRESS %p \r\n", my_map)
</code></pre>
<p>Also, I would in my opinion, change the </p>
<pre><code>type ConcurrentMap struct {
buckets []ThreadSafeMap
bucketCount uint32
}
</code></pre>
<p>too</p>
<pre><code>type ConcurrentMap struct {
buckets []*ThreadSafeMap
bucketCount uint32
}
</code></pre>
<p>When ever I am dealing with an object that must be pointing to the same objects etc, and to ensure GO does not pass by value anywhere, this kind of makes me feel alot safer as I know exactly what I am instructing GO to do.</p>
<p>Hope this helps.</p></pre>ghfhjgfthjgfjh: <pre><p>Go is pass by value, so when you pass the mutex to a function the value is copied. Pointer types are a special type in that they point to something. So when you pass a pointer into a function the value is still copied, but the copied pointer still points to the same location as before the copy. These special types allow the different parts of the program to share state.</p>
<p>Those are addresses of the pointers copied on function call, they both point to the same mutex. Try printing the address of the dereferenced pointer.</p>
<p>The quirk about mutex is that the mutex must be passed by pointer in order for it to share its state. the mutex doesn't function properly if it can't share state.</p>
<p>Go doesn't have references. Go has types and pointer-types, and those types and pointer-types hold values. Possibly the use of the term 'dereference' in literature is the source of the confusion.</p></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传