<p>If I'm analysing a dataset asynchronously using several goroutines but I don't want to accidently hit the same thing more than once I want to have a blacklist that's updated as they go.</p>
<p>Problem is I'm wondering the best way to do this, my synchronous idea would be use a blacklist array to store ones I've already done and check the current one I'm analysing is in the blacklist.</p>
<p>How do I do this with goroutines? If I don't cater for it they may find the same one before a value is added to the blacklist. </p>
<p>Do I use a mutex lock? Or is there a better pattern?</p>
<hr/>**评论:**<br/><br/>seankhliao: <pre><p>have a single goroutine read the data and feed it into a channel then have multiple goroutines receive from the channel to process</p>
<p>if you need to save the ouputs, open another channel and have all the processing goroutines feed into that one and a have a single goroutine receive from it to write to disk/wherever </p></pre>bustyLaserCannon: <pre><p>Thanks that makes sense but what if the values are duplicated and nested inside the values I'd be passing to the channel to process. The processing channel may need to do the process again but may be doing it on one that the other goroutines have already done. </p></pre>seankhliao: <pre><p>there are several ways depending on your data/processing</p>
<p>if your processing doesn't require the values to be passed in together you could unroll/flatten your data and deduplicate with a simple map before you feed it in</p>
<p>if it does need the entire bundle, then it might easier to use the new <code>sync.Map</code> and check in the processing goroutines (update the map before beginning to process a value)</p></pre>bustyLaserCannon: <pre><p>Which would be the faster approach assuming both were possible?</p></pre>seankhliao: <pre><p>unrolling before feeding, simply because you don't need to deal with synchronization</p>
<p>you can check <a href="https://medium.com/@deckarep/the-new-kid-in-town-gos-sync-map-de24a6bf7c2c" rel="nofollow">this article</a> on the performance between the two</p>
<p>edit: unless your unrolling takes a very long time, in which case maybe try the <code>sync.Map</code></p></pre>titpetric: <pre><p>From what I understand, you're trying to deduplicate messages that are coming into several goroutines. There are several ways to go about it, but yes, generally you would keep a "seen" map that will tell you if a message has already been processed. Using a mutex is a valid way to ensure that only one goroutine can read or write from it. You could also resort to the <code>sync.Map</code> construct, added in 1.9, where locking is handled implicitly.</p></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传