HTTP request tester, slower with goroutines

xuanbao · · 1173 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>HTTP request with goroutines, read around 2000 responses in 10 seconds <a href="https://gist.github.com/vp89/8612ce5f6413e7e7bf3c">https://gist.github.com/vp89/8612ce5f6413e7e7bf3c</a></p> <p>HTTP request without goroutines, read around 30000 responses in 10 seconds <a href="https://gist.github.com/vp89/2b4d3a4de2ad7688bae8">https://gist.github.com/vp89/2b4d3a4de2ad7688bae8</a></p> <p>I have a simple JSON api up which reads from a database, I want to stress test it and I tried using goroutines but Im finding that it&#39;s much slower than not using them.</p> <p>Am I not using the goroutines properly?</p> <hr/>**评论:**<br/><br/>Everlag: <pre><p>Have you set GOMAXPROCS &gt; 1?</p> <p>If you don&#39;t explicitly tell the compiler to be parallel, the overhead of switching between goroutines might be significant.</p> <p>Goroutines are great if you are doing io that takes time, testing against a local server with sub-millisecond response times could negate the usefulness of just concurrency. This could be the issue you are encountering.</p> <p>To set GOMAXPROCS &gt; 1,</p> <p>runtime.GOMAXPROCS(runtime.NumCPU())</p> <p>should work.</p> <p>Also, I&#39;m not sure if someInt++ is atomic. If you don&#39;t lock(sync.Mutx)- or better yet, keep a counter per goroutine and sum them at the end of the test- you could potentially count many fewer requests than you are actually making!</p></pre>v89_cs: <pre><p>Nice, this made a real difference, its now getting between 2.9-3.1k responses per second which may be a limitation on the server side or a coincidence that it&#39;s similar to the non-goroutines method</p> <p><a href="https://gist.github.com/vp89/1a6f67486241ec33b769">https://gist.github.com/vp89/1a6f67486241ec33b769</a></p> <p>I played around with the number of goroutines and it seems to work best hitting localhost if you use runtime.NumCPU() as the number of goroutines</p> <p>Using more than this locally actually provides worse performance, but if you go out to the Internet then more is more .. so 20 goroutines to google.com gets me 100 responses a second while 10 gets me 50 .. for example</p></pre>Everlag: <pre><p>Happy to help.</p> <p>Like I said, the time the goroutine can sleep while waiting for io will determine the optimal amount. The effectively 0ms latency to localhost penalizes many goroutines while the high latency to the internet has a much higher ceiling thanks to the &gt;100ms latency.</p> <p>As an aside, sync/atomic is probably a package you&#39;d want to avoid in production code. Higher level primitives or a different strategy that doesn&#39;t require atomicity would be preferred.</p></pre>jerf: <pre><p>I think it&#39;s perfectly sensible to use atomic addition for a safe counter update. All the &#34;higher level primitives&#34; will use multiple such atomic operations to implement themselves, all so that you can... have the exact same result? That&#39;s not sensible.</p> <p>The danger with sync/atomic is if you&#39;re using those functions to implement your own semaphore or something, not if you just want atomic incrementing.</p> <p>And unless you&#39;re operating at a scale much higher than the OP is describing, it isn&#39;t worth your time to try to split into pieces that don&#39;t require atomicity on the counters. You won&#39;t even notice the incrementing in the profile, like, literally, the profiler is very unlikely to even hit an increment with the way it works.</p> <p>You can&#39;t just say &#34;don&#39;t use sync/atomic&#34;.</p></pre>blamarvt: <pre><p>Can you elaborate on avoiding sync/atomic? Is it just slow?</p></pre>ratatask: <pre><p>That&#39;s likely because you&#39;re not I/O bound when hitting localhost, whilst over the internet your goroutines spend more time waiting for I/O and you benefit from having more requests in parallel. You should also find a better way to not have the last loop in main() burn CPU.</p></pre>stephenalexbrowne: <pre><p>I looked over your code, and at first glance there does not appear to be anything wrong with how you are using goroutines. The thing about goroutines is that there is significant overhead to starting them. I suspect what&#39;s probably happening here is that the overhead from creating the goroutines is high enough to where it cancels out the performance gain you would get by being able to send multiple requests at once.</p> <ol> <li>The code powering your API is also potentially a factor here. Could you share that code with us?</li> <li>You should try benchmarking e.g., google.com to see what the difference is there. It&#39;s possible that anything on localhost is responding so fast to requests that you don&#39;t gain much by being able to send multiple requests at once. Whereas, if you try google.com, their servers are much farther away and will take orders of magnitude more time per request, which means being able to send multiple requests at once is a bigger advantage.</li> <li>Did you set GOMAXPROCS? By default go will only use one CPU core, and sometimes you can get better performance by increasing the number of cores it uses.</li> </ol></pre>stephenalexbrowne: <pre><p>I take that back. It looks like there is actually a race condition occurring. reqs and errors are both global variables being accessed by multiple goroutines. Instead, I would suggest keeping a set of counters for each goroutine and adding them up at the end. Could you try running <code>go test . -race</code> to run the race detector?</p></pre>v89_cs: <pre><p>Good points on 2 and 3, setting GOMAXPROCS made a real difference I did not know about this before</p> <p>It seems that on local I don&#39;t gain much using more goroutines than amount of cores I have, but going out to the Internet you benefit from using more goroutines</p></pre>jerf: <pre><p>Along with all the other suggestions, what database are you using, and how are you using it? I&#39;ve had situations in systems without go at all where sequential performance is faster than multiple-process performance because I had contention on the database itself. It isn&#39;t out of the question that Go is parallelizing just fine and it&#39;s actually hammering the database into the ground. Especially if you&#39;re trying to write/lock the same things in all the requests, and that depends on a <em>lot</em> of things to correctly determine.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

1173 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传