Raw TCP performance question

agolangf · · 480 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>Hello everyone! </p> <p>I am writing a plain TCP server in Go, with the ultimate goal of replacing a Node process I have.</p> <p>The Go version of the program currently:</p> <ol> <li>Receives some data on a plain TCP socket</li> <li>Parses it into “event” and “payload” tags</li> <li>Generates a unique ID for that message</li> <li>Inserts it into a Postgres table. </li> </ol> <p>Right now, the Go version is already much faster than the Node version when it’s stressed. I’m seeing throughput of about 6k messages per second (each of these messages is less than 1KB, with most around the 200b mark).</p> <p>My testing so far has been to create four clients that just send 10k messages each to the server and then disconnect. The clients finish almost immediately, the server finishes in about 7 seconds.</p> <p>My question to you all is, is this normal or slow for Go? If it seems slow, and anyone is interested in more detail, I will post the code, but didn’t want to waste people’s time if this is normal speeds. </p> <p>Thanks for your time! </p> <hr/> <p>Edit:</p> <p>Here is the code: <a href="https://github.com/malexdev/datalistener-test" rel="nofollow">https://github.com/malexdev/datalistener-test</a></p> <p>Creating database tables: <a href="https://gist.github.com/malexdev/fe5df03374b0a5819274274683b89640" rel="nofollow">https://gist.github.com/malexdev/fe5df03374b0a5819274274683b89640</a></p> <p>The machine I&#39;ve been running this on is a MacBook Pro 2016: i7, 16gb memory, PCIe SSD.</p> <p>Also, this does seem to be an application bottleneck somewhere: I&#39;m putting messages into a channel while they&#39;re waiting to be inserted. That channel isn&#39;t being saturated before the inserts happen.</p> <hr/> <p>Edit 2:</p> <p>Alright! Found the bottleneck! I had a single line query that was running on every message I received, rendering all my other optimizations like the bulk insert useless. I&#39;m now up to 60k messages per second without any further optimization, which I&#39;ll consider good enough for now. Thanks for your time everyone :) repo has been deleted. </p> <hr/>**评论:**<br/><br/>sacrehubert: <pre><blockquote> <p>I will post the code, but didn’t want to waste people’s time if this is normal speeds.</p> </blockquote> <p>Performance is going to depend on network (i.e.: hardware) configuration, latency, etc, so it&#39;s very difficult to give a meaningful answer. If anything, it&#39;s wasting time <em>not</em> to show us the code ;)</p> <p>If you post your code, we can try to spot any obvious inefficiencies.</p> <p>More to the point: is this performance fast enough for your needs? Does it need to be faster? If not, maybe this is premature optimization? </p></pre>koresho: <pre><p>Good points!</p> <p>In regards to premature optimization: the system has a requirement to accept and process as many as 30000 messages total spread amongst up to 10 clients within 5 seconds. While I am easily within that now, I’m also not yet processing the messages; only inserting them to the database immediately. So perhaps I should implement the processing and then come back and revisit this.</p> <p>Thanks for the input! </p></pre>sacrehubert: <pre><blockquote> <p>So perhaps I should implement the processing and then come back and revisit this.</p> </blockquote> <p>This would be my suggestion as well :)</p> <p>More likely than not, things will move around a bit as you implement the remaining pieces, which may or may not invalidate any benchmarks you perform before then.</p></pre>ratatask: <pre><p>When you have a modern network, a database, and a reasonably fast code and CPU, the bottleneck is almost always the database. </p> <p>If your message integrity allows it, try to batch together inserts to the database - i.e. instead of inserting 1 message per transaction, insert 10 (or more - up to e.g. 1000) per transaction and commit(). You can often get man, many thousands of percent speedup. </p> <p>The theory behind this is that for each transaction, the database have to commit the change to disk, wait for the whole IO operation to complete successfully - with spinning disks this is extremely slow, and the difference between committing 1 row and 100 rows to disk takes essentially the same time. You get less speedup with SSD disks, but usually the speedup is still significant.</p> <p>The drawback is of course when something goes wrong, you also can lose more messages.</p></pre>qu33ksilver: <pre><p>I&#39;d wager that its your DB insertions which is the bottleneck. </p> <p>A fan-out approach with backpressure propagation is ideal for this.</p> <p>Have a primary goroutine which takes the connection and does 2 and 3. Then push a struct to a worker queue of goroutines (I usually use <a href="https://github.com/ivpusic/grpool" rel="nofollow">https://github.com/ivpusic/grpool</a>) which does the job of inserting it to a postgres table. When the queue gets filled up, i.e. all workers are busy, your primary goroutine should block from accepting further messages from the connection.</p> <p>This is a simple model which bounds the no. of concurrent db inserts and at the same time does not let the client to push as many messages possible. You can even extend it to limit the no. of connections itself. </p></pre>koresho: <pre><p>I didn’t use a 3rd party library, but this is essentially the exact implementation I have for the inserts. I am putting the messages into a buffered channel that then bulk inserts the messages using COPY periodically. </p> <p>Good to know I’m on the right track in that regard :) thanks! </p> <p>Edit: wait, i was wrong. My implementation is similar (I do fan the work out to multiple routines) but not the same (as I then collapse them back down to a single worker to do the bulk insert). My bad. I’ll check out your method. Thanks! </p></pre>ask: <pre><p>How many messages per second can you insert into the Postgres database (using the same transaction granularity)?</p> <p>It seems a bit slow, unless that&#39;s how fast the database can go.</p> <p>The important question is: How many do you need? Is it fast enough? If so, stop and do something else.</p></pre>koresho: <pre><p>Good questions, thanks for the input! I admit I didn’t think to test raw DB insert performance, I kind of just assumed I must be doing something wrong. I’ll do that. Thanks! </p></pre>tty5: <pre><p>most likely than not IO performance on DB end is the bottleneck here</p></pre>jerf: <pre><p>I&#39;d suggest <a href="https://blog.golang.org/profiling-go-programs" rel="nofollow">taking a profile</a> of the program and having a look. I&#39;d also increase the number of messages you&#39;re sending, so you have time to take a look at the Postgres DB&#39;s resource usage. You may find it&#39;s maxing out your CPU usage, depending on how the table is set up. If on a Linux system, also have a look at iotop; you may be maxing out your IOPS even if you are not writing a lot per op. Also bear in mind that if the DB is slow that may manifest in the Go profile as appearing to be excessive time spent in the function of the driver that communicates with the DB.</p> <p>If A: it is postgres and B: this is somehow not fast enough for you, consider <a href="https://www.postgresql.org/docs/current/static/populate.html" rel="nofollow">having a look at this</a>. However, for non-initial-import uses I&#39;d recommend against turning off the indexes. But the first two suggestions can be helpful.</p> <p>Also:</p> <blockquote> <p>My testing so far has been to create four clients that just send 10k messages each to the server and then disconnect. The clients finish almost immediately, the server finishes in about 7 seconds.</p> </blockquote> <p>This suggests you may be lacking backpressure. You may want to limit the number of connections you accept simultaneously [1]. No point in accepting thousands of incoming connections if they&#39;re all going to try to jam down the same 10 DB connections, and that&#39;s all they&#39;re doing.</p> <p>[1]: I&#39;d start with something simple; running something like numberOfDatabaseConnections * 2 goroutines listening on a channel for a net.Conn, and then modifying the core .Accept() loop to send the resulting net.Conn on a channel to the goroutines rather than starting a goroutine at the time of accepting.</p></pre>koresho: <pre><p>Excellent pointers, thanks very much! Being kind of a Go noob I had forgotten that profiling is so convenient. And I’m running on macOS at the moment so I’ll be sure to give iotop a spin. </p> <p>Thanks! </p></pre>fakeNAcsgoPlayer: <pre><p>Slow, I would consider somewhere close to 300k per sec good enough if no file or db IO is involved, since a DB is involved scale that part first.</p> <p>And go bench is your best friend.</p></pre>riaan53: <pre><p>Remember to use <a href="https://golang.org/pkg/bufio/" rel="nofollow">https://golang.org/pkg/bufio/</a> to save on some syscalls :)</p> <p>What I normally do on the server is to add a uint32 length frame on a <a href="https://github.com/gogo/protobuf" rel="nofollow">https://github.com/gogo/protobuf</a> payload. Use <a href="https://golang.org/pkg/bufio/#NewReader" rel="nofollow">https://golang.org/pkg/bufio/#NewReader</a> and <a href="https://golang.org/pkg/io/#ReadFull" rel="nofollow">https://golang.org/pkg/io/#ReadFull</a> to read the length of the frame and then a ReadFull again to get the payload. Depending on source you might want to add a crc check as well.</p> <p>For the db use a nice fast postgres lib and not a heavy weight ORM. </p> <p>You will probably receive the messages way faster than what postgres can handle so either batch them for insert, queue then internal or externally (like kafka or nats streaming) and/or handle backpressure by instructing the clients to slow down as things always goes south like some bad latency spikes to your db.</p> <p>If you can point us to the code it will help.</p></pre>koresho: <pre><p>Thanks for your time! I have added the code: <a href="https://github.com/malexdev/datalistener-test" rel="nofollow">https://github.com/malexdev/datalistener-test</a></p> <p>I am using the built-in <code>database/sql</code> in conjunction with <code>pq</code>.</p> <p>Edit: issue resolved, so public repo has been deleted. Thanks for your help everyone :) </p></pre>zeiko_is_back: <pre><p>Show us the code !!!</p></pre>koresho: <pre><p>Done: <a href="https://github.com/malexdev/datalistener-test" rel="nofollow">https://github.com/malexdev/datalistener-test</a></p> <p>Thanks for your time!</p> <p>Edit: issue resolved, so public repo has been deleted. Thanks for your help everyone :) </p></pre>sposec: <pre><p>404?</p></pre>koresho: <pre><p>I resolved the problem, so I pulled the public repo. Its actual home is on an internal server. </p> <p>I edited the main post but I guess I should have also edited my other posts. Sorry. </p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

480 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传