Finding the source of goroutine leaks

xuanbao · · 536 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>I&#39;ve recently detected a slow goroutine leak in my app which seems to be coming from the HTTP server (from <code>/pprof/goroutine?debug=1</code>):</p> <pre><code>124 @ 0x42fc43 0x42a5ee 0x429ab0 0x4ea11a 0x4ea186 0x4ebfda 0x501954 0x6311a6 0x7bd1f9 0x7be06a 0x7be0e3 0x969241 0x968fb0 0x62a7b6 0x6329a9 0x6373b7 0x460d01 # 0x429ab0 net.runtime_pollWait+0x60 /usr/local/go/src/runtime/netpoll.go:160 # 0x4ea11a net.(*pollDesc).Wait+0x3a /usr/local/go/src/net/fd_poll_runtime.go:73 # 0x4ea186 net.(*pollDesc).WaitRead+0x36 /usr/local/go/src/net/fd_poll_runtime.go:78 # 0x4ebfda net.(*netFD).Read+0x23a /usr/local/go/src/net/fd_unix.go:250 # 0x501954 net.(*conn).Read+0xe4 /usr/local/go/src/net/net.go:172 # 0x6311a6 net/http.(*connReader).Read+0x196 /usr/local/go/src/net/http/server.go:526 # 0x7bd1f9 bufio.(*Reader).fill+0x1e9 /usr/local/go/src/bufio/bufio.go:97 # 0x7be06a bufio.(*Reader).ReadSlice+0x21a /usr/local/go/src/bufio/bufio.go:328 # 0x7be0e3 bufio.(*Reader).ReadLine+0x53 /usr/local/go/src/bufio/bufio.go:357 # 0x969241 net/textproto.(*Reader).readLineSlice+0x81 /usr/local/go/src/net/textproto/reader.go:55 # 0x968fb0 net/textproto.(*Reader).ReadLine+0x40 /usr/local/go/src/net/textproto/reader.go:36 # 0x62a7b6 net/http.readRequest+0xb6 /usr/local/go/src/net/http/request.go:721 # 0x6329a9 net/http.(*conn).readRequest+0x359 /usr/local/go/src/net/http/server.go:705 # 0x6373b7 net/http.(*conn).serve+0x947 /usr/local/go/src/net/http/server.go:1425 </code></pre> <p>It seems to be waiting on a <code>Read</code> from the connection, although I could potentially &#34;fix&#34; this with connection timeouts, it appears to be happening on my most simplest of handlers so I&#39;d like to trace exactly which requests are causing this. The leak pattern is quite strange, every 500 requests I send, it grows by about 10-20 goroutines.</p> <p>What&#39;s the best way to figure this out? I could instrument the <code>http</code> package in the stdlib though I was thinking there might be a more elegant solution</p> <hr/>**评论:**<br/><br/>aaron42net: <pre><p>Check &#34;netstat -tn&#34; (assuming you are on Linux or OSX) to see the state of the TCP connections to your app. If they don&#39;t show up at all or are in TIME_WAIT, they have leaked in the Go server somehow. If they are still ESTABLISHED, the client hasn&#39;t closed them at all, and if they are in a strange state like FIN_WAIT1, you they are still closing and may stick that way for a while due to a misbehaving firewall or packet loss between the client and the server.</p> <p>http.ListenAndServe() enables SetKeepAlive(true) on the TCP connection, which will cause the kernel to send an empty TCP packet occasionally to verify that the client is still connected. After somewhere between a few minutes and a few hours, it will give up trying to talk to a client that is no longer responding. You can set ReadTimeout and WriteTimeout on the http.Server struct to do application-level timeouts faster than that.</p></pre>Jamo008: <pre><p>I checked &#34;lsof -i&#34; while I had ~500 goroutines blocked at the frame above and didn&#39;t see anything out of the ordinary, though maybe netstat will show me more</p> <blockquote> <p><a href="https://golang.org/pkg/net/http/#Server.SetKeepAlivesEnabled" rel="nofollow">SetKeepAlivesEnabled</a> controls whether HTTP keep-alives are enabled. By default, keep-alives are always enabled.</p> </blockquote> <p>Keep alives are already enabled on the <a href="https://golang.org/src/net/http/transport.go#L37" rel="nofollow">default RoundTripper</a>.</p> <p>While I figure this out, I&#39;ve indeed just set Read/Write Timeouts on the http.Server</p> <p>Thanks for the advice</p></pre>fortytw2: <pre><p>Can you replicate the issue in a unit test? Try using a lib I pulled out of the standard library - github.com/fortytw2/leaktest to see if anything is leaked after the test ends</p></pre>Jamo008: <pre><p>Thanks I&#39;ll give that a try</p></pre>nicerobot: <pre><p>Forgive me to starting from the most basic if your profiling skills are far beyond this but are you sure you&#39;re not looking at cumulative stats? Maybe try something as simple as the follow to reveal the running goroutines:</p> <pre><code>func init() { go func() { timeout := time.NewTicker(5 * time.Second) for range timeout.C { buf := make([]byte, 1&lt;&lt;20) stacklen := runtime.Stack(buf, true) log.Printf(&#34;STACK TRACE\n==============\n%s\n==============\n&#34;, buf[:stacklen]) } }() } </code></pre></pre>Jamo008: <pre><p>If you leave a down-vote, please also leave a comment as to why</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

536 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传