<p>I was using logrus until now and it caused a real nasty hard-to-debug bug. Basically, under a certain circumstance, there were too many calls to <code>logrus.Infof</code> and hundreds of gouroutines were hung on acquiring lock before they can flush the log. </p>
<p>For ex, log4j does this. <a href="https://logging.apache.org/log4j/2.x/manual/async.html">https://logging.apache.org/log4j/2.x/manual/async.html</a></p>
<hr/>**评论:**<br/><br/>fakeNAcsgoPlayer: <pre><p>You can pipe the logs to a set of fewer goroutines dedicated to writing via logrus API, less log contention but will have some.memory penalty.</p>
<p>Or just don't use logrus, use something like zap or similar.</p></pre>shadyabhi: <pre><p>Thanks for replying. </p>
<blockquote>
<p>You can pipe the logs to a set of fewer goroutines dedicated to writing via logrus API, less log contention but will have some.memory penalty</p>
</blockquote>
<p>Ya, that will require me to write some code. </p>
<p>Does zap not block the calling goroutine?</p></pre>fakeNAcsgoPlayer: <pre><p>We use logrus and notice that with logrus enabled the throughput of the system reduces by 60%, we profiled and found that it is the json encoding which takes most of the time instead of lock contention. </p>
<p>When you say you are having issue because of locking, did you profile your code?</p>
<p>That said most of these libraries have their limitations, luckily you can work around them by writing your own. That is why zap was written. </p></pre>shadyabhi: <pre><p>Thanks for the insight. I didn't profile the code during the issue as this bug happened randomly on our production systems. </p>
<p>Taking a goroutine dump lead me to this. There were thousands of goroutines stuck on this for over 5 minutes. </p>
<pre><code> sync.runtime_SemacquireMutex(0xc42006e2b0)
/golang/1.8/go/src/runtime/sema.go:62 +0x34
sync.(*Mutex).Lock(0xc42006e2ac)
/golang/1.8/go/src/sync/mutex.go:87 +0x9d
github.com/Sirupsen/logrus.Entry.log(0xc42006e280, 0xc424e2fda0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc43992da04, 0xc427324f00, ...)
/my_code/src/github.com/Sirupsen/logrus/entry.go:101 +0x15d*
</code></pre>
<p>So, it was definitely locking in this case. </p></pre>carsncode: <pre><p>If you have hundreds of goroutines hung on logging, I have to wonder: is your logging too verbose? And if not, have you outgrown simple stdout/file-based logging? If you've got that much valuable log data, it might be time to ship directly to a logging service like Logstash, Graylog (GELF), Loggly, etc. or an intermediate log buffer like Redis or RabbitMQ, to be consumed by something like Logstash or Graylog. Like everything else, logging necessitates different approaches at larger scale.</p></pre>shadyabhi: <pre><blockquote>
<p>I have to wonder: is your logging too verbose? </p>
</blockquote>
<p>Yes, it was too verbose for that use-case and it was an unexpected bug. The first thing I did was remove that line.
This issue didn't happen in staging because QPS wasn't high enough. We're talking about a DNS server deployed in one of the top internet companies in the world. The QPS has to be crazy high to trigger this kind of problem. </p></pre>scottjbarr: <pre><p>Instead of goroutines and complicated in-process logging, you could write the logs to stdout and redirect to a file where maybe you have a log shipper waiting to move log data to the ultimate destination</p>
<pre><code>yourbinary > logfile.log
</code></pre>
<p>... or pipe the output into another binary that might (for example) push log data to a UDP endoint</p>
<pre><code>yourbinary | shiptoudp
</code></pre>
<p>... or use the local syslog mechanism if it is available to you.</p></pre>stone_henge: <pre><p>That is a good point, but unrelated to the problem in the OP. Whatever is on the receiving end of the redirection is going to be the bottleneck that OP is already experiencing in-process, unless it solves the problem that OP is asking about. For all we know, OP may be the guy implementing the <code>shiptoudp</code> application.</p></pre>shadyabhi: <pre><p>It won't work because if multiple goroutines are writing to stdout at the same time, the output will be jumbled up. That's the reason <code>logrus</code> and all <code>logging</code> libraries have locks so that only one thread/goroutine can write at the same time. </p>
<p>A solution to this problem is, all goroutines write to a buffered channel which has a pool of consumers that log to file/stdout etc. That way, the goroutine which is trying to log something won't be hung. Hope that made sense. </p></pre>rat9988: <pre><p>May I ask how do your write in stdout in Go?</p></pre>stone_henge: <pre><p>It exists as an <code>*os.File</code> in the <code>os</code> package as <code>os.Stdout</code>. It may be better to write to stderr, though, because the <code>log</code> package uses it by default.</p></pre>rat9988: <pre><p>Thank you very much! :)</p></pre>tmornini: <pre><blockquote>
<p>It may be better to write to stderr, though, because the log package uses it by default.</p>
</blockquote>
<p>Nice! Thanks!</p></pre>tmornini: <pre><p><code>fmt.Println()</code> et. al. ???</p></pre>stone_henge: <pre><p>In essence, what you want to do is buffer the messages so that short term congestion won't hang the processes that are logging. For example, have all your logging directed to a goroutine via a buffered channel:</p>
<pre><code>package main
import (
"log"
"time"
)
func main() {
logger := make(chan string, 10000)
go func() {
for msg := range logger {
log.Println(msg)
}
}()
for {
logger <- "hello"
time.Sleep(1 * time.Second)
}
}
</code></pre>
<p>This means that there could be 10000 pending log messages before this starts blocking any thread sending log messages. Adjust the buffer size to fit the characteristics of your system.</p></pre>shadyabhi: <pre><p>That's correct. I was wondering if there's a logging library in Go which already has that implementation.</p></pre>stone_henge: <pre><p>There's no need to switch to another library. logrus already supports setting your own output writer, so just implement the above as an <code>io.Writer</code>:</p>
<pre><code>package main
import (
"github.com/sirupsen/logrus"
"os"
"time"
)
type bufwriter chan []byte
func (bw bufwriter) Write(p []byte) (int, error) {
bw <- p
return len(p), nil
}
func NewBufwriter(n int) bufwriter {
w := make(bufwriter, n)
go func() {
for p := range w {
os.Stdout.Write(p)
}
}()
return w
}
func main() {
logrus.SetOutput(NewBufwriter(10000))
for {
logrus.WithFields(logrus.Fields{
"animal": "walrus",
"size": 10,
}).Info("A group of walrus emerges from the ocean")
time.Sleep(1 * time.Second)
}
}
</code></pre>
<p>So that's 16 lines of code, IMO too trivial to justify a new dependency</p></pre>cs-guy: <pre><p>github.com/inconshreveable/log15 has code that uses buffered channels. See <a href="http://godoc.org/github.com/inconshreveable/log15#BufferedHandler" rel="nofollow">http://godoc.org/github.com/inconshreveable/log15#BufferedHandler</a>.</p></pre>cs-guy: <pre><p>I worked on adding similar capabilities to github.com/go-kit/kit/log last year in <a href="https://github.com/go-kit/kit/pull/273" rel="nofollow">https://github.com/go-kit/kit/pull/273</a>, but ultimately abandoned it because the error handling didn't meet our standards. The log15 code just discards logging errors, and if that's OK and you would rather use Go kit's log, then you can copy the AsyncLogger type from here: <a href="https://github.com/go-kit/kit/blob/2a348cc41662e402da909dd91146435171d44bbc/log/sync.go#L84" rel="nofollow">https://github.com/go-kit/kit/blob/2a348cc41662e402da909dd91146435171d44bbc/log/sync.go#L84</a></p></pre>allhatenocattle: <pre><p>That pattern can certainly help if the logging is bursty, but if the code is generating too many log events at a constant rate, you'll eventually run out of buffer, no matter how you (reasonably) size it.</p></pre>tv64738: <pre><p>Beware passing mutable data to such an asynchronous logging mechanism, those races will hurt you.</p>
<p>On another angle to solve your problem, <a href="https://github.com/uber-go/zap" rel="nofollow">https://github.com/uber-go/zap</a> is blazing fast.</p></pre>tmornini: <pre><p>I experimented with this and quickly discovered a myriad of potential race conditions as the structs we log in JSON mutated asynchronously to serialization. </p></pre>
Is there a golang logging library around that doesn't lock the calling goroutine for logging?
xuanbao · · 1697 次点击这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传