Is there a way to supervise and restart goroutines if they crash?

blov · · 664 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>I&#39;m building a program that will do 3 sepparate pieces of work we&#39;ll call them a, b and c. I want to perform those pieces of work on different intervals. I would want a to fire a every 1 second, b to fire every 1 hour, and c to fire every day.</p> <p>My thinking is to invoke a goroutine for each of a, b, and c which could be a for loop that just sleeps for 1 second, 1 hour, and 1 day respectively.</p> <p>So now I&#39;m thinking what if one of these goroutines crashes suddenly. What I need is a way to restart the goroutine if something happens. Is there any way to &#34;supervise&#34; these goroutines so if one crashes I can automatically restart it?</p> <hr/>**评论:**<br/><br/>keipra: <pre><p>Also you should probably use a Ticker from the stdlib <code>time</code> package if you want light weight way to schedule recurring tasks.</p></pre>keipra: <pre><p>What do you mean by crash suddenly? Generally all of your failure conditions would be expressed as an error. </p> <p>Go errors do not cause go routines to &#34;crash suddenly&#34;. If your go routines do there&#39;s probably nothing you&#39;re going to be able to do about it (panic due to allocation failure, for example). Or it&#39;s code you have written using panic that shouldn&#39;t be, in which case you can fix it.</p> <p>I think you should think a bit harder about what you are actually concerned might happen and try to express it more clearly.</p></pre>diegobernardes: <pre><p>Take a look at <a href="https://github.com/thejerf/suture" rel="nofollow">suture</a>.</p></pre>Jaeemsuh: <pre><p>Oh this is cool, thanks.</p></pre>toolateforTeddy: <pre><p>IMO, you should not think of goroutimes as long running. You should have one scheduler that launches a goroutine each time you want to fire one of these tasks.</p></pre>toolateforTeddy: <pre><p><a href="https://play.golang.com/p/H_YEHfvA4Co" rel="nofollow">https://play.golang.com/p/H_YEHfvA4Co</a></p> <p>Something like this. Then you don&#39;t have to worry about &#34;thread&#34; management. Each time you start thinking of goroutines as weighty and needing their lifetime managed, consider if you might be going against idiomatic go.</p> <pre><code>package main import ( &#34;fmt&#34; &#34;os&#34; &#34;time&#34; ) func a() { fmt.Println(&#34;a&#34;) } func b() { fmt.Println(&#34;b&#34;) } func c() { fmt.Println(&#34;c&#34;) } func main() { fmt.Println(&#34;Hello, playground&#34;) aTimer := time.NewTicker(time.Second) bTimer := time.NewTicker(time.Hour) cTimer := time.NewTicker(24 * time.Hour) // Just so that you can see the program finish. ender := time.NewTimer(10 * time.Second) for { select { case _ = &lt;-aTimer.C: go a() case _ = &lt;-bTimer.C: go b() case _ = &lt;-cTimer.C: go c() case _ = &lt;-ender.C: os.Exit(0) } } } </code></pre></pre>Jaeemsuh: <pre><p>Wow this is awesome! Thanks for sharing this.</p></pre>therealfakemoot: <pre><p>What you&#39;re probably looking for is a &#34;message queue&#34; of some sort; a proper message queue will be able to handle acknowledgement/receipt of messages, verify completion of work, and redistribution of messages if a recipient has failed in some way.</p> <p>I haven&#39;t worked with message queues at any length but I believe <a href="http://zeromq.org" rel="nofollow">zeromq</a> is a generally useful application. So basically, your &#34;producers&#34; will pump messages into zeromq. Your worker goroutines will consume from zeromq. Producers and consumers will submit side-messages saying &#34;I got this message&#34;, &#34;I finished working on this message, here&#39;s the result&#34;, things like that.</p> <p>Hypothetically, you could implement all of this from scratch in your own application but it&#39;s not easy to be fault-tolerant across the huge range of possible failures: what if zeromq crashes? what if a zeromq node experiences a disk failure? what if a zeromq node experiences a short term network disconnect? what if zeromq experiences a long term network disconnect? what if a worker fails to acknowledge receipt of a message? what if a worker acknowledges receipt but fails to communicate any further? how long do you wait for it to &#39;come back&#39;?</p> <p>If your application is strictly single-machine, you could probably just use an extra channel or two to communicate this extra stuff; &#34;goroutineX received workPayload1&#34;, &#34;goroutineY received workPayload2&#34;, &#34;goroutineX completed work on workPayload1&#34;. if you feel up to the challenge, give it a try, it will probably be very illuminating on the full nature of the problem-space.</p></pre>keipra: <pre><p>You&#39;re trying to sell this poor dude a race car just to drive to the grocery store.</p></pre>therealfakemoot: <pre><p>Ooooor I&#39;m presenting multiple options for a very underspecified question. &#34;I have an application that will do work&#34; doesn&#39;t include any details about where the work is coming from, what the nature of the work is, or how reliable the system needs to be.</p> <p>Note that at the end of my comment, I did propose a rough sketch of a hand-rolled solution involving local channels instead of an external message delivery service.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

664 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传