Is there a way to supervise and restart goroutines if they crash?

blov · · 685 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

I'm building a program that will do 3 sepparate pieces of work we'll call them a, b and c. I want to perform those pieces of work on different intervals. I would want a to fire a every 1 second, b to fire every 1 hour, and c to fire every day. My thinking is to invoke a goroutine for each of a, b, and c which could be a for loop that just sleeps for 1 second, 1 hour, and 1 day respectively. So now I'm thinking what if one of these goroutines crashes suddenly. What I need is a way to restart the goroutine if something happens. Is there any way to "supervise" these goroutines so if one crashes I can automatically restart it? <hr/>**评论：** keipra: <pre>Also you should probably use a Ticker from the stdlib <code>time</code> package if you want light weight way to schedule recurring tasks.</pre>keipra: <pre>What do you mean by crash suddenly? Generally all of your failure conditions would be expressed as an error. Go errors do not cause go routines to "crash suddenly". If your go routines do there's probably nothing you're going to be able to do about it (panic due to allocation failure, for example). Or it's code you have written using panic that shouldn't be, in which case you can fix it. I think you should think a bit harder about what you are actually concerned might happen and try to express it more clearly.</pre>diegobernardes: <pre>Take a look at <a href="https://github.com/thejerf/suture" rel="nofollow">suture</a>.</pre>Jaeemsuh: <pre>Oh this is cool, thanks.</pre>toolateforTeddy: <pre>IMO, you should not think of goroutimes as long running. You should have one scheduler that launches a goroutine each time you want to fire one of these tasks.</pre>toolateforTeddy: <pre><a href="https://play.golang.com/p/H_YEHfvA4Co" rel="nofollow">https://play.golang.com/p/H_YEHfvA4Co</a> Something like this. Then you don't have to worry about "thread" management. Each time you start thinking of goroutines as weighty and needing their lifetime managed, consider if you might be going against idiomatic go. <pre><code>package main import ( "fmt" "os" "time" ) func a() { fmt.Println("a") } func b() { fmt.Println("b") } func c() { fmt.Println("c") } func main() { fmt.Println("Hello, playground") aTimer := time.NewTicker(time.Second) bTimer := time.NewTicker(time.Hour) cTimer := time.NewTicker(24 * time.Hour) // Just so that you can see the program finish. ender := time.NewTimer(10 * time.Second) for { select { case _ = <-aTimer.C: go a() case _ = <-bTimer.C: go b() case _ = <-cTimer.C: go c() case _ = <-ender.C: os.Exit(0) } } } </code></pre></pre>Jaeemsuh: <pre>Wow this is awesome! Thanks for sharing this.</pre>therealfakemoot: <pre>What you're probably looking for is a "message queue" of some sort; a proper message queue will be able to handle acknowledgement/receipt of messages, verify completion of work, and redistribution of messages if a recipient has failed in some way. I haven't worked with message queues at any length but I believe <a href="http://zeromq.org" rel="nofollow">zeromq</a> is a generally useful application. So basically, your "producers" will pump messages into zeromq. Your worker goroutines will consume from zeromq. Producers and consumers will submit side-messages saying "I got this message", "I finished working on this message, here's the result", things like that. Hypothetically, you could implement all of this from scratch in your own application but it's not easy to be fault-tolerant across the huge range of possible failures: what if zeromq crashes? what if a zeromq node experiences a disk failure? what if a zeromq node experiences a short term network disconnect? what if zeromq experiences a long term network disconnect? what if a worker fails to acknowledge receipt of a message? what if a worker acknowledges receipt but fails to communicate any further? how long do you wait for it to 'come back'? If your application is strictly single-machine, you could probably just use an extra channel or two to communicate this extra stuff; "goroutineX received workPayload1", "goroutineY received workPayload2", "goroutineX completed work on workPayload1". if you feel up to the challenge, give it a try, it will probably be very illuminating on the full nature of the problem-space.</pre>keipra: <pre>You're trying to sell this poor dude a race car just to drive to the grocery store.</pre>therealfakemoot: <pre>Ooooor I'm presenting multiple options for a very underspecified question. "I have an application that will do work" doesn't include any details about where the work is coming from, what the nature of the work is, or how reliable the system needs to be. Note that at the end of my comment, I did propose a rough sketch of a hand-rolled solution involving local channels instead of an external message delivery service.</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

685 次点击

加入收藏微博

goroutine

github

channel

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Is there a way to supervise and restart goroutines if they crash?

用户登录

今日阅读排行

一周阅读排行

最新主题