Advice about job queue solution for Go

blov · · 477 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>Hi</p> <p>I&#39;m writing a service which listens on rest endpoints and starts jobs accordingly: think something like processing 1000 image files and generating thumbnails for each one. I&#39;m gonna have a few tasks of different priorities and different parallel-isability. Can someone recommend a way to implement this in Go? I don&#39;t have any experience with them but a job queue seems like the answer, but a lot of the big libraries seem overkill.</p> <p>Basically I need a way to enqueue a job (a normal Go function) from my API endpoints, and then start executing those jobs ASAP from within the same server and binary. I don&#39;t need any distributed solution, so preferably something that isn&#39;t going to need me to run redis or celery etc. It&#39;s a low traffic server that&#39;s gonna experience bursts of job enquiuing, and then sit for the next few hours in the background finishing the list of jobs.</p> <p>Thanks</p> <hr/>**评论:**<br/><br/>i47: <pre><p>Check out this article from someone at MalwareBytes <a href="http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/" rel="nofollow">http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/</a></p></pre>bbslimebeck: <pre><p>Perfect, thank you</p></pre>redlandmover: <pre><p>Just a heads up, that person is no longer with the company.</p></pre>mcastilho: <pre><p>That is correct. After a 6-month transition I have left all of our cloud projects in very good hands with the awesome team I have created there at Malwarebytes. I am now Chief Architect Officer of KnowBe4, a Cybersecurity Awareness Training company (<a href="http://www.knowbe4.com" rel="nofollow">http://www.knowbe4.com</a>), that I helped our CEO Stu Sjouwerman and Kevin Mitnick start back in 2010.</p></pre>jiuweigui: <pre><p>What are you trying to imply here?</p></pre>redlandmover: <pre><blockquote> <blockquote> <p>from someone at MalwareBytes </p> </blockquote> <p>Just a heads up, that person is no longer with the company.</p> </blockquote> <p>absolutely nothing except what was explicitly stated. the author of that blog post is a great dev, but to say that he is still at malwarebytes is incorrect. he left to do other (awesome) things.</p></pre>simonw: <pre><p><a href="http://contribsys.com/faktory/" rel="nofollow">http://contribsys.com/faktory/</a> is a relatively new job queue server written in Go by the author of Sidekiq, one of the most popular Ruby options. Looks very promising.</p></pre>014a: <pre><p>It really depends on how durable and scalable you want your solution to be.</p> <p>The simplest solution is seriously just a buffered channel with a separate goroutine reading from the channel. You can set a buffer high so that it can handle some burst without blocking writes. That being said:</p> <ul> <li>If you exceed the buffer, callers will be blocked, slowing your client API requests down.</li> <li>You can&#39;t distribute work over multiple servers; the server that handled the request will always also do the work.</li> <li>If the server dies, all the jobs are lost. And this might be expected to happen if your usage is especially bursty.</li> </ul> <p>But while these are bad, its actually pretty decent for simple projects.</p> <p>Beyond that, I&#39;d highly suggest using SQS (if you&#39;re on AWS) or GCPS (if you&#39;re on GCloud). These are high scale queueing solutions that solve all of these problems. They&#39;re also practically free at low usage. They don&#39;t come without downsides, though:</p> <ul> <li>More infra to support, though its <strong>significantly</strong> easier than running a turn key queue system like redis.</li> <li>You can enqueue messages asynchronously from the request, but then you can&#39;t guarantee to callers that the request was actually enqueued. So its best to do the enqueue inside the thread that handled the request, which might add a few milliseconds to the request time.</li> <li>Reading the queues isn&#39;t instant; it requires network calls. But its so fast this shouldn&#39;t mattered.</li> <li>Neither of these products can guarantee exactly once delivery. They only do &#34;at least once&#34; delivery. So its possible your messages might rarely be delivered twice. To get exactly once, you can deduplicate incoming messages with the unique identifier of the message, either in-memory if you run one server or in a database like redis if you run multiple servers. Or, preferably, you can architect your system to be tolerant against receiving the same message twice by being idempotent. </li> </ul> <p>I&#39;d stay away from redis, celery, etc unless you know why you need it. SQS/GCPS are great and basically free. Azure probably has a similar solution, I just haven&#39;t looked into it. Use &#39;em. </p></pre>scottjbarr: <pre><p>Even if your queue client code isn&#39;t on AWS, I would recommend giving SQS a shot due to it being fast and reliable. With SQS a message will keep coming back to you until you ACK (in SQS you delete) the message. This requires your code to be idempotent in the context of the message, but that isn&#39;t a bad thing when you consider messages aren&#39;t just getting dropped on the floor if they fail to be processed by your handler. I know some people say just put it back on the Queue (if you&#39;re using Redis), but now what happens when Redis is unavailable temporarily? Your message will be lost. I definitely prefer the more reliable option for critical messages. Given that it is so cheap as to be almost free, and you don&#39;t need to manage the queue itself it is a great option.</p> <p>Having a reliable queue also means you need to think about what to do with messages that can never be processed. Instead of just erroring out for whatever reason and forgetting about the message you&#39;ll still need to remove the message form the queue. You can leave it there but that seems pointless and can create a lot of thrashing for your handler. </p> <p>Just my thoughts. There are other options but SQS has worked everytime, for little effort on my side.</p></pre>rotharius: <pre><p>Not really a <em>job</em> queue, but have you tried a message service like NATS or, if you need Kafka-like persistence, NATS Streaming?</p> <p>You can write your messages as structs representing commands or events, send them through the message service and let another service react to it. It is extremely easy to setup, does fan-out by default (all subscribed services get the message), but competing consumers (messages are delivered in a round robin fashion) can be implemented using a queue group id.</p></pre>BeardsAreWeird: <pre><p>Do you need any kind of persistence in case the server crashes? RabbitMQ is easy to use and quick to have a prototype working. If not, goroutines and channels would be the simplest solution. </p></pre>bbslimebeck: <pre><p>This is for a personal helper service so while I kinda do need persistence I&#39;m guessing that all the jobs will be executed quickly enough. I&#39;m going to go with the goroutines approach explained above, thanks</p></pre>hell_0n_wheel: <pre><blockquote> <p>I&#39;m guessing that all the jobs will be executed quickly enough. </p> </blockquote> <p>Then we can guess that all your users will experience errors soon enough.</p> <p>Unless you give us a few more details about your requirements (scale, uptime, available resources, environment, etc.) you&#39;re going to get answers all over the map.</p> <p>IMO, goroutines should be a tool of last resort: you&#39;ll be reinventing the wheel, and making all the mistakes others have made in the process.</p> <p>RabbitMQ / ZeroMQ are super quick and easy to set up and run with, and there are golang wrappers already available for each tool.</p></pre>rotharius: <pre><p>I agree with this. There are even services simpler than RabbitMQ or ZeroMQ. If you don&#39;t need persistence, you could even use Redis.</p></pre>hell_0n_wheel: <pre><p>I&#39;ve used all three as queues before, and ZeroMQ wins on simplicity. Redis is likely the most complicated. This is all as a single-node, single-client install...</p></pre>bbslimebeck: <pre><p>I should&#39;ve made it more clear but the program won&#39;t be leaving my tiny VPS and I&#39;m going to be the only user. But I think I&#39;ll end up using something like rabbitMQ even just as a learning experience. </p></pre>hell_0n_wheel: <pre><p>For practical purposes, yeah, ditch everything, go single threaded.</p> <p>For learning purposes, ditch everything, go 2x single threaded (producer and comsumer), and use a message queue.</p> <p>I can&#39;t say this enough: YAGNI, YAGNI, YAGNI. Even if you&#39;re building a service to run at scale, build it simply and test it thoroughly before deciding how to &#34;fix&#34; the design for better throughput. More often than not, a simple design on modern hardware is sufficient to handle scale.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

477 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传