Dealing with loads of server connections.

polaris · · 786 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

I have a REST API Server and what I’m trying to accomplish is to: <ul> <li>1. Limit the connections (total) processing time efficiently </li> <li>2. Prevent DDOS by limiting the amount of concurrent connections to n and adding m-n to a queue. </li> <li>3. Limit queued requests to start processing after x milliseconds or completely get cancelled. </li> <li>4. Limit each connection itself to a processing time limit, unless it’s a file upload. </li> </ul> I don’t really know how to accomplish all of these things efficiently, if you could point me in the right direction, that would be great. <hr/>**评论：** jerf: <pre>Unless you are trying to enforce limits on paying customers who have only paid for a certain amount of calls or something, the correct answer is probably that you are completely on the wrong track and everything you've suggested is literally worse than a waste of time... it can only make things worse! At scale, services don't generally behave the way people expect until they've had experience with this sort of thing for a while. The way things tend to work is that the system pretty much just works, until it hits a pretty hard barrier, and everything falls apart. The more powerful computers get and the better all the bits and pieces work, the more true this is... and nowadays, every computer is extremely powerful. Yes, there is an inbetween state, but it's smaller than you think, and you get to "completely unacceptable performance" very quickly. One of the in-between states is when you just start dipping into swap. It degrades the system, but since what usually gets swapped out first is precisely what you didn't need, you might get a chance to be alerted by your monitoring and fix it before it's a problem. Might. Don't count on it. One of the in-between states is not when you suddenly need 110% of your CPU. Requests being what they are, you must have a very, very particular situation for the dips in requests to allow you to "catch back up". You will find that in almost every real situation you will encounter, if you run out of CPU, the system simply comes apart at the seams, almost immediately. If you can handle 1000 req/s, and you're getting 1100 req/s, the absolute best case scenario is that no matter how you slice it, you're falling 100 req/s behind. It doesn't matter if you try to reschedule those requests, something is going to fail. And these requests are not only correlated with each other, they're correlated in exactly the way you don't want... if your service starts to choke people generally respond by hitting it harder. A human intuition that the system will slowly and gracefully bog down and that we can salvage performance by taking just a bit of the load off is generally completely incorrect. Load isn't that regular until you hit an enormous scale. Generally, all you'll do with this sort of approach is make it so that the wall, when you finally hit it, is even harder and even more immediately fatal. You are far far better off profiling your code, simply speeding up what you can to simply run more quickly and more efficiently, and then, if that is not enough, figuring out how to throw more hardware at the problem if you can't just trivially do it with your current stack. I can hardly express strongly enough how much better this plan is than anything you've suggested in your post. This is not the lazy "just throw more hardware at it" suggestion that you might be inclined to resist (I find it a bit icky myself)... this is the "if you have certain needs you simply need a certain amount of hardware" ground reality. "Cleverly" trying to squeeze out a tiny % improvement at the cost of making the wall even harder when you smash into it is a bad use of time that you should instead be using to increase your capacity to handle more load in general. Oh, and you can not in any manner prevent "DDoS" this way. A DDoS can simply overwhelm your network pipe. It doesn't matter what's on the other end of the pipe, in fact it can literally be the case that there is nothing on the other end, if the DDoS overwhelms the whole pipe. You may need to prep for a DDoS, if that is something you consider an unacceptable risk, but none of that prep involves making your service able to withstand the load; it has to do with making sure the load never hits your service at all, like with Cloudflare or something. That said, if you're at the scale where you're talking about running your entire $WHATEVER on one server anyhow, unless you know you're doing something that will attract political attention or something, you are probably way more worried than you need to be and it'd probably be a bad use of time to even think about targeted DDoS right now. Source: I've worked at "cloud scale" for about 5 years now and have worked on web for 18. What you've got up there is a recipe for pain.</pre>Bromlife: <pre>Beautifully said. </pre>pcstyle: <pre>Thanks for the elaborate answer! Much appreciated it, I guess I'll have to check my code then and try to speed it up.</pre>anoobisus: <pre>The short answer: Don't even think about doing this in Go. There are very free, very high performance web servers you should put in front of your Go app. I'm a happy nginx user. </pre>mwholt: <pre>I'm confused by your #1 - do you want to limit the number of connections or total processing time? Also, regarding #2, I'm not sure that would prevent DDoS attacks. You'd have to completely forbid connections, because even the socket connection uses up ports and file descriptors. If you implemented a queue it would have to be at the network layer, not the application layer -- and I think the OS or runtime does this for us already. As for a processing queue, the scheduler will already hold goroutines (consisting of, in your case, HTTP handlers) in a sort of queue until other goroutines finish if there are too many to interleave at the same time. (I might be wrong here, I'm not an expert with the Go scheduler.) With #2, you want to prevent DDoS, but with #3, you want to hold onto the connections longer rather than just release them and free up resources? For #3 and #4 there is a time.After() function in the std lib that you can use to perform an action - like canceling a request - after some timeout.</pre>pcstyle: <pre>Thanks for replying, I was hoping there'd be a library already available out there, guess I'll have to get my hands dirty. <ul> <li>For #1: it's just an overall I'd like to use maybe worker queues and not repeat identical requests, but since it's a REST API, I'm assuming that would be hard to do, since requests are only distinguished by the access token sent? And data changes very often (no caching in mind for now) </li> <li>For #2: Preventing DDOS is 1 of the outcomes: I'd like to add any unprocessed connections (m-n) to a queue. </li> <li>For #3: I'd like to hang on to unprocessed connections longer (+x milliseconds to be exact) and release them if they're still in the queue and have not been processed. </li> <li>For #4: I'll look into time.After(), though I'm not sure it limits the processing time, it limits read time/write time. But I'd like any http handler? computation itself to be limited to v time. If server is under heavy load and will require a lot of time to process the request and perform any needed computation, just cancel it.</li> </ul></pre>anoobisus: <pre><blockquote> it's just an overall I'd like to use maybe worker queues and not repeat identical requests, but since it's a REST API, </blockquote> Oh dear god that sounds like an incredibly misguided pre-optimization. And you know, sounds like a great way to introduce very, very hard to diagnose bugs. Oh, two GETs in the queue, cool, I'll send the same response on the second one even though there were 100 POSTs in between that could've changed server state. OUCH. I wouldn't let anything like that near real code. If you want caching, do proper caching. If you want DDOS protection, do proper DDOS prevention. This is a massive amount of problems to tackle for just another layer on top of an app. This is a problem that entire applications are dedicated to solving (as mentioned earlier, nginx, apache, etc)</pre>mwholt: <pre>You're right on every count, although I'm pretty sure it takes even more than just nginx or Apache to stop DDoS attacks.</pre>pcstyle: <pre>What about #2, #3, and #4?</pre>mwholt: <pre># 1 - The solution to the problem of "not repeating identical requests" (in the way that I think you mean) is called caching by definition. As <a href="/u/anoobisus" rel="nofollow">/u/anoobisus</a> pointed out, the hard part - especially when serving dynamic content and not simply static files - is determining what is really "identical" ... this is not just a Go library issue, this is a well-documented area of research in computer science. If I recall correctly, it's not easy, or sometimes possible, unless you can predict which kinds of requests are coming in what order... #2 - There is a priority queue in the standard lib (see the container/heap package); still not clear on why you'd want to do this over letting the scheduler queue them up as needed. Each handler could have its own timeout which, if it's still running after x ms, it can terminate itself. #3 - See #2 #4 - time.After signals after a certain duration has passed, regardless of operations that were happening during it (reads, writes, processing, whatever). time.After is precisely what I used to limit processing time in an autocomplete API that had to scan a massive index and return as many results as it could in less than 100ms. After that, time.After signaled all the workers to stop and the response was sent to the client. Works like a charm.</pre>pcstyle: <pre><blockquote> Each handler could have its own timeout which, if it's still running after x ms, it can terminate itself. </blockquote> Does that still work if there is middleware involved, how do they work together? <blockquote> #4 - time.After signals after a certain duration has passed, regardless of operations that were happening during it (reads, writes, processing, whatever). time.After is precisely what I used to limit processing time in an autocomplete API that had to scan a massive index and return as many results as it could in less than 100ms. After that, time.After signaled all the workers to stop and the response was sent to the client. Works like a charm.***** </blockquote> Sounds great, I'll try and figure #4 out and will be back here if they don't.</pre>mwholt: <pre>Yep, it can, depends on how you set it all up but it can work.</pre>pcstyle: <pre>Would it be ok with you if I sent you a PM for help later? I would really appreciate it.</pre>mwholt: <pre>Sure</pre>dwevlo: <pre>Sounds like a lot of work. You can set a deadline on connections with <code>conn. SetDeadline</code> (<a href="http://golang.org/pkg/net/#Conn" rel="nofollow">http://golang.org/pkg/net/#Conn</a>). That will cause a read / write on the connection to fail after the time you specify. You will have to account for this upstream in your code though. Typical HTTP request processing grabbing the data you need, running a bunch of processing, then writing a response back out. With a deadline your code won't quit till you try to write the response. If you use <code>time.After</code> you can fail-fast, but your processing code will probably still be running in the background. You have to write it so it can be cancelled somehow. (For example if you're making a database call, you somehow need to set the deadline on the underlying database connection) There's no way to just kill an arbitrary goroutine (or some magical "kill any goroutine that I started from this goroutine") Rate limiting may be easier to pull off: <a href="https://github.com/golang/go/wiki/RateLimiting" rel="nofollow">https://github.com/golang/go/wiki/RateLimiting</a>. In your network listener loop just recv from a <code>time.Ticker</code> before you call <code>Accept</code>. To do this properly you probably need to record who is making the request though. (The throttle needs to be per-user) Typically your operating system handles the queuing for you. It's part of the TCP stack and Go will default to the max value for the backlog. A DDOS will break this, but I don't think there's really much you can do to prevent it. (Perhaps use a CDN like CloudFlare?) Limiting the # of concurrent connections, when those connections come from multiple addresses, will probably just make your system easier to DDOS not harder. You will be artificially limiting the throughput of your application.</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

786 次点击

加入收藏微博

nginx

goroutine

web

apache

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Dealing with loads of server connections.

用户登录

今日阅读排行

一周阅读排行

最新主题