<p>I have an app that I made using Node.js and Socket.io. The communication between the clients and the server goes very well until more and more clients start joining the room. Basically what happens is that the server has to update every connected client every 30ms with a message that is ~15Kb on average. When the number of connected users increases to about 300, a server with this CPU [Intel Xeon D-1520 - 4c/8t - 2.4GHz /2.7GHz] will start failing because it's usage cranks up to about 120% and beyond. And after testing, I gathered that the increase in CPU usage was coming from the 'emit' statement from the server (which makes sense since it sends to more and more clients).</p>
<p>It is also attributed to the fact that Nodejs only uses 1 core by default, theoretically. Therefore, it's not really exploiting the multicore advantage of the CPU. I would like to prevent the 'emission' from blocking the rest of the app from running correctly; which is that I'd like the logic of the app to still run seamlessly even when there's a lot of messages to be sent, and even then, the multiple cores would share the load of emission to be performed.</p>
<p>Using clustering wasn't really helpful as it runs one version of the app for each child process (which is just ridiculous). And even if I could arrange to run the logic of the app on one core and have the remaining cores take care of dispatching the messages, I would still need the 'app-logic' core to communicate its state to the dispatchers which would then update the corresponding clients; but I'd encounter the same bottleneck that I'm running from in the first part of this last approach, with an increasing population. And Redis turned out to more complicated than it should, with outdated documentation.</p>
<p>I wanted to turn to Golang since I heard of its multithreading capability. But I'm wondering if that aspect is actually going to be useful to my case. So my question is to know if Golang would allow me to have relatively heavy messages to send that wouldn't necessarily impact the overall performance of the CPU. Can the work of emitting a message through Websockets be distributed to the available cores and not weight on just one core? Also is WebSockets the best choice for this?</p>
<hr/>**评论:**<br/><br/>pobbly: <pre><p>Might be a design issue? 15kb seems really big for a message. For most apps it's probably best to only send messages that describe how to modify the state on the server or client, not big chunks of state itself. Something like <code>move playerID x y</code> (serialised) would be a good small message. Just a few bytes. Also check out MessagePack: <a href="http://msgpack.org/index.html" rel="nofollow">http://msgpack.org/index.html</a></p>
<p>That said, I've had good results with <a href="https://github.com/olahol/melody" rel="nofollow">https://github.com/olahol/melody</a> and raw websockets in the browser. They're well supported these days, so there's little need for socket.io. I also used the go's concurrent map for state on the server, but I've only needed one instance.</p></pre>GoblinFruitSeed: <pre><p>Only sending diffs is a great idea actually! In fact, the client only sends a few things like the position of the mouse, whether the mouse is clicked or not and information about the size of the screen/browser (for ratio purposes, so that instead of sending the whole universe, the server only sends what will fit on the client's screen). Therefore, the players are created and updated on the server side entirely.</p>
<p>I like the idea of sending only diffs to the client too. But that would imply updating every player on the client side (which I fear can worsen the experience for some users according to how many players there are to be updated). But I think it's definitely a route worth exploring.</p>
<p>MessagePack and melody sound very interesting! I'll look more deeply into them, thanks!</p></pre>pobbly: <pre><p>You're welcome. On hearing more about your requirements, I would say you should definitely invest in a redesign. As far as I can see, you have two major options - keep state on each client, and make it 'eventually consistent', or keep the state on the server, and make it fast (go will help here). But in either case, you should be sending very small messages only. If you're sending state back and forth, something is wrong.</p></pre>ZetaHunter: <pre><p>Note, I am not reading into your use case, this is for using WS in general.
But it sounds like you could benefit from <a href="https://github.com/centrifugal/centrifugo" rel="nofollow">centrifugo</a>, aside from that you could either use gorilla/websocket or gobwas/ws for your own implementation.</p>
<p>You have to keep in mind that socket.io adds a lot of overhead, <a href="https://hashrocket.com/blog/posts/websocket-shootout" rel="nofollow">here</a> you can see a benchmark of different solutions, keep in mind that one uses x's ws library, which is deprecated and suggests using gorilla one, gobwas/ws is more efficient solution. Also it doesn't show socket.io there.</p></pre>GoblinFruitSeed: <pre><p>Precious info, Thanks! Centrifugo looks awesome as it wouldn't require me to rewrite my entire server in a different language. But since, in that case, it would rely on a Nodejs server, I wonder if it wouldn't just hold me back in my quest of trying to use all available cores.</p>
<p>C++ is looking very good as a server! I almost started drooling in awe when I saw the charts and how much more performant it can be than Node.js. It's just daunting that it's also equally more demanding code-wise than using Node.js, and it doesn't help that coding in C++ is in general like walking on eggs because of all the potential memory leakage.</p>
<p>I will be giving a shot to Gorilla/websocket and C++/Websocket++ whenever I summon enough courage to start rewriting the back end.</p>
<p>Thanks a lot, btw!</p></pre>Olreich: <pre><p>C++ just forces you to catch and release memory. Shockingly enough, once you’ve built the algorithms and gotten things set up, you will generally know exactly when you need memory and when you don’t, so it’s pretty easy to new and delete or malloc and free at really good times. C++11 has also introduced quite a few things to simplify memory management like shared and unique pointers.</p>
<p>Go is a much nicer place to be for concurrency though. C++ makes you do a lot of work to safely talk amongst concurrent tasks.</p></pre>readonly123: <pre><p>I mean, I'm not sure that semaphores, mutexes, and mmap/malloc is "a lot of work", but making nice abstractions around it is a pain</p></pre>koresho: <pre><p>What is “just ridiculous” about running multiple Node processes in a cluster?
Separate your state to Redis or another DB, then cluster your app. At scale you’ll have to do this regardless of the language you use: eventually your Go (or C++ or whatever) variant will require more resources than a single machine can provide as well. </p>
<p>Other than that, as others have mentioned, sending just diffs will help. Switching to a faster language like Go will help temporarily. Eventually you’ll have to solve clustering anyway. </p></pre>GoblinFruitSeed: <pre><p>True that! I'm dumb if I think I'm running away from scaling by going to Go.</p>
<p>My problem is that I need to use all the cores, use them harmoniously to run one version of the app per server, and have them distribute the task of communicating with the clients without blocking anything. With Node, clustering will make me run one version of the game on each child process, which I don't want because it doesn't get me out of anything: if when using one core, the app starts failing at 200 users, using 4 cores with one version of the app on each would mean 4 times more players are allowed, but not the 800 are interacting together; you still have approximately that same limit of 200 players per game, although there are 800 on the server.</p>
<p>From my understanding, using Redis wouldn't have helped either. Please, correct me if I'm wrong, but I got the sense that Redis allows you to store the sessions and connections so that when one cluster emits, the message can be emitted from all the other clusters too, to their respective clients. The problem with that is that my scenario requires to have one cluster running the game logic, and the other clusters taking care of getting messages from client sockets, transmitting the inputs to the game logic to perform any necessary operations, and retrieving outputs from the game-logic core, tailored to each client, and sending these out. This, so that the heavy emissions would not rest on one core but the load would be shared.</p>
<p>Now if I understand the principles of Redis well enough, it is not really made to transfer data from cluster to cluster as if it was just copy-pasting data from a folder to another on the same machine, but rather ensures a consistency in the messages that are emitted.</p>
<p>Also, the idea of simply emitting from the game-logic core to the dispatcher cores (and let them emit to their respective clients) would make the CPU fail at even less than the 200 users benchmark. This because the game-logic core would first be dispatching the total amount of data every 30ms, but this time to 3 other cores (which is approximately 3 times the size of a normal emit message).</p>
<p>And just to clarify, the reason I am not running independent versions of the game on each cluster is that the game features some randomness (objects placed here and there, CPUs deciding to take a certain action and therefore ending up at location [x,y] in state z). So having different versions running would be fine as far as seeing the other human players and their actions, but the randomly generated behaviors wouldn't be consistent.</p></pre>koresho: <pre><p>The idea behind clustered apps is you store ALL state in the database. Redis works well for this because it’s very fast. Although generally you’d have a two stage storage system, where you have Redis for temporary state and another slower but more classical DB for more permanent state. </p>
<p>Essentially the idea is, you have your state (the actions CPUs take, the random numbers, the position and actions of your players) and put it all in your database (like Redis).
Then, each server process is in charge of updating its clients with that state, and processing updates from clients into the global state. Each clustered server process doesn’t actually store any state itself. It just shuttles it between the database and its clients. </p>
<p>You’re correct that you don’t have 100% efficiency: this approach won’t scale you to 800, probably a bit less. That’s why you can’t just stop at clustering, you still need to do other changes such as diffing. I was only addressing the point that clustering is still what you’ll need to do regardless of server language, unless you’re confident you’ll never need to scale past a single server. If you are confident you’ll never scale a game instance to run on more than one server, you could switch to Go or some other language and avoid the issue I suppose. </p></pre>pobbly: <pre><p>Excellent technical comment about the ephemeral vs more long-lasting data storage requirement. So many apps run into this sooner or later, and it's a very important skill to figure out which data should be stored where.</p></pre>koresho: <pre><p>Thanks :)</p></pre>GoblinFruitSeed: <pre><p>I think I see your point! I didn't consider Redis as a sort of database but just as storing the sockets (which is utterly false looks like). I haven't had much chance messing around with it and seeing for myself how well it works because I'm using express 4 and the documentation doesn't really cover its use with Redis. When I figured out how to make it work it ended up only using 1 of 4 working child processes. Many people have posted this same issue on blogs but no one seems to come up with a suitable answer. I'm posting the issues myself right now on stackoverflow.</p>
<p>If Redis will act as a DB and can store temporary states, it will certainly not scale to the exact factor of cores due to the fact that it'll be using a DB, but it will be significantly better than using one core. Rewriting my server in go isn't much of a hassle as it is tedious, I have the feeling that it wouldn't be too hard. But at the same time if I can save some time and get a few less players than 800, it'd be something. This is provided that a working solution can be found for the cluster worker issue I talked about above.</p>
<p>If think you're right, scaling will keep its value. I'm not too sure I'll need to scale across multiple different servers, but I never know what may befall me. Thanks for the insightful import!</p></pre>koresho: <pre><p>No problem, happy to help! </p></pre>tty5: <pre><p>Node is single threaded, so you're only using a single core of that server - go has no such problem, but it will not solve your issue, because the bottleneck is elsewhere:</p>
<p>15KB message, to 300 users, every 30ms (or 33 times per second) is 15KB * 300 * 33 = 145MB/s.</p>
<p>145MB/s is 1.13Gbit/s - you can't send that much using gigabit network connection, which sounds super excessive for such a low number of users.. to put it into perspective: a single user that stayed online for 30 days would get over a terabyte of data.</p>
<p>Counter Strike: Global Offensive uses a maximum packet size of 1200 bytes, with most packets being smaller and with default settings has per player bandwidth cap at 80KB/s - about 1/6 of your average, while having updates happen twice as often (64 tics/s).</p>
<p>You will need to start using state diffs and there is no way around it. Most likely for continuous changes (like movement) you will have to make movement itself a state: </p>
<ul>
<li>player n state change 1: started moving at angle x, with speed y</li>
<li>player n state change 2: empty (figure out how to skip?)</li>
<li>player n state change 3: changed movement angle to x2</li>
<li>....</li>
</ul></pre>habarnam: <pre><p>As a general idea, I doubt that if scaling an app in an ecosystem you are familiar with is problematic, you'll have better luck in a new and unknown one, be it Go or something else.</p>
<p>Also the answer to this problem lies within testing, not asking randoms on the internet which have no idea about your application.</p></pre>
Will switching from Nodejs-Socket.io to Golang-Websockets resolve the problem of performance issues on message emission from the server?
polaris · · 931 次点击这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传