Is Go the right tool for my job?

xuanbao · · 528 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<blockquote> <p>edit:</p> <p>In conclusion it looks like both Python and Go will address several (if not all) of my concerns. For my particular situation I will start with Python, since I am most familiar with that (and it has some helpful libraries). My biggest concern was Async, but that is well addressed by Python. The following blog bost was particularly helpful for me:</p> <p><a href="https://hackernoon.com/asynchronous-python-45df84b82434#.tujw8t770">https://hackernoon.com/asynchronous-python-45df84b82434#.tujw8t770</a></p> </blockquote> <p>We are developing a mobile app that will in time be backed by a sizeable database (likely x-million records). This would hold account data and item-data. The item-data set will grow from various sources, mostly from third-party through API&#39;s and user input. Main database will be relational and if needed supported by key-value storage.</p> <p>I am considering picking up Go to develop the back-end of the app. Currently I am somewhat proficient in Python, but I believe in using the right tools for the job. In this case I narrowed it down to Go. Does that make sense?</p> <p>The strengths I am looking for are:</p> <ul> <li>Async handling of API calls to third parties</li> <li>Checking user and third-party input (string handling, should scale up well)</li> <li>Robust backend for apps</li> <li>Good support of database operations</li> <li>Scalability to 100.000+ users</li> </ul> <p>Apart from the features, also:</p> <ul> <li>A language that is quick and easy to learn</li> <li>Helpful community (hi!)</li> <li>Enough developers to grow out the operations when needed.</li> </ul> <p>Should I go for Go or consider another language at this point?</p> <hr/>**评论:**<br/><br/>elingeniero: <pre><p>Go can do this. Python is also pretty good at this and if developing quickly is a priority then it might be worth sticking with what you know. </p></pre>snirpie: <pre><p>I figured that python&#39;s weakness (apart from general scalability, which may not be an immediate concern) would be in async operations.</p> <p>I my case that would be having a stack of API calls to third parties being executed and reported back when complete (or failed). It seems that Python (edit:-core) recently added an event-loop for async operations (<a href="https://docs.python.org/3/library/asyncio-task.html" rel="nofollow">https://docs.python.org/3/library/asyncio-task.html</a>).</p> <p>Would that be a viable option? This might not be the right place to ask :)</p></pre>flatMapds: <pre><p>If you don&#39;t want to leave 2.7 gevent, twisted (pythons equivalent to netty), thespian (actors for python), tornado are all options. But I mean let&#39;s be honest here no matter what you do cpython&#39;s concurrency and parallelism story is never going to be very good no matter how many work arounds the GIL people come up with. </p> <p>I like python it&#39;s a great learning and prototyping tool but it&#39;s just wasteful as hell to use it at scale, but assuming that you don&#39;t have a lot of computation going on and everything is just IO, yes the options I gave you for python scale well enough. </p></pre>__crackers__: <pre><blockquote> <p>async operations</p> </blockquote> <p>Python can do this very well, <em>but only for IO-bound tasks</em>. Py3 has <code>asyncio</code> built in, and it&#39;s even part of the syntax in 3.5 (<code>async</code> and <code>await</code>). There&#39;s also <code>Twisted</code>, which is a <em>much</em> more mature and feature-rich async framework for Python.</p> <p>As long as your program only makes light use of the CPU, Python will cope just fine.</p> <p>If your program will be CPU-heavy, however, Python has multiple limitations you&#39;ll need to work around. Because <code>asyncio</code>/<code>Twisted</code> handle all requests in the same thread, any time spent crunching numbers stalls request handling. As a result, you have to be careful with async in Python.</p> <p>And although you can combine threads with async, Python threads suck for CPU-bound operations because of the Global Interpreter Lock (just one <em>Python</em> thread can run at a time; only C-based extensions can release the GIL and allow other threads to run at the same time).</p> <p>So if there&#39;s a chance you&#39;ll be running any vaguely CPU-intensive code, Go is probably a safer bet. It has &#34;real&#34; threading, will automatically make use of all available cores, and a single thread/goroutine doing some CPU-heavy work won&#39;t lock up the entire process till it&#39;s done (unlike a Python thread/async handler).</p></pre>snirpie: <pre><p>Go might be the better bet: for others reading this and even something I may arrive at. For me the only truly CPU intensive task would be occasional image resizing (I hope). This I may even spin off to a different service.</p> <p>Regarding Twisted: it is not something I am familiar with, but looked into it. </p> <ul> <li>Subjective: in first looks it looks less clean than building on top of asyncio (exceptions for instance)</li> <li>I would rather get familiar with core features than a framework (spent too much time on Django already)</li> <li>Building something on Flask or Sanic and potentially switching out bottlenecks (API calls, websockets, REST, even database calls) with async will get me going quickly.</li> <li>It <em>does</em> look mature and proven though.</li> </ul> <p>(sorry if I am way offtopic for <a href="/r/golang" rel="nofollow">/r/golang</a> here, but my analysis may benefit others)</p></pre>__crackers__: <pre><blockquote> <p>occasional image resizing</p> </blockquote> <p>In Python, you&#39;d probably be using Pillow for that, which is written in C and releases the GIL. (Probably. I&#39;m not sure about the resize function specifically.) So that shouldn&#39;t be an issue.</p> <blockquote> <p>it looks less clean than building on top of asyncio</p> </blockquote> <p>It is compared to Py3.5, which has specific syntax (<code>async</code> and <code>await</code>) for <code>asyncio</code>. In any earlier version, there&#39;s not a whole lot of difference between <code>@asyncio.coroutine</code> and <code>@defer.inlineCallbacks</code>.</p> <blockquote> <p>Building something on Flask or Sanic and potentially switching out bottlenecks</p> </blockquote> <p>You can&#39;t simply replace bits of a Flask app with asyncio. You have to either run them separately or run Flask app instances in an asyncio thread pool.</p></pre>snirpie: <pre><blockquote> <p>You can&#39;t simply replace bits of a Flask app with asyncio. You have to either run them separately or run Flask app instances in an asyncio thread pool.</p> </blockquote> <p>Uhm yeah, I was wondering. Have to wrap my head around that one.</p> <p>edit: looks like there is something like that out there, but not sure about potential tradeoffs: <a href="http://flask-aiohttp.readthedocs.io/en/latest/coroutine.html" rel="nofollow">http://flask-aiohttp.readthedocs.io/en/latest/coroutine.html</a></p></pre>__crackers__: <pre><blockquote> <p>Have to wrap my head around that one.</p> </blockquote> <p>The fundamental issue is that Flask expects to handle one request at a time (a lot of request state is process-global) and that you&#39;ll run multiple instance of the app to handle multiple requests at the same time.</p> <p>asyncio (or any other async framework) by comparison, is designed to run hundreds or thousands of requests in the same instance, and thus needs request-scoped data.</p> <blockquote> <p><a href="http://flask-aiohttp.readthedocs.io/en/latest/coroutine.html" rel="nofollow">http://flask-aiohttp.readthedocs.io/en/latest/coroutine.html</a></p> </blockquote> <p>The <a href="https://github.com/Hardtack/Flask-aiohttp/" rel="nofollow">GitHub repo</a> says EXPERIMENTAL in rather large letters. I wouldn&#39;t…</p> <p>There is also <a href="https://github.com/twisted/klein" rel="nofollow">Klein</a>, a Flask-like framework based on Twisted (i.e. async). Unfortunately, its documentation follows the long-standing Twisted tradition of being, well, shit. To understand Twisted/Klein documentation, you must first understand Twisted/Klein…</p></pre>elingeniero: <pre><p>async/await syntax was added in Python 3.5. It works really nicely and solves the async callback nastiness of the past. </p> <p>Might also be worth pointing out that Go doesn&#39;t have an equivalent so async calls must be handled explicitly within your goroutines. </p> <p>Finally, two new Python libraries sanic (web server framework) and asyncpg (postgres database driver) base themselves on the new syntax and outperform native Go implementations in some benchmarks. </p></pre>cdoxsey: <pre><p>I&#39;m dubious of these claims. </p> <p>I&#39;ve never tried Python 3.5, but we use python all the time at work and are constantly running into scaling problems. Our only solution was a lot of multiprocessing/forking which causes its own set of headaches. (Frozen and orphaned processes, signal handling goes wonky, etc) It also ends up using an absurd amount of memory.</p> <p>Are coroutines in Python 3.5 multiplexed onto multiple CPUs?</p></pre>__crackers__: <pre><blockquote> <p>Are coroutines in Python 3.5 multiplexed onto multiple CPUs?</p> </blockquote> <p>Nope. And the GIL is still there. As before, <code>multiprocessing</code> or separate Python processes are the only ways to utilise multiple cores in Python.</p> <p>But if your program does little more than shuffle data back and forth between clients and your database, Python should be up to the job.</p></pre>elingeniero: <pre><p>Asyncio can use a thread pool in the event loop. Python does have its limitations and isn&#39;t ideal for all use cases, maybe it&#39;s not ideal for yours, but I think it is a good fit for the OP based on their requirements. </p> <p>That said, I don&#39;t think that orphan processes and high memory use are a necessary consequence of threading in Python... Perhaps it could do more to help the developers to avoid these issues, but I suspect that your problems are solveable.</p></pre>cdoxsey: <pre><p>I agree that existing knowledge may outweigh any performance benefits. It doesn&#39;t sound like the OPs load is going to be problematic.</p> <p>We ourselves use Python for our web application with gunicorn, and it does just fine.</p> <p>Throwing a couple more machines at the problem is perfectly reasonable.</p> <p>I was just speaking from experience having worked with Python under load. In my opinion if you know Go, you&#39;re better off using it from the beginning.</p> <p>Async programming also has the issue that it introduces an entirely different programming model to your code. As your program grows the diversity of programming models makes for very complex code that ends up being largely unmaintainable for new developers. Ruby, nodejs, Java and c# all have the same problem. (ie am I in async world or blocking world? Do I just write two implementations of everything?)</p></pre>_n7fury_: <pre><p>While python 3.5 has async support, wsgi is synchronous so all the mature stable frameworks aren&#39;t async.</p></pre>__crackers__: <pre><blockquote> <p>Might also be worth pointing out that Go doesn&#39;t have an equivalent so async calls must be handled explicitly within your goroutines.</p> </blockquote> <p>I&#39;d argue that goroutines <em>are</em> Go&#39;s equivalent of <code>asyncio</code>. They&#39;re super-lightweight, so you can easily start hundreds or thousands of them, which is the same problem with normal threads that async is supposed to solve.</p> <p>The main advantage async has over goroutines is that async programs are single-threaded, so you don&#39;t need to worry about locks and other cross-thread synchronisation issues.</p></pre>elingeniero: <pre><p>Yes, but when you write your goroutine itself, that has to handle the async operation in the traditional callback way. </p></pre>__crackers__: <pre><p>There <em>isn&#39;t</em> an async operation, though. OP is talking about making HTTP requests to third parties and returning when they succeed/fail, not calling some service and having it call you back later.</p> <p>If there were such an async operation, there&#39;s still no need for callback hell.</p> <p>You&#39;d create a channel, set your handler to write to the channel and then immediately <code>select</code> the channel, which blocks the goroutine till the result is written to the channel. When your handler is called, execution continues right where it stopped, exactly like with <code>await</code>/<code>yield from</code>. It&#39;s more verbose, but no more than Go generally is vs Python.</p></pre>snirpie: <pre><p>Great pointers, looks I will stick to what I know for now and save Go for a later date. This was actually my biggest concern, but I have not followed Python development for a while. Seems like it may the right tool for the job after all ;)</p> <p>Y&#39;all have really gone above and beyond what could be expected from <a href="/r/golang" rel="nofollow">/r/golang</a>. Really hope this thread will be helpful to others as well.</p> <p>Now if you excuse me, it seems like I have some catching up to do.</p></pre>elingeniero: <pre><p>No problem. I don&#39;t want to put you off Go; I would definitely recommend learning it at some point and there are reasons why you might choose Go over Python.</p> <p>It&#39;s just that async operations and scalability are not in those reasons, so it wouldn&#39;t be correct to recommend Go over Python when you already have a head start with Python knowledge. </p></pre>dobegor: <pre><p>I think you&#39;re on the right way. That&#39;s the domain Go was designed to shine in. </p> <p>Go for Go and never hesitate to ask. Start from the gorgeous Go Tour to get the basics of the language. </p></pre>snirpie: <pre><p>Thanks, doing that now!</p></pre>banana__hammock6: <pre><p>Go is probably a near-perfect match for your requirements.</p></pre>frikkasoft: <pre><p>I think you can&#39;t go wrong with picking Go here, its really excellent for writing back-end services.</p> <p>I recommend you invest a day or two into looking into the language, write a simple web back-end and take a decision from there. A good start would be to follow the official tour at <a href="https://tour.golang.org/" rel="nofollow">https://tour.golang.org/</a></p></pre>flatMapds: <pre><p>Go is strong in all of these requirements, except input validation which because it&#39;s a small language you have to do an extra bit of effort there but it&#39;s not too bad. Go is also the easiest language I have ever learned, I like the chunk of the community focused on distributed systems. </p></pre>styluss: <pre><p>Word of advice, projects are already hard enough without needless changes. You won&#39;t start with 100000 users, use languages that people are familiar with before adding a new language. Stay with python until you see bottlenecks or other real limitations.</p></pre>snirpie: <pre><p>Good advice I think and something I am considering as well. There is no codebase yet and I am actually interested in learning Go (and even my python skills are not up to scraps for this job). Looks somehow good to start with a clean slate. It will take longer to ship out an MVP and that may or may not come back to hurt me.</p> <p>Fact of the matter is that this particular product hinges on scalability (even without taking 100,000 users into consideration) so Python might be the bottleneck really quickly.</p> <p>Hope I am making the right call, but walking away from this with a solid knowledge of Go would still be a (bittersweet) win.</p></pre>elingeniero: <pre><p>Well I don&#39;t know what your web app does in detail, but unless it&#39;s doing a lot of processing then I would expect your database to be the bottleneck for scalability. </p> <p>In general I would advise against premature optimisation. As long as your approach isn&#39;t fundamentally flawed (and choosing Go over Python or vice versa is not going to create a fundamental flaw), best to develop as quick as possible and solve problems as they arise and not beforehand. </p></pre>snirpie: <pre><p>Still figuring out what it will do in detail myself ;) Basically registering items with users and some reporting based on that. There will not be that much processing and it will be roughly lineair with growth. Sticking to Python/Postgres for now and will start optimising or switching out pieces as need arises.</p></pre>Doctuh: <pre><p>Learning Go is great idea, however beginning what may be a critical app as your first Go codebase may not. </p> <p>Take a look at your first few weeks of Python, do you want to be working with that guy&#39;s code for the next few years?</p></pre>snirpie: <pre><p>Fair point, although Python was my first language (not counting some ductaped VBA) and I learned it organically. Still fear that I am missing some basic concepts and was looking at Go as a way to leapfrog that.</p> <p>That may not be a healthy approach although the clean slate always looks nice. I guess my takeaway should be that I really need to do some work and learning in Python first. If nothing else, it should build a nice reference.</p></pre>tmornini: <pre><blockquote> <p>without needless changes</p> </blockquote> <p>Counterpoint: In my experience the best time to learn a new language is at the beginning of a project.</p> <p>Learning a new language is a huge benefit -- even when you don&#39;t like them.</p> <p>Best way to learn about programming concepts, independent of language features.</p></pre>stormy11: <pre><p>Use node.js. Go is imperative programming.</p></pre>snirpie: <pre><p>Really not sure how this is supposed to be helpful. I think I was clear that this was supposed to go beyond programming paradigms to arrive at the right tool for the job.</p> <p>I did consider node.js actually but it seemed to have little merits. The javascript language seems like a risky bet for large scale projects and the callback hell is a real risk. All the while offering less performance than Go.</p> <p>To be honest with you, I do not have a great feeling about the node community and your comment does little to improve that.</p></pre>stormy11: <pre><p>Understand. Good luck to you.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

528 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传