SpaceX is using Go for it's telemetry system

blov · · 675 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>I recently stumbled upon the blog post from one of SpaceX developers, where he reveals the fact that part of SpaceX telemetry system is written in Go. The blog post is in russian, so here is translated excerpt:</p> <blockquote> <p>It has been 2 years since I started working in SpaceX. I could easily forget about it, but colleague reminded me.</p> <p>First year passed under the aegis of Go. Go the programming language, I mean. We have part of our telemetry system written in Go, and I was working on that code. Funny, but when I was in Google, I didn’t even think about writing in Go. Go left “typical” impressions - simple, comfortable C, and without generics as well, as in C. Everything is great, but lacks the most desired things.</p> <p>[rest of the post is about moving to another, Falcon, team]</p> </blockquote> <p>Original link: <a href="http://blog.not-a-kernel-guy.com/2015/10/16/1738">http://blog.not-a-kernel-guy.com/2015/10/16/1738</a></p> <hr/>**评论:**<br/><br/>mwholt: <pre><p>Welp, so much for <a href="https://forum.golangbridge.org/t/is-golang-a-good-fit-for-robotics/1136/4?u=matt">my earlier comment</a> then:</p> <blockquote> <p>Unless your robotics are, like, rockets, Go is probably a fine candidate language.</p> </blockquote></pre>natefinch: <pre><p>Ahh crap... now I <em>really</em> want to go work for SpaceX.</p></pre>tehwankingwalruses: <pre><p>You should apply if you enjoy working for weeks with no days off and 12-14 hour days.</p></pre>dmikalova: <pre><p>Don&#39;t forget you&#39;re not allowed lunch breaks and they pay under market because you&#39;re working for the good of humanity and at all times you are fully replaceable by more qualified people.</p></pre>solvire: <pre><p>Was I the only one who immediately scanned for GC comments?</p></pre>lavezza: <pre><p>Based on a comment SpaceXJobs made on Twitter, Go is used for a platform called Borg.</p> <p>Based on an old job posting: &#34;As a member of the Borg team, you will:</p> <ul> <li>Work at all levels of the stack to build systems that manage large volumes of mission-critical data generated by SpaceX simulation, test and flight systems, along with the tools we use to analyze these datasets.</li> <li>Engage with other SpaceX engineers to discover their needs and code highly reliable software that revolutionizes the way data is stored, retrieved and manipulated.</li> <li>Own the complete life cycle of the software you create, from design, development, and testing to operation during a mission.&#34;</li> </ul> <p>&#34;The Borg team designs and codes parallelized algorithms for manipulating and analyzing numeric time series data, builds web interfaces and APIs for dozens of clients and hundreds of users, and automates the process of data analysis, with the ultimate goal of developing an expert system to enable SpaceX&#39;s objectives for rapid vehicle reusability and increased reliability.&#34;</p></pre>noydoc: <pre><p>I wonder if GC pauses were involved in any lost rockets.</p></pre>divan0: <pre><p>There is a comment with similar question, and author has responded that, obviously there is no GC on rockets. They use Go for telemetry, and GC isn&#39;t a problem there.</p></pre>prf_q: <pre><p>You know that Go 1.5 has deterministic GC pauses (and those are a lot less than earlier versions in terms of duration) and those are still visibly improving, right?</p></pre>Yojihito: <pre><p>Deterministic = 100% mathematical proven deterministic? Everything else is useless shit in rockets.</p> <p>Also there is a reason why the NASA uses 70s hardware.</p></pre>1r0n1c: <pre><p>So we lost the ability to produce deterministic hardware?</p></pre>hahainternet: <pre><p>In some respects yes. Certainly in regard to very tight scheduling of operations.</p></pre>1r0n1c: <pre><p>Can you elaborate a bit? I&#39;m really curious.</p></pre>hahainternet: <pre><p>I&#39;m no chip designer but you should look up &#39;out of order execution&#39; to get an idea of the mechanisms going on.</p></pre>herir: <pre><p>What 70s hardware produces 100% mathematical proven deterministic results?</p></pre>Yojihito: <pre><p>70s hardware = big cpus (can&#39;t find the english word that describes the size of the cpu) that are unlikely to fail due to cosmic rays and are tested for decades now. A modern 14nm cpu would go nuts in the space.</p></pre>howeman: <pre><p>At least as of a few years ago, skybox was also using go, I think on the data processing end (this is pre buyout)</p></pre>thepciet: <pre><p>This sounds reckless to me. The language runtime is too fresh to have a hand in life and death computing.</p> <p>But, if the goal of SpaceX is flying people around, the technology used must be well proven and understood over many years of use - they are stuck with the programming tools they pick now.</p> <p>This pressure may be exactly what Go needs to more widely compete with C++.</p> <p>[<strong>edit</strong>] In response to the number of downvotes (-11 now), I am sure there are many smart software scientists and engineers out there, and I am sure they would agree that large codebases break in strange and unexpected ways. The instability caused by a large number of people working on one set of code is a big enough problem, adding a third party (Google) without the same moral environment or history (no breaking process at Google has killed anyone yet) means a potentially higher possibility of something you care about disappearing at a bad moment.</p> <p>Big Disasters happen when a lot of small things break, often things that routinely break. I&#39;d argue that the style of Go is a proper direction (minimize bug sources by keeping a simplified strong foundation), but systems requiring dangerous decision making NOW should not use a language implementation not yet proven to never break in the timeframes that matter for missions.</p> <p>[<strong>edit 2</strong>] This is why I argue for development of an idiomatic style of assertion in Go. Writing out all assumptions in a function and crashing immediately on a mismatch only in debug builds is a big step into building and maintaining robust and trusted code.</p> <pre><code>// here&#39;s what I use in my current code // implementation still has formatting bugs and it doesn&#39;t do conditional compilation, // there may be a clever interface that is useful here package assert func Require(assertion bool, format string, a ...interface{}) </code></pre></pre>jussij: <pre><blockquote> <p>This sounds reckless to me.</p> </blockquote> <p>It&#39;s use in the <em>telemetry system</em> not the <em>control system</em>.</p></pre>thepciet: <pre><p>Does this mean Go is not used anywhere in the software chains leading up to making automated launch, flight, and landing choices? </p> <p>The words here are ambiguous and translated from Russian, I just point out that software with major real world impacts has a much lower tolerance for bugs than anything Google does (I wonder if they use Go for their car automation..).</p></pre>jussij: <pre><p>Telemetry is data sent from the rocket to the mission control center back on the ground. Basically it lets mission control monitor the rocket in flight.</p> <p>But the rocket control system actually controls the flight of the rocket.</p> <p>So while it would not be good if that telemetry system failed, as the rocket would then be flying blind, it would (or should) still fly.</p> <p>But if the rocket&#39;s control system failed, that would be catastrophic, as the rocket would crash, irrespective of whether the telemetry system was working or not.</p> <blockquote> <p>I just point out that software with major real world impacts has a much lower tolerance for bugs</p> </blockquote> <p>That is true. But some systems are more mission critical than others.</p></pre>thepciet: <pre><p>That case you describe caused the crash of <a href="https://en.wikipedia.org/wiki/Mars_Climate_Orbiter" rel="nofollow">https://en.wikipedia.org/wiki/Mars_Climate_Orbiter</a></p> <p>Those parts of the telemetry system are mission critical - mistakes, glitches, crashes, bugs, at any point in the technology stack can contribute to a disaster.</p> <p>I just present a warning to engineers. Engineering is all about tradeoffs, and there is little or no room for error for many systems, especially in aerospace. Understanding and proving the entire technology stack is critical scientifically, ethically, and monetarily.</p></pre>jussij: <pre><blockquote> <p>Those parts of the telemetry system are mission critical</p> </blockquote> <p>In the case of the orbiter, that might well be the case. But you could also argue the actual control system was at fault, since it sent the command to the orbiter (using the telemetry data) to fly into Mars.</p> <p>Now I have no idea how SpaceX rockets are designed, but I would assume the rocket would be autonomous. In that type of system, the control system would need input data from it&#39;s environment and like orbiter that data would be mission critical.</p> <p>But I would hope that data would not be in any way connected to a secondary system designed to just send data back to ground control.</p> <p>In other words the control system should not be getting it&#39;s data from the telemetry system and the telemetry system should not be talking to the control system to get it&#39;s data.</p> <p>But if ground control is playing a role in flying the rocket and the rocket is not autonomous, then of course that telemetry system would indeed be mission critical. </p></pre>nexusbees: <pre><p>Do you have anything to show that Go code is more susceptible to mistakes, glitches, crashes and bugs? I can think of some drawbacks of Go that would be relevant to this use case such as </p> <ul> <li>marginally higher memory consumption</li> <li>periodic GC pauses of length &lt; 1ms</li> </ul> <p>If either of these had been an issue for them, I imagine they&#39;d have discarded the Go code and started over pretty quick. I want to know, does your confident assertion that Go code has more mistakes, glitches, crashes and bugs have anything to back it up, or did you simply pull it out of your ass? I strongly suspect it&#39;s the latter and judging by your downvotes, everyone else thinks so too.</p></pre>thepciet: <pre><blockquote> <p>Do you have anything to show that Go code is more susceptible to mistakes, glitches, crashes and bugs?</p> </blockquote> <p>This was not my assertion. I am concerned about the maturity of the Go runtime (NOT about code written in Go).</p> <blockquote> <p>The latest Go release, version 1.5, is a significant release, including major architectural changes to the implementation.</p> </blockquote> <p><a href="https://golang.org/doc/go1.5" rel="nofollow">Go 1.5</a> introduced a rewritten compiler and garbage collector, three months ago. Even though the compiler rewrite was just a style of automated translation, any changes introduce risk of something breaking, as any engineer knows (i.e. always verify your changes). </p> <p>In aerospace we&#39;re talking risk management. Cursory, automated, or even lengthy testing is never enough to say &#34;this will never break&#34;. Repeated use over time is the best judge of correctness, as far as we can take it.</p> <p>If part of the stack is still seeing major architectural or implementation changes then I wouldn&#39;t trust it anywhere near my rocket, no matter how top the designers and implementers are. An internet website? Alright, risk is acceptable. But not anywhere near something that could explode spectacularly.</p> <p>(the type of bug I&#39;m thinking is something like a GC crash that happens once every 10k years of running Go code - these style of bugs have existed in the world and are more likely to have been caught in a C compiler or other place with years of stability, instead of months).</p></pre>int32_t: <pre><p>The Go runtime is essentially a microkernel running in user space, providing services like scheduling (gorountine), memory management (GC), and IPCs (channel). The runtime code is still messy that portable and OS/architecture dependent code has not cleanly modularized yet. For critical control system like avionics, the sole runtime system must pass various certifications e.g <a href="https://en.wikipedia.org/wiki/DO-178B" rel="nofollow">DO-178</a> and <a href="https://en.wikipedia.org/wiki/IEC_61508" rel="nofollow">IEC_61508</a>. It would be amazing if the runtime can reach that level of standard one day. But for now, it is actually not even close.</p></pre>ratatask: <pre><p>That&#39;s what it means, yes. e.g. NASA has their telemetry systems written in pretty much anything - quite a few are done in Python.</p> <p>Normally you have a demarcation point somewhere at the ground where telemetry is presented on a (software) message bus. Everything from the rocket through the radios at both ends and until data ends up on that message bus are systems designed with the highest reliability. Then you write software to process those messages, archive them, graph them, animate them and so on - which doesn&#39;t necessarily need the same stability.</p></pre>thepciet: <pre><blockquote> <p>Then you write software to process those messages, archive them, graph them, animate them and so on - which doesn&#39;t necessarily need the same stability.</p> </blockquote> <p>This is where I&#39;d argue Go will have a niche in NASA&#39;s systems today.</p> <p>BUT I also think Go or a dialect may be necessary for reliable systems in the future, especially if concurrency becomes a necessity (perhaps not in aerospace, but definitely in some forms of automation). It is ideal, built with big teams in mind, hard won knowledge in its design. Go does not have much of the cruft associated with well established languages.</p></pre>int32_t: <pre><p>Many programmers of embedded systems come from electronic engineers with most of their careers diving in assembly programming, I²C, DSP and oscilloscopes. They have little or no training/experience in large-scale software system engineering. Remember the Toyota vehicle controller firmware with more than 10K global variables? That kind of code quality is not uncommon in firmware as the people working on it typically excel at hardware troubleshooting and the domain knowledge of certain protocols/standards, rather than software designing and abstraction, which is often even discouraged.</p> <p>IMHO, a programming language whose design having readability, maintainability and testability in mind can outweigh the lack of mature libraries/runtimes. In many cases, those system don&#39;t even have a full-fledged C runtime built in. At least, we can be freed from witnessing those interleaved #if-#elif-#endif everywhere.</p> <p>That being said, I am not very confident about the usability of Golang in critical/realtime control system at this stage. It has to be proved first that it&#39;s able to implement systems like kernel-space device driver or video/audio decoders.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

675 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传