Gallium: An idea for a programming language that compiles to Go

<a href="https://github.com/weberc2/gallium">https://github.com/weberc2/gallium</a> I wanted to start a conversation about some ideas I've had to write a language that compiles to Go. The rationale and notes are available in the readme. I want this thread to be a starting point for the conversation about how this language could be built and what it could look like (at a broad level--details notwithstanding) and maybe to find some like minds with whom to continue this conversation after this thread expires. I want a constructive discussion--I don't want to debate the rationale (maybe you think Go or language X is good enough--that's fine, but that's not what I'm here to talk about). As I mention in the readme, I've never built a language, so it probably won't go anywhere, but I'm particularly interested to hear any special reasons why this can't work in theory. <hr/>**评论：** jerf: <pre>You give conflicting signals on whether you mean this as a language you seriously want other people to use for some definition of "real work" or whether this is a learning project. I'm going to give feedback as if it is the latter, because right now even if you had everything in hand that you have sketched in your README.md, you wouldn't get anywhere because you'd be overshadowed by Go itself. You would need substantially more differentiation than providing those features, many of which can be half-done right now either with existing packages for Go or half attained via semi-clever use of existing features. But jerf, having the entire features for those things would be qualitatively different and what's on github now are just hacks by comparison... Yes, I get it and I agree (Go is far from my only language, I've spent quite a bit of time with Haskell, I understand those things quite well), but I'm not talking about the desirability or goodness of the resulting language, I'm talking about the fact that without sufficient differentiation you're not going to escape from Go's gravitational attraction within the Go ecosystem, which is, unsurprisingly, fairly large. So instead I would give some suggestions on how to do something useful which will teach you some things, maybe even turn into an interesting Go code generator project, and be useful at almost every step along the way. <ol> <li>Create a Go-to-Go compiler. As a really hacky first pass you can use the existing Go parser and code generator, but that would just be a getting-your-feet-wet thing. What I mean here is that you need to write a parser for Go code that emits into your own AST, which does not use the ast package at all. Then you convert that AST into Go's existing AST types and emit Go code. It would be an ideal learning exercise to do this parser from scratch from the language implementation, but if you want to come at it more as an engineer you can copy/paste the existing one. However, while that will speed you up in this step, it will slow you down a lot in the next one. One suggestion I'd make: Consider ignoring and just stripping out comments, or at least making your peace with the fact they might not all make it through. The Go parser's handling of them is quite idiosyncratic and bizarre. (I understand why it does what it does, but it still makes it bizarre for your use case here.) The downside of this is that if you do this from the beginning you may have a really hard time retrofitting the comment behavior entirely and correctly. But you may consider just living with the fact that in your final compiler some comments may disappear during compilation. After all, this is ultimately a full compilation process, not a <code>gofmt</code> run. It will feel weird to "convert" your AST types into the Go AST types when all it is is a straight conversion. Bear with me. At this point you have (Your New Parser) -> (Your AST) -> (Go AST) -> (Go prettyprinter) -> (Go compiler).</li> <li>Start modifying the parser to generate the constructs for your new features. I would suggest the true sum types is the way to go. I'd suggest making it easy on yourself and using a unique symbol to differentiate them so the parser doesn't have to be too hard or ambiguous. Then, once you have parsed the symbols, you need to insert a new phase: (Your New Parser) -> (Your AST) -> (<a href="http://mattwarren.org/2017/05/25/Lowering-in-the-C-Compiler/">Lower</a> Sum Types) -> (Go AST) -> (Go prettyprinter) -> (Go compiler). At this point you now have a language that clearly enhances Go and can be dropped in anywhere you like because the output is real Go, and now it has Sum Types too.</li> <li>Next, you can try blocking zero values, though with this plan you will certainly still get them from the rest of the Go you're interoping with. By the time you close all those holes I think you'll find this is actually more complicated than adding Sum Types, which is why I put this second.</li> <li>Next, implement generics this way. It is ideal to try to work out a way to do this so that each of these features is a distinct lowering phase, rather than trying to do it all in one shot, but that's easier said than done.</li> <li>Next, implement the traits support.</li> <li>And should you get this far, you'll pretty much know what you want to do next.</li> </ol> In step 1, you may be tempted to try to beat on one or another layer of Go's existing code to try to avoid having to write your own parser. I strongly suggest resisting this temptation. I've tried it before, not in Go but in other places, and it just never seems to work out well. The parser always has more context in it than you realize and more hard-coded decisions, and usually you can't even make what you want really work right, and even if you can sort of get something, it will always be immensely compromised vs. having a real parser. The end result of this plan is that you don't really have a language per se, but you do have a freakishly powerful code processor that others may be interested in expanding on. Going full-on macro processor is not out of the question. (An alternative would be to make step 2 "implement a macro processor" and do the rest in terms of macros, except "blocking zero values" won't be able to be done that way. You can add things with macros but you can't really take them away like that.) In terms of convincing people to use your stuff, if you are interested in that, that may be an easier sell than a full "language", because you can always examine the output of the code processor and satisfy yourself that it's working correctly and/or fix bugs, whereas languages are intrinsically a huge risk to take on any serious project. Hypothetically you could keep iterating this way, with a working language the entire way past step 1 (an advantage of the plan I outline here over most of the alternatives), until you are meaningfully differentiated enough from Go. In particular, with no particular offense intended nor any impugnment of your dedication, having a functioning system means you can put it on GitHub and drop it with peace-of-mind at any time. This is not something to take lightly. :) (Oh, one other thing: Apologies if you already know this, but: TEST SUITE. You need one on DAY ONE. I strongly recommend setting up a pre-commit hook in Git to run them all on every commit. I'm not a fan of TDD but this isn't a bad domain to use it in. Crib any samples you can from the one the Go compiler has as they will already exercise a lot of corner cases. It won't be all your corner cases, but it's still a great start.)</pre>throwlikepollock: <pre>Can I just say that your described method of <code>Go -> Go</code> and then <code>My Single Adjustment -> Go</code> is an awesome way to think about writing a transpiler (which I've not done much really). If I had any idea to add features to go, that sounds like a really fun way to work on them. Eg, you can add one feature, and you can actually use the language right away. It's not missing half of the go language because "you haven't gotten to that yet". Anyway, I was just sort of impressed by that workflow .. clearly because I never think of transpilers in the <code>Go -> Go</code> sense, at least.</pre>weberc2: <pre>Hey jerf, thanks for taking the time to critique; your comments here and on HN are always informed and insightful, and particularly welcome in this thread. You're right that I was unclear; I would like for this language to become a "real work" language, but I also understand that a language is a huge amount of work, so I would be content if this doesn't evolve past a conversation phase. Thanks also for raising the point about being sufficiently different to escape Go's gravity--I hadn't considered that because it seems self-evident to me that generics and sum types on top of Go's runtime would be broadly appealing. I need to rethink that assumption, but I wonder if your opinion might also be changed by thinking about this less as Go + some functional features and more as a functional language that is syntactically and idiomatically familiar, benefits from Go's runtime (including no-VM dependency). In other words, my target market isn't just Go programmers who want a couple of extra features, but also the broader market of programmers who want a pragmatic functional language and who find Haskell and Rust too strict (and other options failing for ecosystem/tooling reasons). Maybe this market is still too narrow, but at least this is the perspective I was coming from, which seems to be different than the one you addressed with your "escaping Go's gravity" critique. Hopefully this helps to clarify my intent. </pre>jerf: <pre>Best of luck to you. Another thing you might want to look at carefully, if you haven't already, is Scala, and how it interacts with Java. One of the problems you'll have with trying to wrap FP around Go is that if you want to interact with the underlying Go libraries, they aren't FP and that will leak out in a lot of ways. I know the Haskell way of doing it, but that is, in a weird sort of way, a really easy way. Trying to deeply weave it into a language I don't have much experience with. (Also consider that the general Haskell community consensus on Scala is that it is "too complex", so, you know, bear that in mind as you study it and don't take too many ideas from it at once. :) ) Like I said, if you keep going down this path and mutating the language one step at a time, you would eventually end up with something differentiated enough, but it's a long road.</pre>weberc2: <pre>Thanks for the advice. The interop between Gallium and Go is something I'm particularly concerned about. I'm thinking the compiler could have an unsafe-like mechanism for interacting with vanilla Go. I haven't through this through very well yet, however. I'll definitely look to Scala/Java when the time comes.</pre>roxven: <pre>I disagree wholly with your premise that satisfying the rust compiler takes more time than debugging Go concurrency bugs. In my experience writing Go (2 years professionally at Google and at Square) the time spent identifying and resolving concurrency bugs dwarfs the time spent making something correct enough to compile in rust by orders of magnitude. I can't count how many times I have found nondeterministic behavior because someone closed over mutable state when launching goroutines for example. It's perfectly possible to just not learn why you're having problems, whether it's how you're writing concurrency bugs in Go, or why rustc won't give you a break. For that developer rustc will feel like a time hog because Go will let you think you got it done even if you didn't. Now I don't mean to say that one is necessarily better than the other; it just depends on how much bugs matter in the big picture. For a lot of organizations it matters more that they can launch fast and get productivity out of junior engineers than it does to be bug free, and that's a perfect situation for Go. But you should recognize the tradeoff you are making: gaining ease of use and development speed at the cost of bugs; you wouldn't be reducing time spent achieving correctness in your programs.</pre>weberc2: <pre>Fair enough, I had a different experience. Maybe I'm just not learning quickly as you suggest. But it's not a matter of Go being too permissive; as I mentioned, most code isn't concurrent, so in these cases, Rust's pedantry isn't keeping you safe. If you write a lot of concurrent code, then Rust is probably a good deal. If you're a slow learner who writes mostly sequential code, Go may be the better bargain.</pre>roxven: <pre>I mostly agree with that. My comment about not learning is not meant as a slight to anyone; in this particular context I think what someone learns is almost more an organizational responsibility than an individual responsibility. If you're in the position of choosing to learn something, you don't know it, and whether you learn it is largely a product of whether the more experienced people you trust tell you it is worth learning.</pre>jerf: <pre><blockquote> I disagree wholly with your premise that satisfying the rust compiler takes more time than debugging Go concurrency bugs. </blockquote> I suspect some of it is experience level, both with programming the "harder" languages in general, and Rust in particular. Once Rust is done rewiring how you think about data flow at all, it gets easier. (It's similar in Haskell; at first it seems like nobody could ever do anything with this language, but once you rewire yourself it gets a lot easier and can even become your preference.) <blockquote> I can't count how many times I have found nondeterministic behavior because someone closed over mutable state when launching goroutines for example. </blockquote> I know Go2 == generics in a lot of people's minds, but I'd actually much rather see something helpful for concurrency come out. I recognize that it can't be something like "import Rust lifetimes", but something like "label this goroutine's interaction with the world as 'non-shared' and have the compiler verify that all communication in or out either correctly transfers ownership (i.e., once sent, it is no longer accessed by the original goroutine) or is fully deeply copied" or something would be awesome. In fact, going to the OP's question again, this is something I'd love to see come out of my wall of text suggestion I gave, much moreso than the features listed on the GitHub readme. Like, I might seriously use a preprocessor that gave me <code>go_safe</code> that worked just like <code>go</code> except statically verified to the extent possible that I didn't accidentally overshare with a closure, or pass in messages with pointers that didn't get copied like I expected, or use things after I passed them in, etc. (Though in terms of difficulty it probably actually rates even higher than the generics + traits; removing things in the preprocessor is generally harder than adding them.) I've been using Go for a long time, and I have a bit of an advantage in that I came into it with ~5 years of Erlang experience already, so I personally have had less concurrency trouble than most people because I already knew how to think in this world, and even if Go didn't really help me, it doesn't stop me either. But I was disappointed when I first read about it when it came out and I saw that it was really just another shared-memory-with-threading language, and while I suppose I've made my peace with that, the disappointment still lingers.</pre>Sythe2o0: <pre>What's the motivation behind not having zero values?</pre>weberc2: <pre>Mostly because they seem out of place in a functional language, where an Option type could represent the same thing with more safety. That said, I haven't thought thoroughly about this, and I would love to hear any opinions.</pre>throwlikepollock: <pre>Go with Option types has been my dream for quite a while. While I also love the ownership model of Rust, it has a lot of baggage and I'm not sure how it could be implemented in Go .. nicely. Simply having quality Enums and Option types totally seems on the table, and would let developers use them as they see fit. Granted, that probably involves generics. Hmph.</pre>weberc2: <pre>For what it's worth, my proposal is roughly sum types plus generics and no ownership model (because Go has a GC, which is less a burden on the developer in most cases). With generics and sum types, you can trivially build an option type.</pre>losinggeneration: <pre>Have you looked at Haxe? It doesn't have a Go target that I know of, but could give you some ideas based on how it generates code for other targets (C++, Lua, PHP, etc)</pre>weberc2: <pre>I've heard of it. Maybe I'm naive, but I'm thinking it should be easy enough to target Go. Pretty much everything in the hypothetical Gallium AST should compile neatly into a Go AST.</pre>losinggeneration: <pre>Neatly may be the hardest part. Especially if you're expecting to use the Go code natively. Code generators typically aren't great at generating idiomatic code.</pre>weberc2: <pre>I think I have a good idea about how the generated code should look, since I implement these features in Go as patterns already (minus the unsafe tagged union, but this is a private detail even if the generated code is interfaced with from Go). I'm sure there will come a time when I realize my assumptions were bad, but without specific concerns to be wary of, I can hardly be proactive. :)</pre>tv64738: <pre>If you're proposing a new language, you might benefit from 1) omitting details of your plan for its initial implementation (or pushing them to an appendix) and 2) talking to <a href="/r/programming" rel="nofollow">/r/programming</a>.</pre>

用户登录

今日阅读排行

一周阅读排行

最新主题