This is the text of my dotGo 2016 presentation. A recording and slide deck are also available.
Hello, welcome to dotGo.
Two years ago I stood on a stage, not unlike this one, and told you my opinion for how configuration options should be handled in Go. The cornerstone of my presentation was Rob Pike’s blog post, Self-referential functions and the design of options.
Since then it has been wonderful to watch this idea mature from Rob’s original blog post, to the gRPC project, who in my opinion have continued to evolve this design pattern into its best form to date.
But, when talking to Gophers at a conference in London a few months ago, several of them expressed a concern that while they understood the notion of a function that returns a function, the technique that powers functional options, they worried that other Go programmers—I suspect they meant less experienced Go programmers—wouldn’t be able to understand this style of programming.
And this made me a bit sad because I consider Go’s support of first class functions to be a gift, and something that we should all be able to take advantage of. So I’m here today to show you, that you do not need to fear first class functions.
Functional options recap
To begin, I’ll very quickly recap the functional options pattern
type Config struct{ ... } func WithReticulatedSplines(c *Config) { ... } type Terrain struct { config Config } func NewTerrain(options ...func(*Config)) *Terrain { var t Terrain for _, option := range options { option(&t.config) } return &t } func main() { t := NewTerrain(WithReticulatedSplines) // [ simulation intensifies ] }
We start with some options, expressed as functions which take a pointer to a structure to configure. We pass those functions to a constructor, and inside the body of that constructor each option function is invoked in order, passing in a reference to the Config
value. Finally, we call NewTerrain
with the options we want, and away we go.
Okay, everyone should be familiar with this pattern. Where I believe the confusion comes from, is when you need an option function which take a parameter. For example, we have WithCities
, which lets us add a number of cities to our terrain model.
// WithCities adds n cities to the Terrain model func WithCities(n int) func(*Config) { ... } func main() { t := NewTerrain(WithCities(9)) // ... }
Because WithCities
takes an argument, we cannot simply pass WithCities
to NewTerrain
, its signature does not match. Instead we evaluate WithCities
, passing in the number of cities to create, and use the result as the value to pass to NewTerrain
.
Functions as first class values
What’s going on here? Let’s break it down. Fundamentally, evaluating a function returns a value. We have functions that take two numbers and return a number.
package math func Min(a, b float64) float64
We have functions that take a slice, and return a pointer to a structure.
package bytes func NewReader(b []byte) *Reader
and now we have a function which returns a function.
func WithCities(n int) func(*Config)
The type of the value that is returned from WithCities
is a function which takes a pointer to a Config
. This ability to treat functions as regular values leads to their name: first class functions.
interface.Apply
Another way to think about what is going on here is to try to rewrite the functional option pattern using an interface.
type Option interface { Apply(*Config) }
Rather than a function type we declare an interface, we’ll call it Option,
and give it a single method, Apply
which takes a pointer to a Config
.
func NewTerrain(options ...Option) *Terrain { var config Config for _, option := range options { option.Apply(&config) } // ... }
Whenever we call NewTerrain
we pass in one or more values that implement the Option
interface. Inside NewTerrain
, just as before, we loop over the slice of options and call the Apply
method on each.
This doesn’t look too different to the previous example. Rather than ranging over a slice of functions and calling them, we range over a slice of interface values and call a method on each. Let’s take a look at the other side, declaring the WithReticulatedSplines
option.
type splines struct{} func (s *splines) Apply(c *Config) { ... } func WithReticulatedSplines() Option { return new(splines) }
Because we’re passing around interface implementations, we need to declare a type to hold the Apply
method. We also need to declare a constructor function to return our splines
option implementation–you can already see that this is going to be more code.
To write WithCities
using our Option
interface we need to do a bit more work.
type cities struct { cities int } func (c *cities) Apply(c *Config) { ... } func WithCities(n int) Option { return &cities{ cities: n, } }
In the previous, functional, version the value of n
, the number of cities to create, was captured lexically for us in the declaration of the anonymous function. Because we’re using an interface we need to declare a type to hold the count of cities and we need a constructor to assign the field during construction.
func main() { t := NewTerrain(WithReticulatedSplines(), WithCities(9)) // ... }
Putting it all together, we call NewTerrain
with the results of evaluating WithReticulatedSplines
and WithCities
.
At GopherCon last year Tomás Senart spoke about the duality of a first class function and an interface with one method. You can see this duality play out in our example; an interface with one method and a function are equivalent.
But, you can also see that using functions as first class values involves much less code.
Encapsulating behaviour
Let’s leave interfaces for a moment and talk about some other properties of first class functions.
When we invoke a function or a method, we do so passing around data. The job of that function is often to interpret that data and take some action. Function values allow you to pass behaviour to be executed, rather that data to be interpreted. In effect, passing a function value allows you to declare code that will execute later, perhaps in a different context.
To illustrate this, here is a simple calculator.
type Calculator struct { acc float64 } const ( OP_ADD = 1 << iota OP_SUB OP_MUL )
It has a set of operations it understands.
func (c *Calculator) Do(op int, v float64) float64 { switch op { case OP_ADD: c.acc += v case OP_SUB: c.acc -= v case OP_MUL: c.acc *= v default: panic("unhandled operation") } return c.acc }
It has one method, Do
, which takes an operation and an operand, v
. For convenience, Do
also returns the value of the accumulator after the operation is applied.
func main() { var c Calculator fmt.Println(c.Do(OP_ADD, 100)) // 100 fmt.Println(c.Do(OP_SUB, 50)) // 50 fmt.Println(c.Do(OP_MUL, 2)) // 100 }
Our calculator only knows how to add, subtract, and multiply. If we wanted to implement division, we’d have to allocate an operation constant, then open up the Do
method and add the code to implement division. Sounds reasonable, it’s only a few lines, but what if we wanted to add square root and exponentiation?
Each time we did this, Do
grows longer and become harder to follow, because each time we add an operation we have to encode into Do
knowledge of how to interpret that operation.
Let’s rewrite our calculator a little.
type Calculator struct { acc float64 } type opfunc func(float64, float64) float64 func (c *Calculator) Do(op opfunc, v float64) float64 { c.acc = op(c.acc, v) return c.acc }
As before we have a Calculator
, which manages its own accumulator. The Calculator
has a Do
method, which this time takes an function as the operation, and a value as the operand. Whenever Do
is called, it calls the operation we pass in, using its own accumulator and the operand we provide.
So, how do we use this new Calculator
? You guessed it, by writing our operations as functions.
func Add(a, b float64) float64 { return a + b }
This is the code for Add
. What about the other operations? It turns out they aren’t too hard either.
func Sub(a, b float64) float64 { return a - b } func Mul(a, b float64) float64 { return a * b } func main() { var c Calculator fmt.Println(c.Do(Add, 5)) // 5 fmt.Println(c.Do(Sub, 3)) // 2 fmt.Println(c.Do(Mul, 8)) // 16 }
As before we construct a Calculator
and call it passing operations and an operand.
Extending the calculator
Now we can describe operations as functions, we can try to extend our calculator to handle square root.
func Sqrt(n, _ float64) float64 { return math.Sqrt(n) }
But, it turns out there is a problem. math.Sqrt
takes one argument, not two. However our Calculator
’s Do
method’s signature requires an operation function that takes two arguments.
func main() { var c Calculator c.Do(Add, 16) c.Do(Sqrt, 0) // operand ignored }
Maybe we just cheat and ignore the operand. That’s a bit gross, I think we can do better.
Let’s redefine Add
from a function that is called with two values and returns a third, to a function which returns a function that takes a value and returns a value.
func Add(n float64) func(float64) float64 { return func(acc float64) float64 { return acc + n } } func (c *Calculator) Do(op func(float64) float64) float64 { c.acc = op(c.acc) return c.acc }
Do
now invokes the operation function passing in its own accumulator and recording the result back in the accumulator.
func main() { var c Calculator c.Do(Add(10)) // 10 c.Do(Add(20)) // 30 }
Now in main we call Do
not with the Add function itself, but with the result of evaluating Add(10)
. The type of the result of evaluating Add(10)
is a function which takes a value, and returns a value, matching the signature that Do
requires.
func Sub(n float64) func(float64) float64 { return func(acc float64) float64 { return acc - n } } func Mul(n float64) func(float64) float64 { return func(acc float64) float64 { return acc * n } }
Subtraction and multiplication are similarly easy to implement. But what about square root?
func Sqrt() func(float64) float64 { return func(n float64) float64 { return math.Sqrt(n) } } func main() { var c Calculator c.Do(Add(2)) c.Do(Sqrt()) // 1.41421356237 }
This implementation of square root avoids the awkward syntax of the previous calculator’s operation function, as our revised calculator now operates on functions which take and return only one value.
Hopefully you’ve noticed that the signature of our Sqrt
function is the same as math.Sqrt
, so we can make this code smaller by reusing any function from the math
package that takes a single argument.
func main() { var c Calculator c.Do(Add(2)) // 2 c.Do(math.Sqrt) // 1.41421356237 c.Do(math.Cos) // 0.99969539804 }
We started with a model of hard coded, interpreted logic. We moved to a more functional model, where we pass in the behaviour we want. Then, by taking it a step further, we generalised our calculator to work for operations regardless of their number of arguments.
Let’s talk about actors
Let’s change tracks a little and talk about why most of us are here at a Go conference; concurrency, specifically actors. To give due credit, the examples here are inspired by Bryan Boreham’s talk from GolangUK, you should check it out.
Suppose we’re building a chat server, we plan to be the next Hipchat or Slack, but we’ll start small for the moment.
type Mux struct { mu sync.Mutex conns map[net.Addr]net.Conn } func (m *Mux) Add(conn net.Conn) { m.mu.Lock() defer m.mu.Unlock() m.conns[conn.RemoteAddr()] = conn }
We have a way to register new connections.
func (m *Mux) Remove(addr net.Addr) { m.mu.Lock() defer m.mu.Unlock() delete(m.conns, addr) }
Remove old connections.
func (m *Mux) SendMsg(msg string) error { m.mu.Lock() defer m.mu.Unlock() for _, conn := range m.conns { err := io.WriteString(conn, msg) if err != nil { return err } } return nil }
And a way to send a message to all the registered connections. Because this is a server, all of these methods will be called concurrently, so we need to use a mutex to protect the conns
map and prevent data races. Is this what you’d call idiomatic Go code?
Don’t communicate by sharing memory, share memory by communicating.
Our first proverb–don’t mediate access to shared memory with locks and mutexes, instead share that memory by communicating. So let’s apply this advice to our chat server.
Rather than using a mutex to serialise access to the Mux
‘s conns
map, we can give that job to a goroutine, and communicate with that goroutine via channels.
type Mux struct { add chan net.Conn remove chan net.Addr sendMsg chan string } func (m *Mux) Add(conn net.Conn) { m.add <- conn }
Add sends the connection to add to the add
channel.
func (m *Mux) Remove(addr net.Addr) { m.remove <- addr }
Remove sends the address of the connection to the remove
channel.
func (m *Mux) SendMsg(msg string) error { m.sendMsg <- msg return nil }
And send message sends the message to be transmitted to each connection to the sendMsg
channel.
func (m *Mux) loop() { conns := make(map[net.Addr]net.Conn) for { select { case conn := <-m.add: m.conns[conn.RemoteAddr()] = conn case addr := <-m.remove: delete(m.conns, addr) case msg := <-m.sendMsg: for _, conn := range m.conns { io.WriteString(conn, msg) } } } }
Rather than using a mutex to serialise access to the conns
map, loop will wait until it receives an operation in the form of a value sent over one of the add
, remove
, or sendMsg
channels and apply the relevant case. We don’t need a mutex anymore because the shared state, our conns
map, is local to the loop
function.
But, there’s still a lot of hard coded logic here. loop
only knows how to do three things; add, remove and broadcast a message. As with the previous example, adding new features to our Mux
type will involve:
- creating a channel.
- adding a helper to send the data over the channel.
- extending the select logic inside
loop
to process that data.
Just like our Calculator
example we can rewrite our Mux
to use first class functions to pass around behaviour we want to executed, not data to interpret. Now, each method sends an operation to be executed in the context of the loop function, using our single ops
channel.
type Mux struct { ops chan func(map[net.Addr]net.Conn) } func (m *Mux) Add(conn net.Conn) { m.ops <- func(m map[net.Addr]net.Conn) { m[conn.RemoteAddr()] = conn } }
In this case the signature of the operation is a function which takes a map of net.Addr’s to net.Conn’s. In a real program you’d probably have a much more complicated type to represent a client connection, but it’s sufficient for the purpose of this example.
func (m *Mux) Remove(addr net.Addr) { m.ops <- func(m map[net.Addr]net.Conn) { delete(m, addr) } }
Remove
is similar, we send a function that deletes its connection’s address from the supplied map.
func (m *Mux) SendMsg(msg string) error { m.ops <- func(m map[net.Addr]net.Conn) { for _, conn := range m { io.WriteString(conn, msg) } } return nil }
SendMsg
is a function which iterates over all connections in the supplied map and calls io.WriteString
to send each a copy of the message.
func (m *Mux) loop() { conns := make(map[net.Addr]net.Conn) for op := range m.ops { op(conns) } }
You can see that we’ve moved the logic from the body of loop
into anonymous functions created by our helpers. So the job of loop
is now to create a conns
map, wait for an operation to be provided on the ops
channel, then invoke it, passing in its map of connections.
But there are a few problems still to fix. The most pressing is the lack of error handling in SendMsg
; an error writing to a connection will not be communicated back to the caller. So let’s fix that now.
func (m *Mux) SendMsg(msg string) error { result := make(chan error, 1) m.ops <- func(m map[net.Addr]net.Conn) { for _, conn := range m.conns { err := io.WriteString(conn, msg) if err != nil { result <- err return } } result <- nil } return <-result }
To handle the error being generated inside the anonymous function we pass to loop
we need to create a channel to communicate the result of the operation. This also creates a point of synchronisation, the last line of SendMsg
blocks until the function we passed into loop
has been executed.
func (m *Mux) loop() { conns := make(map[net.Addr]net.Conn) for op := range m.ops { op(conns) } }
Note that we didn’t have the change the body of loop
at all to incorporate this error handling. And now we know how to do this, we can easily add a new function to Mux
to send a private message to a single client.
func (m *Mux) PrivateMsg(addr net.Addr, msg string) error { result := make(chan net.Conn, 1) m.ops <- func(m map[net.Addr]net.Conn) { result <- m[addr] } conn := <-result if conn == nil { return errors.Errorf("client %v not registered", addr) } return io.WriteString(conn, msg) }
To do this we pass a “lookup function” to loop
via the ops
channel, which will look in the map provided to it—this is loop
‘s conns
map—and return the value for the address we want on the result channel.
In the rest of the function we check to see if the result was nil
—the zero value from the map lookup implies that the client is not registered. Otherwise we now have a reference to the client and we can call io.WriteString
to send them a message.
And just to reiterate, we did this all without changing the body of loop
, or affecting any of the other operations.
Conclusion
In summary
- First class functions bring you tremendous expressive power. They let you pass around behaviour, not just dead data that must be interpreted.
- First class functions aren’t new or novel. Many older languages have offered them, even C. In fact it was only somewhere along the lines of removing pointers did programmers in the OO stream of languages lose access to first class functions. If you’re a Javascript programmer, you’ve probably spent the last 15 minutes wondering what the big deal is.
- First class functions, like the other features Go offers, should be used with restraint. Just as it is possible to make an overcomplicated program with the overuse of channels, it’s possible to make an impenetrable program with an overuse of first class functions. But that does not mean you shouldn’t use them at all; just use them in moderation.
- First class functions are something that I believe every Go programmer should have in their toolbox. First class functions aren’t unique to Go, and Go programmers shouldn’t be afraid of them.
- If you can learn to use interfaces, you can learn to use first class functions. They aren’t hard, just a little unfamiliar, and unfamiliarity is something that I believe can be overcome with time and practice.
So next time you define an API that has just one method, ask yourself, shouldn’t it really just be a function?
有疑问加站长微信联系(非本文作者)