Passing complex nested structures to function. Pointers, by-value or deep copy?

agolangf · · 793 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

I'm having hard time figuring out a perfect way how I should design complex data structures in go. I often come up with a design that stores some kind of state in a deep structure. Assuming this struct contains only other structures as values (not pointers), everything is easy: I can pass this structure by-value to channels, goroutine thread safety is ensured and I when I need to modify something inside I can pass it by pointer and I'm happy. However the problems starts when I have some kind of subset of my data which might not be present in every situation. If I use pointers to store this it makes sense that the pointer can either be NULL or that it contains a pointer to the data which is available. <pre><code>type Substate { Foobar int } type State { Something *Substate } </code></pre> Now the problems start right away: I can't no longer pass a State by-value to a channel unless I manually make a deep copy or I risk breaking thread safety. Go doesn't have a built-in method to do deep copies, so I need to either build them on my own (which gets complex as the structures evolve to contain more nested structures), or that I need to do some kind of dirty tricks like serialise the structure into intermediate format (for example JSON). The only other trick I have seen is to store the optional structures by-value and then store a boolean value next to it indicating if the structure is there or not: <pre><code>type State { IsSomethingPresent bool Something Substate } </code></pre> Pretty? Don't think so and this doesn't help if I need to store Maps or Slices. How have other programmers solved these kind of problems? <hr/>**评论：** egonelbre: <pre>What is the actual context where you are having this problem? There might be other solutions to the problem, i.e. using mutexes/goroutines in some way.</pre>jerf: <pre>"It depends." First, you are clearly aware of this on at least some level, but it's worth bringing it out explicitly: Here's where the dragons live. Fine-grained shared-state concurrency is how threading blows up in your face. Unlike some other languages like Erlang, Go won't actually stop you at the language level from using it; it "only" provides some tools to help with avoiding it. The most "Go-like" answer is to figure out how to break the State up into pieces, give the pieces to the goroutines that need them, and do any necessary cross-communication between the pieces via communication with simple messages. Messages should either be value-only, or be completely transferring ownership of the full message to the target process. Bear in mind that as long as you fully transfer ownership, you can even move bits of what is now the *State around between goroutines. (Just don't make the mistake of handing something off, then keeping a reference in the original goroutine.) Big monolithic *State is a code smell. That said, while once you get the hang of it that's not necessarily all that hard, it's hard to retrofit onto an existing design, and it does take some practice to learn to pull off. I learned in Erlang, where you haven't got much choice. Further, code smell or not, sometimes that's not practical. The next most "Go-like" answer is to wrap your entire State behind a goroutine that owns it, and communicate with that state only with messages and channels. Any bit of the *State handed back out should be copies of the relevant data. You can wrap that behind an object that offers all the methods, and contains nothing but enough state to do the channels/locking and a reference to the state in question. I think a lot of programmers will react with concern about performance issues there, because now you can only manipulate that state with effectively one CPU at a time, but generally if you think about it, "one CPU" for a given state like this turns out to be wildly more than enough, compared to everything else going on, and this would fall into the "wait until the profiler actually says this is a problem" class of problem. If not, well, you may have to try harder to implement my first suggestion. The final solution, which is one I've used before but is definitely the most dangerous, is the Big Lock solution; put a big lock around the *State. This is the last choice for me because while taking one mutex in a goroutine is generally a safe operation, taking two is when you start summoning C'thulu on your program, and if your API is built around taking one to start with, you're halfway there... your users need to be much more careful about what they do. A server goroutine that is responsible for your state basically functions as a goroutine that takes one implicit lock, and won't take any more if written correctly, so it's much harder to write deadlocks if you correctly isolate the goroutine. Generally I confine this solution to something where my methods all look like: <pre><code>this.Lock() defer this.Unlock() // no more than two or three lines of operation on my state return result </code></pre> The more complicated the commented out code gets, the more you're playing with fire. Under no circumstances, if you value your sanity, should you accept any sort of closure into such an API. That's a Dark Side sort of solution, "forever dominating your destiny" etc. If the performance chips are really down, fine-grained locking in the old style is still available. It's still as inadvisable as ever, but, if you need it, you need it. Nevertheless, I'd strongly recommend at least the State server solution and holding off on this until you've profiled everything and optimized everything else to within an inch of its life and it's still not enough for some reason.</pre>Garo5: <pre>Thank you for a well written response. Luckily my system isn't that bad and not that performance critical that I would need to sacrifice elegance and code structure for speed. But you did a good job on reminding me on the benefits and the Go ideology of communicating with messages. I've already sliced my code into different goroutines doing different parts in subsystems, but I bet that I can easily do several refactoring rounds to see how the modules could be made better and the access to shared state could be minimized. I have had my share of problems and bugs in the dark times of old (pre-boost) c++ multithreading, so I'm not keen of repeating any of those issues. I've actually been surprised how little amount of libraries explain how/if they are thread/goroutine safe, so usually the best bet has been to assume that they're not and work around that limitation in the code with channels.</pre>similarnuclear: <pre>I also ran into this problem and chose the serialization hack to create a Clone() method on my top level data structures, using encoding/gob. I send Clone()'d copies the same data structure to many goroutines from one. <pre><code>package main import ( "bytes" "encoding/gob" "fmt" ) type Substate struct { Foobar int } type State struct { Something *Substate } func (s *State) Clone() (*State, error) { var buf bytes.Buffer enc := gob.NewEncoder(&buf) dec := gob.NewDecoder(&buf) if err := enc.Encode(s); err != nil { return nil, err } var cp State if err := dec.Decode(&cp); err != nil { return nil, err } return &cp, nil } func main() { state := State{&Substate{200}} state2 := state state3, _ := state.Clone() fmt.Printf("state.Something is %p\n", state.Something) fmt.Printf("state2.Something is %p\n", state2.Something) fmt.Printf("state3.Something is %p\n", state3.Something) } </code></pre> <a href="http://play.golang.org/p/aM8CZMHrQ1" rel="nofollow">http://play.golang.org/p/aM8CZMHrQ1</a> I'd also like to know how to better share memory by communicating safely. I build all my programs in development with <code>-race</code> and have caught myself making errors (like map accesses from multiple goroutines), which is scary.</pre>thockin: <pre>Gob gets it wrong sometimes. I think the case we hit was not differentiation between a nil pointer and the zero value of that painter's pointee. And reflection is really slow. Now we use code generation for deep copy routines. It feels so dirty.</pre>Garo5: <pre>What kind of code generator? I tried to google around but found nothing. And I agree, I don't like code generation at all, it just makes the entire compile phase more complex than it should be.</pre>thockin: <pre>We wrote one. I think it is not committed yet, but will be soon .</pre>Garo5: <pre>Yeah. I've also used the encoding/gob trick, but it feels really stupid to do so, but it does work. I haven't benchmarked how big penalty there is to use the encoding/gob when compared to a manually written deep copy routing without reflection, but I'd guess that it usually isn't too much.</pre>klauspost: <pre>encoding/gob is not very performant, since it is primary a streaming protocol. Many other serializers are actually faster, and I would expect that it would be at least an order of magnitude faster to copy. Maybe this <a href="https://godoc.org/code.google.com/p/rog-go/exp/deepcopy" rel="nofollow">deepcopy</a> package can help you.</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

793 次点击

加入收藏微博

goroutine

context

godoc

channel

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Passing complex nested structures to function. Pointers, by-value or deep copy?

用户登录

今日阅读排行

一周阅读排行

最新主题