Error handling with streaming data

polaris · · 462 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>I&#39;m working on a personal project with Go, and I&#39;m relatively new to it. Part of what I&#39;m doing involves repeatedly reading data out of a Reader in 2-8 byte chunks. The Go way of handling errors (as far as I&#39;ve found) is to use a pattern like this:</p> <pre><code>content := make([]byte, length) _, err := reader.Read(content) if err != nil { log.Fatal(err) } </code></pre> <p>To simplify reading out the data, I tried wrapping this in a function like this:</p> <pre><code>func read(reader io.ReadCloser, length uint16) ([]byte){} </code></pre> <p>The goal was to be able to immediately operate on the return value, like this: </p> <pre><code>someval := MyStruct{source:read(reader,4), destination:read(reader,4)} </code></pre> <p>However, because Go treats EOF as an error, the only way to know that the reader is exhausted is to evaluate the error it returns. If I want to break execution the Go way (by passing the error down the call stack), I can&#39;t actually do this at all and the only approach I can see is to give up on readable code entirely and litter dozens or hundreds of blocks like this around in my code every time I need to read from the Reader:</p> <pre><code>someval := new(MyStruct) content, err := read(reader,4) if err != nil { return err } someval.source = content content, err = read(reader,4) if err != nil { return err } someval.destination = content </code></pre> <p>I feel like I&#39;m missing something here. I come from a higher-level language background, so being used to thinking in terms of exceptions is probably clouding my judgement here. However, the alternative is so verbose as to be useless. The shear size of the function to read in a simple struct like this would probably introduce more bugs than Go&#39;s error handling method prevents.</p> <p>Without any further input, I see two realistic options:</p> <ul> <li>Just panic/recover in this area of the code to easily halt reads when an EOF is encountered</li> <li>Give up on passing a Reader around, load the file entirely into memory, and pass around a byte array instead</li> </ul> <p>I&#39;m leaning towards the latter as Go does not seem to have the proper semantics to deal with streaming data well. Without exceptions, it just seems like touching the Reader more than absolutely necessary is going to exponentially increase the lines of code in your project (and the surface area for bugs).</p> <p>Any advice?</p> <hr/>**评论:**<br/><br/>djherbis: <pre><p>What about something like <a href="https://play.golang.org/p/NpjripMNTj" rel="nofollow">this</a>?</p> <p>By creating a structure to hold the error and execute work, we can drop the many err checks and get the return values out, and we can just handle the error after were done in one statement.</p></pre>Thunder_Moose: <pre><p>I think this is probably the best approach I&#39;ve seen so far for error checking a streaming read. I&#39;m still on the fence about doing a streaming approach at all though. In my particular case, it really seems easier and cleaner to make all of the functions pure by just passing around a byte array instead of a Reader. </p></pre>djherbis: <pre><p>Out curiosity, what are you doing with the streamed bytes? I usually really like streaming in Go (I&#39;ve built a number of libraries around streaming stuff). Usually I solve these problems by building a new reader/writer which wraps the underlying one and processes the data in each stream. Or, sometimes there already exists a &#34;Go&#34; way of doing it, like encoding/decoding or serializing/deserializing etc.</p> <p>If you&#39;re processing in such small chunks, you&#39;ll at least want to wrap your stream in a <a href="https://golang.org/pkg/bufio/#NewReader" rel="nofollow">bufio.Reader</a> to save on performance.</p></pre>Thunder_Moose: <pre><p>My background is primarily Java, but I wanted to work on a project to get further into Go than I had before. </p> <p>I&#39;m using Go to do some analysis of Java class files, so a good chunk of code is just reading the file into structs. I&#39;m not fully evaluating the file, so there are chunks that just get ignored from the stream. </p> <p>Not sure what you mean by deserialization in Go. What I&#39;m doing is very similar to deserialization in Java, but Java doesn&#39;t need language constructs for it due to generics. Is there some Go construct for this that I didn&#39;t know about?</p></pre>djherbis: <pre><blockquote> <p>similar</p> </blockquote> <p>Go has a bunch of &#34;encoding/decoding&#34; libraries which essentially serialize the data to another format: <a href="https://golang.org/pkg/encoding/#pkg-subdirectories" rel="nofollow">https://golang.org/pkg/encoding/#pkg-subdirectories</a></p> <p>My day job is all Java and I do quite a bit of AST parsing in Java [code generation]. Analyzing java files is actually probably a lot easier in Java, though I understand if you&#39;re using Go as an exercise. (if you end up wanting to do it in Java <a href="https://docs.oracle.com/javase/7/docs/api/javax/annotation/processing/AbstractProcessor.html" rel="nofollow">AbstractProcessor</a> can let you read the java AST during compile-time.)</p> <p>It would probably be helpful to at least have something like: read from File =&gt; using bufio.Reader =&gt; using <a href="https://golang.org/pkg/bufio/#Scanner" rel="nofollow">bufio.Scanner</a> to get &#34;tokens&#34; out of the data. Then you would just write an algorithm which works on &#34;tokens&#34; output from the scanner.</p></pre>Thunder_Moose: <pre><p>When I first had the idea for the project, I thought about using <a href="http://jboss-javassist.github.io/javassist/" rel="nofollow">javassist</a> to read the files out. I&#39;ve had a fair amount of experience with that library and I&#39;m sure it would have been faster and easier to use it. But, I wanted to give Go a try on something more than a basic tutorial so I used that instead. </p> <p>The problem with class file parsing is that it&#39;s entirely binary so normal AST approaches don&#39;t work. There&#39;s no way to tokenize <a href="https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.1" rel="nofollow">data like this</a>, you have to simply evaluate it as a stream of bytes. It&#39;s been fascinating getting this to work and learning how class files are put together. </p> <p>The complexity involved in reading the byte stream into these structs is why I made this post. If you introduce a tiny error in any part of reading the file by reading too many or too few bytes, the rest of the data is going to turn into garbage and there&#39;s no easy way to debug it. I&#39;ve had to break out my hex editor several times to figure out where things went wrong. Clean code really helps debugging, so the less error handling I have to litter around my code the better. </p></pre>djherbis: <pre><p>Have you considered defining a Go struct which looks like that object and implementing the BinaryUnmarshaler interface? <a href="https://golang.org/pkg/encoding/" rel="nofollow">https://golang.org/pkg/encoding/</a></p></pre>Thunder_Moose: <pre><p>That looks like a formalization of reading the entire file in as a byte array and then transforming it. The more I work on this, the more I think that&#39;s the typical Go way of doing something like this. </p></pre>djherbis: <pre><p>Go has stream encoder/decoders that operate on things like this. That being said, it&#39;s more complicated to stream decode this and a single file is likely not so huge that loading into memory isn&#39;t convenient. </p></pre>mrekucci: <pre><p>I recommend you to read this blog post written by Rob Pike: <a href="https://blog.golang.org/errors-are-values" rel="nofollow">https://blog.golang.org/errors-are-values</a> . There are two examples that might help you. The first one is about how Scanner (which process streams) handles errors. And the second one is about simplifying repetitive error handling. </p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

462 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传