Optimization Strategies: Slow Gob Decoding on interface{} types

polaris · · 602 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>Hi,</p> <p>I seem to be having some problem with gobbing and I&#39;m quite sure the solution is quite obvious but it hasn&#39;t hit me yet, hence this public forum question.</p> <p>Some background: I&#39;ve written a machine learning algo that uses maps as a sparse weight matrix (for a more interesting look at that, I <a href="https://speakerdeck.com/chewxy/deep-learning-in-go-or-shennanigans-with-matrices">gave a talk on that</a> once upon a time). </p> <p>The weight matrix looks something like this:</p> <pre><code>type something struct { weights map[feature]*[MAXCLASS]float64 ... // there clearly are more stuff } </code></pre> <p>where <code>MAXCLASS</code> is a predefined constant somewhere. The problem is the type <code>feature</code>. </p> <p>When I started the project <code>feature</code> was <code>type feature string</code>. As the project&#39;s requirement grew, the feature type changed as well. The current feature is defined as such now:</p> <pre><code>type feature interface { FeatType() featType OracleIsValid() bool fmt.Stringer } </code></pre> <p>This has led to a slowdown in gob encoding AND decoding of the weights . So far there are three types that satisfy the interface (<code>singleFeature</code>, <code>strStrTupleFeature</code>, <code>strIntTupleFeature</code>), each are structs tracking different types of features. This has also led to the slowdown in gob encoding and decoding of another struct (a tuple that contains both <code>class</code> and <code>feature</code>). </p> <p>My question is: is there any way I can speed up gobbing the <code>something</code> struct? What is the best strategy to make this faster? I have a 190MB+ model file that takes about 1min to load. Is there something me and my colleagues are missing? Are we entirely blind?</p> <hr/>**评论:**<br/><br/>tgulacsi: <pre><p>Have you tried providing a GobEncode and GobDecode for your interface? For this amount of items, maybe dropping a few reflection calls could mean significant savings.</p></pre>chewxy: <pre><p>Do you mean something like this:</p> <pre><code>type feature interface { FeatType() featType OracleIsValid() bool fmt.Stringer gob.GobEncoder gob.GobDecoder } </code></pre> <p>I&#39;ve not tried. Trying it now</p></pre>elithrar_: <pre><p>No, the poster means implementing your own methods that satisfy the gob interfaces. Those methods could use type switches to attempt to cast to concrete types, and default to the default gob implementation if the type isn&#39;t handled explicitly. The reflection calls for pulling things out of the interface are slow, allocate and will likely be your bottleneck.</p> <p>The trade-off you make when using gob is speed, in exchange for hands-off compatibility with any Go type. </p></pre>chewxy: <pre><p>I have a <code>GobEncode()</code> and <code>GobDecode()</code> method for all my types... </p></pre>elithrar_: <pre><p>And what do those methods look like? What do they do that&#39;s different from the default implementation?</p></pre>barsonme: <pre><p>1 minute seems a bit long. Have you checked cpu and mem profile?</p> <p>I&#39;d also look into something like flatbuffers or some of the other serialization formats that I can&#39;t remember off the top of my head.</p></pre>chewxy: <pre><p>Here&#39;s the top 10 -cum </p> <pre><code> flat flat% sum% cum cum% 0 0% 0% 48.41s 98.49% runtime.goexit 0.05s 0.1% 0.1% 39.06s 79.47% encoding/gob.(*Decoder).Decode 0.08s 0.16% 0.26% 39.06s 79.47% encoding/gob.(*Decoder).DecodeValue 0 0% 0.26% 39.06s 79.47% github.com/chewxy/goddamnproject/tiger.(*NeuralNetwork).Load 0 0% 0.26% 39.06s 79.47% main.main 0 0% 0.26% 39.06s 79.47% runtime.main 0.11s 0.22% 0.49% 39.05s 79.45% encoding/gob.(*Decoder).decodeGobDecoder 0.21s 0.43% 0.92% 39.05s 79.45% encoding/gob.(*Decoder).decodeSingle 0.26s 0.53% 1.44% 39.05s 79.45% encoding/gob.(*Decoder).decodeValue 0.08s 0.16% 1.61% 39.05s 79.45% encoding/gob.(*Decoder).gobDecodeOpFor.func1 </code></pre> <p>The (*Decoder).Decode callgraph looks like this (sorry for the PNG, I kinda screwed up the conversion from svg to upload to imgur):</p> <p><a href="http://i.imgur.com/Lip2J0y.png" rel="nofollow">web Decode</a></p> <p>Most of the complexity comes from the reflection and <code>decodeTypeSequence</code> which is in the gob package. I suspect this has mostly to do with the interface type. </p> <p>The Feature-Class tuple annotated in the image above is defined as such:</p> <pre><code>type fctuple struct { feature Class } </code></pre> <p>And lastly, from internal timing (rough idea): <code>2016/05/10 14:12:51 Loading from model took 42.005589309s</code></p> <p>This is a significantly smaller model though - 109MB</p> <p>I have not looked at FlatBuffers or other serialization protocols. </p></pre>barsonme: <pre><p>I would look into a higher performance library like flatbuffers or capn proto if you want a more &#34;hands off&#34; approach. </p> <p>Otherwise, what the other poster in here said is good advice—make your own methods that serialize the data instead of letting Gob reflect everything.</p></pre>hayzeus: <pre><p>I think I&#39;ve read that Gob is counter-intuitively slow</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

602 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传