<p>I'm happy to announce that <a href="https://github.com/chewxy/gorgonia">Gorgonia</a> now supports CUDA operations! </p>
<h2>What is Gorgonia?</h2>
<p>Gorgonia is a graph computation library, in vein of <a href="https://deeplearning.net/software/theano/">Theano</a> or <a href="https://tensorflow.org">Tensorflow</a>. Except it's written in Go. You can use it to write deep learning applications. <a href="http://blog.chewxy.com/2016/09/19/gorgonia/">See the original announcement</a>.</p>
<p>I myself have written a number of NLP related applications using Gorgonia. </p>
<p>The basic gist of Gorgonia is you write the expression graph (remember, neural networks are just a bunch of equations), which is then compiled and translated into executable code. So for example, if your equation is <code>z = x + y</code>, you'd basically write it like this:</p>
<pre><code>g := NewGraph()
x := NewScalar(g, Float64, WithName("x"))
y := NewScalar(g, Float64, WithName("y"))
z := Must(Add(x, y))
</code></pre>
<p>The graph is then compiled into a bunch of instructions, which basically goes: "okay, at run time, take value in the node <code>x</code>, and the value in the node <code>y</code> then perform the required operation (in this case "Add") and put the result value in node <code>z</code>."</p>
<p>Now of course there are some operations which take very much longer - <code>sigmoid</code> for example. One way to speed it up is to use SIMD instructions (which you can use in Gorgonia - simply pass in the build tag <code>'avx'</code> or <code>'sse'</code>). But the most common way to do so in the deep learning world is to use CUDA. Because CUDA is quite awesome. </p>
<h2>CUDA</h2>
<p>Enter CUDA. Right now, the use of CUDA in Gorgonia is limited. We can specify that we want to perform the <code>Add</code> operation on the GPU. And it'll run on GPU when the graph gets executed. </p>
<p>The main reason for doing so is the cgo overhead. If every <code>Op</code> uses CUDA, the overhead from cgo would negate any benefit from CUDA. A solution that batches CUDA calls is in the pipeline, as well as another method (it's the original method from way back when) - which is to JIT compile <code>*.cu</code> file with <code>nvcc</code> and execute just before run.</p>
<p>But for now, Gorgonia will happily run any CUDA-enabled <code>Op</code>.</p>
<h2>The Ask</h2>
<p>I have a few asks:</p>
<ol>
<li>Play with Gorgonia. As they say, given enough eyeballs, all bugs are shallow. I'm specifically a bit concerned about the CUDA operations, as I have not figured out a good way to test the CUDA stuff.</li>
<li>I need help with CUDA, and its kernels. Write more CUDA kernels. If possible, use <code>cuDNN</code>. Contributors welcome. My goal is having a CUDA kernel for every <code>Op</code> in Gorgonia.</li>
<li>I need help with making CUDA work on the other VM as well.</li>
<li>I need help with documentation. At the past 2 Go meetups I've been to, I've been told that Gorgonia has a number of features that without digging further, would go unnoticed. There is currently an <a href="https://github.com/chewxy/gorgonia/issues/80">open issue</a> with regards to this. Pull requests are welcome.</li>
<li>Ask me questions. This process clarifies a lot of things and can go into documentation.</li>
</ol>
<p>AMA, I guess.</p>
<p>EDIT: <a href="/u/naseerd">/u/naseerd</a> caught an error, which is fixed</p>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传