scipipe Go (Golang) 的工作流系统 scipipe

polaris • 3885 次点击    
这是一个分享于 的项目,其中的信息可能已经有所发展或是发生改变。
Go (Golang) 的工作流系统,受  Flow-based Programming 启示。 # SciPipe SciPipe is an experimental library for writing [scientific Workflows]( in vanilla [Go(lang)]( The architecture of SciPipe is based on an[flow-based programming]( like pattern in pure Go as presented in [this]( and [this]( Gopher Academy blog posts. UPDATE June 23, 2016:  See also [slides from a recent presentation of SciPipe for use in a Bioinformatics setting]( An example workflow Before going into details, let&#39;s look at a toy-example workflow, to get a feel for what writing workflows with SciPipe looks like: <pre box-sizing:="" font-family:="" liberation="" font-size:="" margin-top:="" margin-bottom:="" font-stretch:="" line-height:="" word-wrap:="" padding:="" overflow:="" background-color:="" border-radius:="" word-break:="">package mainimport (     sp &#34;;)func main() {    // Initialize processes     foo := sp.NewFromShell(&#34;foowriter&#34;, &#34;echo &#39;foo&#39; &gt; {o:foo}&#34;)    f2b := sp.NewFromShell(&#34;foo2bar&#34;, &#34;sed &#39;s/foo/bar/g&#39; {i:foo} &gt; {o:bar}&#34;)    snk := sp.NewSink() // Will just receive file targets, doing nothing     // Add output file path formatters for the components created above     foo.SetPathStatic(&#34;foo&#34;, &#34;foo.txt&#34;)     f2b.SetPathExtend(&#34;foo&#34;, &#34;bar&#34;, &#34;.bar&#34;)    // Connect network     f2b.In[&#34;foo&#34;].Connect(foo.Out[&#34;foo&#34;])     snk.Connect(f2b.Out[&#34;bar&#34;])    // Add to a pipeline runner and run     pl := sp.NewPipelineRunner()     pl.AddProcesses(foo, f2b, snk)     pl.Run() }</pre> ... and to see how we would run this code, let&#39;s assume we put this code in a file `myfirstworkflow.go` and run it. Then it can look like this: <pre box-sizing:="" font-family:="" liberation="" font-size:="" margin-top:="" margin-bottom:="" font-stretch:="" line-height:="" word-wrap:="" padding:="" overflow:="" background-color:="" border-radius:="" word-break:="">[samuel test]$ go run myfirstworkflow.go AUDIT   2016/06/09 17:17:41 Task:foowriter    Executing command: echo &#39;foo&#39; &gt; foo.txt.tmp AUDIT   2016/06/09 17:17:41 Task:foo2bar      Executing command: sed &#39;s/foo/bar/g&#39; foo.txt &gt;</pre> As you see, it displays all the shell commands it has executed based on the defined workflow. ### Benefits Some benefits of SciPipe that are not always available in other scientific workflow systems: * Easy-to-grasp behaviour:  Data flowing through a network. * Parallel:  Apart from the inherent pipeline parallelism, SciPipe processes also spawn multiple parallel tasks when the same process has multiple inputs. * Concurrent:  Each process runs in an own light-weight thread, and is not blocked by operations in other processes, except when waiting for inputs from upstream processes. * Inherently simple:  Uses Go&#39;s concurrency primitives (go-routines and channels) to create an &#34;implicit&#34; scheduler, which means very little additional infrastructure code. This means that the code is easy to modify and extend. * Resource efficient:  You can choose to stream selected outputs via Unix FIFO files, to avoid temporary storage. * Flexible:  Processes that wrap command-line programs and scripts can be combined with processes coded directly in Golang. * Custom file naming:  SciPipe gives you full control over how file names are produced, making it easy to understand and find your way among the output files of your computations. * Highly Debuggable(!):  Since everything in SciPipe is plain Go(lang), you can easily use the [gdb debugger]( (preferrably with the [cgdb interface]( for easier use) to step through your program at any detail, as well as all the other excellent debugging tooling for Go (See eg [delve]( and [godebug](, or just use `println()` statements at any place in your code. In addition, you can easily turn on very detailed debug output from SciPipe&#39;s execution itself, by just turning on debug-level logging with `scipipe.InitLogDebug()` in your `main()` method. * Efficient:  Workflows are compiled into static compiled code, that runs fast. * Portable:  Workflows can be distributed as go code to be run with the `go run` command or compiled into stand-alone binaries for basically any unix-like operating system. ## [ ]( limitations * There is not yet a really comprehensive audit log generation. It is being worked on currently. * There is not yet support for the [Common Workflow Language](, but that is also something that we plan to support in the future. ### Connection to flow-based programming From Flow-based programming, SciPipe uses the ideas of separate network (workflow dependency graph) definition, named in- and out-ports, sub-networks/sub-workflows and bounded buffers (already available in Go&#39;s channels) to make writing workflows as easy as possible. In addition to that it adds convenience factory methods such as `scipipe.NewFromShell()` which creates ad hoc processes on the fly based on a shell command pattern, where inputs, outputs and parameters are defined in-line in the shell command with a syntax of `{i:INPORT_NAME}` for inports, and `{o:OUTPORT_NAME}` for outports and `{p:PARAM_NAME}` for parameters.
Google Go 查看源码»
3885 次点击  
加入收藏 微博
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传