Go (Golang) 的工作流系统 scipipe

polaris • • 4772 次点击

这是一个分享于的项目，其中的信息可能已经有所发展或是发生改变。

Go (Golang) 的工作流系统，受 Flow-based Programming 启示。 # SciPipe SciPipe is an experimental library for writing [scientific Workflows](https://en.wikipedia.org/wiki/Scientific_workflow_system) in vanilla [Go(lang)](http://golang.org/). The architecture of SciPipe is based on an[flow-based programming](https://en.wikipedia.org/wiki/Flow-based_programming) like pattern in pure Go as presented in [this](http://blog.gopheracademy.com/composable-pipelines-pattern) and [this](https://blog.gopheracademy.com/advent-2015/composable-pipelines-improvements/) Gopher Academy blog posts. UPDATE June 23, 2016: See also [slides from a recent presentation of SciPipe for use in a Bioinformatics setting](http://www.slideshare.net/SamuelLampa/scipipe-a-lightweight-workflow-library-inspired-by-flowbased-programming). An example workflow Before going into details, let's look at a toy-example workflow, to get a feel for what writing workflows with SciPipe looks like: <pre box-sizing:="" font-family:="" liberation="" font-size:="" margin-top:="" margin-bottom:="" font-stretch:="" line-height:="" word-wrap:="" padding:="" overflow:="" background-color:="" border-radius:="" word-break:="">package mainimport ( sp "github.com/scipipe/scipipe")func main() { // Initialize processes foo := sp.NewFromShell("foowriter", "echo 'foo' > {o:foo}") f2b := sp.NewFromShell("foo2bar", "sed 's/foo/bar/g' {i:foo} > {o:bar}") snk := sp.NewSink() // Will just receive file targets, doing nothing // Add output file path formatters for the components created above foo.SetPathStatic("foo", "foo.txt") f2b.SetPathExtend("foo", "bar", ".bar") // Connect network f2b.In["foo"].Connect(foo.Out["foo"]) snk.Connect(f2b.Out["bar"]) // Add to a pipeline runner and run pl := sp.NewPipelineRunner() pl.AddProcesses(foo, f2b, snk) pl.Run() }</pre> ... and to see how we would run this code, let's assume we put this code in a file `myfirstworkflow.go` and run it. Then it can look like this: <pre box-sizing:="" font-family:="" liberation="" font-size:="" margin-top:="" margin-bottom:="" font-stretch:="" line-height:="" word-wrap:="" padding:="" overflow:="" background-color:="" border-radius:="" word-break:="">[samuel test]$ go run myfirstworkflow.go AUDIT 2016/06/09 17:17:41 Task:foowriter Executing command: echo 'foo' > foo.txt.tmp AUDIT 2016/06/09 17:17:41 Task:foo2bar Executing command: sed 's/foo/bar/g' foo.txt > foo.txt.bar.tmp</pre> As you see, it displays all the shell commands it has executed based on the defined workflow. ### Benefits Some benefits of SciPipe that are not always available in other scientific workflow systems: * Easy-to-grasp behaviour: Data flowing through a network. * Parallel: Apart from the inherent pipeline parallelism, SciPipe processes also spawn multiple parallel tasks when the same process has multiple inputs. * Concurrent: Each process runs in an own light-weight thread, and is not blocked by operations in other processes, except when waiting for inputs from upstream processes. * Inherently simple: Uses Go's concurrency primitives (go-routines and channels) to create an "implicit" scheduler, which means very little additional infrastructure code. This means that the code is easy to modify and extend. * Resource efficient: You can choose to stream selected outputs via Unix FIFO files, to avoid temporary storage. * Flexible: Processes that wrap command-line programs and scripts can be combined with processes coded directly in Golang. * Custom file naming: SciPipe gives you full control over how file names are produced, making it easy to understand and find your way among the output files of your computations. * Highly Debuggable(!): Since everything in SciPipe is plain Go(lang), you can easily use the [gdb debugger](http://golang.org/doc/gdb) (preferrably with the [cgdb interface](https://www.youtube.com/watch?v=OKLR6rrsBmI) for easier use) to step through your program at any detail, as well as all the other excellent debugging tooling for Go (See eg [delve](https://github.com/derekparker/delve) and [godebug](https://github.com/mailgun/godebug)), or just use `println()` statements at any place in your code. In addition, you can easily turn on very detailed debug output from SciPipe's execution itself, by just turning on debug-level logging with `scipipe.InitLogDebug()` in your `main()` method. * Efficient: Workflows are compiled into static compiled code, that runs fast. * Portable: Workflows can be distributed as go code to be run with the `go run` command or compiled into stand-alone binaries for basically any unix-like operating system. ## [ ](https://github.com/scipipe/scipipe#known-limitations)Known limitations * There is not yet a really comprehensive audit log generation. It is being worked on currently. * There is not yet support for the [Common Workflow Language](http://common-workflow-language.github.io/), but that is also something that we plan to support in the future. ### Connection to flow-based programming From Flow-based programming, SciPipe uses the ideas of separate network (workflow dependency graph) definition, named in- and out-ports, sub-networks/sub-workflows and bounded buffers (already available in Go's channels) to make writing workflows as easy as possible. In addition to that it adds convenience factory methods such as `scipipe.NewFromShell()` which creates ad hoc processes on the fly based on a shell command pattern, where inputs, outputs and parameters are defined in-line in the shell command with a syntax of `{i:INPORT_NAME}` for inports, and `{o:OUTPORT_NAME}` for outports and `{p:PARAM_NAME}` for parameters.

授权协议：: MIT
开发语言：: Google Go 查看源码»
操作系统：: 跨平台

4772 次点击

加入收藏微博

github

echo

net

io

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Go (Golang) 的工作流系统 scipipe

用户登录

今日阅读排行

一周阅读排行