Call for help! YACC, Go, SVG, Feedback

polaris · 2017-08-14 10:00:09 · 646 次点击    
这是一个分享于 2017-08-14 10:00:09 的资源,其中的信息可能已经有所发展或是发生改变。

Hi everyone,

I started a project a while ago on (github) and I felt kinda stuck at one stage, so I decided to fire of a call for help here to proceed with this project - cause I feel this would be helpful for more people than just me.

This project has no documentation for now by intention, cause it's in a very early development stage and I didn't want to raise any attention before it's usable. So here a small description what it's supposed to be.

About tdoc

tdoc is supposed to generate SVG pictures from text - the source for the big picture are actually plain SVG files (like the official iconset from AWS, Azure or whatever). If you want to generate an AWS documentation, it should be as easy as that:

sns foo -> lambda bar

This will take the picture "sns.svg", labels it with foo and draws a relation to the bar labeled "lambda.svg". Fair enough, the problem is, it should be very flexible to also support input like this:

dc datacenter1 {
    ec2 foo {
        docker my_container1
    }
}

dc datacenter2 {
    ec2 bar {
        docker my_container2
    }
}

So therefor I decided to use YACC.

Current state of the project

tdoc is able to render something like this correct:

dc datacenter1 {
    ec2 foo
    lambda bar
    sns blubb
}

Imgur The biggest issue I'm facing currently, is the grammar for the multi-nested components. Rendering the images nested on many layers does work - but the parsing of the initial input is still an issue.

Imgur

cloud foo {
    vpc bar
    client blubb
    APIGateway baz
    CloudWatch_alarm quo
    CloudSearch blubbb
    DynamoDB_item bazz
    MachineLearning test {
        EC2_instance bar1
        EC2_instance bar2
        EC2_instance bar3
        EC2_instances bar4
        EC2_instances bar5
        EC2_instances bar6
        OpsWorks_instances bar7
        RDS_RDSDBinstance bar8
        RDS_OracleDBinstancealternate bar9
    }
}

The smallest samle that doesn't work: Imgur

cloud foo {
    vpc bar {
        EC2_Instance blubb
    }

    vpc baz {
        EC2_Instance quo
    }
}

The source for this problem is actually the design of the grammar - but since I've absolutely no experience with YACC (beside what I've learned so far while working on tdoc), I'd really like to get some help and a review of the grammar. The relations are not yet build into the grammar, since the nesting is causing me so much pain.

Try it yourself

After you go-get'ed tdoc, you can try it yourself with just putting any random svg in a folder (like test.svg) and run it with: tdoc -s /home/foo/svg input.tdoc

And input.tdoc would look like: test foo

Any feedback, review, support, PR, chat - whatever would be welcome!


评论:

sin2pifx:

You might want to take a look at graphviz.

ducky_cloud:

I already made a lot of researches in this area - sth like this does just not exist. The problem is mostly that tools like graphviz or plantuml are not very good when it comes to drawing custom icons. The main purpose was to draw a picture like you'd scetch sth on the whiteboard, using official iconset.

sin2pifx:

My hunch is that "depth" is not correct. YACC (or probably bison) executes the code blocks after the rule has been recognized. declaration is left recursive, so it might be incrementing depth too late. It usually is not a good idea to shadow the state of a parser in a global variable. The cleanest solution is build the parse tree and then traverse the tree to perform the actions.

ducky_cloud:

Do you have any ressource to look it up or a hint for an implementation? The idea of a parser tree is not new to me (tried it), but I faced several issues because of the limited number of non-terminals and the highly recursive grammar ...

Depth is indeed my biggest problem - there's another branch (root_rewrite) which uses x and y (which depth and the number of the component in depth to add the component to). And as you said - under specific circumstances, this state is just not valid anymore ...

sin2pifx:

I'll try to have a look tomorrow evening.

sin2pifx:

RemindMe! 20 hours

9nut:

based on a cursory review of tdoc.y, i'm not sure why root is an array/slice.

a generic root should be an element that contain all other top level elements (e.g. list) and all other elements (and blocks) should be descended from those (i.e. abstract syntax tree). evaluating the tree in order, depth-first, should produce the complete svg string.

it's hard to tell from the grammar what is a definition and what is an invocation. i would consider making those more clear. for example, with something like:

bar : vpc {
    EC2_Instance %1
}
baz : vpc {
    EC2_Instance %1
}
foo : cloud {
    bar %1  '->'  baz %2
}
$main : {
    foo "blubb" "quo"
}

assuming i recorded every definition in the registry, registry["foo"] will contain the root element for definition of foo (i.e. a cloud), etc. assuming i have a "runner" that takes a definition and a lookup table (i.e. registry), i should be able to do runner.Eval(registry.Get("$main"), registry) to evaluate the top level program.

here's a good reference for a C version of something similar, the UNIX and Plan 9 pic troff preprocessor grammar.

ducky_cloud:

Root contains only a reference to other components - it's just a kind of help to know on which component I have to append another component.

vpc foo {
    EC2_Instance bar
}

The component foo will be used to add another component bar and creates a tree in this case. The root_rewrite branch follows the same idea, but multi-dimensional. The the end I have the program component, which is itself 1 component and is the first node for the abstract syntax tree.

Thanks for the reference and your thoughts - will have a look!

sin2pifx:

I can't get it to compile. When I run make, it says: go tool: no such tool "yacc", so I've got no idea what to do.

Instead, I'll summarize what you might want to do. This is your basic grammar (I noticed missing semicolons BTW):

program: statement_list;
statement_list: statement | statement_list statement;
statement: declaration | relation_assignment;
relation_assignment:
    TEXT RELATION TEXT |
    TEXT RELATION declaration |
    declaration RELATION TEXT |
    relation_assignment RELATION declaration |
    declaration RELATION declaration;
declaration:
    COMPONENT IDENTIFIER |
    COMPONENT IDENTIFIER ALIAS TEXT |
    declaration SCOPEIN |
    SCOPEOUT;

There does not seem to be a relation between {, SCOPEIN, and }, SCOPEOUT. That's weird. Normally, when you parse, that's the most important anchor. I think your grammar would accept }}}}}.

From what I get from your examples, you actually want something like this (but note that I don't know what TEXT, RELATION, COMPONENT and IDENTIFIER mean):

document: declaration_list { document = $1; }.
declaration_list:{ declaration_list = nil; } |
    declaration declaration_list { declaration_list = concatenate($1, $2); };
declaration:
    COMPONENT IDENTIFIER { declaration = MakeTerminalNode($1, $2); };
    COMPONENT IDENTIFIER SCOPEIN declaration_list SCOPEOUT { declaration = MakeEmbeddingNode($1, $2, $4); };

That should give you your tree, which you can then process.

ducky_cloud:

Thanks, I'm gonna change the Makefile. Go tool yacc has been removed, so goyacc has to be used in your version of go: https://godoc.org/golang.org/x/tools/cmd/goyacc

(Just go get golang.org/x/tools/cmd/goyacc and change go tool yacc to goyacc - my local version is a bit messed up compared to github due to all my experiments ...)

You can use sth like

EC2_Instance "This is an Instance" as ec2

In this case, EC2_Instance is the component, "This is an Instance" is TEXT and ec2 is the IDENTIFIER. RELATION is if you do that

EC2_Instance instance1 -> EC2_Instance instance2

Let's see if I got it right.

concatenate will return a new Node with $1 and $2

MakeTerminalNode just creates a new component

MakeEmbeddingNode creates a new component and adds it as a child - to what exactly? From where do I know to which component I should add the new Node?

sin2pifx:

The "EmbeddingNode" would be a container for other components or "embedding nodes". Suppose you make the struct as simple as this:

struct Node {
component string
relation string
children []Node*
}

you would have all information from the file. The parser then builds it the tree bottom-up. When it recognizes a simple component, it returns a struct where children is nil; when it recognizes COMPONENT IDENTIFIER { ... } it returns a struct with a list of Nodes in children. Wouldn't that take care of the basic nesting problem?

I have to say I don't quite understand what you want to achieve with relation and text, but you can either shoehorn them in the same struct, or have an interface "ParseTreeNode" or something like that and create a list of those as children.

ducky_cloud:

Relation and text is just an Implementation detail. If you want to display a different text than the ID, you can use text. RELATION draws the lines between two related COMPONENTs and changes the placement.

I don't see for now how the recursion works, but I'll investigate some time tomorrow and see if I can make it work. If it's really that "simple", I'd bite my ass xD That would indeed solve my nesting problem. Thanks!!!

ducky_cloud:

I just pushed a change. If you want to compile and test, just use

make yacc
go run main.go foo.tdoc

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

646 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传