Extendable markdown parser

blov · · 634 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

Hi everyone, I'm a new-ish adept of Go lang, and I'm not yet very familiar with the ecosystem. I want to ask if anyone knows of a markdown parser library which can be extended to include custom syntax. Thank you. <hr/>**评论：** Emacs24: <pre>They are all shitty, although <a href="https://github.com/golang-commonmark/markdown" rel="nofollow">https://github.com/golang-commonmark/markdown</a> does look like relatively easily expandable. The most popular is <a href="https://github.com/russross/blackfriday" rel="nofollow">https://github.com/russross/blackfriday</a> but looking at <a href="https://sourcegraph.com/github.com/russross/blackfriday@6d1ef893fcb01b4f50cb6e57ed7df3e2e627b6b2/-/blob/markdown.go#L32:2$references" rel="nofollow">this</a> I doubt if it is easily expandable And don't waste your time on this piece of crap: <a href="https://github.com/a8m/mark" rel="nofollow">https://github.com/a8m/mark</a></pre>habarnam: <pre>Thank you. Blackfriday sounds somewhat familiar so maybe I toyed with it a while back. :)</pre>bear1728: <pre>I actually wrote one in Typescript based off Khan academy's "simple-markdown" in Javascript: <a href="https://gitlab.com/whatwhathuhhuh/bear-markdown" rel="nofollow">https://gitlab.com/whatwhathuhhuh/bear-markdown</a> I bet it would be pretty easy to rewrite it in Go since it's mostly just a few regular expressions. Of course it would take more time than using one of the others mentioned here, but if you want to learn more about parsing/markdown/go/typescript it could be an interesting project.</pre>habarnam: <pre>Thank you, but I'm already working on implementing a parser of my own, though I'm using a different approach. :) I was looking for a usable solution until I manage to get mine up to CommonMark compliance.</pre>jerf: <pre>Ah, that clarifies a lot. Are you looking to do a one-off extension of the syntax, or are you looking to offer a library that allows users to extend the syntax? Either way, I suspect you'll find that the blackfriday library is basically the bare minimum you're going to get to support CommonMark. The standard betrays its origins and it's more complicated than an implementation of Markdown written to be clear from the get go (rather than try to harmonize multiple existing code bases) would have been. If you're trying to offer the latter, you're going to find that, yeah, it's actually going to be quite the challenge. What seems to a user to be "look, I just want to add a |_ _| pair that means to wrap this bit of text in a span that puts a box around it" interacts with all the rest of the grammar; what does <code>hello |_ world *how _| are* you</code> mean? Either you hard-code answers to that question into the new extension, or you are going to eventually find yourself backed into an interface where the grammar-extending function they have to implement is basically <pre><code>type Extender interface { Extend(p MarkdownParseTree) MarkdownParseTree } </code></pre> , and you just wash your hands of the consequences after that. (And I hope your Markdown nodes are themselves an interface that can be externally implemented. I'm not looking at blackfriday for this; too much work for a reddit post :) )</pre>habarnam: <pre>Thanx for the input, you make some good points. To be completely honest my effort is not too tied into golang, but I need it as a golang library to extend an existing product. The actual parsing I'm trying to do is using ragel - the go part is just manipulating the resulting AST, but like you said, markdown is quite complicated - and "not regular enough" - for it to work with the naive approach I'm using. I have parts of the standard working independently, but as soon as I combine them, the results are unpredictable. :) Because the actual parser logic is using ragel instead of go, the library wouldn't be extendable per se, but the state machines a ragel document is composed of, can be extended easily enough, as long as they play nice as part of the whole, which is not yet the case. I suspect I need to break the parsing down into stages, but this would mostly invalidate the advantage of using ragel - which would be a "one pass" parsing.</pre>jerf: <pre>Well, I can encourage you in the sense that, as far as I know, this problem just really stinks. You're not missing out on anything or something; it just really stinks. Definitely steal as much of the test suite from one of the existing parsers as you can. I recognize that is easier said than done, but it might still be worth it even so, because it's just impossible to work with building a parser without a solid test suite.</pre>habarnam: <pre><blockquote> Definitely steal as much of the test suite from one of the existing parsers as you can. </blockquote> Yes, I will do that, I was thinking about the same thing. I have a couple of tests, but the more the better. :)</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

634 次点击

加入收藏微博

github

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Extendable markdown parser

用户登录

今日阅读排行

一周阅读排行

最新主题