Sentences v1.0.1: A multilingual command line sentence tokenizer

polaris · · 456 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p><a href="https://github.com/neurosnap/sentences">Sentences</a> is a multilingual command line sentence tokenizer. This golang package converts a blob of text into a list of sentences. The ultimate goal is to become one of the fastest and accurate sentence tokenizers with an emphasis on extending it to fit developers&#39; needs.</p> <p><strong>Any feedback is greatly appreciated.</strong></p> <p>This project started out as a straight port from NLTK&#39;s punkt sentence tokenizer but since the original migration I have made substantial updates to how the tokenizer functions and its performance. I&#39;d be happy to discuss with anyone interested in the struggles with porting a python package to golang.</p> <ul> <li><a href="http://sentences.erock.io/">Demo</a></li> <li><a href="https://godoc.org/gopkg.in/neurosnap/sentences.v1">Docs</a></li> <li><a href="https://github.com/neurosnap/sentences">https://github.com/neurosnap/sentences</a></li> </ul> <hr/>**评论:**<br/><br/>john10x: <pre><p>Great stuff, I sometimes play around with NLTK. Nice to have this in Go.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

456 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传