<p><a href="https://github.com/jdkato/prose">prose is a Golang library</a> designed to aid in a number of tasks related to (English) text processing. Some of its features include:</p>
<ul>
<li>Splitting text on words, sentences, or arbitrary regexps.</li>
<li>Part-of-speech tagging + named-entity extraction.</li>
<li>Intelligently converting strings to title case.</li>
<li>Counting the number of syllables in a word.</li>
<li>Calculating readability metrics such as Flesch–Kincaid, SMOG, and Coleman–Liau.</li>
</ul>
<p>It's still under active development, but its core functionality is in place and fairly well tested.</p>
<p>Looking forward to hearing any feedback or general thoughts.</p>
<hr/>**评论:**<br/><br/>tv64738: <pre><p>Nice. As someone who's not actively working on NLP tasks, two notes:</p>
<ul>
<li><p><code>summarize</code> sounds like a thing that returns the gist of a longer text, like <a href="https://www.reddit.com/user/autotldr/comments/" rel="nofollow">https://www.reddit.com/user/autotldr/comments/</a></p></li>
<li><p>it would be nice to see outputs of the examples for all of the code samples in the readme; reusing <code>go test</code> <code>Example</code>s would be worthwhile</p></li>
</ul></pre>jdkato: <pre><p>Thanks for the feedback!</p>
<ul>
<li>This is actually something I'm planning on adding to the package (my ultimate goal is readability + usage statistics, sentiment analysis, and some form of a TL;DR generator).</li>
<li>Good idea.</li>
</ul></pre>leadguit: <pre><p>Sounds interesting - in what languages? Meaning is it english only?</p></pre>jdkato: <pre><p>Yes, essentially. The <code>PragmaticSegmenter</code> (a sentence splitter) currently supports English, Spanish, and French -- but everything else is English-only for now.</p></pre>
prose: A library for text processing that supports tokenization, part-of-speech tagging, named-entity extraction, and more.
agolangf · · 353 次点击这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
0 回复
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传