<p><a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a></p>
<hr/>**评论:**<br/><br/>xuoe: <pre><p>Have a look at <a href="https://github.com/PuerkitoBio/goquery">goquery</a>.</p></pre>PaulCapestany: <pre><p>Another +1 for goquery, but depending on your task you may want to check out <a href="https://github.com/ericchiang/pup">https://github.com/ericchiang/pup</a> as well.</p></pre>cathalgarvey: <pre><p>GoQuery is exactly the thing. :)</p></pre>BOSS_OF_THE_INTERNET: <pre><p>I also use the heck out of goQuery for acceptance testing. </p></pre>wolf0403: <pre><p>golang.org/x/net/html and/or launchpad.net/xmlpath maybe?
<a href="http://stackoverflow.com/questions/24101721/parse-broken-html-with-golang" rel="nofollow">http://stackoverflow.com/questions/24101721/parse-broken-html-with-golang</a></p></pre>Streamweaver66: <pre><p>I don't know of one I'd actually call equivalent. People are doing some great work on implementing Go for arbitrary data structures but the truth is that Python is just more suited for a task like that than go. Why choose though? Setup a small python service to parse and pass back the data and consume it in Go if you need.</p></pre>Yojihito: <pre><p>The problem is that Python is slow if you need to tag 300.000 websites.</p></pre>Streamweaver66: <pre><p>I doubt it, not any slower than Go for text processing anyway. Python is often faster on text processing than Go in many benchmarks. It's one of Go's relative weaknesses. Python3 has fine concurrency features as well. In particular if you're doing that many websites you probably need very good html correction and solid ability to handle malformed HTML. Beautiful Soup is one of the more mature libraries when it comes to that. </p></pre>
What's the go equivalent of Python's Beautiful Soup (an HTML scraping library) ?
agolangf · · 852 次点击这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传