What's the go equivalent of Python's Beautiful Soup (an HTML scraping library) ?

agolangf · · 455 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p><a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a></p> <hr/>**评论:**<br/><br/>xuoe: <pre><p>Have a look at <a href="https://github.com/PuerkitoBio/goquery">goquery</a>.</p></pre>PaulCapestany: <pre><p>Another +1 for goquery, but depending on your task you may want to check out <a href="https://github.com/ericchiang/pup">https://github.com/ericchiang/pup</a> as well.</p></pre>cathalgarvey: <pre><p>GoQuery is exactly the thing. :)</p></pre>BOSS_OF_THE_INTERNET: <pre><p>I also use the heck out of goQuery for acceptance testing. </p></pre>wolf0403: <pre><p>golang.org/x/net/html and/or launchpad.net/xmlpath maybe? <a href="http://stackoverflow.com/questions/24101721/parse-broken-html-with-golang" rel="nofollow">http://stackoverflow.com/questions/24101721/parse-broken-html-with-golang</a></p></pre>Streamweaver66: <pre><p>I don&#39;t know of one I&#39;d actually call equivalent. People are doing some great work on implementing Go for arbitrary data structures but the truth is that Python is just more suited for a task like that than go. Why choose though? Setup a small python service to parse and pass back the data and consume it in Go if you need.</p></pre>Yojihito: <pre><p>The problem is that Python is slow if you need to tag 300.000 websites.</p></pre>Streamweaver66: <pre><p>I doubt it, not any slower than Go for text processing anyway. Python is often faster on text processing than Go in many benchmarks. It&#39;s one of Go&#39;s relative weaknesses. Python3 has fine concurrency features as well. In particular if you&#39;re doing that many websites you probably need very good html correction and solid ability to handle malformed HTML. Beautiful Soup is one of the more mature libraries when it comes to that. </p></pre>
455 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传