Golang for crawling and parsing

xuanbao · · 435 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

What libraries or framework for there tasks you know? For crawling I plan to use the selenium cluster. But here with convenient parsing HTML I can`t solve in any way... In python I use BeatifulSoup and search for something similar for go. What do you think about this ? <hr/>**评论：** ishanjain28: <pre>I use GoQuery for parsing HTML in pretty much all of my scraping jobs and It has worked really well for me. </pre>monoxiphoid: <pre>+1 for GoQuery -- the API should be very familiar to you if you've ever done any work with jQuery on the frontend.</pre>dgryski: <pre><a href="http://go-colly.org/" rel="nofollow">http://go-colly.org/</a> or <a href="https://github.com/PuerkitoBio/fetchbot" rel="nofollow">https://github.com/PuerkitoBio/fetchbot</a></pre>NilhEx: <pre>Thanks. May be you know some solutions as alternative to selenium webdriver? I would like to use browser control or emulation (phantomjs) to better mask the spider and clicks</pre>tmlbl: <pre>I have used <a href="https://github.com/yhat/scrape" rel="nofollow">https://github.com/yhat/scrape</a> for several scraping projects. But it’s only good for HTML parsing, not aware of anything headless browser-wise in Go. I always go back to node.js for those. </pre>Ploobers: <pre>This is super well written and works better than Puppeteer #nomorenode :) <a href="https://github.com/chromedp/chromedp" rel="nofollow">https://github.com/chromedp/chromedp</a></pre>tural-esger: <pre><a href="http://surf.readthedocs.io/" rel="nofollow">http://surf.readthedocs.io/</a> is it something you look for?</pre>NilhEx: <pre>Very likely. It is necessary to study</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

435 次点击

加入收藏微博

github

io

python

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Golang for crawling and parsing

用户登录

今日阅读排行

一周阅读排行

最新主题