<p>What libraries or framework for there tasks you know?</p>
<p>For crawling I plan to use the selenium cluster. But here with convenient parsing HTML I can`t solve in any way...
In python I use BeatifulSoup and search for something similar for go.</p>
<p>What do you think about this ?</p>
<hr/>**评论:**<br/><br/>ishanjain28: <pre><p>I use GoQuery for parsing HTML in pretty much all of my scraping jobs and It has worked really well for me. </p></pre>monoxiphoid: <pre><p>+1 for GoQuery -- the API should be very familiar to you if you've ever done any work with jQuery on the frontend.</p></pre>dgryski: <pre><p><a href="http://go-colly.org/" rel="nofollow">http://go-colly.org/</a>
or
<a href="https://github.com/PuerkitoBio/fetchbot" rel="nofollow">https://github.com/PuerkitoBio/fetchbot</a></p></pre>NilhEx: <pre><p>Thanks. May be you know some solutions as alternative to selenium webdriver? I would like to use browser control or emulation (phantomjs) to better mask the spider and clicks</p></pre>tmlbl: <pre><p>I have used <a href="https://github.com/yhat/scrape" rel="nofollow">https://github.com/yhat/scrape</a> for several scraping projects. But it’s only good for HTML parsing, not aware of anything headless browser-wise in Go. I always go back to node.js for those. </p></pre>Ploobers: <pre><p>This is super well written and works better than Puppeteer #nomorenode :)</p>
<p><a href="https://github.com/chromedp/chromedp" rel="nofollow">https://github.com/chromedp/chromedp</a></p></pre>tural-esger: <pre><p><a href="http://surf.readthedocs.io/" rel="nofollow">http://surf.readthedocs.io/</a> is it something you look for?</p></pre>NilhEx: <pre><p>Very likely. It is necessary to study</p></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传