i/o timeout (dial udp) on a crawler

polaris · · 2120 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

I have been using the PuerkitoBio's fetchbot to crawl some pages and receive a bunch of: <pre><code>[ERR] HEAD http://bbc.com - Head http://www.bbc.com/: dial tcp: lookup www.bbc.com on 127.0.1.1:53: read udp 127.0.0.1:53755->127.0.1.1:53: i/o timeout </code></pre> I opened up an issue but I'm not sure how active the project still is: <a href="https://github.com/PuerkitoBio/fetchbot/issues/23" rel="nofollow">https://github.com/PuerkitoBio/fetchbot/issues/23</a> I noticed there was also a closed issue <a href="https://github.com/golang/go/issues/16865" rel="nofollow">https://github.com/golang/go/issues/16865</a> and wanted to know if this is being fixed or if someone smarter than me can enlighten me? I've tried on several different versions of Go - 1.7, 1.7.1, 1.7.5 and 1.8 linux/amd64. I am running Ubuntu 16.04 (which I upgraded from 14.04 and got the same errors). EDIT: The answer seems to be that my router was using Google's DNS servers. I removed that and now everything seems to be working fine. <hr/>**评论：** adrian_blx: <pre>Your DNS(-cache) seems to have issues.</pre>userofmostinterest: <pre>I have been messing around with the some of the http client's settings: <pre><code>f.HttpClient = &http.Client{ Transport: &http.Transport{ TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, Dial: (&nett.Dialer{ Timeout: 30 * time.Second, Resolver: &nett.CacheResolver{TTL: 10 * time.Minute}, }).Dial, DisableKeepAlives: true, }, Timeout: 40 * time.Second, } </code></pre> So, I am caching the DNS lookup for 10 minutes and setting some timeouts. Is there anything else I can try?</pre>userofmostinterest: <pre>The full gist can be found here: <a href="https://gist.github.com/kristen1980/9d689b6ae0ab9f8a330c4598060295e4" rel="nofollow">https://gist.github.com/kristen1980/9d689b6ae0ab9f8a330c4598060295e4</a></pre>userofmostinterest: <pre>Thanks! The DNS was the issue. Turns out my router was using Google's DNS servers. I removed that and can now crawl in peace. Thanks for pointing me in the right direction!</pre>Yojihito: <pre>What do you want to fetch from the BBC site?</pre>userofmostinterest: <pre>It isn't just the BBC site that fails. I get multiple failures.</pre>userofmostinterest: <pre>I get several thousand of these errors quickly all for different domains and urls. I just posted one example and didn't want to post a repetitive looking log.</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

2120 次点击

加入收藏微博

github

linux

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

i/o timeout (dial udp) on a crawler

用户登录

今日阅读排行

一周阅读排行

最新主题