<p>I'm trying to get my upvote information from HN, but their API doesn't support private user information, so I decided to try scraping; however, I can't seem to authenticate correctly (see my attempt here: <a href="https://gist.github.com/ns-cweber/20033179b75e3c0e301062c85b2d35e4" rel="nofollow">https://gist.github.com/ns-cweber/20033179b75e3c0e301062c85b2d35e4</a>). The login request returns 200, but passing the cookies to the /upvoted endpoint returns 39 bytes of non-HTML garbage (with http 200): sN/QH.�T(,q���. This also isn't the error message you get when auth fails--that is "Cannot display that" (try it yourself here: <a href="https://news.ycombinator.com/upvoted?id=dang&comments=t" rel="nofollow">https://news.ycombinator.com/upvoted?id=dang&comments=t</a>).</p>
<p>I've tried double and triple checking the headers against my browser and against curl, and I'm confirming that the output is the same via httputil.DumpRequest. Still, my Go client is getting different data than curl or my browser.</p>
<p>Could someone help me spot my error?</p>
<hr/>**评论:**<br/><br/>xrstf: <pre><p>Your code works perfectly fine for me, assuming I'm using the same username in both places (line 69 and 114). I receive a gzip blob that decodes to <code><html op="upvoted"><head><meta name="referrer" content="origin"><meta name=".....</code>.</p>
<p>Maybe I did not understand your issue?</p></pre>weberc2: <pre><blockquote>
<p><a href="https://gist.github.com/ns-cweber/20033179b75e3c0e301062c85b2d35e4" rel="nofollow">https://gist.github.com/ns-cweber/20033179b75e3c0e301062c85b2d35e4</a></p>
</blockquote>
<p>Maybe the "non-HTML garbage" I was receiving is simply gzipped HTML, and I was expecting the HTTP client to un-gzip it for me? I'll look into this when I get a moment. Thanks for taking the time to run it!</p></pre>xrstf: <pre><p>I've just added a <code>ioutil.WriteFile("foo.gz", data, 0644)</code> at the end of <code>main</code> and took a look at the file. Before changing the username in the <code>upvoted</code> function, I received the "Cannot display that" gzip blob, after changing it, it worked as intended.</p></pre>weberc2: <pre><p>Yeah, gzip was the problem. Thanks very much.</p></pre>Thaxll: <pre><p>You ask for a gzip response with: values.Set("accept-encoding", "gzip, deflate, br") in the client but the default Go http client doesn't support gzip response. Remove that header you will be fine.</p>
<p>If you want to handle gzip response: <a href="https://github.com/NYTimes/gziphandler" rel="nofollow">https://github.com/NYTimes/gziphandler</a></p></pre>weberc2: <pre><p>Awesome, thanks!</p></pre>dchapes: <pre><blockquote>
<p>but the default Go http client doesn't support gzip response</p>
</blockquote>
<p>This isn't accurate. If you don't provide your own "Accept-Encoding" header and if you don't override <code>http.Transport.DisableCompression</code> then the default <code>http.Transport</code> will add the header and transparently decompress the response.</p>
<p>From <a href="https://golang.org/pkg/net/http#Transport.DisableCompression" rel="nofollow"><code>http.Transport.DisableCompression</code></a>:</p>
<blockquote>
<p>DisableCompression, if true, prevents the Transport from
requesting compression with an "Accept-Encoding: gzip"
request header when the Request contains no existing
Accept-Encoding value. If the Transport requests gzip on
its own and gets a gzipped response, it's transparently
decoded in the Response.Body. However, if the user
explicitly requested gzip it is not automatically
uncompressed.</p>
</blockquote></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传