[help] Why is reddit returning 429 regardless of useragent?

agolangf · · 636 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>No matter why useragent I use I get a 429 to many requests error from reddit</p> <p>Here is my code </p> <pre><code>package main import ( &#34;fmt&#34; &#34;regexp&#34; &#34;strings&#34; &#34;log&#34; &#34;net/http&#34; &#34;os&#34; &#34;bytes&#34; ) func stripchars(str, chr string) string { return strings.Map(func(r rune) rune { if strings.IndexRune(chr, r) &lt; 0 { return r } return -1 }, str) } func main() { if len(os.Args) != 2 { fmt.Fprintf(os.Stderr, &#34;Usage: %s URL\n&#34;, os.Args[0]) os.Exit(1) } re := regexp.MustCompile(&#34;https?://imgur.com/a......&#34;) client := &amp;http.Client{} url := fmt.Sprintf(&#34;https://reddit.com/r/%s/top.json&#34;, os.Args[1]) resp, err := http.NewRequest(&#34;GET&#34;, url, nil) resp.Header.Set(&#34;User-Agent&#34;, &#34;linux:go-postgrabber:v0.1 (by /u/ineedmorealts)&#34;) fmt.Printf(&#34;Getting&#34; + url) response, err := client.Do(resp) if err != nil { log.Fatal(err) } else { defer response.Body.Close() buf := new(bytes.Buffer) buf.ReadFrom(response.Body) s := buf.String() // Does a complete copy of the bytes in the buffer. links := re.FindString(stripchars(s, &#34;\&#34;&#34;)) fmt.Printf(&#34;%v\n&#34;, links) fmt.Printf(&#34;%v\n&#34;,s) } } </code></pre> <p>Sorry if this isn&#39;t the right place for this</p> <p>Edit: I have tired switching both useragent and IPs neither of which fixed the issue</p> <p>Edit_2: <a href="/u/gohacker" rel="nofollow">u/gohacker</a> found the issue!</p> <p><code>url := fmt.Sprintf(&#34;https://reddit.com/r/%s/top.json&#34;, os.Args[1])</code> should be <code>url := fmt.Sprintf(&#34;https://www.reddit.com/r/%s/top.json&#34;, os.Args[1])</code>, I have no idea why but this completely fixes the issue </p> <hr/>**评论:**<br/><br/>afhtech: <pre><p>HTTP 429 is too many requests. Are you creating a new connection with every request instead of reusing just one? (New net/http instance on every request). On mobile but that&#39;s my guess.</p></pre>ineedmorealts: <pre><p>The program makes just make 1 request per run </p></pre>gohacker: <pre><blockquote> <p><code>resp, err := http.NewRequest(&#34;GET&#34;, url, nil)</code></p> </blockquote> <p>Following <a href="https://github.com/Droogans/unmaintainable-code" rel="nofollow">the best practices</a>?</p> <p>I fail to see how user agent is responsible for this. The only explanation that comes to mind is what <a href="/u/afhtech" rel="nofollow">u/afhtech</a> said. Are you invoking the program repeatedly? What it says if you open <a href="https://reddit.com/r/golang/top.json" rel="nofollow">https://reddit.com/r/golang/top.json</a> in a browser?</p></pre>afhtech: <pre><p>Change <code>resp, err := http.NewRequest(&#34;GET&#34;, url, nil)</code></p> <p>To <code>resp, err := client.Get(url)</code></p> <p>Example here <a href="https://golang.org/pkg/net/http/" rel="nofollow">https://golang.org/pkg/net/http/</a></p></pre>ineedmorealts: <pre><blockquote> <p>Are you invoking the program repeatedly?</p> </blockquote> <p>Nope. I test it ~ 3 time an hour but but the 2nd run I&#39;m getting 429</p> <blockquote> <p>What it says if you open <a href="https://reddit.com/r/golang/top.json" rel="nofollow">https://reddit.com/r/golang/top.json</a> in a browser?</p> </blockquote> <p>I see the json for the top posts.</p> <p>This is really starting to bug me because I have written almost the exact same program in python and it works perfectly </p></pre>mikekreuzer: <pre><p>Have you tried the format for the user agent string suggested <a href="https://github.com/reddit/reddit/wiki/API" rel="nofollow">here</a> ?</p></pre>ineedmorealts: <pre><p>I tried and it ran into the same issue</p></pre>ratatask: <pre><p>Maybe you did it wrong ? Update your code so we can see exactly what you&#39;re doing.</p></pre>ineedmorealts: <pre><p>It&#39;s been updated</p></pre>vendakka: <pre><p>Have you considered creating a developer account and using OAuth? The same top posts query works fine for me when I do that.</p> <p>I&#39;ve exported my code as a library so you could try using that and see if it works. You&#39;ll need a developer account key.</p> <p><a href="https://github.com/sridharv/reddit-go" rel="nofollow">github.com/sridharv/reddit-go</a></p></pre>0x6B: <pre><p>You just need to be authenticated (OAUTH) and set a proper &#39;User-Agent&#39; header.</p> <p>If you PM me I can sent you a reddit api project (to ugly to be open sourced) of mine which might help you ;-)</p> <p>edit: Add sample code</p> <pre><code>package main import ( &#34;fmt&#34; &#34;log&#34; &#34;net/http&#34; &#34;os&#34; &#34;strings&#34; &#34;time&#34; &#34;golang.org/x/net/context&#34; &#34;golang.org/x/oauth2&#34; &#34;golang.org/x/oauth2/clientcredentials&#34; ) func main() { cfg := &amp;clientcredentials.Config{ ClientID: &#34;...&#34;, ClientSecret: &#34;...&#34;, TokenURL: &#34;https://www.reddit.com/api/v1/access_token&#34;, Scopes: []string{&#34;read&#34;}, } ctx := context.Background() ua := &amp;UserAgent{ Platform: &#34;linux&#34;, AppID: &#34;com.your-domain.your-client-name&#34;, Version: &#34;v0.0.0, Username: &#34;anon&#34;, } // This is the only way to set the User-Agent header for the // way how the oauth2 package aquires the access_token. // Aaaand reddit is VERY PICKY about the useragent. Even for a single request. http.DefaultClient.Transport = ua.HTTPTransporter(http.DefaultTransport) client := &amp;http.Client{ Transport: &amp;oauth2.Transport{ Source: cfg.TokenSource(ctx), Base: ua.HTTPTransporter(http.DefaultTransport), }, } // ... client.Do(...) //... } // UserAgent contains the information/formating needed for reddit&#39;s expectations // on the &#39;User-Agent&#39; http handler. type UserAgent struct { // Platform of your client i.e &#34;linux&#34; Platform string // AppID for your program i.e &#34;com.mydomain.fun-client&#34; AppID string // Semantic Version string i.e. &#34;v.0.1.0&#34; Version string // Reddit user name i.e &#34;/u/anon&#34; Username string } func (ua *UserAgent) String() string { return fmt.Sprintf(&#34;%s:%s:%s (by %s)&#34;, ua.Platform, ua.AppID, ua.Version, ua.Username) } // HTTPTransporter returns a UserAgentTransport which implements the // http.RoundTripper interface and sets the &#39;User-Agent&#39; http header. func (ua *UserAgent) HTTPTransporter(rt http.RoundTripper) *UserAgentTransport { return &amp;UserAgentTransport{ UserAgent: ua.String(), RoundTripper: rt, } } type UserAgentTransport struct { UserAgent string RoundTripper http.RoundTripper } func (t *UserAgentTransport) RoundTrip(req *http.Request) (*http.Response, error) { req.Header.Set(&#34;User-Agent&#34;, t.UserAgent) return t.RoundTripper.RoundTrip(req) } </code></pre></pre>fungussa: <pre><p>Your code hasn&#39;t logged into reddit.</p> <pre><code> type RedditResponse struct { Json struct { Errors [][]string Data struct { Modhash string Url string Id string Name string } } } .... loginURL := fmt.Sprintf(&#34;%sapi/login/%s&#34;, s.rootURL, username) data := url.Values{ &#34;user&#34;: {username}, &#34;passwd&#34;: {password}, &#34;api_type&#34;: {&#34;json&#34;}, } request, err := http.NewRequest(&#34;POST&#34;, loginURL, bytes.NewBufferString(data.Encode())) if err != nil { return fmt.Errorf(&#34;Failed to create a request. err %s\n&#34;, err) } request.Header.Set(&#34;User-Agent&#34;, s.defaultUserAgent) response, err := http.DefaultClient.Do(request) if err != nil { return fmt.Errorf(&#34;Failed to get the response. %s\n&#34;, err) } defer response.Body.Close() if response.StatusCode != http.StatusOK { return fmt.Errorf(&#34;Failed to get the login response: %v\n&#34;, response.Status) } redditResponse := &amp;RedditResponse{} err = json.NewDecoder(response.Body).Decode(redditResponse) if err != nil { respbytes, _ := ioutil.ReadAll(response.Body) return fmt.Errorf(&#34;Failed to decode the json. err %s. &#39;%v&#39;\n&#34;, err, string(respbytes)) } </code></pre></pre>ineedmorealts: <pre><p>Should it need to log in? I&#39;ve written pretty much the same program in python and it works fine without having to login </p></pre>fungussa: <pre><p>My app not only reads from, but also writes to Reddit, and I was encountering the same 429 error code. It was only after I&#39;d set the User-Agent header, during the login request, that the 429 error was resolved.</p> <p>With your app, since it only reads from Reddit, I wouldn&#39;t thought that logging in wouldn&#39;t be necessary</p></pre>ineedmorealts: <pre><blockquote> <p>I wouldn&#39;t thought that logging in wouldn&#39;t be necessary</p> </blockquote> <p>Luckily logging is was not required, the issue was fixed when I changed <code>https://reddit.com</code> to <code>httos://www.reddit.com</code></p></pre>gohacker: <pre><p>I&#39;ve looked into it. Reddit requires you to <code>Set</code> the <code>Host</code> header, too. No need to log in, as the users <a href="/u/fungussa" rel="nofollow">/u/fungussa</a> and <a href="/u/0x6B" rel="nofollow">/u/0x6B</a> claim.</p> <pre><code>req.Header.Set(&#34;User-Agent&#34;, &#34;linux:go-postgrabber:v0.1 (by /u/ineedmorealts)&#34;) req.Header.Set(&#34;Host&#34;, &#34;reddit.com&#34;) </code></pre></pre>fungussa: <pre><p>I didn&#39;t need to set the host. Although I had encountered the 429 error code when I hadn&#39;t set the User-Agent header</p></pre>ineedmorealts: <pre><p>I set the host header and the issue remains </p></pre>gohacker: <pre><p>Sorry, it appeared to fix the issue for me, but now it doesn&#39;t. Even setting the same set of headers wget sets doesn&#39;t help. But wget never gets 429. Most strange. Perhaps it&#39;s somehow related to the <a href="https://github.com/golang/go/issues/4800" rel="nofollow">issue 4800</a>. Try to change <code>reddit.com</code> to <code>www.reddit.com</code>:</p> <pre><code>url := fmt.Sprintf(&#34;https://www.reddit.com/r/%s/top.json&#34;, os.Args[1])` </code></pre></pre>ineedmorealts: <pre><p>That seems to have fixed it! I still want to do a but more testing to make sure but it seems to be fixed. Thanks!</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

636 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传