select {} vs for {} breaking https requests on another thread

I'm a relative beginner with Go, so this might be obvious to some, but I've been struggling with this bug for two days and just fixed it, so I wonder if anyone can shed some light on what is really going on here. I'm doing HTTP requests in a goroutine, and I made my main thread wait indefinitely so that it doesn't close using for {}. This works fine for http, but https requests block forever with no error or anything. Changing the main thread to wait using select {} fixes https requests. Here is a minimal example: <pre><code>func main() { go func() { _, err := http.Get("https://en.wikipedia.org/wiki/Main_Page"); if err != nil { log.Fatal(err) } fmt.Println("Request done."); }(); for {}; } </code></pre> This will (on my machine, anyway) not print anything to the console. Changing the main thread to wait using select {} fixes it, and it'll print "Request done.". Leaving the main thread waiting using for {} works fine if the url is http://. What's going on? I'm running go version go1.9.1 darwin/amd64 on a macbook, and running the program using go run file.go. Edit: Lots of people are telling me that the goroutine isn't being scheduled, but that's not true: if you add a print statement before the http.Get request, you'll see that the goroutine gets scheduled and runs perfectly with for {} or with select {}. http requests work with either. https requests block forever if the main thread is running for {}, but not if it's running select {}. This behaviour is what I'm trying to understand. Edit 2: Simply changing https:// to http:// in the wikipedia link doesn't test http, because wikipedia (any many other big sites) automatically redirect http to https. I should have put this in the original code. If you test a true http site, with no redirects (I've been using my own website <a href="http://kieranvs.com" rel="nofollow">http://kieranvs.com</a>), you'll see the behaviour I'm describing. <hr/>**评论：** deusmetallum: <pre>Rather than do a massive for loop, consider using sync waitgroups, which you can read about here: <a href="http://goinbigdata.com/golang-wait-for-all-goroutines-to-finish/" rel="nofollow">http://goinbigdata.com/golang-wait-for-all-goroutines-to-finish/</a></pre>kieranvs: <pre>Thanks, I will implement a better solution. But do you have any idea about why for {} breaks https?</pre>deusmetallum: <pre>Haven't a clue, but see if the waitgroup thing fixes it!</pre>kieranvs: <pre>I'm sure it will, because using select {} fixes it! I'm just so confused as to why select {} vs for {} affects the behaviour on a different thread, and wondering if this is some kind of bug in the implementation of net/http.</pre>floatdouble: <pre>I'm not sure if that's the case, because even doing <pre><code>for { time.Sleep(time.Nanosecond) } </code></pre> fixes it for me. How long did you run it for before killing it?</pre>nemith: <pre>Go routines only context switch on channel sends and between functions. An empty for loop will take 100% of your CPU and run without any context switching. The delay that someone else said "fixes it" is a function call to allow a context switch and also free's the CPU from spinning as fast as it can. TL;DR; empty for loops are always bad</pre>BurpsWangy: <pre>I suspect some goroutine isn't getting scheduled because of the infinite for {} loop. That goroutine is probably in the package that handles the https call. Try something like: <pre><code>for { time.Sleep(time.Millisecond * 4) } </code></pre> See if that runs your code as expected. Sleeping will yield some time to the scheduler to execute all your goroutines.</pre>qu33ksilver: <pre>First of all, you don't need to give ; at end of each line. Use <code>go fmt</code> and <code>go vet</code> to automatically lint your code to match with conventions. Now coming to your issue, I tried your code. The <code>for {}</code> doesn't work for both http and https versions. And the <code>select {}</code> works for both http and https. The reason is because the <code>for {}</code> doesn't allow for pre-emption as there are no yield points in a blank infinite loop. Hence, your other goroutine doesn't get a chance to run. Whereas a <code>select {}</code> waits for some event to occur and always yields while this wait is happening. In these cases, using a waitgroup is ideal as somebody has already mentioned. EDIT: The reason it is working in your case for http and not for https is pure luck. As it does not happen in my machine. The Go scheduler just happens to run the other goroutine first. I'm pretty sure, if you try enough no. of times, the code will block for http case too. Rest assured, the issue is not at all with http and https.</pre>kieranvs: <pre>Thank you for actually trying my code, you're the first one who actually addressed the http/https issue. It's not working on your machine? I literally just tested it again and it works for me with http! The goroutine code runs perfectly. It only stops working with https, as in, every statement before the http.Get works perfectly and that line of code blocks forever if it's https. To be clear, you changed the URL to a plain http website (one that actually works on http, and doesn't redirect to https) and it didn't work?</pre>kieranvs: <pre><blockquote> EDIT: The reason it is working in your case for http and not for https is pure luck. </blockquote> It's not. Here is a much better example, where I've used a channel to make sure the goroutine gets started properly (which it does every time, by the way) before proceeding with the experiment. I've also included my website which works on http, and you can just comment out to switch between them. It's definitely not the scheduler. <pre><code>func main() { c := make(chan int); go func(c chan int) { fmt.Println("The goroutine is definitely running."); c <- 0; _, err := http.Get("https://en.wikipedia.org/wiki/Main_Page"); //_, err := http.Get("http://kieranvs.com/"); if err != nil { log.Fatal(err) } fmt.Println("Request done."); }(c); //Wait to make sure the goroutine got scheduled properly and is running. _ = <- c; //Wait forever for {}; } </code></pre></pre>qu33ksilver: <pre>Umm .. the goroutine gets started properly because you are waiting on the channel in the main function. :) And then it does the same thing all over again. I tried your examples. Yes, the normal http site works and the https site does not work. But its luck again. Because I tried with "<a href="http://youtube.com" rel="nofollow">http://youtube.com</a>" and it does not work. You might want to try that and see what results you get.</pre>kieranvs: <pre>I've been debugging this issue for two solid days, it's a little annoying for you to look at it for two seconds and tell me it's luck. "<a href="http://youtube.com" rel="nofollow">http://youtube.com</a>" is a https request because it automatically redirects to <a href="https://youtube.com" rel="nofollow">https://youtube.com</a>. The goroutine gets started properly every time, in every variation. That has never been the issue! This is my entire point - even when the goroutine gets started properly, you still see this behaviour, so it's not the scheduling.</pre>divan0: <pre>You're both right. Anonymous function in main has enough time to start (easy to check with printing "Start request" at the beginning), but http.Get itself launches more gouroutines. Getting the https request is a bit more complicated then http, and takes more time. Forever loop, as already commented, doesn't allow preemption so scheduler is never executed and http.Get is stuck trying to launch new goroutines. I think this can be properly shown by using golang execution tracer tool - it's perfect for such cases. But also, you can have indirect proofs: <ul> <li>add delay before running for{}: <pre><code>time.Sleep(500 * time.Millisecond) for {} </code></pre></li> <li>add runtime.Gosched (<a href="https://golang.org/pkg/runtime/#Gosched" rel="nofollow">https://golang.org/pkg/runtime/#Gosched</a>) to yield scheduler: <pre><code>for { runtime.Gosched() } </code></pre></li> </ul> So, in some way, it can be called "luck" as https request on some machines/connections can be fast enough to execute before forever loop launches.</pre>divan0: <pre>Hmm, actually using tracer is complicated in this case as in order to write trace into the file we need to run <code>trace.Stop()</code> in own goroutine, which also cannot be executed due to for{}. Here is a screenshot of the trace output for my example above with runtime.Gosched() called: <a href="https://imgur.com/a/s5MaO" rel="nofollow">https://imgur.com/a/s5MaO</a> You may note many small interruptions in Proc3 (which runs main goroutine in my case) - that's what allows another goroutines to execute.</pre>joushou: <pre>Neither "http" nor "https" works for me on go1.9.1 darwin/amd64 on a macbook pro. I have three things to say about this: <ol> <li>What you are doing is all sorts of wrong. "for {}" means "burn 100% CPU on a single core doing absolutely nothing". Even for techniques that don't do this ("select {}"?), I still can't find a good reason to ever block forever. In this case, the problem is that you have an unnecessary goroutine. Of course, in some cases, you might end up with goroutines that need to wait for each others completion, but a Waitgroup is the way to go then.</li> <li>Despite you not wanting to do what you are doing, I do believe that you might have encountered a bug in the runtime. What we see appears to be the runtime being out of machine threads to schedule goroutines on (you're consuming one permanently with the for loop), and while there are limits to how many machine threads will run Go code at any given time, you should be able to burn one and still execute a HTTP request.</li> <li>You're not supposed to use semicolons in Go.</li> </ol> Try installing go1.8 or go1.7 and see if it worked there. Open an issue on the go github issue tracker, and post you test code.</pre>kieranvs: <pre><ol> <li>Yeah, I guess there is no real reason to. Instead of firing off a load of goroutines and then waiting forever, I could just remove the word "go" from the last goroutine invocation, and have that run on the main thread instead. However, this is what I did, and then I came across this discrepancy between what I understand and what's happening - so I just want to understand. This isn't production code, this is nothing but a learning exercise.</li> <li>It's not that the goroutine doesn't get a thread to run on. The goroutine works. I can do whatever I want there. I can do http requests. I just can't do https requests. There's clearly a bug in some piece of code that is running on my machine and not on yours...</li> </ol> Did you run my code verbatim? Did you try a true http website, not a website that redirects http requests to https? If you insert a print statement before the http request, you can see that the goroutine is alive and working.</pre>joushou: <pre>I initially ran your code raw, and then modified the example a bit to experiment (is the goroutine even running, does it only happen if the for loop is in the main goroutine, does it happen if GOMAXPROCS is lifted, ...). I just removed the "s" from your URL to try HTTP, so it would appear that I probably got redirected to https. I had forgotten to consider that the default http client handles redirects automatically. Testing against a pure HTTP site lets the code run. So far, any combination of a blocked machine thread ("for {}") and a http.Get on a HTTPS or HTTP redirecting to HTTPS site resulting in a block (not pure HTTP), and the routine that issues the http request is always alive before the request is issued, suggesting that the runtime does indeed trip due net/http's use of crypto/tls. Lifting GOMAXPROCS did not help, suggesting that it is not just a simple machine thread starvation.... Trying on a Linux box would be interesting, but I do admit that I am a bit too lazy to do that myself right now.</pre>Kraigius: <pre>Testing on windows on my machine with his own website because he said it doesn't do https redirect automatically. It will complete the request and print for both http and https (In the case of https it will fatal because his website doesn't have a valid https certificate). If I limit GOMAXPROCS to 1, it will never yield, it will never print anything for both http and https. GOMAXPROCS to 2 will print. This is the behavior that I'm expecting. <code>for{}</code> will not yield. The only reason it prints on my machine when I don't limit GOMAXPROCS must be due to pure luck, the goroutine probably execute before entering the loop. I've used the scheduler trace and yeah, the goroutine is always queued. If GOMAXPROCS is set to 1 and I replace the <code>for</code> with a <code>select</code>, it will complete the request and print. This is also the behavior that I'm expecting since <code>select</code> will yield. edit: If GOMAXPROCS is set to 1, I keep the <code>for{}</code> but put <code>time.Sleep(time.Millisecond * 4)</code> in the body of the loop, it will do the request and print. This is also expected since the sleep allows it to yield.</pre>joushou: <pre>Hmm, that means that it works entirely as intended on Windows (for for-loop of course consuming a machine thread, so everything works with GOMAXPROCS > 1). This suggests that we're dealing with a platform-specific issue, eliminating basically everything in net/http and crypto/tls. GOMAXPROCS default to 8 on my Mac where I tested, but I tried upping it to 16 to no avail, which suggest that the issue is not machine thread starvation. Also, I tested with HTTP without redirect as well using my own site, and there it works. This specific reproduction does indeed require that HTTPS (thereby crypto/tls) is involved. EDIT: It seems like you expect it to only work out of luck when you don't limit GOMAXPROCS. This is not luck. The "go" keyword puts the goroutine in the queue. If a machine thread is idle, it will be woken up. If it is busy, it will pop the new task off the work queue when done. The only time where the goroutine won't run is if all machine threads are busy with non-yielding work. It is behaving correctly, and deterministically—no races or luck involved.</pre>Kraigius: <pre>Ohh..this is interesting.</pre>Kraigius: <pre>It looks obvious to me. Let's stop and think for a moment on what is happening. The program start a goroutine, it then continue and block inside an infinite loop. The infinite loop basically doesn't give any rest to the system. While in the infinite loop, I believe the http.Get request will be fired and returned at some point. What am certain of, is that the infinite loop doesn't give any window of opportunity for the goroutine to use your system I/O to print "Request done.". It's not 100% because of the <code>for</code>, but on how you used it, you made it completely empty devoid of all work. <blockquote> if you add a print statement before the http.Get request, you'll see that the goroutine gets scheduled and runs perfectly with for {} or with select {} </blockquote> The nature of asynchronous code is that this behavior isn't guaranteed. It's not guaranteed that a print on line #2.5 will execute before entering the <code>for{}</code>. As for the difference between <code>select{}</code> and <code>for{}</code>, <code>for{}</code> consume cpu resources, the select is like <code>STOP</code> <a href="https://stackoverflow.com/questions/18661602/what-does-an-empty-select-do" rel="nofollow">source</a></pre>kieranvs: <pre>Hi, thanks for taking the time to look into my issue. I have done extensive testing, and on my machine (which allows the go runtime to have more than one OS thread and more than one hardware CPU core), starting a goroutine and then entering an infinite loop on the main thread does not block the goroutine from doing whatever it wants. Prints, general computation, http requests all work perfectly. The only time it stops working is with a https request. <blockquote> What am certain of, is that the infinite loop doesn't give any window of opportunity for the goroutine to use your system I/O to print "Request done.". </blockquote> It works just fine on my machine, maybe because it's got more hardware cores or something. The goroutine can do whatever computation or IO it wants, except a https request. The effects you're describing where a for {} completely blocks the other goroutines happens at exactly four such threads - that's the number of logical hardware cores that are present on this machine. The only logical conclusion I can come to is that the goruntime assigns goroutines to OS threads in such a way that the main go thread is on some OS thread #0, and the http.Get implementation contains a bug which means that it only works for https requests if it can get assigned to OS thread #0, or in some way requires the main thread to yield. By the way, in all my extensive testing, the results have been 100% the same each invocation - the nondeterminism of goroutine scheduling seems to have little to no impact on this issue with http vs https.</pre>Kraigius: <pre>While it's interesting to look into the compiler to see why it gets scheduled that way on your machine, the core of the problem is in your code. It would be faster to fix it (waitgroup or use a <code>select{}</code>) than verify whether or not there is in fact a bug in some low level code of Go. http, https, http redirect all perform different operations, and different number of operations. It's entirely possible that for your hardware it is more likely to yield at a certain time for https and less likely for http when querying a specific website by factoring content-length, and the time for the request to come back to you. Better not go down the rabbit hole.</pre>sethammons: <pre>As others have pointed out, for{} never yields control back to the CPU for scheduling other activity. You can leverage runtime.Gosched() (<a href="https://golang.org/pkg/runtime/#Gosched" rel="nofollow">https://golang.org/pkg/runtime/#Gosched</a>) to yield control back to the processor. That said, there are much better ways to block that don't hog the CPU. Probably the best is just using select {}.</pre>nhooyr: <pre>Hey, I can reproduce your exact issue and it does seem to have to do with scheduling as the following program works with https: <pre><code>func main() { go func() { _, err := http.Get("https://en.wikipedia.org/wiki/Main_Page") if err != nil { log.Fatal(err) } fmt.Println("Request done.") }() for { runtime.Gosched() } } </code></pre> Without the runtime.Gosched() call, I don't get the "Request done." message. I'm not sure why exactly, definitely deserves an issue on the Go github repo.</pre>slantview: <pre>You can keep arguing with everybody about why it breaks, eventually you will find the answer. However, stop doing a for {} loop as everyone here has told you it’s bad practice and burns cpu unnecessarily. If you ask for help and argue with everyone about it, it just makes you look bad. Plus you have burned through two days of your employers money for something that is well documented. Tl;dr: use a select, stop wasting time and money.</pre>kieranvs: <pre>Firstly, I’m not doing this for work so I’m not wasting anyone’s money. I’m trying to understand what’s going on, so while changing it to select makes my code work, it doesn’t really achieve the real goal here of getting to the bottom of why http requests work in the goroutine and https requests fail. Contrary to being well documented, I believe I may be the first person to ever write about this on the internet... It’s completely my fault that most of the responders to this post didn’t respond about http vs https. See, while I thought I had created a minimal example and carefully written about everything you need to know to replicate the bug, I left out one crucial thing: redirects. It’s not enough to change https to http in my example code and see the problem first hand, you have to change it to a completely different website that won’t redirect you, like Wikipedia or YouTube would. Most people probably thought okay, let’s try it... nah, what an idiot, neither works, it’s obviously the for loop and hence I got all those replies. Your tldr basically amounts to ignore it and don’t be curious. I had already discovered that select made the code work when I posted this question. The question isn’t about how to make the code work. It’s about why http works but https doesn’t. Some responders did start talking about that, and I think the general consensus is that the implementation in net/http is a bit dodgy. On a machine with two or more hardware cores, it is supposed to be okay to hog one system thread with a loop and use the other one to do a https request. I’m really sorry you think I’ve been rude. :( I’m just a frustrated student trying to get to the bottom of this, this is the first time I’ve really encountered a bug in the standard library/implementation of a language, and most of the community here seemed more keen to lecture me on the busy-waiting of for loops rather than discuss the question. </pre>nhooyr: <pre>Don't worry about those responses. You asked a reasonable question and some people are dodging it and telling you to give up on understanding the underlying issue. Very poor advice imo. What you are doing is the best way to learn. Keep at it!</pre>dlsniper: <pre>You can read here how to block forever in Go: <a href="http://blog.sgmansfield.com/2016/06/how-to-block-forever-in-go/" rel="nofollow">http://blog.sgmansfield.com/2016/06/how-to-block-forever-in-go/</a> As for breaking the https, that's not possible. Fix your code, not pseudo-problem you cannot reproduce because of bugs.</pre>kieranvs: <pre>What do you mean pseudo-problem that I can't reproduce? I've even reduced it to a minimal example which works reliably every time. Running the code with for {} doesn't work, but select {} does, for https. Both work for http. This is super weird. Also, I've already been on that page - it sheds no light on this. for {} is listed as one of the (not recommended) ways to wait forever, and yet there is no mention of side effects on other threads trying to use https.</pre>kabloom195: <pre>The answer is in that article. The for loop keeps spinning, monopolizing the core you started it on, and doesn't let the other goroutine schedule. The select construct tells the goroutine to yield the processor indefinitely, so the other goroutine schedules.</pre>joushou: <pre>The for loop monopolize a single machine thread. The smallest Macbooks have physical hyperthreaded cores, so he should have at least 4 machine threads to schedule goroutines on (unless he tampers with GOMAXPROCS). Assuming the http client doesn't busy-loop on non-yielding code, he should be able to have 3 "for {}"'s running, and still issue a http request. It's still a terrible idea to use "for {}". Why would one ever want to "block forever", after the applications usefulness has expired?</pre>kieranvs: <pre>I wrote a quick test program to see how many goroutines I could have running for {} at the same time, and still spawn new ones, and you're right, I can have the main thread and 3 more before it breaks. Why is this? I am under the impression that the OS scheduler can swap out system threads on a time-share basis whenever it wants, without the thread yielding. I mean, just dump all the CPU registers into memory, put in the other thread's register values and off you go! Is this a limitation with the Go runtime?</pre>joushou: <pre>In Go, you never interact with threads. You make goroutines, or "green threads"/"stackless threads" as they're often called in other languages, which are then internally scheduled by the runtime on one or more "machine threads". To manage this, the Go runtime implements its own scheduler. This scheduler is not preemptive, but cooperative, only yielding the machine threads at certain calls which enters the runtime (I/O, locks, ...). Some of these calls might create additional machine threads to service your request, but only GOMAXPROCS (defaults to logical CPU count) threads will execute your goroutines at any given time. Any code that doesn't enter the runtime to yield the machine thread will block the machine thread forever (although that machine thread is preempted to execute other OS processes). This might seem complicated, silly and even limiting compared to normal, preemptive OS threads. However, it's a trick done to make goroutines a cheap resource which can be spawned in the hundreds of thousands. OS threads are (relatively) expensive to create (enter the kernel, create a process/thread with a few MB of stack, exit kernel) and run (preemptive multitasking incurs expensive context switches, which take time and renders caches useless—the fewer preemptions, the better), whereas goroutines are almost free (more goroutines does not mean more context switches interfering with work due to not preempting each other , and they have tiny dynamic stacks). If you want to learn more, then the Go model is effectively called "M:N threading" (M application-level tasks for N kernel-level threads, where M>N), or "hybrid threading". The common OS model is called "1:1 threading" (1 application-level task for 1 kernel-level thread). Another alternative is "N:1 threading" (N application-level tasks for 1 kernel-level threads). There can be some slight confusion of terminology here: Some would refer to the goroutine as the "thread", while calling the OS thread a "scheduling entity" or something along those lines.</pre>kieranvs: <pre>Thank you, that makes so much sense! Still no closer to understanding why http works but https breaks though. I think I'll have to submit a bug report on the issue tracker.</pre>joushou: <pre>Please do. A report too much is better than one too little. If you wouldn't mind, please add the info from the other comments as well—Kraigius, for example, points out that it seems to work on Windows.</pre>kieranvs: <pre>Look, using for {} to stall the main thread works for HTTP requests on the second thread, but mysteriously stops working for HTTPS. Changing to select {} on the main thread fixes this. I've done my research, but I don't think anyone has addressed this exact issue here. I get that for {} is busy-waiting and select {} yields.</pre>

用户登录

今日阅读排行

一周阅读排行

最新主题