I've just started looking at go a few days ago, and I wanted to try rewriting an unwieldy set of python scripts that I have, in go.
The object of the code is to read some files from Amazon S3, and process them line by line.
This is a function I wrote, using the AWS GO SDK, to list a bucket, and return them in a channel. I'm using channels to emulate python's generators, since the list can be very large
func ListS3(bucket, prefix string, client *s3.S3) chan string{
c := make(chan string)
input := s3.ListObjectsInput{
Bucket: aws.String(bucket),
Prefix: aws.String(prefix)}
go func() {
client.ListObjectsPages(
&input,
func(p *s3.ListObjectsOutput, last bool) (shouldContinue bool) {
for _, obj := range p.Contents{
c <- *obj.Key
}
if last {
close(c)
}
return true
})
}()
return c
}
And I'm using the function like this:
list := ListS3(bucket, path, svc)
for key := range list {
fmt.Println(key)
}
**评论:**
dotaheor:
ilogik:Is client.ListObjectsPages a synced or asycned call? If it is a synced call, please put close(c) after it, for it is possible the callback passed to client.ListObjectsPages will never be called so that close(c) will be never called.
Epolevne:fair point, I've added it
ilogik:If in your usage it's important to know that you got all the items then I would suggest adding some error handling since this is a network operation and will fail at some point. The current code may not return all the items of the S3 bucket and the caller cannot tell.
Epolevne:ListObjectsPages returns an error object. I'm not really sure how I'm supposed to pass that back, since it's being called in a goroutine.
I had started by adding a second error return value to ListS3, but that doesn't make sense, because that only returns the channel, which happens immediately.
Should I return the err through the channel?
Do you have any examples code for this?
danredux:You could use a struct on the channel that has an error and a string instead of just a string.
Veonik:Never start a goroutine without a plan to end it. If something goes wrong while iterating, the goroutine will stay forever. That's a leak.
jerf:I'd say this is valid. The text/template package's lexer does much the same, emitting a token via a channel to the parser as it reads them.
ilogik:The usage you have literally given us above is not "invalid", but it is a useless use of channels. You could just as easily put the Println in the callback in directly.
However, if you spawn a set of worker goroutines who all read from that channel and then do the work for the given bucket, you'd have something useful. In this case you see why most Go network APIs are written to be synchronous; it keeps the code simple, and users of those APIs can easily add any other asynchronous/concurrent behaviors they want via goroutines.
jerf:Yes, that's the plan, to have workers processing the the resulting files.
ilogik:Cool, just wanted to be sure. It is not uncommon to see code posted by relatively new Go programmers that uselessly uses channels or spawns goroutines to worse than no effect. :)
Yojihito:that's exactly why I posted the question. I was mostly just using it as a replacement for generators for now.
I've picked this problem as my first go program because I thought it was a good use case for goroutines
ilogik:Do you need a code sample of a worker pool listening on a channel and doing func_x()?
I've done that for my last project, works quite well.
Yojihito:yeah, I'd love that :)
Morgahl:
ilogik:Small critique. Use a for range loop on the channel if there is only one channel you are listening to:
https://play.golang.org/p/xihPx1RWMb
This allows you to to close the channel, the workers will finish processing the workQueue and then exit. Always write your Goroutines with a way to exit when they are done :)
awesome!, thank you :)
