I have three elements consisting of 1 million, 2 million and 3 million ints respectively. I want to insert them to redis such that they all happen concurrently and the total execution time is not greater than the execution time of the 3 million ints. I tried using sync.Waitgroup but it doesn't speed up the execution. Here's my code.
package main
import (
"log"
"strconv"
"time"
"gopkg.in/redis.v5"
)
func main() {
oneMillion := makeRange(1, 1000000)
twoMillion := makeRange(1000001, 3000000)
threeMillion := makeRange(3000001, 6000000)
elements := [][]int{oneMillion, twoMillion, threeMillion}
client := redis.NewClient(&redis.Options{
Addr: "localhost:6379",
Password: "",
DB: 0,
DialTimeout: 60 * time.Second,
WriteTimeout: 60 * time.Second,
ReadTimeout: 60 * time.Second,
PoolTimeout: 60 * time.Second,
})
pipeline := client.Pipeline()
for _, elem := range elements {
for i := 0; i < len(elem); i++ {
key := "KEY:" + strconv.Itoa(elem[i])
val := "VAL:" + strconv.Itoa(elem[i])
cmd := pipeline.Set(key, val, 0)
if cmd.Err() != nil {
log.Fatal("cmd error: ", cmd.Err())
}
}
_, err := pipeline.Exec()
if err != nil {
log.Fatal("error: ", err)
}
}
}
func makeRange(min, max int) []int {
a := make([]int, max-min+1)
for i := range a {
a[i] = min + i
}
return a
}
**评论:**
cdoxsey:
newbgopher:Hi newbgopher,
Redis is a single-threaded database, so although concurrency may help a little here, in the end you're limited by the database itself. To improve performance you need run multiple Redis processes and partition your data.
Some other suggestions:
- If you are truly inserting a range of integers, using a Lua script will likely perform far better (all the computation happens on the server it self so there's no network load)
- You should limit the size of the pipeline and periodically execute it. Doing all 3 million in one execute is probably asking too much.
- For the goroutines, make sure you are using a separate redis connection for each one.
sgmansfield:Thanks for the response! Im not really inserting ints, i just used it for simplicity sake. I'll try using a dedicated redis connection per data element.
newbgopher:It really doesn't matter how many different connections you add. Redis literally only does one thing at a time. Your total time will be the sum of all three, the best you can do is eliminate a couple round trips by concurrently sending the data. If your long pole is Redis inserting those huge lists, you'll be waiting regardless.
Morgahl:thanks for the response!
lumost:The package and configuration in use in the code sample is using a connection pool of 10 (default) so at least your 3rd point is covered somewhat :)
throwawayguy123xd:In addition to the single-threaded nature of redis producing a bottleneck ( even with proper batching redis will only push 60-400k ops per second ). It's important to consider the number of bytes on the network - 3 million ints has a minimum overhead of 12 MB prior to serialization and connection overhead. A stock linux host will also consume a minimum of 4KB ram per connection.
Depending on your needs for this data you may want to consider serializing the 3 million ints into a single record. This would avoid most of the per op overhead of redis, and with some cleverness will still allow per record access using the get range command.
i think the other posters answered clearer, but to answer your specific question
i would think you need 3 clients (or 3 insert-interfaces), so even though may not be able to with single-threaded redis, you would try something like:
func makeandinsert(array* []int, index int) { client := redis.newClient(...) for i,v := range array[index] { client.insert(v) //or whatever } } func main() { elements := ... go makeandinsert(&elements, 0) go makeandinsert(&elements, 1) go makeandinsert(&elements, 2) }
its just a different way of doing it multi-threaded (aka refactoring)
