性能提升48倍! python redis批量写入大量数据优化过程

梨花菜 · 2020-09-14 12:32:43 · 781 次点击 · 预计阅读时间 4 分钟 · 大约8小时之前开始浏览

这是一个创建于 2020-09-14 12:32:43 的文章，其中的信息可能已经有所发展或是发生改变。

1.最原始的版本,直接使用hset,效率很低

写30w条完耗时365秒,这样有两个问题:

相同的key,写入多条应该用hmset代替hset
另外可以用pipeline,避免频繁跟redis服务端交互,大量减少网络io

image.png

image.png

image.png

def get_conn():
    r = redis.Redis(host='localhost', port=6379, decode_responses=True)
    return r


def test_set_redis():
    conn = get_conn()
    machineId = 43696000000000  
    device_no = 88800000  
    work_in = time.time()
    source = "1"
    factory_no = "factory"
    today = datetime.date.today()
    oneday = datetime.timedelta(days=1)
    tomorrow = str(today + oneday).replace("-", "")
    afterTomorrow = str(today + oneday + oneday).replace("-", "")
    todayZero = int(time.mktime(today.timetuple()))
    today = str(today).replace("-", "")
    for i in range(300000):  
        upAxisId = "uxi" + str(device_no)
        axisVarietyId = "axi" + str(device_no)
        varietyId = "vi" + str(device_no)
        axisNum = "axn" + str(device_no)
        try:
            conn.hset('mykey_prefix' + str(device_no), "machineId", str(machineId))
            conn.hset('mykey_prefix' + str(device_no), "machineNum", str(machineId))
            conn.hset('mykey_prefix' + str(device_no), "factoryId", factory_no)
            conn.hset('mykey_prefix' + str(device_no), "groupId", "group_id")
            conn.hset('mykey_prefix' + str(device_no), "groupName", "groupName11")
            conn.hset('mykey_prefix' + str(device_no), "workshopId", "workshopId11")
            conn.hset('mykey_prefix' + str(device_no), "workshopName", "workshopName11")
            conn.hset('mykey_prefix' + str(device_no), "source", source)
            conn.hset('mykey_prefix' + str(device_no), "errorTimeLimit", str(20))
            conn.expire('mykey_prefix' + str(device_no), 864000)  # 设置10天过期时间
            conn.hset('mykey_prefix' + str(device_no), "axisInfo", json.dumps(axisInfo))
            conn.hset('mykey_another_prefix:' + today, str(machineId), json.dumps(fbfcalue))
            conn.hset('mykey_another_prefix:' + tomorrow, str(machineId), json.dumps(fbfcalue2))
            conn.hset('mykey_another_prefix:' + afterTomorrow, str(machineId), json.dumps(fbfcalue3))
            conn.hset('mykey_another_prefix1:' + today, str(machineId), json.dumps(fbfcalue))
            conn.hset('mykey_another_prefix1:' + tomorrow, str(machineId), json.dumps(fbfcalue2))
            conn.hset('mykey_another_prefix1:' + afterTomorrow, str(machineId), json.dumps(fbfcalue3))

            conn.expire('mykey_another_prefix:' + today, 259200)  # 3天
            conn.expire('mykey_another_prefix:' + tomorrow, 259200)
            conn.expire('mykey_another_prefix:' + afterTomorrow, 259200)
            conn.expire('mykey_another_prefix1:' + today, 259200)
            conn.expire('mykey_another_prefix1:' + tomorrow, 259200)
            conn.expire('mykey_another_prefix1:' + afterTomorrow, 259200)

            conn.hset('fy:be:de:ma', str(device_no), str(machineId))
            conn.expire('fy:be:de:ma', 864000)
            machineId = int(machineId) + int(1)
            device_no = int(device_no) + int(1)
        except Exception as e:
            print("设置异常，错误信息：", e)

2.使用pipeline代替每次设置一个key就请求一次

方法很简单,只需要两处小小的改动

image.png

使用`pipeline`效果非常明显,已经从365秒变成了126秒,一下子就减少了239秒,将近4约分钟!

image.png

3.使用pipeline + hmset

把同一个key对应的field和value组装成字典,通过hmset一次性搞定

image.png

用了hmset之后,再次压缩时间,126变成98,耗时缩小了28秒,将近半分钟

image.png

为了进一步压缩时间,使用`golang`实现了一遍,性能很强劲

从python的98秒变成了7.5秒,整整提升了13倍! 是最开始的365秒的48倍!!!

image.png

func setDevice() {
    var deviceNo string
    var deviceInfo map[string]interface{}
  // 获取reids管道
    pipe := rdb.Pipeline()
    defer pipe.Exec(ctx1)

    for i := 0; i < len(devices); i++ {
        device := devices[i]
        for k, v := range device {
            deviceNo = k
            deviceInfo = v
        }

        deviceKey := fmt.Sprintf("%s:%s", deviceInfoKey, deviceNo)

        machineId := deviceInfo["machineId"].(string)
        // 设置排班信息
        shiftInfo, _ := json.Marshal(shiftToday)
        pipe.HSetNX(ctx1, fystTodayKey, machineId, shiftInfo)
        pipe.Expire(ctx1, fystTodayKey, time.Hour*24)
        pipe.HSetNX(ctx1, fymstTodayKey, machineId, shiftInfo)
        pipe.Expire(ctx1, fymstTodayKey, time.Hour*24)
               
         // hmset 代替hset,一次性写入map
        pipe.HMSet(ctx1, deviceKey, deviceInfo).Err()
        pipe.Expire(ctx1, deviceKey, time.Hour*72)
        if i%1000 == 0 && i >= 1000 {
            failCmd, err1 := pipe.Exec(ctx1)
            log.Printf("正在设置第%d个采集器 \n", i)
            if err1 != nil {
                countFail += len(failCmd)
            }
        }
    }

}

4.总结

批量写入时,使用pipeline可以大幅度提升性能
key相同的field和value,可以用hmset代替hset,也能很好的提升性能
操作大量数据时,使用golang来代替python是很棒的选择

有疑问加站长微信联系（非本文作者）

本文来自：简书

感谢作者：梨花菜

查看原文：性能提升48倍! python redis批量写入大量数据优化过程

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

781 次点击

加入收藏微博

收入我的专栏

上一篇：Go语言 | 并发设计中的同步锁与waitgroup用法

下一篇：[Golang]在windows下编译linux的程序

redis

python

信息

采集器

0 回复

暂无回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

关注我

扫码关注领全套学习资料
加入 QQ 群：
- 192706294（已满）
- 731990104（已满）
- 798786647（已满）
- 729884609（已满）
- 977810755（已满）
- 815126783（已满）
- 812540095（已满）
- 1006366459（已满）
- 692541889
加入微信群：liuxiaoyan-s，备注入群
也欢迎加入知识星球 Go粉丝们（免费）

性能提升48倍! python redis批量写入大量数据优化过程

1.最原始的版本,直接使用hset,效率很低

写30w条完耗时365秒,这样有两个问题:

2.使用pipeline代替每次设置一个key就请求一次

方法很简单,只需要两处小小的改动

使用`pipeline`效果非常明显,已经从365秒变成了126秒,一下子就减少了239秒,将近4约分钟!

3.使用pipeline + hmset

把同一个key对应的field和value组装成字典,通过hmset一次性搞定

用了hmset之后,再次压缩时间,126变成98,耗时缩小了28秒,将近半分钟

为了进一步压缩时间,使用`golang`实现了一遍,性能很强劲

从python的98秒变成了7.5秒,整整提升了13倍! 是最开始的365秒的48倍!!!

4.总结

用户登录

今日阅读排行

一周阅读排行

关注我

1.最原始的版本,直接使用hset,效率很低

写30w条完耗时365秒,这样有两个问题:

2.使用pipeline代替每次设置一个key就请求一次

方法很简单,只需要两处小小的改动

使用`pipeline`效果非常明显,已经从365秒变成了126秒,一下子就减少了239秒,将近4约分钟!

3.使用pipeline + hmset

把同一个key对应的field和value组装成字典,通过hmset一次性搞定

用了hmset之后,再次压缩时间,126变成98,耗时缩小了28秒,将近半分钟

为了进一步压缩时间,使用`golang`实现了一遍,性能很强劲

从python的98秒变成了7.5秒,整整提升了13倍! 是最开始的365秒的48倍!!!

4.总结

性能提升48倍! python redis批量写入大量数据优化过程

1.最原始的版本,直接使用hset,效率很低

写30w条完耗时365秒,这样有两个问题:

2.使用pipeline代替每次设置一个key就请求一次

方法很简单,只需要两处小小的改动

使用pipeline效果非常明显,已经从365秒变成了126秒,一下子就减少了239秒,将近4约分钟!

3.使用pipeline + hmset

把同一个key对应的field和value组装成字典,通过hmset一次性搞定

用了hmset之后,再次压缩时间,126变成98,耗时缩小了28秒,将近半分钟

为了进一步压缩时间,使用golang实现了一遍,性能很强劲

从python的98秒变成了7.5秒,整整提升了13倍! 是最开始的365秒的48倍!!!

4.总结

用户登录

今日阅读排行

一周阅读排行

关注我

给该专栏投稿 写篇新文章

收入到我管理的专栏 新建专栏

1.最原始的版本,直接使用hset,效率很低

写30w条完耗时365秒,这样有两个问题:

2.使用pipeline代替每次设置一个key就请求一次

方法很简单,只需要两处小小的改动

使用pipeline效果非常明显,已经从365秒变成了126秒,一下子就减少了239秒,将近4约分钟!

3.使用pipeline + hmset

把同一个key对应的field和value组装成字典,通过hmset一次性搞定

用了hmset之后,再次压缩时间,126变成98,耗时缩小了28秒,将近半分钟

为了进一步压缩时间,使用golang实现了一遍,性能很强劲

从python的98秒变成了7.5秒,整整提升了13倍! 是最开始的365秒的48倍!!!

4.总结

使用`pipeline`效果非常明显,已经从365秒变成了126秒,一下子就减少了239秒,将近4约分钟!

为了进一步压缩时间,使用`golang`实现了一遍,性能很强劲

给该专栏投稿写篇新文章

收入到我管理的专栏新建专栏

使用`pipeline`效果非常明显,已经从365秒变成了126秒,一下子就减少了239秒,将近4约分钟!

为了进一步压缩时间,使用`golang`实现了一遍,性能很强劲