Parallel grep over compressed files: pcgrep

agolangf · · 577 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

<a href="https://github.com/natefinch/pcgrep">https://github.com/natefinch/pcgrep</a> I wrote this in a half hour or so, in response to a friend saying he'd spent "several hours" writing something similar in python. Then someone told him about GNU parallel, and he figured he'd just do it in bash. But I thought it was an interesting problem, so I coded up a little solution. Far from complete, but it's functional and pretty quick (both to write and to run). It's currently parallel per-file... I added code to run the regex matching in goroutines, which made it faster to search a single file, but the overhead of copying the line data to the goroutines actually slowed it down for num_files >= num_cores... possibly one could add more sophisticated logic to choose how parallel to go based on the number of files, but I was about done with spending time on it, since it wasn't really filling a need I had. Figured I'd just post this here in case anyone is interested.... feel free to suggest optimizations, I didn't really spend much time optimizing other than trying out the aforementioned matching in goroutines. <hr/>**评论：** natefinch: <pre>Oh, and a key thing I learned: when you call bufio.Scanner.Bytes(), you're given a slice of bytes that does no allocation, and if you try to use that slice asynchronously, it will get overwritten the next time you call Bytes(). Took me a while to figure out why I was getting odd output. Always read the docs, folks!</pre>volker48: <pre>That has bitten me as well.</pre>barsonme: <pre>Same. I was building a trie and decided to use the Bytes method instead of Text and couldn't figure out why nothing was inside my trie. RTFM is a good idea

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

577 次点击

加入收藏微博

slice

github

python

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Parallel grep over compressed files: pcgrep

用户登录

今日阅读排行

一周阅读排行

最新主题