I created a library for Right/Left string padding which has variable allocation caching
for common use cases:
https://github.com/Rhymond/gopad/blob/master/gopad.go#L8
Is it good practice to do that? I'm looking for an advice, thanks in advance!
评论:
metamatic:
tscs37:Given how cheap appending one byte to a Buffer is, I'm very skeptical that you'll get any speedup by having pre-build 2, 3, 4 and 5 byte sequences.
In fact, given modern CPU architectures, I suspect that appending one byte to a buffer 5 times is going to be faster than going out to main memory to fetch a sequence of 5 bytes you created at compile time.
raima220:appending is only cheap when the capacity is sufficient.
From experience, pre-allocating a buffer in memory instead of allocating til max capacity is a factor of 1.5 to 2 times in performance until the cache is fully warmed.
The reason is that if the capacity is not sufficient, you will most likely have to hit the runtime or even worse, malloc.
In a tight loop, going into runtime or doing a syscall can shit all over the cache and worsen performance.
metamatic:I though the same before, but I made benchmarks:
With "Cache": BenchmarkPad-8 300000 3673 ns/op
Without: BenchmarkPad-8 300000 4579 ns/op
You can find my benchmark test here:
xienze:Interesting. I'm surprised that (a) it's faster and (b) by that much.
Would be interesting to look at the generated code, if I knew X86 assembler...
I did some more tests, and it's even faster if you just keep the buffer around and reset it, rather than creating a new buffer every time. I'm also inclined to optimize for the common case of only wanting to pad with a single rune; I don't think I've ever wanted to pad with more than one.
var buff bytes.Buffer func spad(s string, l int, ch rune, isl bool) string { buff.Reset() sl := utf8.RuneCountInString(s) n := l - sl if n <= 1 || l < 1 { return s } if !isl { buff.WriteString(s) } for ; n > 0; n-- { buff.WriteRune(ch) } if isl { buff.WriteString(s) } return buff.String() }
metamatic:and it's even faster if you just keep the buffer around and reset it
But then you're not thread-safe unless you control access to the buffer, which will slow you right back down. Or you can make the caller provide the buffer, but that's an inconvenience and a bit of a leaky abstraction. Tradeoffs.
xienze:Indeed. And the whole thing's an exercise in premature optimization to start with.
hell_0n_wheel:Well it depends. If this string padding function is going to get called all the time then OP's idea is a good move. If it's just used occasionally and not in a hot path? Who cares.
raima220:What problem did you solve by writing that code?
Did you profile the code before and after adding the solution, to confirm it did what you expected?
hell_0n_wheel:Yes, and it looks like that
cached
is fasterWith "Cache": BenchmarkPad-8 300000 3673 ns/op
Without: BenchmarkPad-8 300000 4579 ns/op
You can find my benchmark test here:
Snabel3:Well then you answered your own question.
As you stated it here, there's no context for your question, so we can't understand it, let alone give you a good answer.
Consider this next time you post: https://cs.stackexchange.com/help/how-to-ask
The cache is not optimal as a array since the code will have to look up the pointer to the value every time. I don't think you are earning any performance with all these lookups.
The cache should be a constant (if used at all).
Try using string.Repeat() with a single space instead of a for-loop and use a slice to get the length you want. Should be cleaner and safer code.
