-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<format>: Optimize container insertion #1894
Conversation
Should we also add a specialization for |
@AdamBucior we could, but that's essentially the same code in triple. I'm thinking it might be better to completely get rid of specializations and use a traits-like class for appending the buffer to the iterator, so we'd have much less duplication. |
I pushed a commit which enables it for any container with This matters to me because I'd like to keep using a stack-allocated memory buffer that grows into the heap like |
I think there may be a Clang bug with CTAD here - all of the failures seem to be with clang-cl and the code appears to be well-formed, but Clang doesn't think that As an aside, I recommend not defining |
I fixed the clang bug with a temporary workaround |
While parties have yet to agree with the most precise approach, just want to share some numbers. This is the performance with the current stable And this is with this PR (f2d6854): As you can see, while the difference for One interesting observation though: |
It's the main reason, I believe. Anyways, these changes are real performance improvements and I hope they gonna be merged one way or another. Numbers formatting is something that exists in one form or another in almost every app on this planet. I think it's worth spending some time to improve its fundamental properties for the long term usage. |
It's been a while since last activity on the subject. I really wouldn't be so worried if it wasn't about the upcoming horror of ABI freezing 😝 |
Fortunately, there's still plenty of time to get this into 17.0. Maintainer activity has indeed been slow recently - as you guessed, partly due to deferred internal work that we need to catch up on (apparently there are other MSVC libraries to maintain - although I can't imagine why anyone would ever need more than the STL 😹). There's also necessary STL work beyond code reviews (e.g. I spent the past several days updating the CI toolset and profiling/bug-reporting for modules). Also, maintainers including myself have been out due to vacations and vaccinations. Making progress on the PR backlog is now one of our priorities, so the activity should pick up again. 📈 |
@StephanTLavavej |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, I'll validate and push changes for test nitpicks.
Thanks for significantly improving |
This nukes the partial specialization of
_Fmt_iterator_buffer
for containers and instead uses a traits-like class for appending in the generic_Fmt_iterator_buffer
. This traits-like class is then partially specialized for any container which hasappend
orinsert
, preferring to pickappend
if possible.There are a few benefits to this:
format_to
with a brand new string, our performance doesn't suffer drastically from itbasic_string
, since the generic version of_Fmt_iterator_buffer
suffers from the same issue as above.One drawback, however, is that if the string's capacity is big enough to fit the output, we'll have to do copies that could be avoided.
It brings us to essentially the same performance than fmtlib according to @jovibor's benchmarking code:
![image](https://user-images.githubusercontent.com/6440374/116803601-38150100-aae7-11eb-8a79-cd18b41d6e3f.png)
And I'd wager potentially even faster with long strings, since fmt's implementation falls back to
push_back
character per character oncefmt::memory_buffer
runs out.