Improving performance for `Copy` keys and values #12

dicej · 2018-04-23T22:16:45Z

I wrote a few simple benchmarks to determine what, if anything, might be gained by storing keys and values directly in OrdMap rather than wrapping them in Arcs:

https://github.com/dicej/ordmap_performance/blob/master/src/lib.rs

(note that I had to hack the im-rs lib.rs to make the nodes module public in order to make the above work)

Here are the results on my machine:

test add_and_remove         ... bench:   2,394,205 ns/iter (+/- 78,129)
test add_and_remove_mut     ... bench:     449,903 ns/iter (+/- 54,218)
test add_and_remove_raw     ... bench:   1,259,173 ns/iter (+/- 64,433)
test add_and_remove_raw_mut ... bench:     257,425 ns/iter (+/- 16,412)

Considering that performance is roughly doubled by removing the Arcs, would it be appropriate to modify the APIs so that the Arc wrappers are optional?

The text was updated successfully, but these errors were encountered:

bodil · 2018-04-27T13:33:20Z

If there were a good way to make the Arcs optional... it should be a lot easier with associated type constructors, but without them I've not managed to find a satisfactory solution yet that doesn't mean reimplementing the public API for each variant, and until then I'd prefer to optimise for ease of use over performance. Not that I'm necessarily convinced that Arc is the better default, but that's my working assumption right now. Research is ongoing, ideas welcome etc.

tov · 2018-05-23T19:38:23Z

I'd like to think about this too. Could you give an example of the ergonomics provided by the Arcs around elements/keys/values?

I'm also curious as to why accessors such as Conslist::head return an Option<Arc<A>> rather than Option<&Arc<A>>. Is that for similar reasons?

Thanks, and sorry if I'm butting in where I don't belong.

bodil · 2018-05-24T12:39:31Z

Some might be for historical reasons, I distinctly recall moments in the early days when I just couldn't figure out how to return a reference. Right now it probably ought to be returning &Arc<A> at a minimum, not Arc<A>, but I have a feeling that just returning &A is sufficient in all practical cases. The only reason you'd need the Arc is if you really need to clone the reference, and that might be better served with its own method/iterator.

tov · 2018-05-24T12:47:02Z

Thanks for explaining. I agree it's unlikely to need to clone the reference. I figured &Arc<A> is the best compromise between Arc<A> (which bumps the refcount) and &A (which lets the client do fewer useful things). (But then I still don't understand the purpose of the element Arcs in the first place. Is the assumption just that clients will most often want the elements of structures like this in Arcs, so that smooths out the common case?)

I was thinking that it would also be possible to get a reference to the tail without bumping a refcount, at the cost of a slight increase in overall complexity. But it would make it easy to do arbitrary manual traversals without any refcount changes, which is nice. I can write it up if you like.

bodil · 2018-05-24T12:51:39Z

The Arcs are there because the assumption is that you're storing objects which aren't efficiently copyable. This too may be less true in practice.

I'd love to see your take on it, but it may need to be merged as part of a bigger overhaul...

tov · 2018-05-24T17:05:37Z

I see. And is the assumption also that you need to copy elements for some of the operations? I wonder what would happen if you just placed a Clone constraint on the element type for those operations. For example, for ConsList, you could get rid of the automatic Arcs, but then ConsList::reverse and ConsList::append would require that the element type be Clone. That means that Arcs would still get cloned properly, but Copy or cheap-to-clone elements wouldn't have to be wrapped in Arcs. I'm not sure how well this strategy applies to the other data structures.

I attempted to work up my take, and it didn't work as expected, but I've created an issue detailing what I did manage to do.

bodil · 2018-05-30T11:12:00Z

Update on this: I'm currently rewriting Vector as a chunked RRB tree, and I'm doing it without Arcs, just a Clone constraint on the value type. Speedup is at roughly 25% now, which isn't as impressive as I'd hoped, but still clearly an improvement, and, more importantly, I think the API is coming out a lot cleaner without the Arc wrapped values everywhere. Unless I run into something unexpected before I'm done, I think I'll go back and do the same for the other data structures before the next release.

tov · 2018-05-30T17:39:49Z

This is good to hear—I think the Clone constraint instead of element Arcs is an improvement. I tried that for the current Vector on a branch and was pleased with the results. Yes, the speed-up is only modest, but the API seems cleaner. (It's sort of like, say Mutex, where 80% of the time beginners might want to wrap it in a Arc, but the library doesn't wrap it automatically, in order to support more advanced usages. I don't think anyone is going to be using a Mutex without understanding Arcs, so including the Arc in the Mutex wouldn't really increase usability. I think the situation is similar here—one is unlikely to be using these data structures without enough understanding of Arcs to insert ones own.)

There’s an additional thing I investigated that might interest you. For methods that insert an element, I had them take R where R: Into<A> rather than just A. This works to add Arcs automatically where necessary, as your Shared trait did before. The problem is that it makes inferring the type of a new Vector nearly impossible. Even the vector! macro had to be revised (obfuscated a bit, really), in order to figure out the element type. So you probably don't want to do things that way, but it's worth thinking about.

bodil · 2018-05-31T11:59:26Z

@tov I was using Into<A> previously, in fact - Shared comes from PR #2 which addressed a change in the standard library which broke type inference in a lot of cases. I'm hoping to get rid of the need for Shared in general, but as it is it makes for a lot nicer code than with Into and explicit type signatures everywhere.

Also, time to get rid of `ConsList`. Closes #28, #21, #12.

bodil mentioned this issue Apr 27, 2018

Cache locality (comparison with Immer) #11

Open

Eh2406 mentioned this issue Jun 6, 2018

cargo could use a persistent hash map in it's resolver #26

Closed

bodil added a commit that referenced this issue Jun 21, 2018

Un-Arc everything.

b0099f1

Also, time to get rid of `ConsList`. Closes #28, #21, #12.

bodil closed this as completed Jun 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving performance for `Copy` keys and values #12

Improving performance for `Copy` keys and values #12

dicej commented Apr 23, 2018

bodil commented Apr 27, 2018

tov commented May 23, 2018

bodil commented May 24, 2018

tov commented May 24, 2018

bodil commented May 24, 2018

tov commented May 24, 2018

bodil commented May 30, 2018

tov commented May 30, 2018

bodil commented May 31, 2018

Improving performance for Copy keys and values #12

Improving performance for Copy keys and values #12

Comments

dicej commented Apr 23, 2018

bodil commented Apr 27, 2018

tov commented May 23, 2018

bodil commented May 24, 2018

tov commented May 24, 2018

bodil commented May 24, 2018

tov commented May 24, 2018

bodil commented May 30, 2018

tov commented May 30, 2018

bodil commented May 31, 2018

Improving performance for `Copy` keys and values #12

Improving performance for `Copy` keys and values #12