Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object matching is *slow*. #11

Closed
dead-claudia opened this issue Sep 21, 2016 · 12 comments
Closed

Object matching is *slow*. #11

dead-claudia opened this issue Sep 21, 2016 · 12 comments
Labels

Comments

@dead-claudia
Copy link
Owner

dead-claudia commented Sep 21, 2016

It's literally 10 times faster to check an equivalent Map than an object literal, and similarly (5-10x) for Set vs array literals. This should not be the case, especially when they are both prioritized.

@dead-claudia
Copy link
Owner Author

@zubuzon Sorry for getting it uploaded so late. I've been a bit busy myself.

@ghost
Copy link

ghost commented Sep 21, 2016

Its ok. And I agree that Map is faster. I figured that out in a childnode
diff algo I developed yesterday. Still working to find a good object match solution myself and will share it when I know the answer to it.

@isiahmeadows It depends on how much content inside the object, but a way faster method is this

function demo(a) {
switch(a.length) {
case 3: return a === 'foo' || a === 'bar';
case 5: return a === 'isiah';
}
}

This is around 40% faster then a lookup table etc. At least for V8.

On Sep 21, 2016 18:43, "Isiah Meadows" [email protected] wrote:

@zubuzon https://github.com/zubuzon Sorry for getting it uploaded so
late. I've been a bit busy myself.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AVG552eMfads5uzqLagCR3v9TRu2p8YNks5qsQppgaJpZM4KCpcH
.

@dead-claudia
Copy link
Owner Author

I'd love to do that, except it's pretty much impossible to use in any meaningful way beyond what I already have been, simply because I'm checking arbitrary keys on arbitrary objects. The objects' properties may not seem arbitrary, but the matcher algorithm has to account for any number of those non-arbitrary combinations of objects, so it'll seem arbitrary in practice.

I suspect what it comes down to is that V8 is setting up those objects for frequent reads and writes with near minimal insertions and deletions, so it's the optimized form (as opposed to the dictionary form) that's probably what's killing the performance (and why maps are so much faster to check despite requiring twice the code and more objects/memory).

@dead-claudia
Copy link
Owner Author

@zubuzon I fixed some busted tests, so it might be worth taking another look at the matcher algorithm.

@ghost
Copy link

ghost commented Sep 22, 2016

Im reading your source code now file by file so i will read it soon.

I started out with your package.json and you need to update your
dependencies. Your Bluebird version is old now.

On Sep 22, 2016 12:56, "Isiah Meadows" [email protected] wrote:

@zubuzon https://github.com/zubuzon I fixed some busted tests, so it
might be worth taking another look at the matcher algorithm.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#11 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AVG554pYDlI9EIy4lKM0b2VJRF2uQzm8ks5qsgqGgaJpZM4KCpcH
.

@ghost
Copy link

ghost commented Sep 22, 2016

Btw. How many languages are thallium written in? I see javascript, shell
script, coffeeScript and typescript.

I have no problem with it. Just wondering

On Sep 22, 2016 13:13, "Kenny Flashlight" [email protected] wrote:

Im reading your source code now file by file so i will read it soon.

I started out with your package.json and you need to update your
dependencies. Your Bluebird version is old now.

On Sep 22, 2016 12:56, "Isiah Meadows" [email protected] wrote:

@zubuzon https://github.com/zubuzon I fixed some busted tests, so it
might be worth taking another look at the matcher algorithm.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#11 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AVG554pYDlI9EIy4lKM0b2VJRF2uQzm8ks5qsgqGgaJpZM4KCpcH
.

@ghost
Copy link

ghost commented Sep 22, 2016

I suggest use Map() and polyfill it for legacy browsers if needed. IE9 is soon dead. So worry about IE10 and newer.

@ghost
Copy link

ghost commented Sep 22, 2016

@isiahmeadows Finally I got a chance to benchmark your improved matcher code. I used two different benchmarks, and for primitives you win !! 👍 But for objects, well

lodash x 73,502,883 ops/sec ±0.83% (90 runs sampled)
wirvo x 70,664,247 ops/sec ±0.29% (94 runs sampled)
thallium x 47,007,237 ops/sec ±0.62% (91 runs sampled)

Wirvo is my experimental code:)

In my other benchmark...

string literal              x 51,963,933 ops/sec ±1.34% (85 runs sampled)
array literal               x 401,612 ops/sec ±2.15% (85 runs sampled)
boolean literal             x 56,340,454 ops/sec ±0.75% (89 runs sampled)
object literal              x 2,517,687 ops/sec ±1.93% (87 runs sampled)
object from null            x 4,319,427 ops/sec ±1.07% (88 runs sampled)
regex literal               x 3,602,752 ops/sec ±0.41% (90 runs sampled)
number literal              x 56,516,447 ops/sec ±1.06% (89 runs sampled)
null                        x 43,424,897 ops/sec ±1.17% (91 runs sampled)
undefined                   x 44,001,595 ops/sec ±1.61% (84 runs sampled)
date                        x 3,107,621 ops/sec ±0.92% (91 runs sampled)
map                         x 1,243,875 ops/sec ±1.61% (88 runs sampled)
regex constructor           x 1,950,122 ops/sec ±0.79% (88 runs sampled)

Impressive numbers!!!

The only bad thing I see is this:

array literal               x 401,612 ops/sec ±2.15% (85 runs sampled)

In the first benchmark, the result was higher. And now lower? Seems to be something with that code.

So if you plan to add the missing features you need to be carefull :)

Conclusion on this - based on my own experience - is that this is very good results. I also benchmarked other deepEqual algos, and yours is so far the fastest now.

And ofc you can optimize this further, but is there any good reason for doing that?

@ghost
Copy link

ghost commented Sep 22, 2016

@isiahmeadows I ran a few more benchmarks, and there are a few bottlenecks. Your buffer code is very slow.

// yours
buffer (differing)          x 74,288 ops/sec ±1.73% (87 runs sampled)

// simple deepEqual algo
buffer (differing)          x 7,295,563 ops/sec ±2.70% (85 runs sampled)

And for normal usecase you fail on number and regEx differing. For numbers you fail on both of this:

1, 2 // differing
[ new Number(123), new Number(456) // differing

@ghost
Copy link

ghost commented Sep 23, 2016

@isiahmeadows I did a few more comparisons, and I don't think the main focus should be to improve performance on this algo. There are only a few places where your code is slower then my experimental code.

// yours
string literal              x 52,332,017 ops/sec ±2.24% (88 runs sampled)
boolean literal             x 58,686,660 ops/sec ±0.59% (90 runs sampled)
object from null            x 4,058,788 ops/sec ±1.44% (87 runs sampled)

// experiment
string literal              x 75,662,261 ops/sec ±1.42% (89 runs sampled)
boolean literal             x 68,715,825 ops/sec ±2.29% (64 runs sampled)
object from null            x 8,244,838 ops/sec ±0.84% (89 runs sampled)

@dead-claudia
Copy link
Owner Author

@zubuzon

I started out with your package.json and you need to update your dependencies. Your Bluebird version is old now.

Actually, it's fine. I set that as a minimum version. It'll technically work all the way to 4; I just set the minimum version so I knew I had the getNewLibraryCopy method.

And for normal usecase you fail on number and regEx differing.

First, it's usually an antipattern anyways to use the primitive object wrappers, so I find it easier to not support that. Second, regexps should work, assuming they have the same source (i.e. /w+/ and /w{1,}/ don't match, despite being equivalent, and it's hard to detect equivalence, anyways). Anything to the contrary is a bug.

I ran a few more benchmarks, and there are a few bottlenecks. Your buffer code is very slow.

That's because of all the boilerplate that was in the middle. My latest patch removed Symbol.iterator support as well as expando property support, which improved performance significantly across the board. It should compare a little better now.

I did a few more comparisons, and I don't think the main focus should be to improve performance on this algo. There are only a few places where your code is slower then my experimental code.

Yeah...my next push will likely be my last for a while. It should bring it pretty close.

@dead-claudia
Copy link
Owner Author

With this last patch, I'll consider this fixed for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant