fix: #39 - Improve constructor performance #45

realharry · 2018-03-05T02:29:59Z

REF: #39

Updated the isCID() implementation to use a (private) intermediate method to validate CID without relying on the error being thrown during validation.

The validateCID() method also uses this intermediate method, and its behavior has been preserved (e.g., with regards to throwing an error for invalid CIDs).

Updated the isCID() implementation to use a (private) intermediate method to validate CID without relying on the error being thrown during validation. The validateCID() method also uses this intermediate method, and its behavior has been preserved (e.g., with regards to throwing an error for invalid CIDs).

daviddias · 2018-03-05T08:49:42Z

package.json

@@ -1,6 +1,6 @@
 {
  "name": "cids",
-  "version": "0.5.2",
+  "version": "0.5.3",


@realharry version gets updated on publish. Please revert this change.

daviddias · 2018-03-05T08:49:59Z

package.json

@@ -43,6 +43,10 @@
    "aegir": "^12.0.8",
    "chai": "^4.1.2",
    "dirty-chai": "^2.0.1",
+    "karma-mocha": "^1.3.0",
+    "karma-mocha-webworker": "^1.3.0",


This gets installed with aegir. Did it not work for you?

@diasdavid It was working on my machine, but the CI builds were failing (especially, Jenkins) as shown in my checkin history. I added these to make the CI builds work. I'll revert it back.

daviddias · 2018-03-05T08:50:06Z

package.json

@@ -43,6 +43,10 @@
    "aegir": "^12.0.8",
    "chai": "^4.1.2",
    "dirty-chai": "^2.0.1",
+    "karma-mocha": "^1.3.0",
+    "karma-mocha-webworker": "^1.3.0",
+    "mocha": "^5.0.1",


Same as above

daviddias · 2018-03-05T08:50:23Z

package.json

+    "karma-mocha": "^1.3.0",
+    "karma-mocha-webworker": "^1.3.0",
+    "mocha": "^5.0.1",
+    "multihashing": "^0.3.2",


We only use multihashing-async to use WebCrypto in the browser.

I was "experimenting", and likely I forgot to remove this package after (accidentally) adding it. I'll remove it.

Oh, I must have misunderstood your comment. I'll remove multihashing-async, but I will still need multihashing in dev-dependencies. I use multihashing in my test case cidperf-a.js.

dignifiedquire · 2018-03-05T11:48:51Z

src/index.js

@@ -219,12 +219,11 @@ class CID {
   */
  static isCID (other) {
    try {


why is this try block still here? shouldn't this be just return !CID._checkCIDComponents(other) now?

@dignifiedquire If you look at the original implementation of validateCID(), it relies on the behavior of mh.validate(other.multihash), which is supposed to throw an error for invalid multihash. That behavior is preserved in the updated validateCID(). And, in order to be able to use/share a common implementation between isCID() and validateCID(), I had to preserve that behavior in _checkCIDComponents() as well. Therefore isCID() still needs to catch the error to return false/true in case an error is thrown (e.g., from mh.validate(other.multihash)). Contractually (in terms of interface), isCID() never throws an error since we now have an "outer" try/catch containing its implementation (even through future updates). I hope my answer makes sense.

Ah I forgot that. This is unfortunate, mh.validate should at least have a counter part that doesn't throw.

Yes. but, as I indicated in my comment on #39, the try/catch itself is not an issue. It's the cost of creating an error object in the heap that takes time. Depending on the use cases, if we rarely hit mh.validate part in the "if chaining", the cost of having that is not much. The question is, out of 100 times or a million times calling isCID() in a typical use case (e.g., are CIDs generally expected to be valid in most cases?, etc.), how many times we end up creating the error object? Having said that, I do agree with you. The code will be cleaner if we don't have to include try/catch. Thanks.

realharry · 2018-03-05T18:16:35Z

I removed the karma/mocha dev-dependencies. Getting the same build error as before. @diasdavid Any suggestions?

realharry · 2018-03-05T19:32:31Z

Sorry for the mixup. All CI builds pass at this point. Thanks.

daviddias · 2018-03-06T10:40:35Z

src/index.js

    }

    if (!Buffer.isBuffer(other.multihash)) {
-      throw new Error('multihash must be a Buffer')
+      return 'multihash must be a Buffer'


These should be thrown errors.

daviddias · 2018-03-06T10:40:57Z

src/index.js

+   * @param {any} other
+   * @returns {string}
+   */
+  static _checkCIDComponents (other) {


This function can be extracted to a separate file given it is an utility.

I can create a file cid-util.js or util.js or util/cid-util.js. Any preferences?

daviddias · 2018-03-06T10:42:04Z

test/profiling/alt-impl/cid1.js

+
+CID1.codecs = codecs
+
+module.exports = CID1


Can we call it CID_A and CID_B given that we already use 0 and 1 for version numbers? Pretty cool to include profiling tests!

Thinking more about this. Instead of duplicating the code. This could just be installing the CID version on npm.

Doing it this way will require creating a new npm project folder (with package.json) for testing/profiling (I think). This only works for the "reference" impl. We may still need a separate/duplicated CID for comparison. (See the comment below.)

daviddias · 2018-03-06T10:43:01Z

test/profiling/alt-impl/cid2.js

+
+CID2.codecs = codecs
+
+module.exports = CID2


Why duplicate instead of using the version in src?

@diasdavid Thanks for pointing that out. I wasn't really happy with this solution, but I couldn't come up with a better one. The issue was: (1) I wanted to use, as a reference, a "snapshot" of the CID implementation as of, say, Sunday. The CID implementation will (likely) change over time, and using the "current" version (in src) will not make sense for this particular test (say, 2 months from now). (2) I wanted to try a different "alternative" implementations for comparison. And, putting alternative (trial) implementations in the prod source code (e.g., as an extra method, etc.) would not have made sense. (See the comment above.)

That's why I ended up duplicating the code for both. Obviously, there is something "funny" about doing it this way. I'm not sure what would be the best way though. Any suggestions?

Having thought about it a bit more, I think your suggestion, if I understand correctly, seems to make more sense. I now have to reverse reference and alternative (in my mind):

Reference: CID version X (in NPM)

Alternative: CID in src (with alternative/trial implementation)

(When I started out, the CID in src was the reference, and "my version" was the alternative.)

BTW, if we do this, then we cannot have just one test runner (which has an advantage of being simple). We'll need two separate js files/runners, I think. (Correct me if I'm wrong.) And, run two test cases separately to compare the results, which might be an overkill for this particular task. Please let me know what you would prefer, of if you have any alternative suggestions.

You can npm install a locked version on the profiling test folder (therefore one version) and require the other from the src folder (and you get the other version to compare). Makes sense?

(1) Put checkCIDComponents() function in a separate file. (2) Added a test spec for the new module. (3) Simplify the perf test runner. (Now it tests one CID implementation at a time.)

realharry · 2018-03-06T19:34:35Z

Updated per feedback:
(1) Created a new cid-util.js and its corresponding test spec.
(2) Simplified the test runner so that it now takes only one CID implementation. Will need to run the test multiple times for each alternative implementation to compare.

realharry · 2018-03-06T19:48:15Z

@diasdavid Jenkins build fails on Mac. Is there anything I can do to fix this?

daviddias · 2018-03-06T19:47:56Z

src/cid-util.js

+   */
+  checkCIDComponents: function (other) {
+    if (other == null) {
+      return 'null values are not valid CIDs'


This should also be a throw Error

"not anymore, since throwing errors makes everything slower.

daviddias · 2018-03-06T19:48:07Z

src/cid-util.js

+    }
+
+    if (!(other.version === 0 || other.version === 1)) {
+      return 'Invalid version, must be a number equal to 1 or 0'


plus this one

same as https://github.com/ipld/js-cid/pull/45/files#r173727533

daviddias · 2018-03-06T19:48:13Z

src/cid-util.js

+    }
+
+    if (typeof other.codec !== 'string') {
+      return 'codec must be string'


plus this one

same as https://github.com/ipld/js-cid/pull/45/files#r173727533

daviddias · 2018-03-06T19:48:19Z

src/cid-util.js

+    }
+
+    if (!Buffer.isBuffer(other.multihash)) {
+      throw new Error('multihash must be a Buffer')


Just like this one :)

daviddias · 2018-03-06T19:49:16Z

src/cid-util.js

+  /**
+   * Test if the given input is a valid CID object.
+   * Returns an error message if it is not, or
+   * Throws an error its multihash is invalid.


Let's simplify it and just make it throw an error if it is doesn't have valid CID components, the error message will tell which component is invalid.

@diasdavid I'm a little bit confused.

The new implementation of validateCID() already does that. And, this behavior has been preserved from the existing implementation. The purpose of checkCIDComponents() was to use the same logic in isCID() without relying on the error being thrown. The idea was that isCID() was slow because of the error objects being created on the heap (which obviously we need to verify in the long run). If we change checkCIDComponents() to throw errors for all these conditions, we are effectively going back to the old implementation where isCID() directly uses validateCID(). The purpose of checkCIDComponents() was to avoid using error objects in CID validation in the constructor (via isCID()) without having to duplicate the code.

Please advise.

Oh, I see.

If that is the case. Let's wrap the mh.validate on a tryCatch block inside the checkCIDComponents itself and only make that function return strings or undefined (should enable V8 to optimize it even more).

OK. I'll check in the updated changes shortly.

daviddias

LGTM :)

realharry added 3 commits March 4, 2018 18:17

Added dev dependency

7eaaa02

Updated dev dependencies

fd65693

realharry mentioned this pull request Mar 5, 2018

Improve constructor performance #39

Closed

daviddias suggested changes Mar 5, 2018

View reviewed changes

dignifiedquire reviewed Mar 5, 2018

View reviewed changes

Updated based on the feedback

4d423a7

Fixed checkin error

bdf3404

daviddias suggested changes Mar 6, 2018

View reviewed changes

Updated per review/feedback

c2cb1c7

(1) Put checkCIDComponents() function in a separate file. (2) Added a test spec for the new module. (3) Simplify the perf test runner. (Now it tests one CID implementation at a time.)

daviddias suggested changes Mar 6, 2018

View reviewed changes

checkCIDComponents() method now does not throw error.

0f11f0c

daviddias requested a review from vmx March 7, 2018 08:07

daviddias approved these changes Mar 7, 2018

View reviewed changes

daviddias approved these changes Mar 12, 2018

View reviewed changes

daviddias merged commit dc0bfd3 into multiformats:master Mar 12, 2018


		CID1.codecs = codecs

		module.exports = CID1


		CID2.codecs = codecs

		module.exports = CID2

fix: #39 - Improve constructor performance #45

fix: #39 - Improve constructor performance #45

Conversation

realharry commented Mar 5, 2018 • edited by daviddias Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dignifiedquire Mar 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

realharry commented Mar 5, 2018

realharry commented Mar 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

realharry Mar 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

realharry Mar 6, 2018 • edited Loading

Choose a reason for hiding this comment

realharry Mar 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

realharry commented Mar 6, 2018 • edited Loading

realharry commented Mar 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

realharry Mar 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daviddias left a comment

Choose a reason for hiding this comment

realharry commented Mar 5, 2018 •

edited by daviddias

Loading

dignifiedquire Mar 5, 2018 •

edited

Loading

realharry Mar 6, 2018 •

edited

Loading

realharry Mar 6, 2018 •

edited

Loading

realharry Mar 6, 2018 •

edited

Loading

realharry commented Mar 6, 2018 •

edited

Loading

realharry Mar 6, 2018 •

edited

Loading