-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inmemory store does not support values more than 127 bytes in size for properties with Cardinality.SET (and doesn't support key sizes > 127 bytes) #2273
Comments
I can reproduce it on 0.5.2. |
@dk-github Do you know why it may happen? |
We routinely store strings hundreds of MBs in size, so this looks to be a "special" case which wasn't covered in any of the tests or use cases so far and so went unnoticed. I had a quick look, and the above code works fine if we remove the following: cardinality(Cardinality.SET) Also, the comment above the failing precondition line says that it assumes the length of the key (not value) in an Entry is always less than 127, and uses that assumption to reduce the overhead (1 byte to store the position where value starts instead of 4 bytes). So my hypothesis (not based on actual code dive so far) is that normally JG stores property as value under a key equal to property name, but in case of Cardinality.SET it switches to a structure which effectively stores values as keys (and thus achieves uniqueness). Summary (NOTE that it is actually premature because it would be great to verify by diving in the code - will try doing this later next week and update this post if not correct):
The options I can see here are: b) invest some time in an "adaptive" storage strategy, so that in 99.99% cases it uses 1-byte indexes, and switches to 4-byte only if required, on per-page or per-entry basis. For example, use sign bit in first byte to signify if it is a 1-byte or 4-byte length. c) check if there is a practical limit to the key size in other backends, and if so - should we document these limits as limitations of Cardinality.SET, or look for a more involved but less limited implementation of SET on frontend side I will try to set aside some time to verify the hypothesis, and look at how difficult would option b be. In the meantime, any thoughts are welcome. |
…page storage, to allow for Entry keys of arbitrary length, while not incurring unnecessary overhead (as 99% of keys are typically quite short)
…page storage, to allow for Entry keys of arbitrary length, while not incurring unnecessary overhead (as 99% of keys are typically quite short) Signed-off-by: Dmitry Kovalev <[email protected]>
I opened a PR #2294 which implements option b) - i.e. it uses 1 byte to store key length unless it exceeds 127, in which case it switches to full 4-byte integer and stores it negated. SO that by reading 1st byte and checking if it is negative it can then detect if it needs to read 3 more. It relies on BigEndian order being default in java.nio cross-platform. I haven't made any performance or memory footprint comparisons yet, but I expect the impact to be negligible to both. Please review. I have also converted the repro code above to Java, not sure if we want to make it into a test specifically for storing big values with Cardinality.SET - and where to put that test if we do - so just posting it here for now:
|
It would be nice if you can put it in JanusGraphTest.java |
…page storage, to allow for Entry keys of arbitrary length, while not incurring unnecessary overhead (as 99% of keys are typically quite short) Signed-off-by: Dmitry Kovalev <[email protected]>
…page storage, to allow for Entry keys of arbitrary length, while not incurring unnecessary overhead (as 99% of keys are typically quite short) Signed-off-by: Dmitry Kovalev <[email protected]>
…page storage, to allow for Entry keys of arbitrary length, while not incurring unnecessary overhead (as 99% of keys are typically quite short) Signed-off-by: Dmitry Kovalev <[email protected]>
…page storage, to allow for Entry keys of arbitrary length, while not incurring unnecessary overhead (as 99% of keys are typically quite short) Signed-off-by: Dmitry Kovalev <[email protected]>
…ge, to allow for Entry keys of arbitrary length, while not incurring unnecessary overhead (as 99% of keys are typically quite short) (#2294) Signed-off-by: Dmitry Kovalev <[email protected]>
Fixed in #2294 |
For confirmed bugs, please report:
For berkeleyje storage works as expected
But for inmemory store failed on commit
with exception
The text was updated successfully, but these errors were encountered: