-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intermittent corrupt output from FSTObjectOutput #106
Comments
Thanks for the report. Also things like this might happen due to incorrect multithreading/concurrency. You cannot write to an FSTObjectOutput from different threads. Do you have a reproducing test case ? |
Our program runs on 1 thread only. |
I have developed a stand-alone reproducible case. :-) I'm not sure about my theory regarding SoftReference, but I know for sure that this program behaves unexpectedly. It's a stress-test of serializing/deserializing what I think should be the same object 100 times. For me, the deserializer will fail 0-10 times per run. Usually around 3 or 4 times. I'm surprised that the program isn't completely deterministic. Working theory is that GC is the non-deterministic element. To try it: and you should see a few failures. If you don't, run it a few more times. It's very intermittent. I'm hopeful that you are able to reproduce and take a look. Apologies in advance for any wrong theories here... I really appreciate your help. Your library is awesome! We are using it to really speed up loading JavaScripts in Rhino on Android. Our preprocessor fails sometimes, and this investigation Thank you, |
appreciate your help very much. Will have a look as soon as possible .. |
thank you! |
curious - were you able to unzip and run the example? did it reproduce the intermittent failure for you? |
haven't had time for open source work, stay tuned :) |
I've been trying to debug this, but haven't been successful. Would love to chat about debugging strategies, if you were free for a chat session somewhere. Thanks! |
hm .. maybe just add a flag disabling softreference caching at all using a publis static. The current implementation afaik contains a fallback doing an alloc, so only adding to cache must be ommited. |
Still working to debug the issue I'm seeing with serialization. I no longer think it is related to SoftReference. My latest theory is that there's a bug in the resize method of FSTIdentity2IdMap. After hours of analysis of my bigger sample, here's a program that, for me, reliably reproduces a failure, just by using FSTIdentity2IdMap in isolation. Note that if I make the initial size big enough that it's never resized, then the failure doesn't happen, thus my suspicion of the resize method. I would be very grateful if you could take a look / help me understand/fix/update FSTIdentity2IdMap. Thanks!
|
More info: The above failure goes away with a modification to the FSTIdentity2IdMap.rePut method, so that it also works with a max-depth object that is using the linear list:
I am now investigating why I end up with a max depth FSTIdentity2IdMap faster than I might expect. It's still quite odd that in my real use case using the serializer, the failure is so intermittent... I've been able to prove that the Java System.identityHashCode results for specific instances cause problems, whereas others do not... trying to connect the dots from that to this... |
Ok, I figured out why the max depth is intermittently reached more rapidly than intended. putOrGetHash uses calcIndexFromHash to get an index for the key array. if the slot is already used, it tries the next two, but if all 3 are used, it gives up and calls putOrGetNext, which increases the depth. If in putOrGetNext, I track to see what percentage of the mKey array actually got used, I can see that it varies a lot... sometimes as low as 14%, quite often as low as 33%... so most of the allocated array is wasted and so it gets to the max-depth faster (depending on how unlucky the hash codes get) and then can lead to the resize bug (in my previous comment) where data in the linear lists isn't copied forward.
Is HashMap really that much slower? Here's a slide in replacement class FSTIdentity2IdMapSimple
|
So, in short, I believe FSTIdentity2IdMap has a bug and an unintended behavior. Your library is really great / fast, has been huge plus for our project. I really appreciate your work on it, and I'm hopeful my effort here leads to improvements. Please review and let me know if you have some "official fixes". Thanks! |
Thanks for your valuable help. Benchmarks have shown (some time ago), that Regarding memory waste: as an object requires >16 bytes of object header, a As each dereferencing has a risk of triggering a cache miss, a lookup to a Summary: The design of j.u.HashMap is one of the main causes for GC Anyway I'll try to fix and release your observation this weekend if 2016-02-12 8:33 GMT+01:00 bradedelman [email protected]:
|
Great find+fix ! |
checked length of lists and none of them exceeds 20 elements, so it seems hashkeys still distribute well. will release this with 2.44 Many thanks for your patient investigation and the solution. Basically the linlist was just a quick fix some years ago and I obviously forgot to adapt the resize/reput for it :). |
unfortunately several testcases (e.g. basicfsttest) fail with your changes, reason is for multidimensional arrays the original position determination was correct, changed in a way hopefully both cases work now. will release with 2.45´ |
Sorry about that. Thanks for the fix to the fix. Are there instructions on how to run the tests? Sorry, I didn't pick up on that. |
I'm seeing very intermittent issues with FSTObjectOutput (when iterating over serializing 100's of objects). I kept thinking: shouldn't it be completely deterministic? In the course of trying to debug it, I stumbled upon the use of SoftReference in FSTConfiguration. So then it occurred to me, oh, I wonder if the intermittent nature stems from the behavior of the Garbage Collector?
So, I experimented with 2 things:
I noticed that you had this comment:
/**
* reuse heavy weight objects. If a FSTStream is closed, objects are returned and can be reused by new stream instances.
* the objects are held in soft references, so there should be no memory issues. FIXME: point of contention !
* @param cached
*/
Which suggests (perhaps) some uncertainty about the use of SoftReference?
I don't know your code well enough to decide whether it's safe for me to use my modification (this is a utility that runs and exit, so I'm not concerned about leaking memory, but I am very concerned about reliability),
Have you seen problems with this? My test case can't be shared directly, but I've spent enough time on this problem, that I could work towards developing a standalone example. But before I spent more time on it, I thought it was worth asking you about it.
Thanks,
Brad
The text was updated successfully, but these errors were encountered: