-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
help with solution for big properties in an object space #76
Comments
This comment has been minimized.
This comment has been minimized.
Looks like the best thing to do for the TS properties in this store is to remove duplicate entries if usage allows that. Taking _freeFloatShares as an example, it has 220 million NAs in it. The NAs take no space of themselves, but require 880MB to be encoded in the Partition Map. (which including the double values as well, takes up 950MB -- that is huge.) _sharesOut is similar but a little smaller. gicIndustry has no NAs, but 166 million pointers to a store that has just 300 instances. No way the instances in CLIENTSecurity are changing gicIndustry that often!! Removing duplicates would help tremendously here. finally, held is just TRUEs and FALSEs, which take up no room, but need 670MB to map them to their dates. Once the duplicates are removed, it would make sense to modify the update methods to check if the value asOf the date is different from what the update is poised to store at the date. If it is the same, don't store it. I predict removing the duplicates from these four properties will save over 3GB .... probably more. |
managed to run some of these duplicate removals without any REP. Attaching log of that effort There is some fear-uncertainty-and-doubt ("FUD") as to whether this is a safe action. In times past the number of deletes on these series caused some fatal problems that required the vdb to be rolled back mid production. For this client it would set them back more than a business day of usage. Like to avoid that happening again. Some indication that if the cleanStore message worked then all is good? The test here ran about an hour to finish and the cleanStore did say TRUE at that time. After that we ran some displayClusterProfile. The property in question is gicSubIndustry. attaching the entire process log - |
It looks safe to me. I suspect that you are remembering issues that occurred with a Product Mapped Time Series. Because Vision defers cleanup, it would be possible to delete dates from the index and leave a situation where there would be too many adjustments to make in the cartesian space once an alignment was done. So, I can see dates being deleted, no alignment being done, and a successful commit to the network. Then later, you access the time series, an alignment is attempted, and fails. Then the only remedy would be a rollback. Your intuition to clean the thing would have turned up that error before the commit in the Product Mapped case. Since these structures are Partition Mapped, they are not vulnerable to that scenario. It is best to clean them in the same session with the delete just so you save small structures and prevent sessions before the next cleanup from having to do that big alignment, but if you did forget to clean, a rollback would not be required. Probably there would be no issue. There is a small possiblity that a malloc error might occur depending on what else the affected session had been doing, but that failure would just require a clean, and then things would work again. I was glad to see you used 'displayMemoryStats' early in your test, but disappointed that later it was just a totalNetworkAllocation print. I encourage to always use displayMemoryStats because the high water mark is usually more informative than the allocation value at the time of the print. Anyway, dropping the cluster size of gicIndustry from 1.3GB to 26MB seems like a really nice win to me! |
profile_big_properties.txt
attached file is the output of a particular object space where there are some time series properties attached to an Entity class. The class is not a subclass of Security and the data values are not data records, they are simple numbers and boolean entries.
hoping some one can advise what to do to alleviate the pressure on memory at upload time when new data is added to these properties.
Thanks
The text was updated successfully, but these errors were encountered: