-
-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty lists while creating parents #15367
Comments
comment:1
These are the empty buckets of the various
OK, we have now all the new lists of length 1. Let's see who's referring to them
and count, among the referrers that are lists themselves, what the lengths are.
The first one must be
so it looks like we have 265/53=5 (empty) If you think this is a real issue, we could open-code the hashtables for |
comment:2
The code:
Allocates about 64M of ram. This suggests that we are using about 64K of ram per p-adic field. There is a similar phenomenon for quadratic fields (about 50K each). The side issue that it is very easy to make p-adic fields or number fields un-garbage collectable (hopefully fixed forever someday soon #14711) makes it easy for a user to accidentally end up running out of ram. |
comment:3
There is a comment in:
that suggests that we can't rely on init to actually be responsible for allocating anything. Consequently, throughout parent.pyx and parent_old.pyx most of the time we use any of This suggests the following fix: This may even speed up execution if these lists are typically empty, as looking to see if an empty list contains something will be slower than checking that the list doesn't even exist yet (especially as we already check if the list doesn't exist yet). |
comment:4
Memory footprint:
So we see that if we assume that It also makes a difference in initialization time:
As you can see, allocating the empty lists is a pretty significant cost as well. For an open-coded hash table we would expect on a 64-bit machine: 16 bytes per entry in MonoDict and 32 bytes per entry in TripleDict (compared to 24 bytes for a python dict) Furthermore, allocation time would be fairly independent of size: the store would simply be a slab of memory that can be initialized to null by a straight memory-zeroing call. So I'm pretty sure we could make MonoDict and TripleDict quite a bit faster and more memory-efficient by basically just copying the open hash-table design for python's dict. We'd be working with table sizes that are a power of two, so we'd probably want to shift the addresses that we use as keys (given that the bottom 3 bits probably carry no information) It's quite a bit of work to get right, so we should probably first establish that this is actually a bottleneck somewhere. I don't think the runtime characteristics of the implementations would be hugely different (but I cannot exclude something like a factor of 2 in runtime) |
comment:5
Replying to @nbruin:
Indeed. Thus, I think it would be a good idea to start with small steps, that can easily be done and may improve the situation. Namely:
|
comment:6
Replying to @simon-king-jena:
Given that we should bound the fill ratio by about 60%-70%, that would not be enough. I expect, however, that many caches have very few entries, so going with 11 and reshaping when required is likely the best.
Hm, I wonder how the cost of that works out. Another option: make MonoDict etc. lazy in actually allocating the buckets (i.e., upon creation, just set |
comment:7
To test whether the dimension of the dicts is too big at initialisation, it makes sense to see how often a resize of First findings:
I will now see what happens during doctests. Question: Is it really a good idea to have 107 buckets for 38 items? |
comment:8
Replying to @nbruin:
What is the theory behind this choice?
Good idea! So, do not be lazy in coercion initialisation of Parent, but be lazy in the underlying data structure. |
comment:9
Replying to @simon-king-jena:
The answer to this question is basically the same as "Is it really worth using a dictionary with lightning fast lookup and bad memory density rather than a linear search in a dense list?", except that we get to tune between the two. Given cache locality, a linear search in a list of 38 items could be quite competitive! Python dicts are even more agressive in resizing: for small dictionary sizes, they aim for the next power of two that is at least FOUR times the number of items (and trigger it at 2/3 occupation rate). But then, their price per entry is 24 bytes. |
comment:10
Next findings:
Anyway, I am not sure if these figures are hinting something. Do I understand correctly: Your suggestion is to start with a list (an actual initialised list) of buckets, but initially each bucket will be |
comment:11
Replying to @simon-king-jena:
Uh, no. I was more thinking of starting out with a bucket list of None. Your interpretation is also possible, but addresses a slightly different issue. It could be worth doing both, one of the two, or neither. For your proposal: I don't know what the cost difference is between making an empty list and putting another reference to None in, i.e., the cost of
versus
If the difference is significant (and it may well be, because increffing doesn't involve memory allocation) then your interpretation may well have great benefits (and surely reduced memory footprint). If you're feeling gutsy, you could just leave the entries in Come to think of it, I expect that your interpretation would have the biggest impact. |
comment:12
Replying to @simon-king-jena:
A reasonable start is wikipedia. We're currently using "separate chaining" which degrades fairly gracefully with increasing load factor, so perhaps the 60-70 load factor isn't as strict as the phase transition that open addressing undergoes around that point. You could experiment and tune a reasonable trade-off (the idea is that any time you don't hit what you're looking for on the first try, you've basically lost. Hash conflict resolution should rarely be necessary). The other design limitation is that reshaping is expensive (O(size of dict)), so that should happen rarely. That means that if you increase the number of bins, you have to do so by a lot. |
comment:13
After figuring out how python dicts work to improve WeakValueDictionary it wasn't so complicated to replicate the design for MonoDict and TripleDict either. Things of note:
Some little problems:
|
Attachment: coerce_dict.pyx.gz MonoDict and TripleDict implemented by open addressing |
MonoDict and TripleDict implemented by open addressing |
comment:14
Attachment: coerce_dict.pxd.gz With the attached files the doctests succeed (modulo some trivial failures). |
comment:15
With the old code, it would have been relatively easy to create a dictionary with a fixed key size (e.g. a dictionary such that all keys are 5-tuples of objects that are weak-refd if possible and compared by identity). One question we might address: How difficult would this be with the new code you are suggesting? You are defining
and then a So, do you think we should have |
comment:16
Replying to @simon-king-jena:
I do not think so. With the previous code you had a prototype that performed well and offered this feature and we didn't put it in. That suggests to me we don't have a need for it. In any case, If you would do this I don't think you'd want to add an extra layer of indirection in the key fields. Hence, you'd have a dynamic length structure:
which of course you cannot define, but you need the sizeof, which should fairly safely be
and addressing entry
One problem with open addressing is that it has no cache locality at all. That's not a huge problem if the table is compact (and your hits are almost always immediate). Since the weakrefs and value are only used for validation, not searching, you could consider storing 'those' under an indirection. I think we should only generalize the design if we know we have a need for it. I think first we should establish if this implementation is really an advantage, if the hash functions perform well, and if the resizing amount is reasonable. One way would be to let lookup keep statistics:
See: http://code.activestate.com/recipes/578375/ for a proposal that makes a much more compact search table with compact iteration. And as well: profiling data on how much time one typically in this code. If that's 2%, even doubling the speed would not have much effect. We may be hitting this code fairly often, but it's still pretty specialized: not like a |
comment:17
I can confirm that all doctests pass, except some in sage.structure.coerce, which explicitly use a method that shows the distribution of values in the buckets and is thus not relevant. Suggestion: I turn your two files into a git branch, remove the useless stuff in coerce.pyx, and then we can start reviewing (and testing performance). |
Branch: u/SimonKing/ticket/15367 |
comment:19
Replying to @simon-king-jena:
Well, I simply did... The first commit is the result of your drop-in replacement, the second commit is removing the get_stats() method, which is a debug method that is not functional any longer. I think the second commit is nothing but a review patch. Filling Author and Reviewer fields accordingly---but the review is not completed yet! New commits:
|
Reviewer: Simon King |
Commit: |
Author: Nils Bruin |
comment:20
With the attached branch, I repeated the example from the ticket description:
And from comment:2 (starting a new session):
With sage-5.12.beta5+#13394, I get only "0" for gc.collect(), and in the end it get_memory_usage(memus) returns 30.2265625. I have to leave now, but will tell something about memory footprint later. |
Work Issues: Improve timings |
Branch pushed to git repo; I updated commit sha1. New commits:
|
Branch pushed to git repo; I updated commit sha1. This was a forced push. Recent commits:
|
comment:44
ping I think this straightforward replacement is worth merging:
So, if someone can finish the review ... It's pretty straightforward code (apart from being rather performance sensitive). |
comment:45
Once more, my impression is that the git workflow helps us solve problems that we didn't have without git. That's to say:
So, how can I get your code? |
comment:46
Nils rewrote history with the forced pushs (bad), so some of the code in Simon's branch conflicts as it is on a different history. Replying to @simon-king-jena:
Noticing conflicts is not a problem when you manually copy patches, correct. However, not noticing conflicts is a problem ;-)
If you haven't made any commits that you want to keep: delete your current branch ( |
comment:47
Replying to @vbraun:
As far as I know, my local repository contains nothing that depends on this ticket. So, how can there be a conflict? I am not talking about conflicts that are just artefacts of how git works.
If I have a local branch associated with this ticket, and if I pull from this ticket and there are merge conflicts with my local branch, I'd expect that my old local branch (ticket/15367) be preserved and a new local branch (say, ticket/15367-1) be created that is equal to the branch forced-pushed by Nils. This should be automatically done by either the dev script or by git. And then, I can still decide whether I want to delete my old branch or merge the new and the old branch, or I can explain to Nils why I don't like his new branch and switch back to the old one. In contrast, you say that in the new workflow I have to decide whether or not to delete my old branch, before I pull from Nils' changes. Why? This approach seems plain wrong to me. I guess this is a theoretical discussion that belongs to the sage-git list. Since I don't see stuff depend on this ticket (yet), I did as you suggested. |
comment:48
It seems that Sorry about the rebase, but in this case it seemed warranted because
Keeping the trivial edits to |
comment:49
Replying to @simon-king-jena:
The commits from all of those branches are of course preserved in git. If you like another local branch to give a name to the different histories then that is fine, but don't force your naming convention on others. To uniquely reference any of the past branch heads you only need the commit SHA1, whose historical values is conveniently listed in the ticket comments. |
comment:50
Replying to @simon-king-jena:
I think you can "unlink" by doing something along the lines of |
comment:51
Replying to @nbruin:
AFAIK, git knows that this was a forced push. So, the dev script should at least offer a "forced pull". Anyway, this ticket is not about annoyances and hold-ups caused by the switch to the new workflow. |
comment:52
To start reviewing, I merged master (freshly pulled from trac) into the branch, which succeeded. |
comment:53
Second step of the review: I checked that the patchbot is right and all tests pass (after merging master). |
comment:54
Third step: Re-test timings. I do "branch versus sage-5.13.b4".
In the branch:
In sage-5.13.b4 (plus #14912, but this doesn't touch coerce dict):
So, everything became faster, and the memory consumption is a bit less (but the latter could very well be because of other tickets). The effect on startuptime is probably not significant, but I get (one run only) So, the winner is: THIS BRANCH. Last step of the review: Read the new code. |
comment:55
In
without saying that First question: Why does it not crash? Second question, even if there is a good answer to the first: Shouldn't it be |
comment:56
Replying to @simon-king-jena:
Cython has grown type inference.
No, because cython infers the correct type by itself. I'm not commenting on whether it's good style to rely on said type inference, but it is very convenient when you're programming. |
comment:57
Replying to @nbruin:
Since when? Amazing. Anyway, back to needs review then... |
comment:58
Replying to @simon-king-jena:
According to https://github.com/cython/cython/wiki/ReleaseNotes-0.13 since August 25, 2010 :-). |
comment:59
Replying to @nbruin:
So far I have always tried to tell Cython which type to expect. Anyway. The code looks good to me, tests pass, the patchbot has nothing to complain, and as we have seen there is an improved performance. Positive review! |
comment:61
While looking at this code, I noticed that the
Is this intentional? If yes, can we deprecate |
This is crazy:
Much of that is empty lists:
I'm not sure where these 525 empty lists are created, but it seems a bit excessive.
It's not just p-adic fields:
CC: @nbruin @simon-king-jena @vbraun
Component: coercion
Author: Nils Bruin
Branch/Commit: u/nbruin/ticket/15367 @
719cdec
Reviewer: Simon King
Issue created by migration from https://trac.sagemath.org/ticket/15367
The text was updated successfully, but these errors were encountered: