Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up in coerce_dict #24135

Closed
jdemeyer opened this issue Oct 31, 2017 · 41 comments
Closed

Clean up in coerce_dict #24135

jdemeyer opened this issue Oct 31, 2017 · 41 comments

Comments

@jdemeyer
Copy link
Contributor

  1. Deprecate various arguments which should have been deprecated in Empty lists while creating parents #15367.

  2. Use a tuple instead of a list to throw away. This is very slightly faster.

  3. Use safe memory functions from cysignals instead of PyMem functions.

  4. Split __init__ into __cinit__ and __init__.

  5. Rename the iteritems() method to items().

  6. Introduce a new inline function valid(ptr) to replace the very common ptr != NULL and ptr != dummy.

  7. Change type of key_id from void* to PyObject*. This avoids a lot of casts.

  8. Use a custom class with @cython.freelist instead of a PyCapsule to wrap a PyObject*.

  9. In tp_clear, only delete references to elements contained in the dict. Move deallocation of the data structure to __dealloc__.

  10. Use random 31-bit multipliers (instead of 13 and 503) in the hash function for TripleDict. This might give better mixing for large sizes (in any case, it can't hurt).

  11. Rename dummy -> deleted_key to make it more clear what it means. Also, there was no reason that this was of type bytes. Now it is simply created by object().

  12. Change the logic of the lookup() methods a bit to make them easier to understand.

  13. Generic code cleanup: PEP 8, is instead of == for pointers, ...

Timings:

All the changes above lead to a modest speed-up:

MonoDict lookup:

sage: from sage.structure.coerce_dict import MonoDict; D = MonoDict()
sage: L = [Integer(x) for x in range(1000)]
sage: for k in L: D[k] = k
sage: timeit('[D[k] for k in L]', repeat=20, number=20000)

Before: 20000 loops, best of 20: 72.7 µs per loop

After: 20000 loops, best of 20: 64.5 µs per loop

TripleDict lookup:

sage: from sage.structure.coerce_dict import TripleDict; D = TripleDict()
sage: L = [(None, None, Integer(x)) for x in range(1000)]
sage: for k in L: D[k] = None
sage: timeit('[D[k] for k in L]', repeat=20, number=20000)

Before: 20000 loops, best of 20: 97.3 µs per loop

After: 20000 loops, best of 20: 87.3 µs per loop

CC: @simon-king-jena @nbruin @tscrim @fchapoton

Component: coercion

Author: Jeroen Demeyer

Branch/Commit: 7c83e05

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/24135

@jdemeyer jdemeyer added this to the sage-8.1 milestone Oct 31, 2017
@jdemeyer

This comment has been minimized.

@jdemeyer
Copy link
Contributor Author

Branch: u/jdemeyer/clean_up_in_coerce_dict

@tscrim
Copy link
Collaborator

tscrim commented Oct 31, 2017

Commit: 5abd7b8

@tscrim
Copy link
Collaborator

tscrim commented Oct 31, 2017

New commits:

5abd7b8Clean up in coerce_dict

@jdemeyer

This comment has been minimized.

@jdemeyer

This comment has been minimized.

@jdemeyer

This comment has been minimized.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 1, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

dbf3831Clean up in coerce_dict

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 1, 2017

Changed commit from 5abd7b8 to dbf3831

@jdemeyer

This comment has been minimized.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 1, 2017

Changed commit from dbf3831 to 15fe12d

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 1, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

15fe12dClean up in coerce_dict

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 1, 2017

comment:10

cc chapoton because this is also relevant for Python 3.

@tscrim
Copy link
Collaborator

tscrim commented Nov 1, 2017

comment:11

Can you also justify 8 a little too? I was only able to get a loose understanding from a quick search.

Maybe I am too traditional, but why remove the explicit return values?

diff --git a/src/sage/structure/coerce_dict.pyx b/src/sage/structure/coerce_dict.pyx
index bbd1ca7..cc2d04f 100644
--- a/src/sage/structure/coerce_dict.pyx
+++ b/src/sage/structure/coerce_dict.pyx
@@ -821,16 +869,15 @@ cdef class MonoDict:
 #on cyclic GC)
 
 cdef int MonoDict_traverse(MonoDict op, visitproc visit, void *arg):
-    if op.table == NULL:
+    if op.table is NULL:
         return 0
     Py_VISIT3(<PyObject*>op.eraser, visit, arg)
     cdef size_t i
     for i in range(op.mask + 1):
         cursor = &op.table[i]
-        if cursor.key_id != NULL and cursor.key_id != dummy:
+        if valid(cursor.key_id):
             Py_VISIT3(cursor.key_weakref, visit, arg)
             Py_VISIT3(cursor.value, visit, arg)
-    return 0
@@ -859,16 +906,17 @@ cdef int MonoDict_clear(MonoDict op):
     del tmp
     for i in range(mask+1):
         cursor = &(table[i])
-        if cursor.key_id != NULL and cursor.key_id != dummy:
+        if valid(cursor.key_id):
             cursor.key_id = dummy
             Py_XDECREF(cursor.key_weakref)
             Py_XDECREF(cursor.value)
-    PyMem_Free(table)
-    return 0
+    sig_free(table)
+
 
 (<PyTypeObject*>MonoDict).tp_traverse = <traverseproc>(&MonoDict_traverse)
 (<PyTypeObject*>MonoDict).tp_clear = <inquiry>(&MonoDict_clear)
 
+
 cdef class TripleDictEraser:
     """
     Erases items from a :class:`TripleDict` when a weak reference becomes

I am assuming 5 is for better consistency with Python3. Although I think that items() should have a least a little one-line description saying that it is an iterator (and not a list/view).

Trivial and I know you essentially simply moved it, but only 2 spaces instead of 4:

    sage: K.<t> = GF(2^55)
    sage: for i in range(50):
    ....:   a = K.random_element()
    ....:   E = EllipticCurve(j=a)
    ....:   P = E.random_point()
    ....:   Q = 2*P

Also with its new placement, it does not need an #indirect doctest.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 2, 2017

comment:12

Replying to @tscrim:

Maybe I am too traditional, but why remove the explicit return values?

Conceptually, these functions do not return anything. It is really analogous to a plain Python function which always returns None. The int return value exists only for a technical reason, namely exception handling.

@jdemeyer

This comment has been minimized.

@jdemeyer

This comment has been minimized.

@jdemeyer

This comment has been minimized.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 2, 2017

Changed commit from 15fe12d to 347a7c5

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 2, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

347a7c5Clean up in coerce_dict

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 2, 2017

comment:17

Replying to @tscrim:

Can you also justify 8 a little too?

The new class ObjectWrapper is a simple replacement for a PyCapsule. It's simpler because it doesn't need a name, a context or a destructor. It literally just stores a pointer and nothing else. This simplicity and the @cython.freelist() should make it extremely fast too.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 2, 2017

comment:18

Replying to @tscrim:

I am assuming 5 is for better consistency with Python3.

Obviously. If we are implementing a kind of dict, we should use items() instead of iteritems().

Although I think that items() should have a least a little one-line description saying that it is an iterator (and not a list/view).

Done.

@jdemeyer

This comment has been minimized.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 2, 2017

comment:19

I ended up doing a lot more cleanup in this version. Please review.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 2, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

87a0193Clean up in coerce_dict

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 2, 2017

Changed commit from 347a7c5 to 87a0193

@tscrim
Copy link
Collaborator

tscrim commented Nov 2, 2017

comment:21

Thank you for the explanations. All of the changes make sense to me.

However, multiple patchbots are reporting multiple random seg faults. So something seems to be subtly breaking.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 2, 2017

comment:22

Replying to @tscrim:

However, multiple patchbots are reporting multiple random seg faults. So something seems to be subtly breaking.

I know, that was because I made a mistake in __dealloc__. I fixed that now, so we need to wait for new patchbot reports.

By the way, I realize that I'm making quite a lot of changes in this ticket. What started as a plan to fix the iteritems() calls ended up rewriting most of the coerce_dict.pyx file. However, most changes are pretty simple and do not change the functionality at all.

If there are certain changes that you cannot accept (either because you disagree or you don't understand), I can try to revert those.

@jdemeyer

This comment has been minimized.

@tscrim
Copy link
Collaborator

tscrim commented Nov 2, 2017

comment:23

Replying to @jdemeyer:

Replying to @tscrim:

However, multiple patchbots are reporting multiple random seg faults. So something seems to be subtly breaking.

I know, that was because I made a mistake in __dealloc__. I fixed that now, so we need to wait for new patchbot reports.

Ah, I see. I couldn't quite tell if the patchbot reports were current or not.

By the way, I realize that I'm making quite a lot of changes in this ticket. What started as a plan to fix the iteritems() calls ended up rewriting most of the coerce_dict.pyx file. However, most changes are pretty simple and do not change the functionality at all.

It's good to get things done.

If there are certain changes that you cannot accept (either because you disagree or you don't understand), I can try to revert those.

I think I understand all of the changes and agree with them (or at least do not have any reason to disagree). So if the latest patchbots come back clean, then it will be a positive review.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 2, 2017

comment:24

Replying to @tscrim:

Ah, I see. I couldn't quite tell if the patchbot reports were current or not.

I believe that the patchbot icon which is displayed on the Trac page (i.e. this page) only considers the up-to-date reports. As I'm writing this comment, I see a status of "pending", meaning that no patchbot has tested this latest version.

@tscrim
Copy link
Collaborator

tscrim commented Nov 2, 2017

comment:25

Replying to @jdemeyer:

Replying to @tscrim:

Ah, I see. I couldn't quite tell if the patchbot reports were current or not.

I believe that the patchbot icon which is displayed on the Trac page (i.e. this page) only considers the up-to-date reports. As I'm writing this comment, I see a status of "pending", meaning that no patchbot has tested this latest version.

I've seen it as pending before when one of them is still working but another has finished and came back green. Although that might have been on a much older version of the patchbot.

@tscrim
Copy link
Collaborator

tscrim commented Nov 3, 2017

comment:26

Greenbot.

@tscrim
Copy link
Collaborator

tscrim commented Nov 3, 2017

Reviewer: Travis Scrimshaw

@jdemeyer

This comment has been minimized.

@jdemeyer

This comment has been minimized.

@jdemeyer

This comment has been minimized.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 3, 2017

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

7c83e05Add comments on MonoDict/TripleDict attributes

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 3, 2017

Changed commit from 87a0193 to 7c83e05

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Nov 3, 2017

comment:31

Added some harmless comments.

@vbraun
Copy link
Member

vbraun commented Dec 11, 2017

Changed branch from u/jdemeyer/clean_up_in_coerce_dict to 7c83e05

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants