Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing iterators in PyMappingProtocol #611

Closed
sebpuetz opened this issue Oct 3, 2019 · 6 comments
Closed

Implementing iterators in PyMappingProtocol #611

sebpuetz opened this issue Oct 3, 2019 · 6 comments
Labels

Comments

@sebpuetz
Copy link
Contributor

sebpuetz commented Oct 3, 2019

🐛 Bug Reports

When reporting a bug, please provide the following information. If this is not a bug report you can just discard this template.

🌍 Environment

  • Your operating system and version: Ubuntu 18.04
  • Your python version: 3.73
  • How did you install python (e.g. apt or pyenv)? Did you use a virtualenv?: pyenv
  • Your rust version (rustc --version): rustc 1.38.0-nightly (311376d30 2019-07-18)
  • Are you using the latest pyo3 version? Have you tried using latest master (replace version = "0.x.y" with git = "https://github.com/PyO3/pyo3")? yes

💥 Reproducing

Implementing PyIterProtocol for a type implementing PyMappingProtocol seems to overwrite __iter__ in Python. Invoking an iterator through any kind of statement calls PyIterProtocol::__iter__. PyMappingProtocol::__iter__ is not accessible. Checking whether an element is in mapping through the in keyword runs an iterator and compares items instead of using __contains__. __contains__ can be invoked explicitly through PyMapping.__contains__.

If the implementation of PyIterProtocol is removed for PyMapping, statements such as:

import pymapping
mapping = pymapping.Mapping(["a", "b", "c", "d"]) 
for k in mapping:
    print(k)

print("a" in mapping)

fail by stating that mapping is not iteratable. Although, the iterator can be called explicitly through PyMapping.__iter__ but using iter() on a PyMapping object fails:

In [1]: import pymapping                                                                                                                                                                             

In [2]: mapping = pymapping.Mapping(["a", "b", "c", "d"])                                                                                                                                            

In [3]: iter(mapping)                                                                                                                                                                                
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-73d010d9f2e9> in <module>
----> 1 iter(mapping)

TypeError: 'Mapping' object is not iterable

In [4]: mapping.__iter__()                                                                                                                                                                           
own_iter
Out[4]: <MappingIter at 0x7f234178de40>
#![feature(specialization)]

#[pymodule]
fn pymapping(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_class::<PyMapping>()?;
    Ok(())
}

use pyo3::prelude::*;
use pyo3::{PyIterProtocol, PyMappingProtocol};

use std::collections::HashMap;
use std::vec;

#[pyclass(name = Mapping)]
pub struct PyMapping {
    map: HashMap<String, usize>,
}

#[pymethods]
impl PyMapping {
    #[new]
    #[args(mmap = false)]
    fn __new__(obj: &PyRawObject, keys: Vec<String>) -> PyResult<()> {
        let map = keys.into_iter().enumerate().map(|(x, y)| (y, x)).collect();

        obj.init(PyMapping { map });

        Ok(())
    }
}

#[pyproto]
impl PyMappingProtocol for PyMapping {
    fn __len__(&self) -> PyResult<usize> {
        Ok(self.map.len())
    }

    fn __getitem__(&self, key: String) -> PyResult<usize> {
        Ok(self.map.get(&key).cloned().unwrap_or_default())
    }

    fn __iter__(&self, py: Python) -> PyResult<PyObject> {
        println!("mapping iter");
        let iter = IntoPy::into_py(
            Py::new(py, PyMappingIter::new(self.map.iter().map(|(k, _)| k.to_owned()).collect()))?,
            py,
        );

        Ok(iter)
    }

    fn __contains__(&self, word: String) -> PyResult<bool> {
        println!("contains");
        Ok(self.map.contains_key(&word))
    }
}

#[pyproto]
impl PyIterProtocol for PyMapping {
    fn __iter__(slf: PyRefMut<Self>) -> PyResult<PyObject> {
        let mapping = &*slf;
        println!("iter protocol");
        let gil = Python::acquire_gil();
        let py = gil.python();
        let iter = IntoPy::into_py(
            Py::new(py, PyMappingIter::new(mapping.map.iter().map(|(k, _)| k.to_owned()).collect()))?,
            py,
        );

        Ok(iter)
    }
}

#[pyclass(name = MappingIter)]
pub struct PyMappingIter {
    v: vec::IntoIter<String>,
}

impl PyMappingIter {
    pub fn new(v: Vec<String>) -> Self {
        PyMappingIter { v: v.into_iter() }
    }
}

#[pyproto]
impl PyIterProtocol for PyMappingIter {
    fn __iter__(slf: PyRefMut<Self>) -> PyResult<Py<PyMappingIter>> {
        Ok(slf.into())
    }

    fn __next__(mut slf: PyRefMut<Self>) -> PyResult<Option<String>> {
        let slf = &mut *slf;
        Ok(slf.v.next())
    }
}

Please also write what exact flags are required to reproduce your results.

@kngwyu
Copy link
Member

kngwyu commented Oct 12, 2019

Thank you for reporting.

@kngwyu kngwyu added the bug label Oct 12, 2019
@Alexander-N
Copy link
Member

I guess the problem is that tp_iter has to be implemented for any iterable type, but is only present in PyIterIterProtocol.

@sebpuetz
Copy link
Contributor Author

Would that explain __contains__ being ignored tho? As far as I can tell, it's only accessible on PyMapping types through explicit calls to __contains__().

@sebpuetz
Copy link
Contributor Author

sebpuetz commented Oct 24, 2019

After spending quite some time digging around CPython, it looks like in and not in don't call __contains__ but PySequence_Contains.

https://github.com/python/cpython/blob/96b06aefe23521b61e4e9cdd44f5d30b00c7eb95/Python/ceval.c#L5099-L5109

PySequence_Contains is also used in CPython to check membership in PyDicts:

https://github.com/python/cpython/blob/96b06aefe23521b61e4e9cdd44f5d30b00c7eb95/Objects/dictobject.c#L4364-L4376

This means that both PySequenceProtocol and PyMappingProtocol need to be implemented if I want statements such as x in mapping and the mapping functions to work as expected. Which actually solves the issue:

#[pyproto]
impl PyMappingProtocol for PyMapping {
    fn __getitem__(&self, item: String) -> PyResult<String> {
        Ok(String::new())
    }
}

#[pyproto]
impl PySequenceProtocol for PyMapping {
    fn __contains__(&self, word: String) -> PyResult<bool> {
        Ok(self.map.contains_key(&word))
    }
}

#[pyproto]
impl PyIterProtocol for PyMapping {
    fn __iter__(slf: PyRefMut<Self>) -> PyResult<PyObject> {
        let mapping = &*slf;
        let gil = Python::acquire_gil();
        let py = gil.python();
        let iter = IntoPy::into_py(
            Py::new(py, PyMappingIter::new(mapping.map.iter().map(|(k, _)| k.to_owned()).collect()))?,
            py,
        );

        Ok(iter)
    }
}
import mapping
m = mapping.Mapping(["Test", "test", "t"])
"Test" in m # calls PySequenceMapping::__contains__
print([x for x in m]) # calls PyIterProtocol::__iter__

So, it seems like PyMappingProtocol::__contains__ isn't actually expected by Python and just gets added as a regular instance method which is why it was only accessible through explicit calls. in seems to look for sq_contains which falls back to a traversal if the slot is null. This fallback in turn explains why x in mapping raised TypeError: 'Mapping' object is not iterable.

PyMappingProtocol::__iter__ looks similar, since I get the expected iteration support by implementing PyIterProtocol::__iter__.

sebpuetz added a commit to finalfusion/finalfusion-python that referenced this issue Oct 24, 2019
In order to get arbitrary keys, PyMappingProtocol::__getitem__
needs to be implemented. To get O(1) __contains__,
PySequenceProtocol::__contains__ needs to be implemented. To get
proper Iteration support, PyIterProtocol::__iter__ needs to be
implemented.

PyO3/pyo3#611

This commit adds the correct implementation of the three traits
to PyVocab.
sebpuetz added a commit to finalfusion/finalfusion-python that referenced this issue Oct 24, 2019
In order to get arbitrary keys, PyMappingProtocol::__getitem__
needs to be implemented. To get O(1) __contains__,
PySequenceProtocol::__contains__ needs to be implemented. To get
proper Iteration support, PyIterProtocol::__iter__ needs to be
implemented.

PyO3/pyo3#611

This commit adds the correct implementation of the three traits
to PyVocab.
@Alexander-N
Copy link
Member

Thanks, your analysis looks spot-on to me! I guess we should remove __iter__ and __contains__ from PyMappingProtocol.

sebpuetz added a commit to sebpuetz/pyo3 that referenced this issue Oct 25, 2019
The methods are not expected by CPython and are only explicitly
callable. To get iteration support, PyIterProtocol should be
implemented and to get support for `x in mapping`,
PySequenceProtocol's __contains__ should be implemented.

PyO3#611
sebpuetz added a commit to sebpuetz/pyo3 that referenced this issue Oct 25, 2019
The methods are not expected by CPython and are only explicitly
callable. To get iteration support, PyIterProtocol should be
implemented and to get support for `x in mapping`,
PySequenceProtocol's __contains__ should be implemented.

PyO3#611
sebpuetz added a commit to sebpuetz/pyo3 that referenced this issue Oct 25, 2019
The methods are not expected by CPython and are only explicitly
callable. To get iteration support, PyIterProtocol should be
implemented and to get support for `x in mapping`,
PySequenceProtocol's __contains__ should be implemented.

PyO3#611
sebpuetz added a commit to finalfusion/finalfusion-python that referenced this issue Oct 25, 2019
In order to get arbitrary keys, PyMappingProtocol::__getitem__
needs to be implemented. To get O(1) __contains__,
PySequenceProtocol::__contains__ needs to be implemented. To get
proper Iteration support, PyIterProtocol::__iter__ needs to be
implemented.

PyO3/pyo3#611

This commit adds the correct implementation of the three traits
to PyVocab.
sebpuetz added a commit to finalfusion/finalfusion-python that referenced this issue Oct 25, 2019
In order to get arbitrary keys, PyMappingProtocol::__getitem__
needs to be implemented. To get O(1) __contains__,
PySequenceProtocol::__contains__ needs to be implemented. To get
proper Iteration support, PyIterProtocol::__iter__ needs to be
implemented.

PyO3/pyo3#611

This commit adds the correct implementation of the three traits
to PyVocab.
sebpuetz added a commit to sebpuetz/pyo3 that referenced this issue Oct 25, 2019
The methods are not expected by CPython and are only explicitly
callable. To get iteration support, PyIterProtocol should be
implemented and to get support for `x in mapping`,
PySequenceProtocol's __contains__ should be implemented.

PyO3#611
sebpuetz added a commit to sebpuetz/pyo3 that referenced this issue Oct 25, 2019
The methods are not expected by CPython and are only explicitly
callable. To get iteration support, PyIterProtocol should be
implemented and to get support for `x in mapping`,
PySequenceProtocol's __contains__ should be implemented.

PyO3#611
sebpuetz added a commit to finalfusion/finalfusion-python that referenced this issue Oct 25, 2019
In order to get arbitrary keys, PyMappingProtocol::__getitem__
needs to be implemented. To get O(1) __contains__,
PySequenceProtocol::__contains__ needs to be implemented. To get
proper Iteration support, PyIterProtocol::__iter__ needs to be
implemented.

PyO3/pyo3#611

This commit adds the correct implementation of the three traits
to PyVocab.
sebpuetz added a commit to finalfusion/finalfusion-python that referenced this issue Oct 26, 2019
In order to get arbitrary keys, PyMappingProtocol::__getitem__
needs to be implemented. To get O(1) __contains__,
PySequenceProtocol::__contains__ needs to be implemented. To get
proper Iteration support, PyIterProtocol::__iter__ needs to be
implemented.

PyO3/pyo3#611

This commit adds the correct implementation of the three traits
to PyVocab.
sebpuetz added a commit to finalfusion/finalfusion-python that referenced this issue Oct 26, 2019
In order to get arbitrary keys, PyMappingProtocol::__getitem__
needs to be implemented. To get O(1) __contains__,
PySequenceProtocol::__contains__ needs to be implemented. To get
proper Iteration support, PyIterProtocol::__iter__ needs to be
implemented.

PyO3/pyo3#611

This commit adds the correct implementation of the three traits
to PyVocab.
@Alexander-N
Copy link
Member

Closed by #644.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants