From 4f25877633763be50d89069452782226060607cc Mon Sep 17 00:00:00 2001 From: Forest Date: Tue, 30 Jul 2024 17:20:41 -0700 Subject: [PATCH] gh-55454: Add IMAP4 IDLE support to imaplib This extends imaplib with support for the rfc2177 IMAP IDLE command, as requested in #55454. It allows events to be pushed to a client as they occur, rather than having to continually poll for mailbox changes. The interface is a new idle() method, which returns an iterable context manager. Entering the context starts IDLE mode, during which events (untagged responses) can be retrieved using the iteration protocol. Exiting the context sends DONE to the server, ending IDLE mode. An optional time limit for the IDLE session is supported, for use with servers that impose an inactivity timeout. The context manager also offers a burst() method, designed for programs wishing to process events in batch rather than one at a time. Notable differences from other implementations: - It's an extension to imaplib, rather than a replacement. - It doesn't introduce additional threads. - It doesn't impose new requirements on the use of imaplib's existing methods. - It passes the unit tests in CPython's test/test_imaplib.py module (and adds new ones). - It works on Windows, Linux, and other unix-like systems. - It makes IDLE available on all of imaplib's client variants (including IMAP4_stream). - The interface is pythonic and easy to use. Caveats: - Due to a Windows limitation, the special case of IMAP4_stream running on Windows lacks a duration/timeout feature. (This is the stdin/stdout pipe connection variant; timeouts work fine for socket-based connections, even on Windows.) I have documented it where appropriate. - The file-like imaplib instance attributes are changed from buffered to unbuffered mode. This could potentially break any client code that uses those objects directly without expecting partial reads/writes. However, these attributes are undocumented. As such, I think (and PEP 8 confirms) that they are fair game for changes. https://peps.python.org/pep-0008/#public-and-internal-interfaces Usage examples: https://github.com/python/cpython/issues/55454#issuecomment-2227543041 Original discussion: https://discuss.python.org/t/gauging-interest-in-my-imap4-idle-implementation-for-imaplib/59272 Earlier requests and suggestions: https://github.com/python/cpython/issues/55454 https://mail.python.org/archives/list/python-ideas@python.org/thread/C4TVEYL5IBESQQPPS5GBR7WFBXCLQMZ2/ --- Doc/library/imaplib.rst | 101 ++++- Doc/whatsnew/3.14.rst | 6 + Lib/imaplib.py | 368 +++++++++++++++++- Lib/test/test_imaplib.py | 50 +++ Misc/ACKS | 1 + ...4-08-01-01-00-00.gh-issue-55454.wy0vGw.rst | 1 + 6 files changed, 507 insertions(+), 20 deletions(-) create mode 100644 Misc/NEWS.d/next/Library/2024-08-01-01-00-00.gh-issue-55454.wy0vGw.rst diff --git a/Doc/library/imaplib.rst b/Doc/library/imaplib.rst index a2dad58b00b9fa1..55bb66aad1efa7b 100644 --- a/Doc/library/imaplib.rst +++ b/Doc/library/imaplib.rst @@ -10,6 +10,7 @@ .. changes for IMAP4_SSL by Tino Lange , March 2002 .. changes for IMAP4_stream by Piers Lauder , November 2002 +.. changes for IDLE by Forest August 2024 **Source code:** :source:`Lib/imaplib.py` @@ -187,7 +188,7 @@ However, the *password* argument to the ``LOGIN`` command is always quoted. If you want to avoid having an argument string quoted (eg: the *flags* argument to ``STORE``) then enclose the string in parentheses (eg: ``r'(\Deleted)'``). -Each command returns a tuple: ``(type, [data, ...])`` where *type* is usually +Most commands return a tuple: ``(type, [data, ...])`` where *type* is usually ``'OK'`` or ``'NO'``, and *data* is either the text from the command response, or mandated results from the command. Each *data* is either a ``bytes``, or a tuple. If a tuple, then the first part is the header of the response, and the @@ -307,6 +308,48 @@ An :class:`IMAP4` instance has the following methods: of the IMAP4 QUOTA extension defined in rfc2087. +.. method:: IMAP4.idle([dur]) + + Return an iterable context manager implementing the ``IDLE`` command + as defined in :rfc:`2177`. + + The optional *dur* argument specifies a maximum duration (in seconds) to + keep idling. It defaults to ``None``, meaning no time limit. + To avoid inactivity timeouts on servers that impose them, callers are + advised to keep this <= 29 minutes. See the note below regarding + :class:`IMAP4_stream` on Windows. + + The context manager sends the ``IDLE`` command upon entry, produces + responses via iteration, and sends ``DONE`` upon exit. + It represents responses as ``(type, datum)`` tuples, rather than the + ``(type, [data, ...])`` tuples returned by other methods, because only + one response is represented at a time. + + Example:: + + with M.idle(dur=29*60) as idler: + for response in idler: + typ, datum = response + print(typ, datum) + + It is also possible to process a burst of responses all at once instead + of one at a time. See `IDLE Context Manager`_ for details. + + Responses produced by the iterator will not be returned by + :meth:`IMAP4.response`. + + .. note:: + + Windows :class:`IMAP4_stream` connections have no way to accurately + respect *dur*, since Windows ``select()`` only works on sockets. + However, if the server regularly sends status messages during ``IDLE``, + they will wake our selector and keep iteration from blocking for long. + Dovecot's ``imap_idle_notify_interval`` is two minutes by default. + Assuming that's typical of IMAP servers, subtracting it from the 29 + minutes needed to avoid server inactivity timeouts would make 27 + minutes a sensible value for *dur* in this situation. + + .. method:: IMAP4.list([directory[, pattern]]) List mailbox names in *directory* matching *pattern*. *directory* defaults to @@ -612,6 +655,62 @@ The following attributes are defined on instances of :class:`IMAP4`: .. versionadded:: 3.5 +.. _idle context manager: + +IDLE Context Manager +-------------------- + +The object returned by :meth:`IMAP4.idle` implements the context management +protocol for the :keyword:`with` statement, and the :term:`iterator` protocol +for retrieving untagged responses while the context is active. +It also has the following method: + +.. method:: IdleContextManager.burst([interval]) + + Yield a burst of responses no more than *interval* seconds apart. + + This generator retrieves the next response along with any + immediately available subsequent responses (e.g. a rapid series of + ``EXPUNGE`` responses after a bulk delete) so they can be efficiently + processed as a batch instead of one at a time. + + The optional *interval* argument specifies a time limit (in seconds) + for each response after the first. It defaults to 0.1 seconds. + (The ``IDLE`` context's maximum duration is respected when waiting for the + first response.) + + Represents responses as ``(type, datum)`` tuples, just as when + iterating directly on the context manager. + + Example:: + + with M.idle() as idler: + + # get the next response and any others following by < 0.1 seconds + batch = list(idler.burst()) + + print(f'processing {len(batch)} responses...') + for typ, datum in batch: + print(typ, datum) + + Produces no responses and returns immediately if the ``IDLE`` context's + maximum duration (the *dur* argument to :meth:`IMAP4.idle`) has elapsed. + Callers should plan accordingly if using this method in a loop. + + .. note:: + + Windows :class:`IMAP4_stream` connections will ignore the *interval* + argument, yielding endless responses and blocking indefinitely for each + one, since Windows ``select()`` only works on sockets. It is therefore + advised not to use this method with an :class:`IMAP4_stream` connection + on Windows. + +.. note:: + + The context manager's type name is not part of its public interface, + and is subject to change. + + .. _imap4-example: IMAP4 Example diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 088f70d9e9fad4a..58f5ec82b03af62 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -117,6 +117,12 @@ Added support for converting any objects that have the :meth:`!as_integer_ratio` method to a :class:`~fractions.Fraction`. (Contributed by Serhiy Storchaka in :gh:`82017`.) +imaplib +------- + +* Add :meth:`~imaplib.IMAP4.idle`, implementing the ``IDLE`` command + as defined in :rfc:`2177`. (Contributed by Forest in :gh:`55454`.) + json ---- diff --git a/Lib/imaplib.py b/Lib/imaplib.py index e576c29e67dc0a1..7bcbe4912191d83 100644 --- a/Lib/imaplib.py +++ b/Lib/imaplib.py @@ -19,10 +19,22 @@ # GET/SETQUOTA contributed by Andreas Zeidler June 2002. # PROXYAUTH contributed by Rick Holbert November 2002. # GET/SETANNOTATION contributed by Tomas Lindroos June 2005. - -__version__ = "2.58" - -import binascii, errno, random, re, socket, subprocess, sys, time, calendar +# IDLE contributed by Forest August 2024. + +__version__ = "2.59" + +import binascii +import calendar +import errno +import functools +import platform +import random +import re +import selectors +import socket +import subprocess +import sys +import time from datetime import datetime, timezone, timedelta from io import DEFAULT_BUFFER_SIZE @@ -74,6 +86,7 @@ 'GETANNOTATION':('AUTH', 'SELECTED'), 'GETQUOTA': ('AUTH', 'SELECTED'), 'GETQUOTAROOT': ('AUTH', 'SELECTED'), + 'IDLE': ('AUTH', 'SELECTED'), 'MYRIGHTS': ('AUTH', 'SELECTED'), 'LIST': ('AUTH', 'SELECTED'), 'LOGIN': ('NONAUTH',), @@ -192,10 +205,13 @@ def __init__(self, host='', port=IMAP4_PORT, timeout=None): self.tagged_commands = {} # Tagged commands awaiting response self.untagged_responses = {} # {typ: [data, ...], ...} self.continuation_response = '' # Last continuation response + self._idle_responses = [] # Response queue for idle iteration + self._idle_capture = False # Whether to queue responses for idle self.is_readonly = False # READ-ONLY desired state self.tagnum = 0 self._tls_established = False self._mode_ascii() + self._readbuf = b'' # Open socket to server. @@ -315,14 +331,58 @@ def open(self, host='', port=IMAP4_PORT, timeout=None): def read(self, size): """Read 'size' bytes from remote.""" - return self.file.read(size) + # Read from an unbuffered input, so our select() calls will not be + # defeated by a hidden library buffer. Use our own buffer instead, + # which can be examined before calling select(). + if isinstance(self, IMAP4_stream): + read = self.readfile.read + else: + read = self.sock.recv + + parts = [] + while True: + if len(self._readbuf) >= size: + parts.append(self._readbuf[:size]) + self._readbuf = self._readbuf[size:] + break + parts.append(self._readbuf) + size -= len(self._readbuf) + self._readbuf = read(DEFAULT_BUFFER_SIZE) + if not self._readbuf: + break + return b''.join(parts) def readline(self): """Read line from remote.""" - line = self.file.readline(_MAXLINE + 1) + # Read from an unbuffered input, so our select() calls will not be + # defeated by a hidden library buffer. Use our own buffer instead, + # which can be examined before calling select(). + if isinstance(self, IMAP4_stream): + read = self.readfile.read + else: + read = self.sock.recv + + LF = b'\n' + parts = [] + length = 0 + while length < _MAXLINE: + try: + pos = self._readbuf.index(LF) + 1 + parts.append(self._readbuf[:pos]) + length += len(parts[-1]) + self._readbuf = self._readbuf[pos:] + break + except ValueError: + parts.append(self._readbuf) + length += len(parts[-1]) + self._readbuf = read(DEFAULT_BUFFER_SIZE) + if not self._readbuf: + break + + line = b''.join(parts) if len(line) > _MAXLINE: - raise self.error("got more than %d bytes" % _MAXLINE) + raise self.error(f'got more than {_MAXLINE} bytes') return line @@ -588,6 +648,44 @@ def getquotaroot(self, mailbox): return typ, [quotaroot, quota] + def idle(self, dur=None): + """Return an iterable context manager implementing the IDLE command + + :param dur: Maximum duration (in seconds) to keep idling, + or None for no time limit. + To avoid inactivity timeouts on servers that impose + them, callers are advised to keep this <= 29 minutes. + See the note below regarding IMAP4_stream on Windows. + :type dur: int|float|None + + The context manager sends the IDLE command upon entry, produces + responses via iteration, and sends DONE upon exit. + It represents responses as (type, datum) tuples, rather than the + (type, [data, ...]) tuples returned by other methods, because only one + response is represented at a time. + + Example: + + with imap.idle(dur=29*60) as idler: + for response in idler: + typ, datum = response + print(typ, datum) + + Responses produced by the iterator are not added to the internal + cache for retrieval by response(). + + Note: Windows IMAP4_stream connections have no way to accurately + respect 'dur', since Windows select() only works on sockets. + However, if the server regularly sends status messages during IDLE, + they will wake our selector and keep iteration from blocking for long. + Dovecot's imap_idle_notify_interval is two minutes by default. + Assuming that's typical of IMAP servers, subtracting it from the 29 + minutes needed to avoid server inactivity timeouts would make 27 + minutes a sensible value for 'dur' in this situation. + """ + return _Idler(self, dur) + + def list(self, directory='""', pattern='*'): """List mailbox names in directory matching pattern. @@ -944,6 +1042,14 @@ def xatom(self, name, *args): def _append_untagged(self, typ, dat): if dat is None: dat = b'' + + # During idle, queue untagged responses for delivery via iteration + if self._idle_capture: + self._idle_responses.append((typ, dat)) + if __debug__ and self.debug >= 5: + self._mesg(f'idle: queue untagged {typ} {dat!r}') + return + ur = self.untagged_responses if __debug__: if self.debug >= 5: @@ -1279,6 +1385,236 @@ def print_log(self): n -= 1 +class _Idler: + # Iterable context manager: start IDLE & produce untagged responses + # + # This iterator produces (type, datum) tuples. They slightly differ + # from the tuples returned by IMAP4.response(): The second item in the + # tuple is a single datum, rather than a list of them, because only one + # untagged response is produced at a time. + + def __init__(self, imap, dur=None): + if 'IDLE' not in imap.capabilities: + raise imap.error("Server does not support IDLE") + self._dur = dur + self._imap = imap + self._tag = None + self._sock_timeout = None + self._old_state = None + + def __enter__(self): + imap = self._imap + assert not (imap._idle_responses or imap._idle_capture) + + if __debug__ and imap.debug >= 4: + imap._mesg('idle start' + + ('' if self._dur is None else f' dur={self._dur}')) + + try: + # Start capturing untagged responses before sending IDLE, + # so we can deliver via iteration any that arrive while + # the IDLE command continuation request is still pending. + imap._idle_capture = True + + self._tag = imap._command('IDLE') + # Process responses until the server requests continuation + while resp := imap._get_response(): # Returns None on continuation + if imap.tagged_commands[self._tag]: + raise imap.abort(f'unexpected status response: {resp}') + + if __debug__ and imap.debug >= 4: + prompt = imap.continuation_response + imap._mesg(f'idle continuation prompt: {prompt}') + except: + imap._idle_capture = False + raise + + self._sock_timeout = imap.sock.gettimeout() if imap.sock else None + if self._sock_timeout is not None: + imap.sock.settimeout(None) # Socket timeout would break IDLE + + self._old_state = imap.state + imap.state = 'IDLING' + + return self + + def __iter__(self): + return self + + def _wait(self, timeout=None): + # Block until the next read operation should be attempted, either + # because data becomes availalable within 'timeout' seconds or + # because the OS cannot determine whether data is available. + # Return True when a blocking read() is worth trying + # Return False if the timeout expires while waiting + + imap = self._imap + if timeout is None: + return True + if imap._readbuf: + return True + if timeout <= 0: + return False + + if imap.sock: + fileobj = imap.sock + elif platform.system() == 'Windows': + return True # Cannot select(); allow a possibly-blocking read + else: + fileobj = imap.readfile + + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle _wait select({timeout})') + + with selectors.DefaultSelector() as sel: + sel.register(fileobj, selectors.EVENT_READ) + readables = sel.select(timeout) + return bool(readables) + + def _pop(self, timeout, default=('', None)): + # Get the next response, or a default value on timeout + # + # :param timeout: Time limit (in seconds) to wait for response + # :type timeout: int|float|None + # :param default: Value to return on timeout + # + # Note: This method ignores 'dur' in favor of the timeout argument. + # + # Note: Windows IMAP4_stream connections will ignore the timeout + # argument and block until the next response arrives, because + # Windows select() only works on sockets. + + imap = self._imap + if imap.state != 'IDLING': + raise imap.error('_pop() only works during IDLE') + + if imap._idle_responses: + resp = imap._idle_responses.pop(0) + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle _pop({timeout}) de-queued {resp[0]}') + return resp + + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle _pop({timeout})') + + if not self._wait(timeout): + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle _pop({timeout}) done') + return default + + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle _pop({timeout}) reading') + imap._get_response() # Reads line, calls _append_untagged() + resp = imap._idle_responses.pop(0) + + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle _pop({timeout}) read {resp[0]}') + return resp + + def burst(self, interval=0.1): + """Yield a burst of responses no more than 'interval' seconds apart + + :param interval: Time limit for each response after the first + (The IDLE context's maximum duration is + respected when waiting for the first response.) + :type interval: int|float + + This generator retrieves the next response along with any + immediately available subsequent responses (e.g. a rapid series of + EXPUNGE responses after a bulk delete) so they can be efficiently + processed as a batch instead of one at a time. + + Represents responses as (type, datum) tuples, just as when + iterating directly on the context manager. + + Example: + + with imap.idle() as idler: + batch = list(idler.burst()) + print(f'processing {len(batch)} responses...') + + Produces no responses and returns immediately if the IDLE + context's maximum duration (the 'dur' argument) has elapsed. + Callers should plan accordingly if using this method in a loop. + + Note: Windows IMAP4_stream connections will ignore the interval + argument, yielding endless responses and blocking indefinitely + for each one, because Windows select() only works on sockets. + It is therefore advised not to use this method with an IMAP4_stream + connection on Windows. + """ + try: + yield next(self) + except StopIteration: + return + + start = time.monotonic() + + yield from iter(functools.partial(self._pop, interval, None), None) + + if self._dur is not None: + elapsed = time.monotonic() - start + self._dur = max(self._dur - elapsed, 0) + + def __next__(self): + imap = self._imap + start = time.monotonic() + + typ, datum = self._pop(self._dur) + + if self._dur is not None: + elapsed = time.monotonic() - start + self._dur = max(self._dur - elapsed, 0) + + if not typ: + if __debug__ and imap.debug >= 4: + imap._mesg('idle iterator exhausted') + raise StopIteration + + return typ, datum + + def __exit__(self, exc_type, exc_val, exc_tb): + imap = self._imap + + if __debug__ and imap.debug >= 4: + imap._mesg('idle done') + imap.state = self._old_state + + if self._sock_timeout is not None: + imap.sock.settimeout(self._sock_timeout) + self._sock_timeout = None + + # Stop intercepting untagged responses before sending DONE, + # since we can no longer deliver them via iteration. + imap._idle_capture = False + + # If we captured untagged responses while the IDLE command + # continuation request was still pending, but the user did not + # iterate over them before exiting IDLE, we must put them + # someplace where the user can retrieve them. The only + # sensible place for this is the untagged_responses dict, + # despite its unfortunate inability to preserve the relative + # order of different response types. + if leftovers := len(imap._idle_responses): + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle quit with {leftovers} leftover responses') + while imap._idle_responses: + typ, datum = imap._idle_responses.pop(0) + imap._append_untagged(typ, datum) + + try: + imap.send(b'DONE' + CRLF) + status, [msg] = imap._command_complete('IDLE', self._tag) + if __debug__ and imap.debug >= 4: + imap._mesg(f'idle status: {status} {msg!r}') + + except OSError: + if not exc_type: + raise + + return False # Do not suppress context body exceptions + + if HAVE_SSL: class IMAP4_SSL(IMAP4): @@ -1348,26 +1684,20 @@ def open(self, host=None, port=None, timeout=None): self.sock = None self.file = None self.process = subprocess.Popen(self.command, - bufsize=DEFAULT_BUFFER_SIZE, + bufsize=0, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True, close_fds=True) self.writefile = self.process.stdin self.readfile = self.process.stdout - def read(self, size): - """Read 'size' bytes from remote.""" - return self.readfile.read(size) - - - def readline(self): - """Read line from remote.""" - return self.readfile.readline() - def send(self, data): """Send data to remote.""" - self.writefile.write(data) - self.writefile.flush() + # Write with buffered semantics to the unbuffered output, avoiding + # partial writes. + sent = 0 + while sent < len(data): + sent += self.writefile.write(data[sent:]) def shutdown(self): diff --git a/Lib/test/test_imaplib.py b/Lib/test/test_imaplib.py index 1fd75d0a3f4c7b3..374a07f2e59108e 100644 --- a/Lib/test/test_imaplib.py +++ b/Lib/test/test_imaplib.py @@ -497,6 +497,56 @@ def test_with_statement_logout(self): # command tests + def test_idle_capability(self): + client, _ = self._setup(SimpleIMAPHandler) + with self.assertRaisesRegex(imaplib.IMAP4.error, + 'does not support IDLE'): + with client.idle(): + pass + + class IdleCmdHandler(SimpleIMAPHandler): + capabilities = 'IDLE' + def cmd_IDLE(self, tag, args): + self._send_textline('+ idling') + self._send_line(b'* 2 EXISTS') + self._send_line(b'* 0 RECENT') + time.sleep(1) + self._send_line(b'* 1 RECENT') + r = yield + if r == b'DONE\r\n': + self._send_tagged(tag, 'OK', 'Idle completed') + else: + self._send_tagged(tag, 'BAD', 'Expected DONE') + + def test_idle_iter(self): + client, _ = self._setup(self.IdleCmdHandler) + client.login('user', 'pass') + with client.idle() as idler: + # iteration should produce responses + typ, datum = next(idler) + self.assertEqual(typ, 'EXISTS') + self.assertEqual(datum, b'2') + typ, datum = next(idler) + self.assertEqual(typ, 'RECENT') + self.assertEqual(datum, b'0') + # iteration should have consumed untagged responses + _, data = client.response('EXISTS') + self.assertEqual(data, [None]) + # responses not iterated should remain after idle + _, data = client.response('RECENT') + self.assertEqual(data, [b'1']) + + def test_idle_burst(self): + client, _ = self._setup(self.IdleCmdHandler) + client.login('user', 'pass') + # burst() should yield immediately available responses + with client.idle() as idler: + batch = list(idler.burst()) + self.assertEqual(len(batch), 2) + # burst() should not have consumed later responses + _, data = client.response('RECENT') + self.assertEqual(data, [b'1']) + def test_login(self): client, _ = self._setup(SimpleIMAPHandler) typ, data = client.login('user', 'pass') diff --git a/Misc/ACKS b/Misc/ACKS index b031eb7c11f73f5..c4605c8de2016c7 100644 --- a/Misc/ACKS +++ b/Misc/ACKS @@ -572,6 +572,7 @@ Benjamin Fogle Artem Fokin Arnaud Fontaine Michael Foord +Forest Amaury Forgeot d'Arc Doug Fort Daniel Fortunov diff --git a/Misc/NEWS.d/next/Library/2024-08-01-01-00-00.gh-issue-55454.wy0vGw.rst b/Misc/NEWS.d/next/Library/2024-08-01-01-00-00.gh-issue-55454.wy0vGw.rst new file mode 100644 index 000000000000000..58fc85963217c93 --- /dev/null +++ b/Misc/NEWS.d/next/Library/2024-08-01-01-00-00.gh-issue-55454.wy0vGw.rst @@ -0,0 +1 @@ +Add IMAP4 ``IDLE`` support to the :mod:`imaplib` module. Patch by Forest.