Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119511: Fix OOM vulnerability in imaplib #119514

Merged
merged 6 commits into from
Jan 27, 2025

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented May 24, 2024

The IMAP4 client could consume an arbitrary amount of memory when trying to connent to a malicious server, because it read a "literal" data with a single read(size) call, and BufferedReader.read() allocates the bytes object of the specified size before reading. Now the IMAP4 client reads data by chunks, therefore the amount of used memory is limited by the amount of the data actually been sent by the server.

The IMAP4 client could consume an arbitrary amount of memory when trying
to connent to a malicious server, because it read a "literal" data with a
single read(size) call, and BufferedReader.read() allocates the bytes
object of the specified size before reading. Now the IMAP4 client reads data
by chunks, therefore the amount of used memory is limited by the
amount of the data actually been sent by the server.
@serhiy-storchaka serhiy-storchaka added type-security A security issue needs backport to 3.8 needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes stdlib Python modules in the Lib dir release-blocker needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels May 24, 2024
@serhiy-storchaka serhiy-storchaka requested a review from a team as a code owner May 24, 2024 18:15
@gpshead gpshead marked this pull request as draft May 24, 2024 19:59
@gpshead
Copy link
Member

gpshead commented May 24, 2024

I've marked this Draft for now as discussion on this on the security response team list is not complete. (we'll summarize that in a public issue once it has settled)

Lib/imaplib.py Outdated
delta = min(cursize, size - cursize)
data += self.file.read(delta)
cursize += delta
return data
return self.file.read(size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've missed erasing this part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Lib/imaplib.py Show resolved Hide resolved
Lib/imaplib.py Outdated
delta = min(cursize, size - cursize)
data += self.file.read(delta)
cursize += delta
return data
return self.file.read(size)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Lib/imaplib.py Outdated Show resolved Hide resolved
@encukou
Copy link
Member

encukou commented Jun 7, 2024

Since this did not block a release, I'm switching it to deferred blocker.

@encukou
Copy link
Member

encukou commented Jan 14, 2025

I've marked this Draft for now as discussion on this on the security response team list is not complete.

@gpshead, is the discussion still going on?

@encukou
Copy link
Member

encukou commented Jan 20, 2025

@SethMichaelLarson Could you confirm that the PSRT is still discussing this?

@miss-islington-app
Copy link

Thanks @serhiy-storchaka for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10, 3.11, 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 27, 2025
…GH-119514)

The IMAP4 client could consume an arbitrary amount of memory when trying
to connect to a malicious server, because it read a "literal" data with a
single read(size) call, and BufferedReader.read() allocates the bytes
object of the specified size before reading. Now the IMAP4 client reads data
by chunks, therefore the amount of used memory is limited by the
amount of the data actually been sent by the server.
(cherry picked from commit 735f25c)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Jan 27, 2025

GH-129355 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Jan 27, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 27, 2025
…GH-119514)

The IMAP4 client could consume an arbitrary amount of memory when trying
to connect to a malicious server, because it read a "literal" data with a
single read(size) call, and BufferedReader.read() allocates the bytes
object of the specified size before reading. Now the IMAP4 client reads data
by chunks, therefore the amount of used memory is limited by the
amount of the data actually been sent by the server.
(cherry picked from commit 735f25c)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Jan 27, 2025

GH-129356 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Jan 27, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 27, 2025
…GH-119514)

The IMAP4 client could consume an arbitrary amount of memory when trying
to connect to a malicious server, because it read a "literal" data with a
single read(size) call, and BufferedReader.read() allocates the bytes
object of the specified size before reading. Now the IMAP4 client reads data
by chunks, therefore the amount of used memory is limited by the
amount of the data actually been sent by the server.
(cherry picked from commit 735f25c)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Jan 27, 2025

GH-129357 is a backport of this pull request to the 3.11 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.11 only security fixes label Jan 27, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 27, 2025
…GH-119514)

The IMAP4 client could consume an arbitrary amount of memory when trying
to connect to a malicious server, because it read a "literal" data with a
single read(size) call, and BufferedReader.read() allocates the bytes
object of the specified size before reading. Now the IMAP4 client reads data
by chunks, therefore the amount of used memory is limited by the
amount of the data actually been sent by the server.
(cherry picked from commit 735f25c)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
@miss-islington-app
Copy link

Sorry, @serhiy-storchaka and @encukou, I could not cleanly backport this to 3.9 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 735f25c5e3a0f74438c86468ec4dfbe219d93c91 3.9

@bedevere-app
Copy link

bedevere-app bot commented Jan 27, 2025

GH-129358 is a backport of this pull request to the 3.10 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.10 only security fixes label Jan 27, 2025
gpshead added a commit that referenced this pull request Jan 27, 2025
…9514) (GH-129355)

gh-119511: Fix a potential denial of service in imaplib (GH-119514)

The IMAP4 client could consume an arbitrary amount of memory when trying
to connect to a malicious server, because it read a "literal" data with a
single read(size) call, and BufferedReader.read() allocates the bytes
object of the specified size before reading. Now the IMAP4 client reads data
by chunks, therefore the amount of used memory is limited by the
amount of the data actually been sent by the server.
(cherry picked from commit 735f25c)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
gpshead added a commit that referenced this pull request Jan 27, 2025
…9514) (GH-129356)

gh-119511: Fix a potential denial of service in imaplib (GH-119514)

The IMAP4 client could consume an arbitrary amount of memory when trying
to connect to a malicious server, because it read a "literal" data with a
single read(size) call, and BufferedReader.read() allocates the bytes
object of the specified size before reading. Now the IMAP4 client reads data
by chunks, therefore the amount of used memory is limited by the
amount of the data actually been sent by the server.
(cherry picked from commit 735f25c)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
foresto added a commit to foresto/cpython that referenced this pull request Feb 4, 2025
Our read() implementation designed to support IDLE replaces the one from
PR python#119514, fixing the same problem it was addressing. The tests that it
added are preserved.
gvanrossum added a commit that referenced this pull request Feb 7, 2025
* gh-55454: Add IMAP4 IDLE support to imaplib

This extends imaplib with support for the rfc2177 IMAP IDLE command,
as requested in #55454.  It allows events to be pushed to a client as
they occur, rather than having to continually poll for mailbox changes.

The interface is a new idle() method, which returns an iterable context
manager.  Entering the context starts IDLE mode, during which events
(untagged responses) can be retrieved using the iteration protocol.
Exiting the context sends DONE to the server, ending IDLE mode.

An optional time limit for the IDLE session is supported, for use with
servers that impose an inactivity timeout.

The context manager also offers a burst() method, designed for programs
wishing to process events in batch rather than one at a time.

Notable differences from other implementations:

- It's an extension to imaplib, rather than a replacement.
- It doesn't introduce additional threads.
- It doesn't impose new requirements on the use of imaplib's existing methods.
- It passes the unit tests in CPython's test/test_imaplib.py module
  (and adds new ones).
- It works on Windows, Linux, and other unix-like systems.
- It makes IDLE available on all of imaplib's client variants
  (including IMAP4_stream).
- The interface is pythonic and easy to use.

Caveats:

- Due to a Windows limitation, the special case of IMAP4_stream running
  on Windows lacks a duration/timeout feature. (This is the stdin/stdout
  pipe connection variant; timeouts work fine for socket-based
  connections, even on Windows.) I have documented it where appropriate.

- The file-like imaplib instance attributes are changed from buffered to
  unbuffered mode. This could potentially break any client code that
  uses those objects directly without expecting partial reads/writes.
  However, these attributes are undocumented. As such, I think (and
  PEP 8 confirms) that they are fair game for changes.
  https://peps.python.org/pep-0008/#public-and-internal-interfaces

Usage examples:

#55454 (comment)

Original discussion:

https://discuss.python.org/t/gauging-interest-in-my-imap4-idle-implementation-for-imaplib/59272

Earlier requests and suggestions:

#55454

https://mail.python.org/archives/list/[email protected]/thread/C4TVEYL5IBESQQPPS5GBR7WFBXCLQMZ2/

* gh-55454: Clarify imaplib idle() docs

- Add example idle response tuples, to make the minor difference from other
  imaplib response tuples more obvious.
- Merge the idle context manager's burst() method docs with the IMAP
  object's idle() method docs, for easier understanding.
- Upgrade the Windows note regarding lack of pipe timeouts to a warning.
- Rephrase various things for clarity.

* docs: words instead of <=

Co-authored-by: Peter Bierma <[email protected]>

* docs: improve style in an example

Co-authored-by: Peter Bierma <[email protected]>

* docs: grammatical edit

Co-authored-by: Peter Bierma <[email protected]>

* docs consistency

Co-authored-by: Peter Bierma <[email protected]>

* comment -> docstring

Co-authored-by: Peter Bierma <[email protected]>

* docs: refer to imaplib as "this module"

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: simplify & clarify idle debug message

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: elaborate in idle context manager comment

* imaplib: re-raise BaseException instead of bare except

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: convert private doc string to comment

* docs: correct mistake in imaplib example

This is a correction to 8077f2e, which
changed a variable name in only one place and broke the subsequent
reference to it, departed from the naming convention used in the rest of
the module, and shadowed the type() builtin along the way.

* imaplib: simplify example code in doc string

This is for consistency with the documentation change in 8077f2e
and subsequent correction in 013bbf1.

* imaplib: rename _Idler to Idler, update its docs

* imaplib: add comment in Idler._pop()

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: remove unnecessary blank line

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: comment on use of unbuffered pipes

* docs: imaplib: use the reStructuredText :class: role

Co-authored-by: Peter Bierma <[email protected]>

* Revert "docs: imaplib: use the reStructuredText :class: role"

This reverts commit f385e44, because it
triggers CI failures in the docs by referencing a class that is
(deliberately) undocumented.

* docs: imaplib: use the reST :class: role, escaped

This is a different approach to f385e44, which was reverted for
creating dangling link references.

By prefixing the reStructuredText role target with a ! we disable
conversion to a link, thereby passing continuous integration checks
even though the referenced class is deliberately absent from the
documentation.

* docs: refer to IMAP4 IDLE instead of just IDLE

This clarifies that we are referring to the email protocol, not the editor with the same name.

Co-authored-by: Guido van Rossum <[email protected]>

* imaplib: IDLE -> IMAP4 IDLE in exception message

Co-authored-by: Peter Bierma <[email protected]>

* docs: imaplib idle() phrasing and linking tweaks

* docs: imaplib: avoid linking to an invalid target

This reverts and rephrases part of a3f21cd
which created links to a method on a deliberately undocumented class.
The links didn't work consistently, and caused sphinx warnings that
broke cpython's continuous integration tests.

* imaplib: update test after recent exception change

This fixes a test that was broken by changing an exception in
b01de95

* imaplib: rename idle() dur argument to duration

* imaplib: bytes.index() -> bytes.find()

This makes it more obvious which statement triggers the branch.

* imaplib: remove no-longer-necessary statement

Co-authored-by: Martin Panter <[email protected]>

* docs: imaplib: concise & valid method links

The burst() method is a little tricky to link in restructuredText, due
to quirks of its parent class.  This syntax allows sphinx to generate
working links without generating warnings (which break continuous
integration) and without burdening the reader with unimportant namespace
qualifications.  It makes the reST source ugly, but few people read
the reST source, so it's a tolerable tradeoff.

* imaplib: note data types present in IDLE responses

* docs: imaplib: add comma to reST changes header

Co-authored-by: Bénédikt Tran <[email protected]>

* imaplib: sync doc strings with reST docs

* docs: imaplib: minor Idler clarifications

* imaplib: idle: emit (type, [data, ...]) tuples

This allows our iterator to emit untagged responses that contain literal
strings in the same way that imaplib's existing methods do, while still
emitting exactly one whole response per iteration.

* imaplib: while/yield instead of yield from iter()

* imaplib: idle: use deadline idiom when iterating

This simplifies the code, and avoids idle duration drift from time spent
processing each iteration.

* docs: imaplib: state duration/interval arg types

* docs: imaplib: minor rephrasing of a sentence

* docs: imaplib: reposition a paragraph

This might improve readability, especially when encountering Idler.burst()
for the first time.

* docs: imaplib: wrap long lines in idle() section

* docs: imaplib: note: Idler objects require 'with'

* docs: imaplib: say that 29 minutes is 1740 seconds

* docs: imaplib: mark a paragraph as a 'tip'

* docs: imaplib: rephrase reference to MS Windows

* imaplib: end doc string titles with a period

* imaplib: idle: socket timeouts instead of select()

IDLE timeouts were originally implemented using select() after
checking for the presence of already-buffered data.
That allowed timeouts on pipe connetions like IMAP4_stream.
However, it seemed possible that SSL data arriving without any
IMAP data afterward could cause select() to indicate available
application data when there was none, leading to a read() call
that would block with no timeout. It was unclear under what
conditions this would happen in practice. This change switches
to socket timeouts instead of select(), just to be safe.

This also reverts IMAP4_stream changes that were made to support IDLE
timeouts, since our new implementation only supports socket connections.

* imaplib: Idler: rename private state attributes

* imaplib: rephrase a comment in example code

* docs: imaplib: idle: use Sphinx code-block:: pycon

* docs: whatsnew: imaplib: reformat IMAP4.idle entry

* imaplib: idle: make doc strings brief

Since we generally rely on the reST/html documentation for details, we
can keep these doc strings short. This matches the module's existing doc
string style and avoids having to sync small changes between two files.

* imaplib: Idler: split assert into two statements

* imaplib: Idler: move assignment out of try: block

* imaplib: Idler: move __exit__() for readability

* imaplib: Idler: move __next__() for readability

* imaplib: test: make IdleCmdHandler a global class

* docs: imaplib: idle: collapse double-spaces

* imaplib: warn on use of undocumented 'file' attr

* imaplib: revert import reformatting

Since we no longer import platform or selectors, the original import
statement style can be restored, reducing the footprint of PR #122542.

* imaplib: restore original exception msg formatting

This reduces the footprint of PR #122542.

* docs: imaplib: idle: versionadded:: next

* imaplib: move import statement to where it's used

This import is only needed if external code tries to use an attribute
that it shouldn't be using. Making it a local import reduces module
loading time in supported cases.

* imaplib test: RuntimeWarning on IMAP4.file access

* imaplib: use stacklevel=2 in warnings.warn()

* imaplib test: simplify IMAP4.file warning test

* imaplib test: pre-idle-continuation response

* imaplib test: post-done untagged response

* imaplib: downgrade idle-denied exception to error

This makes it easier for client code to distinguish a temporary
rejection of the IDLE command from a server responding incorrectly to
IDLE.

* imaplib: simplify check for socket object

* imaplib: narrow the scope of IDLE socket timeouts

If an IDLE duration or burst() was in use, and an unsolicited response
contained a literal string, and crossed a packet boundary, and the
subsequent packet was delayed beyond the IDLE feature's time limit, the
timeout would leave the incoming protocol stream in a bad state (with
the tail of that response appearing where the start of a response is
expected).

This change moves the IDLE socket timeout to cover only the start
of a response, so it can no longer cause that problem.

* imaplib: preserve partial reads on exception

This ensures that short IDLE durations / burst() intervals
won't risk corrupting response lines that span multiple packets.

* imaplib: read/readline: save multipart buffer tail

For resilience if read() or readline() ever complete with more than one
bytes object remaining in the buffer. This is not expected to happen,
but it seems wise to be prepared for a future change making it possible.

* imaplib: use TimeoutError subclass only if needed

* doc: imaplib: elaborate on IDLE response delivery

* doc: imaplib: elaborate in note re: IMAP4.response

* imaplib: comment on benefit of reading in chunks

Our read() implementation designed to support IDLE replaces the one from
PR #119514, fixing the same problem it was addressing. The tests that it
added are preserved.

* imaplib: readline(): treat ConnectionError as EOF

---------

Co-authored-by: Gregory P. Smith <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Guido van Rossum <[email protected]>
Co-authored-by: Guido van Rossum <[email protected]>
Co-authored-by: Martin Panter <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Feb 7, 2025
* pythongh-55454: Add IMAP4 IDLE support to imaplib

This extends imaplib with support for the rfc2177 IMAP IDLE command,
as requested in python#55454.  It allows events to be pushed to a client as
they occur, rather than having to continually poll for mailbox changes.

The interface is a new idle() method, which returns an iterable context
manager.  Entering the context starts IDLE mode, during which events
(untagged responses) can be retrieved using the iteration protocol.
Exiting the context sends DONE to the server, ending IDLE mode.

An optional time limit for the IDLE session is supported, for use with
servers that impose an inactivity timeout.

The context manager also offers a burst() method, designed for programs
wishing to process events in batch rather than one at a time.

Notable differences from other implementations:

- It's an extension to imaplib, rather than a replacement.
- It doesn't introduce additional threads.
- It doesn't impose new requirements on the use of imaplib's existing methods.
- It passes the unit tests in CPython's test/test_imaplib.py module
  (and adds new ones).
- It works on Windows, Linux, and other unix-like systems.
- It makes IDLE available on all of imaplib's client variants
  (including IMAP4_stream).
- The interface is pythonic and easy to use.

Caveats:

- Due to a Windows limitation, the special case of IMAP4_stream running
  on Windows lacks a duration/timeout feature. (This is the stdin/stdout
  pipe connection variant; timeouts work fine for socket-based
  connections, even on Windows.) I have documented it where appropriate.

- The file-like imaplib instance attributes are changed from buffered to
  unbuffered mode. This could potentially break any client code that
  uses those objects directly without expecting partial reads/writes.
  However, these attributes are undocumented. As such, I think (and
  PEP 8 confirms) that they are fair game for changes.
  https://peps.python.org/pep-0008/#public-and-internal-interfaces

Usage examples:

python#55454 (comment)

Original discussion:

https://discuss.python.org/t/gauging-interest-in-my-imap4-idle-implementation-for-imaplib/59272

Earlier requests and suggestions:

python#55454

https://mail.python.org/archives/list/[email protected]/thread/C4TVEYL5IBESQQPPS5GBR7WFBXCLQMZ2/

* pythongh-55454: Clarify imaplib idle() docs

- Add example idle response tuples, to make the minor difference from other
  imaplib response tuples more obvious.
- Merge the idle context manager's burst() method docs with the IMAP
  object's idle() method docs, for easier understanding.
- Upgrade the Windows note regarding lack of pipe timeouts to a warning.
- Rephrase various things for clarity.

* docs: words instead of <=

Co-authored-by: Peter Bierma <[email protected]>

* docs: improve style in an example

Co-authored-by: Peter Bierma <[email protected]>

* docs: grammatical edit

Co-authored-by: Peter Bierma <[email protected]>

* docs consistency

Co-authored-by: Peter Bierma <[email protected]>

* comment -> docstring

Co-authored-by: Peter Bierma <[email protected]>

* docs: refer to imaplib as "this module"

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: simplify & clarify idle debug message

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: elaborate in idle context manager comment

* imaplib: re-raise BaseException instead of bare except

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: convert private doc string to comment

* docs: correct mistake in imaplib example

This is a correction to 8077f2e, which
changed a variable name in only one place and broke the subsequent
reference to it, departed from the naming convention used in the rest of
the module, and shadowed the type() builtin along the way.

* imaplib: simplify example code in doc string

This is for consistency with the documentation change in 8077f2e
and subsequent correction in 013bbf1.

* imaplib: rename _Idler to Idler, update its docs

* imaplib: add comment in Idler._pop()

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: remove unnecessary blank line

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: comment on use of unbuffered pipes

* docs: imaplib: use the reStructuredText :class: role

Co-authored-by: Peter Bierma <[email protected]>

* Revert "docs: imaplib: use the reStructuredText :class: role"

This reverts commit f385e44, because it
triggers CI failures in the docs by referencing a class that is
(deliberately) undocumented.

* docs: imaplib: use the reST :class: role, escaped

This is a different approach to f385e44, which was reverted for
creating dangling link references.

By prefixing the reStructuredText role target with a ! we disable
conversion to a link, thereby passing continuous integration checks
even though the referenced class is deliberately absent from the
documentation.

* docs: refer to IMAP4 IDLE instead of just IDLE

This clarifies that we are referring to the email protocol, not the editor with the same name.

Co-authored-by: Guido van Rossum <[email protected]>

* imaplib: IDLE -> IMAP4 IDLE in exception message

Co-authored-by: Peter Bierma <[email protected]>

* docs: imaplib idle() phrasing and linking tweaks

* docs: imaplib: avoid linking to an invalid target

This reverts and rephrases part of a3f21cd
which created links to a method on a deliberately undocumented class.
The links didn't work consistently, and caused sphinx warnings that
broke cpython's continuous integration tests.

* imaplib: update test after recent exception change

This fixes a test that was broken by changing an exception in
b01de95

* imaplib: rename idle() dur argument to duration

* imaplib: bytes.index() -> bytes.find()

This makes it more obvious which statement triggers the branch.

* imaplib: remove no-longer-necessary statement

Co-authored-by: Martin Panter <[email protected]>

* docs: imaplib: concise & valid method links

The burst() method is a little tricky to link in restructuredText, due
to quirks of its parent class.  This syntax allows sphinx to generate
working links without generating warnings (which break continuous
integration) and without burdening the reader with unimportant namespace
qualifications.  It makes the reST source ugly, but few people read
the reST source, so it's a tolerable tradeoff.

* imaplib: note data types present in IDLE responses

* docs: imaplib: add comma to reST changes header

Co-authored-by: Bénédikt Tran <[email protected]>

* imaplib: sync doc strings with reST docs

* docs: imaplib: minor Idler clarifications

* imaplib: idle: emit (type, [data, ...]) tuples

This allows our iterator to emit untagged responses that contain literal
strings in the same way that imaplib's existing methods do, while still
emitting exactly one whole response per iteration.

* imaplib: while/yield instead of yield from iter()

* imaplib: idle: use deadline idiom when iterating

This simplifies the code, and avoids idle duration drift from time spent
processing each iteration.

* docs: imaplib: state duration/interval arg types

* docs: imaplib: minor rephrasing of a sentence

* docs: imaplib: reposition a paragraph

This might improve readability, especially when encountering Idler.burst()
for the first time.

* docs: imaplib: wrap long lines in idle() section

* docs: imaplib: note: Idler objects require 'with'

* docs: imaplib: say that 29 minutes is 1740 seconds

* docs: imaplib: mark a paragraph as a 'tip'

* docs: imaplib: rephrase reference to MS Windows

* imaplib: end doc string titles with a period

* imaplib: idle: socket timeouts instead of select()

IDLE timeouts were originally implemented using select() after
checking for the presence of already-buffered data.
That allowed timeouts on pipe connetions like IMAP4_stream.
However, it seemed possible that SSL data arriving without any
IMAP data afterward could cause select() to indicate available
application data when there was none, leading to a read() call
that would block with no timeout. It was unclear under what
conditions this would happen in practice. This change switches
to socket timeouts instead of select(), just to be safe.

This also reverts IMAP4_stream changes that were made to support IDLE
timeouts, since our new implementation only supports socket connections.

* imaplib: Idler: rename private state attributes

* imaplib: rephrase a comment in example code

* docs: imaplib: idle: use Sphinx code-block:: pycon

* docs: whatsnew: imaplib: reformat IMAP4.idle entry

* imaplib: idle: make doc strings brief

Since we generally rely on the reST/html documentation for details, we
can keep these doc strings short. This matches the module's existing doc
string style and avoids having to sync small changes between two files.

* imaplib: Idler: split assert into two statements

* imaplib: Idler: move assignment out of try: block

* imaplib: Idler: move __exit__() for readability

* imaplib: Idler: move __next__() for readability

* imaplib: test: make IdleCmdHandler a global class

* docs: imaplib: idle: collapse double-spaces

* imaplib: warn on use of undocumented 'file' attr

* imaplib: revert import reformatting

Since we no longer import platform or selectors, the original import
statement style can be restored, reducing the footprint of PR python#122542.

* imaplib: restore original exception msg formatting

This reduces the footprint of PR python#122542.

* docs: imaplib: idle: versionadded:: next

* imaplib: move import statement to where it's used

This import is only needed if external code tries to use an attribute
that it shouldn't be using. Making it a local import reduces module
loading time in supported cases.

* imaplib test: RuntimeWarning on IMAP4.file access

* imaplib: use stacklevel=2 in warnings.warn()

* imaplib test: simplify IMAP4.file warning test

* imaplib test: pre-idle-continuation response

* imaplib test: post-done untagged response

* imaplib: downgrade idle-denied exception to error

This makes it easier for client code to distinguish a temporary
rejection of the IDLE command from a server responding incorrectly to
IDLE.

* imaplib: simplify check for socket object

* imaplib: narrow the scope of IDLE socket timeouts

If an IDLE duration or burst() was in use, and an unsolicited response
contained a literal string, and crossed a packet boundary, and the
subsequent packet was delayed beyond the IDLE feature's time limit, the
timeout would leave the incoming protocol stream in a bad state (with
the tail of that response appearing where the start of a response is
expected).

This change moves the IDLE socket timeout to cover only the start
of a response, so it can no longer cause that problem.

* imaplib: preserve partial reads on exception

This ensures that short IDLE durations / burst() intervals
won't risk corrupting response lines that span multiple packets.

* imaplib: read/readline: save multipart buffer tail

For resilience if read() or readline() ever complete with more than one
bytes object remaining in the buffer. This is not expected to happen,
but it seems wise to be prepared for a future change making it possible.

* imaplib: use TimeoutError subclass only if needed

* doc: imaplib: elaborate on IDLE response delivery

* doc: imaplib: elaborate in note re: IMAP4.response

* imaplib: comment on benefit of reading in chunks

Our read() implementation designed to support IDLE replaces the one from
PR python#119514, fixing the same problem it was addressing. The tests that it
added are preserved.

* imaplib: readline(): treat ConnectionError as EOF

---------

Co-authored-by: Gregory P. Smith <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Guido van Rossum <[email protected]>
Co-authored-by: Guido van Rossum <[email protected]>
Co-authored-by: Martin Panter <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
cmaloney pushed a commit to cmaloney/cpython that referenced this pull request Feb 8, 2025
* pythongh-55454: Add IMAP4 IDLE support to imaplib

This extends imaplib with support for the rfc2177 IMAP IDLE command,
as requested in python#55454.  It allows events to be pushed to a client as
they occur, rather than having to continually poll for mailbox changes.

The interface is a new idle() method, which returns an iterable context
manager.  Entering the context starts IDLE mode, during which events
(untagged responses) can be retrieved using the iteration protocol.
Exiting the context sends DONE to the server, ending IDLE mode.

An optional time limit for the IDLE session is supported, for use with
servers that impose an inactivity timeout.

The context manager also offers a burst() method, designed for programs
wishing to process events in batch rather than one at a time.

Notable differences from other implementations:

- It's an extension to imaplib, rather than a replacement.
- It doesn't introduce additional threads.
- It doesn't impose new requirements on the use of imaplib's existing methods.
- It passes the unit tests in CPython's test/test_imaplib.py module
  (and adds new ones).
- It works on Windows, Linux, and other unix-like systems.
- It makes IDLE available on all of imaplib's client variants
  (including IMAP4_stream).
- The interface is pythonic and easy to use.

Caveats:

- Due to a Windows limitation, the special case of IMAP4_stream running
  on Windows lacks a duration/timeout feature. (This is the stdin/stdout
  pipe connection variant; timeouts work fine for socket-based
  connections, even on Windows.) I have documented it where appropriate.

- The file-like imaplib instance attributes are changed from buffered to
  unbuffered mode. This could potentially break any client code that
  uses those objects directly without expecting partial reads/writes.
  However, these attributes are undocumented. As such, I think (and
  PEP 8 confirms) that they are fair game for changes.
  https://peps.python.org/pep-0008/#public-and-internal-interfaces

Usage examples:

python#55454 (comment)

Original discussion:

https://discuss.python.org/t/gauging-interest-in-my-imap4-idle-implementation-for-imaplib/59272

Earlier requests and suggestions:

python#55454

https://mail.python.org/archives/list/[email protected]/thread/C4TVEYL5IBESQQPPS5GBR7WFBXCLQMZ2/

* pythongh-55454: Clarify imaplib idle() docs

- Add example idle response tuples, to make the minor difference from other
  imaplib response tuples more obvious.
- Merge the idle context manager's burst() method docs with the IMAP
  object's idle() method docs, for easier understanding.
- Upgrade the Windows note regarding lack of pipe timeouts to a warning.
- Rephrase various things for clarity.

* docs: words instead of <=

Co-authored-by: Peter Bierma <[email protected]>

* docs: improve style in an example

Co-authored-by: Peter Bierma <[email protected]>

* docs: grammatical edit

Co-authored-by: Peter Bierma <[email protected]>

* docs consistency

Co-authored-by: Peter Bierma <[email protected]>

* comment -> docstring

Co-authored-by: Peter Bierma <[email protected]>

* docs: refer to imaplib as "this module"

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: simplify & clarify idle debug message

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: elaborate in idle context manager comment

* imaplib: re-raise BaseException instead of bare except

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: convert private doc string to comment

* docs: correct mistake in imaplib example

This is a correction to 8077f2e, which
changed a variable name in only one place and broke the subsequent
reference to it, departed from the naming convention used in the rest of
the module, and shadowed the type() builtin along the way.

* imaplib: simplify example code in doc string

This is for consistency with the documentation change in 8077f2e
and subsequent correction in 013bbf1.

* imaplib: rename _Idler to Idler, update its docs

* imaplib: add comment in Idler._pop()

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: remove unnecessary blank line

Co-authored-by: Peter Bierma <[email protected]>

* imaplib: comment on use of unbuffered pipes

* docs: imaplib: use the reStructuredText :class: role

Co-authored-by: Peter Bierma <[email protected]>

* Revert "docs: imaplib: use the reStructuredText :class: role"

This reverts commit f385e44, because it
triggers CI failures in the docs by referencing a class that is
(deliberately) undocumented.

* docs: imaplib: use the reST :class: role, escaped

This is a different approach to f385e44, which was reverted for
creating dangling link references.

By prefixing the reStructuredText role target with a ! we disable
conversion to a link, thereby passing continuous integration checks
even though the referenced class is deliberately absent from the
documentation.

* docs: refer to IMAP4 IDLE instead of just IDLE

This clarifies that we are referring to the email protocol, not the editor with the same name.

Co-authored-by: Guido van Rossum <[email protected]>

* imaplib: IDLE -> IMAP4 IDLE in exception message

Co-authored-by: Peter Bierma <[email protected]>

* docs: imaplib idle() phrasing and linking tweaks

* docs: imaplib: avoid linking to an invalid target

This reverts and rephrases part of a3f21cd
which created links to a method on a deliberately undocumented class.
The links didn't work consistently, and caused sphinx warnings that
broke cpython's continuous integration tests.

* imaplib: update test after recent exception change

This fixes a test that was broken by changing an exception in
b01de95

* imaplib: rename idle() dur argument to duration

* imaplib: bytes.index() -> bytes.find()

This makes it more obvious which statement triggers the branch.

* imaplib: remove no-longer-necessary statement

Co-authored-by: Martin Panter <[email protected]>

* docs: imaplib: concise & valid method links

The burst() method is a little tricky to link in restructuredText, due
to quirks of its parent class.  This syntax allows sphinx to generate
working links without generating warnings (which break continuous
integration) and without burdening the reader with unimportant namespace
qualifications.  It makes the reST source ugly, but few people read
the reST source, so it's a tolerable tradeoff.

* imaplib: note data types present in IDLE responses

* docs: imaplib: add comma to reST changes header

Co-authored-by: Bénédikt Tran <[email protected]>

* imaplib: sync doc strings with reST docs

* docs: imaplib: minor Idler clarifications

* imaplib: idle: emit (type, [data, ...]) tuples

This allows our iterator to emit untagged responses that contain literal
strings in the same way that imaplib's existing methods do, while still
emitting exactly one whole response per iteration.

* imaplib: while/yield instead of yield from iter()

* imaplib: idle: use deadline idiom when iterating

This simplifies the code, and avoids idle duration drift from time spent
processing each iteration.

* docs: imaplib: state duration/interval arg types

* docs: imaplib: minor rephrasing of a sentence

* docs: imaplib: reposition a paragraph

This might improve readability, especially when encountering Idler.burst()
for the first time.

* docs: imaplib: wrap long lines in idle() section

* docs: imaplib: note: Idler objects require 'with'

* docs: imaplib: say that 29 minutes is 1740 seconds

* docs: imaplib: mark a paragraph as a 'tip'

* docs: imaplib: rephrase reference to MS Windows

* imaplib: end doc string titles with a period

* imaplib: idle: socket timeouts instead of select()

IDLE timeouts were originally implemented using select() after
checking for the presence of already-buffered data.
That allowed timeouts on pipe connetions like IMAP4_stream.
However, it seemed possible that SSL data arriving without any
IMAP data afterward could cause select() to indicate available
application data when there was none, leading to a read() call
that would block with no timeout. It was unclear under what
conditions this would happen in practice. This change switches
to socket timeouts instead of select(), just to be safe.

This also reverts IMAP4_stream changes that were made to support IDLE
timeouts, since our new implementation only supports socket connections.

* imaplib: Idler: rename private state attributes

* imaplib: rephrase a comment in example code

* docs: imaplib: idle: use Sphinx code-block:: pycon

* docs: whatsnew: imaplib: reformat IMAP4.idle entry

* imaplib: idle: make doc strings brief

Since we generally rely on the reST/html documentation for details, we
can keep these doc strings short. This matches the module's existing doc
string style and avoids having to sync small changes between two files.

* imaplib: Idler: split assert into two statements

* imaplib: Idler: move assignment out of try: block

* imaplib: Idler: move __exit__() for readability

* imaplib: Idler: move __next__() for readability

* imaplib: test: make IdleCmdHandler a global class

* docs: imaplib: idle: collapse double-spaces

* imaplib: warn on use of undocumented 'file' attr

* imaplib: revert import reformatting

Since we no longer import platform or selectors, the original import
statement style can be restored, reducing the footprint of PR python#122542.

* imaplib: restore original exception msg formatting

This reduces the footprint of PR python#122542.

* docs: imaplib: idle: versionadded:: next

* imaplib: move import statement to where it's used

This import is only needed if external code tries to use an attribute
that it shouldn't be using. Making it a local import reduces module
loading time in supported cases.

* imaplib test: RuntimeWarning on IMAP4.file access

* imaplib: use stacklevel=2 in warnings.warn()

* imaplib test: simplify IMAP4.file warning test

* imaplib test: pre-idle-continuation response

* imaplib test: post-done untagged response

* imaplib: downgrade idle-denied exception to error

This makes it easier for client code to distinguish a temporary
rejection of the IDLE command from a server responding incorrectly to
IDLE.

* imaplib: simplify check for socket object

* imaplib: narrow the scope of IDLE socket timeouts

If an IDLE duration or burst() was in use, and an unsolicited response
contained a literal string, and crossed a packet boundary, and the
subsequent packet was delayed beyond the IDLE feature's time limit, the
timeout would leave the incoming protocol stream in a bad state (with
the tail of that response appearing where the start of a response is
expected).

This change moves the IDLE socket timeout to cover only the start
of a response, so it can no longer cause that problem.

* imaplib: preserve partial reads on exception

This ensures that short IDLE durations / burst() intervals
won't risk corrupting response lines that span multiple packets.

* imaplib: read/readline: save multipart buffer tail

For resilience if read() or readline() ever complete with more than one
bytes object remaining in the buffer. This is not expected to happen,
but it seems wise to be prepared for a future change making it possible.

* imaplib: use TimeoutError subclass only if needed

* doc: imaplib: elaborate on IDLE response delivery

* doc: imaplib: elaborate in note re: IMAP4.response

* imaplib: comment on benefit of reading in chunks

Our read() implementation designed to support IDLE replaces the one from
PR python#119514, fixing the same problem it was addressing. The tests that it
added are preserved.

* imaplib: readline(): treat ConnectionError as EOF

---------

Co-authored-by: Gregory P. Smith <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Guido van Rossum <[email protected]>
Co-authored-by: Guido van Rossum <[email protected]>
Co-authored-by: Martin Panter <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Feb 18, 2025

GH-130248 is a backport of this pull request to the 3.9 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.9 only security fixes label Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deferred-blocker stdlib Python modules in the Lib dir type-security A security issue
Projects
Development

Successfully merging this pull request may close these issues.

6 participants