Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The loop in utility socket.create_connection() swallows previous errors #73943

Closed
ankostis mannequin opened this issue Mar 8, 2017 · 11 comments
Closed

The loop in utility socket.create_connection() swallows previous errors #73943

ankostis mannequin opened this issue Mar 8, 2017 · 11 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@ankostis
Copy link
Mannequin

ankostis mannequin commented Mar 8, 2017

BPO 29757
Nosy @vstinner, @1st1, @ankostis
PRs
  • bpo-29757: don't swallows errors in the socket.create_connection() utility loop #562
  • Superseder
  • bpo-29980: OSError: multiple exceptions should preserve the exception type if it is common
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2017-03-08.14:45:17.845>
    labels = ['type-feature', 'library', '3.9']
    title = 'The loop in utility `socket.create_connection()` swallows previous errors'
    updated_at = <Date 2021-05-22.17:19:34.463>
    user = 'https://github.com/ankostis'

    bugs.python.org fields:

    activity = <Date 2021-05-22.17:19:34.463>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2017-03-08.14:45:17.845>
    creator = 'ankostis'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 29757
    keywords = []
    message_count = 10.0
    messages = ['289238', '289312', '289313', '289771', '290009', '292929', '302089', '335930', '336193', '336206']
    nosy_count = 3.0
    nosy_names = ['vstinner', 'yselivanov', 'ankostis']
    pr_nums = ['562']
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'patch review'
    status = 'open'
    superseder = '29980'
    type = 'enhancement'
    url = 'https://bugs.python.org/issue29757'
    versions = ['Python 3.9']

    @ankostis
    Copy link
    Mannequin Author

    ankostis mannequin commented Mar 8, 2017

    Context

    The utility method socket.create_connection() currently works like that:

    1. resolve the destination-address into one or more IP(v4 & v6) addresses;
    2. loop on each IP address and stop to the 1st one to work;
    3. if none works, re-raise the last error.

    The problem

    So currently the loop in socket.create_connection() ignores all intermediate errors and reports only the last connection failure,
    which might be irrelevant. For instance, when both IPv4 & IPv6 networks are supported, usually the last address is a IPv6 address and it frequently fails with an irrelevant error - the actual cause have already been ignored.

    ## Possible solutions & open questions
    To facilitate network debugging, there are at least 3 options:
    a. log each failure as they happen, but that would get the final failure twice: once as a (warning?) message, and once as an exception .
    b. collect all failures and log them only when connection fails to collect the errors,
    but that might miss important infos to the user;
    c. collect and return all failures in list attached to the raised exception.

    A question for cases (a) & (b) is what logging "means" to use: the warnings or logging module?
    And if logging is chosen, log them in 'DEBUG' or 'WARNING' level?

    Case (c) sidesteps the above questions.

    @ankostis ankostis mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir labels Mar 8, 2017
    @1st1
    Copy link
    Member

    1st1 commented Mar 9, 2017

    Case (c) sidesteps the above questions.

    I like the (c) option. I don't think we should use logging/warnings here, as they can produce unwanted noise that few know how to silence.

    When the list of errors is passed as a second argument to the exception, how is it rendered? Would it make sense to concatenate all error messages:

      if errs:
          errors_msg = '  \n'.join(str(e) for e in errs)
          raise error(f"connection failed:\n{errors_msg}", errs)

    Or maybe we should track which addr caused which exception?

      for res in getaddrinfo(host, port, 0, SOCK_STREAM):
          af, socktype, proto, canonname, sa = res
          try:
              ...
          except error as e:
              errs.append((canonname, e))
              ...
    
      if errs:
          error_msg = '  \n'.join(f'{addr}: {e}' for addr, e in errs)
          raise error('connection failed:\n{error_msg}', errs)

    @1st1
    Copy link
    Member

    1st1 commented Mar 9, 2017

    This is a new feature, so we can only push it to 3.7.

    @ankostis
    Copy link
    Mannequin Author

    ankostis mannequin commented Mar 17, 2017

    When the list of errors is passed as a second argument to the exception, how is it rendered?

    This is how my latest ec887c0c3 looks on Linux:

        >>> import socket
        >>> socket.create_connection(('localhost', 12345))
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          File "/usr/lib/python3.5/socket.py", line 714, in create_connection
            raise error("no connection possible due to %d errors" % nerr, errors)
        OSError: [Errno no connection possible due to 2 errors] [ConnectionRefusedError(111, 'Connection refused'), ConnectionRefusedError(111, 'Connection refused')]

    And this is on Windows:

        >>> socket.create_connection(('localhost', 12345), 1)
        Traceback (most recent call last):
          File "D:\Apps\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
            exec(code_obj, self.user_global_ns, self.user_ns)
          File "<ipython-input-13-758ac642f3d5>", line 1, in <module>
            socket.create_connection(('localhost', 12345), 1)
          File "D:\Apps\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\socket.py", line 714, in create_connection
            raise error("no connection possible due to %d errors" % nerr, errors)
        OSError: [Errno no connection possible due to 2 errors] [timeout('timed out',), timeout('timed out',)]

    Would it make sense to concatenate all error messages:

    But then the user will not receive a list of errors to inspect, but just a big string.
    The biggest problem in my latest ec887c0c3 is that I'm abusing the 1st arg to OSError() constructor, instead of being an errno it is a string.
    But I got that from the existing code.[1]

    And still, this PR is not yer finished because there is no feedback on any intermediate errors in the case of success.
    As suggested on the OP, this may happen (in my order of preference):

    1. with a new argument for the user to provide the list to collect the errors (changes the API backward-compatiblly);
    2. with logging logs;
    3. with warnings;
    4. do nothing.

    I prefer logging over warnings because they are more configurable.

    [1] https://github.com/python/cpython/blob/master/Lib/socket.py#L724

    @ankostis
    Copy link
    Mannequin Author

    ankostis mannequin commented Mar 22, 2017

    This is a new feature, so we can only push it to 3.7.

    As it stands now(b39d4b1) it hardly contains any change - just in the case of multiple intermediate errors AND final failure, the exception raised is a bit different.

    AFAICS it would be a "change" for 3.7 if my one of the 3 options of my last comment gets implemented.

    Any ideas which one to implement?

    @ankostis
    Copy link
    Mannequin Author

    ankostis mannequin commented May 3, 2017

    Just to remind that as it stands now(b39d4b1) it contains no API changes at all, so it should be considered for merge into 3.6.x line.

    @vstinner
    Copy link
    Member

    I'm not excited to keep multiple exception objects alive because the Exception.__traceback__ causes many hairy reference cycle issues.

    I just found an old bug in socket.create_connection():
    #3546

    See https://bugs.python.org/issue29639#msg302087 for my story of the bug.

    @csabella csabella added 3.8 (EOL) end of life type-feature A feature request or enhancement labels Feb 16, 2019
    @vstinner
    Copy link
    Member

    I propose the close the issue as WONTFIX. I don't see how the current behavior is an issue.

    I expect a very large traceback just because a server is down/reject connections. I'm not sure that I want that by default.

    The problem

    So currently the loop in socket.create_connection() ignores all intermediate errors and reports only the last connection failure,
    which might be irrelevant. For instance, when both IPv4 & IPv6 networks are supported, usually the last address is a IPv6 address and it frequently fails with an irrelevant error - the actual cause have already been ignored.

    It's not so hard to reimplement create_connection(): it's a loop on getaddrinfo(). So you can decide how to deal with errors.

    --

    I recall that Yury Selivanov, Nathnaniel Smith and Guido van Rossum discussed the idea of "MultiError": an object which contains multiple exceptions, object which would render properly all exceptions when a traceback is displayed.

    @ankostis
    Copy link
    Mannequin Author

    ankostis mannequin commented Feb 21, 2019

    The problem of not fixing this (and just add a suggestion in the docs saying that the user may re-implement this method) is that frequently this call is hidden at the core of many networking libraries. So monkey-patching is needed instead, which is not always nice or easy.

    I just found an old bug in socket.create_connection():
    #3546

    Regarding bpo-3546, i'm surprised that a function-local variable keeps its contents alive, long after the frame of that function has gone. I thought that local-variables were deterministically (ref-countering) destructed.
    What is happening?

    @vstinner
    Copy link
    Member

    I thought that local-variables were deterministically (ref-countering) destructed. What is happening?

    IT's a reference cycle. Exception => traceback => frame => exception. The 'err' variable kept the frame alive which kept the exception alive which kept everything alive. The GC is supposed to be able to break ref cycles, but GC collections are irregular and sometimes the GC fails to break complex cycles.

    @csabella csabella added 3.9 only security fixes and removed 3.7 (EOL) end of life 3.8 (EOL) end of life labels Jan 26, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel
    Copy link
    Member

    This is a duplicate of #74166.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants