Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type inference for narrowing of unions #1436

Closed
bdarnell opened this issue Apr 24, 2016 · 13 comments
Closed

Type inference for narrowing of unions #1436

bdarnell opened this issue Apr 24, 2016 · 13 comments
Labels

Comments

@bdarnell
Copy link
Contributor

From #1141 (comment)

This pattern is the most common cause of cast() calls in my code, and it would also be awesome if mypy could infer this narrowing on its own so I wouldn't need the explicit cast (although it may be too much to ask).

def f(x: Union[str, bytes]):
    if isinstance(x, bytes):
        x = x.decode('utf8')
    x = cast(str, x)
    # From here on x is a str instead of a union.

The logic would be that in the (implied) else branch of the if, we know that x is not bytes, so it must be of type Union[str, bytes] - bytes = Union[str] = str. After executing the assignment in the if branch, we also know that it is of type str there. The type checker could construct a new union from the types that are possible in each branch of the if, which in this case would simplify to str.

On the other hand, this is currently legal and would become illegal if the union were automatically narrowed in this way:

def f(x: Union[str, bytes]):
    if isinstance(x, bytes):
        x = decode('utf8')
    if something():
        x = b"some bytes"
@JukkaL
Copy link
Collaborator

JukkaL commented Apr 24, 2016

This works for me already without a cast (in Python 3 mode):

from typing import Union

def f(x: Union[str, bytes]):
    if isinstance(x, bytes):
        x = x.decode('utf8')
    x + ''  # no error

Can you give a full example and command line where the cast is required?

@bdarnell
Copy link
Contributor Author

Ah, you're right. I reduced this from a more complicated condition and didn't test the reduced version properly.

@bdarnell
Copy link
Contributor Author

OK, here's a better reduction. The error is in python 2 mode, presumably because the nested if blocks confuse it.

from typing import Union, cast

import sys
PY3 = sys.version_info >= (3,)

if PY3:
    unicode = str
    Basestring = str
else:
    Basestring = Union[unicode, bytes]

def require_str(s: str): pass

def f(name: Basestring):
    if not PY3:
        if isinstance(name, unicode):
            name = name.encode('utf-8')
    #name = cast(str, name)
    require_str(name)

I've gone through a number of iterations on the condition here; I'm not sure what the right way to express it is. It's an edge case in that it wants to deal with character data, but it needs to call into python 2 APIs that deal exclusively in native str objects (the __import__() function), so it must encode any incoming unicode strings in py2.

Here's one that seems to work in a single if statement without an explicit cast in both 2 and 3:

if not isinstance(name, str):
    name = name.encode('utf-8')
require_str(name)

@bdarnell bdarnell reopened this Apr 24, 2016
@gvanrossum
Copy link
Member

gvanrossum commented Apr 25, 2016 via email

@JukkaL
Copy link
Collaborator

JukkaL commented Apr 25, 2016

@gvanrossum You are probably right. Mypy thinks that the else block of the outermost if statement (which is empty) might be taken even in Python 2 mode, even though the if block is always executed.

@JukkaL JukkaL added the bug mypy got something wrong label Apr 25, 2016
@gvanrossum gvanrossum added this to the 0.5 milestone Apr 28, 2016
@gvanrossum gvanrossum removed this from the 0.5 milestone Mar 29, 2017
@ilevkivskyi
Copy link
Member

Both original and new examples work correctly on master and reveal_type(name) shows str in both Python 2 and Python 3 mode.

@Krumpet
Copy link

Krumpet commented Oct 15, 2018

I know this is closed, but right now on Python3 and using conditional expressions, I'm getting wrong type inference on a union type:

dict: Dict[str, int]
def get_int(arg: Union[str, int]) -> int:
    ret_val_1 = arg if isinstance(arg, int) else dict[arg]  # type: int
    ret_val_2 = dict[arg] if isinstance(arg, str) else arg  # type: Union[str, int]
    if isinstance(arg, str):
        ret_val_3 = dict[arg] # type: int
    else:
        ret_val_3 = arg # type: int

    return ret_val_1

I've written comments stating what the type of ret_val_n is inferred to in each case. In the else clause of the conditional expressions, arg remains of Union type, rather than resolving to the remaining type.

@gvanrossum
Copy link
Member

@Krumpet That's a separate issue -- currently we don't do conditional types in branches of ternary expressions (... if ... else ...) -- only when ifstatements or assert are used.

@gvanrossum
Copy link
Member

[me]

That's a separate issue

It's somewhere in the tracker but I can't find the issue because the keywords ("conditional", "termary", "union") are too common to be of much value in searching the tracker. (@ilevkivskyi maybe you recall the issue or how to find it?)

@emmatyping
Copy link
Collaborator

@gvanrossum I think you (and @Krumpet) want #5550. I found it by remembering Benjamin Peterson opened #5671 and following the link, which is definitely not ideal...

@gvanrossum
Copy link
Member

Maybe, though that's about unreachable code (e.g. sys.version or TYPE_CHECKING), which uses a different mechanism than isinstance().

@ilevkivskyi
Copy link
Member

@Krumpet you example just works (I just tried on a recent git commit). I removed all comments and mypy correctly inferred int for all three. My guess is that it "doesn't work" for you exactly because of type comments you added. (In case you don't know, mypy treats # type: ... comments as type annotations for compatibility with Python 2.)

@Krumpet
Copy link

Krumpet commented Oct 15, 2018

@ilevkivskyi I added the type comments to illustrate where I was getting behavior I didn't expect, they were added after the fact.

For me ret_val_2 is still inferred as Union (and arg is also a Union in the else clause for 1 and 2)

If mypy inferrs types correctly then at least we've narrowed down the problem to my IDE (pycharm). As you've mentioned on #5550 , PEP 484 doesn't specify supporting type inference in ternary expressions, and that is probably why I'm seeing this behavior on my end. I'll raise this issue with them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants