Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-106535: Soft deprecate the getopt module #105735

Merged
merged 1 commit into from
Jul 8, 2023

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Jun 13, 2023

The getopt module exists since the initial revision of the Python
source code (1990). The optparse module was added to Python 2.3. When
Python 2.7 added the 3rd argparse module, the optparse module was
soft deprecated. Soft deprecate the getopt module.


📚 Documentation preview 📚: https://cpython-previews--105735.org.readthedocs.build/

@vstinner
Copy link
Member Author

I don't propose to emit a DeprecationWarning at runtime yet. I'm surprised, but there is still quite a big number of PyPI top 5,000 projects happy with the getopt module (using it). Maybe it's just simple to use and it's good enough for their needs.

@vstinner
Copy link
Member Author

cc @hugovk

@CAM-Gerlach
Copy link
Member

Just for reference, PEP 594 said this about getopt:

The getopt module mimics C’s getopt() option parser.

Although users are encouraged to use argparse instead, the getopt module is still widely used. The module is small, simple, and handy for C developers to write simple Python scripts.

As a baseline, searching grep.app for (import|from) modulename in public GitHub Python modules/scripts, there are:

@vstinner
Copy link
Member Author

11 600 files using optparse
5 000 files using getopt

I would say that either we remove optparse deprecation, or we deprecate getopt in its documentation, to have a consistent maintenance guideline.

@CAM-Gerlach
Copy link
Member

CAM-Gerlach commented Jun 14, 2023

From a current usage perspective, it would seem so, though just playing devil's advocate here, arguably optparse and argparse have a much more similar design and fulfill a similar niche (therefore being more duplicative), whereas getopt has some unique attributes—it is much simpler, is more familiar to developers from/interoperable with the C getopt(), and has a much smaller and easier to maintain implementation.

FWIW, I've always used argparse myself urged others who do use the other two to migrate, as well as helped them do so. And I agree it would be best to point users toward one recommended stdlib argument parsing module that does a lot of nice things for users, even if it takes a bit more upfront effort and certainly isn't perfect.

Anyway, some additional datapoints for reference—usage in Python's own stdlib:

getopt usage: 20 code hits (plus the module's test suite)

For a point of reference—optparse usage: 6 code hits (plus 2 for the module itself and its test suite):

Copy link
Member

@CAM-Gerlach CAM-Gerlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, as it seems like you've cleaned up the outstanding Sphinx warnings, you should remove this file from Doc/tools/.nitignore.

Comment on lines 11 to 12
The :mod:`getopt` module is deprecated and will not be developed further;
development will continue with the :mod:`argparse` module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should tailor this message (copied from the optparse deprecation) to be a little more appropriate here:

  • Active development has long since stopped here and continued with argparse, as the last getopt-specific change (that wasn't cross-stdlib cleanup) was bpo-11621 / build error: bootstrap issue with gettext #55830 in 2011.
  • It seems worth mentioning explicitly there are no plans for removal in the near future, as the lack of removal version could created user uncertainty about whether and how urgently users should worry about their code breaking.
  • Provide clearer guidance to users about what it is recommended that they do
Suggested change
The :mod:`getopt` module is deprecated and will not be developed further;
development will continue with the :mod:`argparse` module.
The :mod:`getopt` module is deprecated, as the actively-developed
:mod:`argparse` module offers much more functionality with less user code.
While there are no immediate removal plans, migrating to :mod:`!argparse`
is recommended.

Comment on lines +17 to 21
designed to be familiar to users of the C :c:func:`!getopt` function. Users who
are unfamiliar with the C :c:func:`!getopt` function or who would like to write
less code and get better help and error messages should consider using the
:mod:`argparse` module instead.
Copy link
Member

@CAM-Gerlach CAM-Gerlach Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
designed to be familiar to users of the C :c:func:`!getopt` function. Users who
are unfamiliar with the C :c:func:`!getopt` function or who would like to write
less code and get better help and error messages should consider using the
:mod:`argparse` module instead.
designed to be familiar to users of the C :c:func:`!getopt` function.
Using the :mod:`argparse` module is encouraged instead,
as it offers better help and error messages while writing less code.

Tweaked this a bit to account to nudge users more toward argparse rather than just naming specific categories of users that should "consider using" it instead, as well as be more concise and less duplicative/repetitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I still think these changes (or perhaps a revised version) would be a good idea if we decide not to actually deprecate it, to make this clearer, more concise and a bit more of a nudge.

@vstinner
Copy link
Member Author

I created a discussion to specify "Soft Deprecation" in PEP 387: https://discuss.python.org/t/formalize-the-concept-of-soft-deprecation-dont-schedule-removal-in-pep-387-backwards-compatibility-policy/27957

@malemburg
Copy link
Member

I'm a -1 on deprecating the getopt module, even soft-deprecating it.

I'm an active user (I much prefer getopt over many of the framework style CLI tools out there for simple scripts) and so is CPython itself... it uses getopt() for parsing the command line options, so unless CPython starts using something else and then makes this available in the stdlib as well, we should keep support in Python's stdlib.

We already have PEP 594 which discusses the pros and cons of deprecating stdlib module. We don't really need the same discussion again every few years.

BTW: Just because a piece of code is older, established and has low maintenance, doesn't mean that we need to remove it. This only causes unnecessary churn, makes simple scripts without 3rd party dependencies more complex and doesn't buy us anything much.

@CAM-Gerlach
Copy link
Member

BTW: Just because a piece of code is older, established and has low maintenance, doesn't mean that we need to remove it. This only causes unnecessary churn, makes simple scripts without 3rd party dependencies more complex and doesn't buy us anything much.

Just to note, as @vstinner mentions, the idea here is explicitly not to remove it (for the forseeable future, and quite possibly for the life of Python itself), nor even to add a warning, just to let users know it is no longer actively developed and encourage them to use other, better alternatives unless they have a specific reason to use argparse.

That being said, while I certainly don't use getopt myself, I share a lot of the sentiments in the rest of your message—unlike optparse, the overall design and use cases it serves are much more distinct from argparse, the actual library code much simpler and requiring little to no maintenance, and it maintains compatibility with the arg parsing of many other widely used languages. I could see optparse being eventually removed in 5-10 years (perhaps coinciding with a major version bump) if usage numbers are low enough, but not getopt.

So, maybe deprecation isn't the right framing here at all in this particular case. Maybe we should instead just tweak the language in the existing Note admonition along the lines of my comment, and perhaps move the Seealso up from being varied at the bottom to after the introduction—or perhaps even integrate the two.

@vstinner
Copy link
Member Author

I created https://discuss.python.org/t/formalize-the-concept-of-soft-deprecation-dont-schedule-removal-in-pep-387-backwards-compatibility-policy/27957 to formalize the concept of "you should not use this thing but we don't plan to remove it neither". The idea here is to advice users who are learning Python to make a choice between the 3 flavors of command line parsing in the Python stdlib. If they read an old Python book or an old Python script and see getopt: should they still use it for new scripts in 2023? Is getopt still relevant in 2023? How are users supposed to figure it out?

My problem is that the regular "deprecation" markup is appealing to be followed soon by a DeprecationWarning in the code, and then end with the removal of the function. That's not my willingness here.

@malemburg
Copy link
Member

malemburg commented Jun 19, 2023

Thanks for the added comments, @CAM-Gerlach and @vstinner

I think that getopt and argparse/optparse serve different use cases and as such getopt fills its own niche. It's easy enough to use for beginners and pros wanting to write short scripts. It's also part of the POSIX standard and it has been used as basis for writing more evolved CLI interfaces.

There is duplication between argparse and optparse, since they both want to address more complex CLI parsing using an object oriented approach, so deprecating one or the other makes sense in the long run.

This doesn't apply to getopt and I don't think we should even "mark" it as obsolete, since it is not. The getopt docs already point to argparse as a possible alternative, so I don't think there's anything actionable here.

BTW: This ticket had me actually looking at the source code of the Python getopt module and I was surprised to find that getopt is only emulating the C POSIX function in Python instead of directly exposing the emulation we have for it in Python as _PyOS_GetOpt(). I found this ticket from 2000 which describes how the emulation came to be: #33417 and this is the discussion that led to it: https://mail.python.org/archives/list/[email protected]/thread/7HVUOYNP24WH4N6Y4YYIDRUDD7VJU43F/ ... I guess command line parsing usually is not time criticial, so no one ever bothered. I might change this for eGenix PyRun, since I use getopt to emulate the CPython command line parsing and doing this in Python is definitely causing a slow down in startup time.

@malemburg
Copy link
Member

To give you an analogy to what I'm trying to explain:

Would you declare bicycles obsolete, just because we now have cars ?

@vstinner
Copy link
Member Author

Would you declare bicycles obsolete, just because we now have cars ?

Well, the example at the end of https://docs.python.org/dev/library/getopt.html doc shows that argparser does the same with way less lines of code and likely better error messages when the command line is invalid (better than assert False, "unhandled option"). Code using getopt usually use global variables, whereas argparse provides a convient "namespace" objects (usually, it's stored as an args variable in code).

@CAM-Gerlach
Copy link
Member

CAM-Gerlach commented Jun 20, 2023

@malemburg 's comments much more concisely and elegantly articulate some of the potential concerns I shared above, at least with the current "deprecation" language.

At the same time, I also agree that we should be steering most users, especially newer ones, toward argparse (or other options), unless they have a specific reason to use getopt (familiarity, interoperability, or existing code), which the existing docs do make an attempt at, but I do think it could be clearer.

As such, perhaps an agreeable balance here would be to

  • Improve the note language as suggested above to be clearer as to the module's status and nudge users who don't already fit the intended use case to consider alternatives
  • Move the seealso to just below the intro section rather than buried at the bottom of everything to make the links to alternatives more visible
  • And perhaps, depending on the outcome of the Discourse discussion, consider using a "discouraged" directive or whatever it ends up being here...but I can certainly see the argument that this could send the wrong message to some users, so not sure about that.

@malemburg
Copy link
Member

Well, the example at the end of https://docs.python.org/dev/library/getopt.html doc shows that argparser does the same with way less lines of code

This is not really true, since the argparse snippet at the end only shows the parsing declaration part and not the processing of the parsed output. In reality, the needed code for argparse will be more verbose. See e.g. https://hg.python.org/cpython/rev/6e2e5adc0400 where this exercise was done.

But that's not really the point. getopt() is a very simple tool to use, whereas argparse and optparse are more complex tools, which also require more programming knowledge to be used correctly.

@malemburg
Copy link
Member

As such, perhaps an agreeable balance here would be to

* Improve the note language as suggested above to be clearer as to the module's status and nudge users who don't already fit the intended use case to consider alternatives

Fair enough, though, such language already exists in the docs: https://docs.python.org/3.13/library/getopt.html

* Move the `seealso` to just below the intro section rather than buried at the bottom of everything to make the links to alternatives more visible

Wouldn't that be duplication of the note at the start of the page ? I wouldn't mind, but this would also look somewhat odd -- much like a road block barrier... which brings us to the next point...

* And perhaps, depending on the outcome of the Discourse discussion, consider using a "discouraged" directive or whatever it ends up being here...but I can certainly see the argument that this could send the wrong message to some users, so not sure about that.

Regardless of the outcome of the Discourse discussions, I'm not in favor of calling getopt() discouraged technology, as already explained.

@CAM-Gerlach
Copy link
Member

Fair enough, though, such language already exists in the docs: https://docs.python.org/3.13/library/getopt.html

Yup; see my review suggestion above for my suggested tweaks to said language in the vain described in my quote. Feedback welcome, of course.

Wouldn't that be duplication of the note at the start of the page?

Yeah, I was thinking of perhaps consolidating the two, but not totally sure there's a good way to structure that, so maybe best to just leave things as they are on that aspect.

I wouldn't mind, but this would also look somewhat odd -- much like a road block barrier... which brings us to the next point...

It's not that uncommon to have seealsos in the section intros these days; it doesn't imply the module is deprecated/discouraged. But I do agree it can be duplicative of the note, and not sure there's an easy way to make those two work together, so again maybe best to lave it.

Regardless of the outcome of the Discourse discussions, I'm not in favor of calling getopt() discouraged technology, as already explained.

I'm also somewhat hesitant to call it such at least in general (as opposed to just for new code for users who don't have a specific reason to use it over other more sophisticated options); the dependence on that discussion had more to do with exactly what the framing and wording of such a "soft deprecation"/"discouragement" is. Of the examples cited there, this one certainly seems to me to be on a different tier from the others in terms of the applicability of such, given the others all had direct, often drop-in-ish replacements and/or some form of actual problem with using them.

@vstinner
Copy link
Member Author

vstinner commented Jul 3, 2023

PEP 387 got updated to define "Soft Deprecation": https://peps.python.org/pep-0387/#soft-deprecation

The getopt module exists since the initial revision of the Python
source code (1990). The optparse module was added to Python 2.3. When
Python 2.7 added the 3rd argparse module, the optparse module was
soft deprecated. Soft deprecate the getopt module.
@vstinner vstinner force-pushed the deprecate_getopt branch from fc545db to cba3d9a Compare July 7, 2023 22:06
@vstinner vstinner changed the title Deprecate the getopt module in its documentation gh-106535: Soft deprecate the getopt module Jul 7, 2023
@vstinner
Copy link
Member Author

vstinner commented Jul 7, 2023

I updated the PR to only soft deprecate the getopt module. If tomorrow, the getopt development becomes active again, we can simply remove this mention in the documentation and continues as nothing happened :-)

A soft deprecation is only a mention in the documentation: no warnings is issued, no removal is scheduled.

Note: I just added "soft deprecated" to the documentation glossary.

@vstinner
Copy link
Member Author

vstinner commented Jul 7, 2023

I changed optparse deprecation to a soft deprecation, since there is clearly no plan to remove this module. I just want to make it more explicit.

@vstinner vstinner merged commit da98ed0 into python:main Jul 8, 2023
@vstinner vstinner deleted the deprecate_getopt branch July 8, 2023 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants