Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csv.skipinitialspace only skips spaces, not "whitespace" in general #65496

Closed
DanielAndersson mannequin opened this issue Apr 18, 2014 · 6 comments
Closed

csv.skipinitialspace only skips spaces, not "whitespace" in general #65496

DanielAndersson mannequin opened this issue Apr 18, 2014 · 6 comments
Assignees
Labels
3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes docs Documentation in the Doc dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@DanielAndersson
Copy link
Mannequin

DanielAndersson mannequin commented Apr 18, 2014

BPO 21297
Nosy @terryjreedy, @berkerpeksag
Files
  • csv_skipinitialspace_testing.py: test code
  • csv_skipinitialspace_testing.csv: csv file for the test code
  • csv_skipinitialspace_docfix.patch: patch fix
  • skipinitialspace_test.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2014-04-18.11:52:21.568>
    labels = ['type-bug', '3.8', '3.9', '3.10', 'library', 'docs']
    title = 'csv.skipinitialspace only skips spaces, not "whitespace" in general'
    updated_at = <Date 2021-03-25.22:59:21.786>
    user = 'https://bugs.python.org/DanielAndersson'

    bugs.python.org fields:

    activity = <Date 2021-03-25.22:59:21.786>
    actor = 'iritkatriel'
    assignee = 'docs@python'
    closed = False
    closed_date = None
    closer = None
    components = ['Documentation', 'Library (Lib)']
    creation = <Date 2014-04-18.11:52:21.568>
    creator = 'Daniel.Andersson'
    dependencies = []
    files = ['39558', '39559', '39560', '39732']
    hgrepos = []
    issue_num = 21297
    keywords = ['patch']
    message_count = 6.0
    messages = ['216780', '216820', '216907', '244415', '244866', '245484']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'docs@python', 'berker.peksag', 'Daniel.Andersson', 'Andy.Almonte', 'jbmilam']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue21297'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @DanielAndersson
    Copy link
    Mannequin Author

    DanielAndersson mannequin commented Apr 18, 2014

    Regarding the skipinitialspace parameter to the different CSV reader dialects in the csv module, the official documentation asserts:

    When True, whitespace immediately following the delimiter is ignored.
    

    and the help(csv) style module documentation says:

    * skipinitialspace - specifies how to interpret whitespace which
      immediately follows a delimiter.  It defaults to False, which
      means that whitespace immediately following a delimiter is part
      of the following field.
    

    "Whitespace" is a bit too general in both cases (at least a red herring in the second case), since it only skips spaces and not e.g. tabs 1.

    In [Modules/_csv.c](https://github.com/python/cpython/blob/main/Modules/_csv.c), it more correctly describes the parameter. At line 81:

    int skipinitialspace;       /* ignore spaces following delimiter? */
    

    and the actual implementation at line 638:

    else if (c == ' ' && dialect->skipinitialspace)
        /* ignore space at start of field */
        ;
    

    No-one will probably assume that the whole UTF-8 spectrum of "whitespace" is skipped, but at least I initially assumed that the tab character was included.

    @DanielAndersson DanielAndersson mannequin added docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Apr 18, 2014
    @terryjreedy
    Copy link
    Member

    Do I understand correctly that only one space is ignored?

    @terryjreedy terryjreedy changed the title skipinitialspace in the csv module only skips spaces, not "whitespace" in general csv.skipinitialspace only skips spaces, not "whitespace" in general Apr 19, 2014
    @DanielAndersson
    Copy link
    Mannequin Author

    DanielAndersson mannequin commented Apr 20, 2014

    No, multiple spaces are ignored as advertised (according to actual tests; not just reading the code), but only spaces (U+0020) and not e.g. tabs (U+0009), which are also included in the term "whitespace", along with several other characters.

    In light of your followup question, the internal comment at [Modules/_csv.c](https://github.com/python/cpython/blob/main/Modules/_csv.c), line 639:

    /* ignore space at start of field */
    

    could perhaps be clarified to say "spaces" instead of "space", but the code context makes it quite clear, and it does not face the users anyway. The main point of this issue is meant to be the wording in the module docstring and the official docs regarding "whitespace" contra "space".

    @jbmilam
    Copy link
    Mannequin

    jbmilam mannequin commented May 29, 2015

    This code shows what Daniel Andersson was talking about. I changed the "whitespace" references in the documentation that Daniel mentioned to say spaces. Also I changed "ignore space at the start of the field" to "ignore spaces at the start of the field" due to Terry's confusion.

    Let me know of any errors or extra changes that are needed.

    @berkerpeksag
    Copy link
    Member

    The patch looks good to me, thanks! Could you also convert your test script to a test case and add it in Lib/test/test_csv.py?

    @jbmilam
    Copy link
    Mannequin

    jbmilam mannequin commented Jun 18, 2015

    This is my first attempt at working with the test suite but I believe this is what you were asking for. Due to this being my first attempt at writing tests I have included it as a separate patch file. Any further changes just let me know.

    @iritkatriel iritkatriel added 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes labels Mar 25, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @slateny slateny added the tests Tests in the Lib/test dir label Aug 22, 2022
    @JelleZijlstra JelleZijlstra self-assigned this Oct 7, 2022
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 7, 2022
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 7, 2022
    miss-islington added a commit that referenced this issue Oct 7, 2022
    miss-islington added a commit that referenced this issue Oct 7, 2022
    @slateny slateny closed this as completed Oct 8, 2022
    @slateny slateny moved this to Done in CSV issues Oct 8, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes docs Documentation in the Doc dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: Done
    Development

    No branches or pull requests

    5 participants