Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Reestructure tools and scripts #23658

Closed
datapythonista opened this issue Nov 13, 2018 · 6 comments
Closed

CLN: Reestructure tools and scripts #23658

datapythonista opened this issue Nov 13, 2018 · 6 comments
Assignees
Labels
CI Continuous Integration Clean Closing Candidate May be closeable, needs more eyeballs Needs Discussion Requires discussion from core team before further action

Comments

@datapythonista
Copy link
Member

I think it would be good to reestructure a bit our scripts, make sure all them have a header explaining what they do and how to use them, get rid of the ones not in use anymore (if any), and add the inventory of them to the contributing documentation, with what they do.

I'd move all them to ci/ (or if the name is not good rename it; sklearn uses build_tools, numba buildscripts). And inside, I'd have different directories (names can probably be improved):

  • ci/deps/: conda requirements files
  • ci/setup/: scripts to create the environment, like downloading conda, building pandas...
  • ci/testing/: scripts to run the tests
  • ci/checks/: scripts to validate code, like code_checks.sh or validate_docstrings.py
  • ci/release/: scripts used during the release
  • ci/benchmarks/: I'd move the asv files inside ci
  • ci/config: A directory for the yaml files with the CI configuration
  • ci/tools/: scripts like merge-pr.py, find_commits_touching_func.py...

It would probably be good to unify scripts, like script_single.sh and script_multi.sh that share 70% of the code, and have a single script with an argument run_tests.sh single / run_tests.sh multi.

And it could also be useful to have a single script for things like the set up, so we have downloading and running conda in the same script, and it can be called all together, or just a part (like code_checks.sh), so we can do setup_env.sh download, setup_env.sh create, or simply setup_env.sh to do both.

@pandas-dev/pandas-core thoughts?

@datapythonista datapythonista added CI Continuous Integration Clean Needs Discussion Requires discussion from core team before further action labels Nov 13, 2018
@datapythonista datapythonista self-assigned this Nov 13, 2018
@jbrockmendel
Copy link
Member

I'd move all them to ci/

What's the reason for this? "ci/" has a pretty clear scope; it isn't clear why we'd want to mix other things with it.

It would probably be good to unify scripts, like script_single.sh and script_multi.sh that share 70% of the code,

+1

Especially with regard to the CI config and code checks, a ton of effort has gone into these, and any extent to which we can label stuff as "HEY OTHER PROJECTS, GO AHEAD AND COPY/PASTE THIS" may save some other folks headaches.

@jreback
Copy link
Contributor

jreback commented Nov 13, 2018

ci has a pretty specific connotation so -1 on adding this to everything

in combining scripts - if u can do so w/o changing things then ok

@h-vetinari
Copy link
Contributor

I'm +-0 on renaming the folder, but I'd like to have a single script for validating all the lint check before committing. This is not the same as the commit-hooks talked about in #23616, but I'm copying half a comment from there for completeness:

  • with the recent changes to the CI, git diff upstream/master -u -- "*.py" | flake8 --diff doesn't cut it anymore.
  • it should IMO be replaced with a single script (ideally platform-agnostic, at least from a user-POV) that tests all the relevant lint checks (flake8, isort, doctests), which should also have a box in the PR template. (this script could be eventually enhanced with all sorts of things; e.g. check that dependencies are up to date; run tests that were changed by PR, etc.). DOC: flake8-per-pr for windows users #23707 is maybe a small step in that direction.

@datapythonista
Copy link
Member Author

We've got ci/code_checks.sh, probably too slow for a hook, but does what you describe.

@jbrockmendel
Copy link
Member

@datapythonista is this closed by #23924 or anything since then?

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Sep 23, 2020
@datapythonista
Copy link
Member Author

I would personally refactor and simplify those directories, but there hasn't been consensus, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Clean Closing Candidate May be closeable, needs more eyeballs Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

4 participants