Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: try to make faster #996

Merged
merged 19 commits into from
Jul 30, 2024
Merged

Conversation

henryiii
Copy link
Member

Trying this out.

Signed-off-by: Henry Schreiner <[email protected]>
@henryiii henryiii force-pushed the henryiii/ci/faster branch from 21720b0 to 7cec773 Compare May 31, 2024 16:57
@HDembinski
Copy link
Member

And, is it faster?

@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

Yes, though not quite as much as I hoped. mold saves about 6 seconds in linking, but takes a couple of seconds to install. uv drops the install time to 0 seconds, but most of the install time is taken up in the coverage job, which is all tangled up in make (make is my pet peeve, I can't stand it, IMO it's the wrong answer to everything it's used for). I'll try to see if I can come up with a way to do the coverage job and get uv to install.

PS: Would you be interested in having a noxfile? I could even set up make to call it, but would you be okay to have nox installed? I use that on most of my repos, and it natively supports enabling uv.

@HDembinski
Copy link
Member

HDembinski commented Jun 3, 2024

make is my pet peeve

I think it is very versatile and one can use it for so much more than building code. It is nice because it can automatically parallelize workloads.

PS: Would you be interested in having a noxfile? I could even set up make to call it, but would you be okay to have nox installed? I use that on most of my repos, and it natively supports enabling uv.

No, I don't think I need it. Wait, I confused that with tox. If you give it a try, why not.

@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

one can use it for so much more than building code

But there are better solutions for all those things (IMO). Latex is better handled by latexmk. Python is better handled by anything that understands virtual environments. Compiling is better handled by a CMake and Ninja - make cannot parallelize past the target boundary, which is ridiculous; why should you wait for target "a"'s files to build to start building target "b"'s files? That alone makes ninja much faster.

And the syntax literally requires tabs. 🤮 Even Rake is superior; at least it's using a real programming language's (Ruby) syntax. There's also "invoke", which make-like with Python syntax. Though I haven't really used it. "doit" is the underlying Python library and provides make-style rebuild logic for Python, and that looks really useful. Though again, haven't really used it, but have seen it used in other tools.

And yes, it's installed almost everywhere, but comes in different flavors and actually writing an advanced makefile that's portable is painful. The good (relative) features are tied to things like gmake.

Those are just some of my complaints; I'm fine with people using it if they like it, I just really, really don't like it. The two best uses are IMO as a "common commands" list (like you are using it), and for setting up a this-generates-that workflow (but I think things like snakemake or luigi might be better?). But I think I'm allowed at least one pet peeve and it's make so I avoid it. :)

If you give it a try, why not.

I can try a nox-based solution, then if you don't like it, I can avoid it. I should be able to keep the makefile commands the same. :) (I technically already have a noxfile for iminuit that I use locally for development)

@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

Actually, after saying all that, I think I'll just fix .ci/install_deps.py for now, then propose a noxfile separately, probably after the next release or so of nox. I'm planning on improving nox's native support for binary builds; your nice rebuild feature can be done in nox, but it currently has to be manually implemented and I'm hoping to upstream it as a feature first.

@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

FYI, with a warm cache, uv pip install -r pyproject.toml --extra test takes 317ms. For comparison, yq '.project.optional-dependencies.test' pyproject.toml -ot | xargs py -m pip install also with a warm cache takes 32.7s.

@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

By the way, one of the slowest parts is the ~40 second checkout due to needing to clone all of ROOT.

@henryiii henryiii force-pushed the henryiii/ci/faster branch 2 times, most recently from 66ae0e2 to a450374 Compare June 3, 2024 21:46
@henryiii henryiii force-pushed the henryiii/ci/faster branch from a450374 to 8d7cedf Compare June 3, 2024 21:51
@henryiii
Copy link
Member Author

henryiii commented Jun 3, 2024

This basically removes all the penalty for installing the full test dependency set (uv lumps building into downloading, but it's under a second I believe), so I enabled everything. That exposes one error (and two warnings) on macOS ARM:

=================================== FAILURES ===================================
_________________________ test_Template_with_model_2D __________________________
    def test_Template_with_model_2D():
        truth1 = (1.0, 0.1, 0.2, 0.3, 0.4, 0.5)
        x1, y1 = mvnorm(*truth1[1:]).rvs(size=int(truth1[0] * 1000), random_state=1).T
        truth2 = (1.0, 0.2, 0.1, 0.4, 0.3, 0.0)
        x2, y2 = mvnorm(*truth2[1:]).rvs(size=int(truth2[0] * 1000), random_state=1).T
    
        x = np.append(x1, x2)
        y = np.append(y1, y2)
        w, xe, ye = np.histogram2d(x, y, bins=(3, 5))
    
        def model(xy, n, mux, muy, sx, sy, rho):
            return n * 1000 * mvnorm(mux, muy, sx, sy, rho).cdf(np.transpose(xy))
    
        x3, y3 = mvnorm(*truth2[1:]).rvs(size=int(truth2[0] * 10000), random_state=2).T
        template = np.histogram2d(x3, y3, bins=(xe, ye))[0]
    
        cost = Template(w, (xe, ye), (model, template))
        assert cost.ndata == np.prod(w.shape)
        m = Minuit(cost, *truth1, 1)
        m.limits["x0_n", "x0_sx", "x0_sy"] = (0, None)
        m.limits["x0_rho"] = (-1, 1)
>       m.migrad()
tests/test_cost.py:1867: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/iminuit/minuit.py:791: in migrad
    fm = migrad(ncall, self._tolerance)
            class ndarray is returned.
    
        Raises
        ------
        ValueError
            Raises ValueError if `a` contains NaN (Not a Number) or Inf (Infinity).
    
        See Also
        --------
        asarray : Create and array.
        asanyarray : Similar function which passes through subclasses.
        ascontiguousarray : Convert input to a contiguous array.
        asfarray : Convert input to a floating point ndarray.
        asfortranarray : Convert input to an ndarray with column-major
                         memory order.
        fromiter : Create an array from an iterator.
        fromfunction : Construct an array by executing a function on grid
                       positions.
    
        Examples
        --------
        Convert a list into an array.  If all elements are finite
        ``asarray_chkfinite`` is identical to ``asarray``.
    
        >>> a = [1, 2]
        >>> np.asarray_chkfinite(a, dtype=float)
        array([1., 2.])
    
        Raises ValueError if array_like contains Nans or Infs.
    
        >>> a = [1, 2, np.inf]
        >>> try:
        ...     np.asarray_chkfinite(a)
        ... except ValueError:
        ...     print('ValueError')
        ...
        ValueError
    
        """
        a = asarray(a, dtype=dtype, order=order)
        if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all():
>           raise ValueError(
                "array must not contain infs or NaNs")
E           ValueError: array must not contain infs or NaNs
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/numpy/lib/function_base.py:628: ValueError
=============================== warnings summary ===============================
tests/test_cost.py::test_Template_with_model_2D
  /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/iminuit/cost.py:396: RuntimeWarning: overflow encountered in divide
    beta = (n + k) / (mu + k)
tests/test_cost.py::test_Template_with_model_2D
  /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/iminuit/cost.py:397: RuntimeWarning: invalid value encountered in multiply
    return poisson_chi2(n, mu * beta) + poisson_chi2(k, k * beta)

Might be nice to fix.

I'll come back to this later.

@HDembinski
Copy link
Member

one can use it for so much more than building code

But there are better solutions for all those things (IMO). Latex is better handled by latexmk. Python is better handled by anything that understands virtual environments. Compiling is better handled by a CMake and Ninja - make cannot parallelize past the target boundary, which is ridiculous; why should you wait for target "a"'s files to build to start building target "b"'s files? That alone makes ninja much faster.

Those are all fair points. I admit that I don't use make for these purposes for the same reasons, but I don't hate it. Before snakemake became the build-your-analysis tool of choice at my institute, I used make to generate scientific outputs in a parallel pipeline. For that purpose, make was the right tool. It is a low-level tool, so it is very flexible and one can use it for all kinds of purposes. High-level tools are better for specific jobs, but then you need to learn N tools. Like everything in life, it is a trade-off.

And the syntax literally requires tabs. 🤮 Even Rake is superior; at least it's using a real programming language's (Ruby) syntax. There's also "invoke", which make-like with Python syntax. Though I haven't really used it. "doit" is the underlying Python library and provides make-style rebuild logic for Python, and that looks really useful. Though again, haven't really used it, but have seen it used in other tools.

And yes, it's installed almost everywhere, but comes in different flavors and actually writing an advanced makefile that's portable is painful. The good (relative) features are tied to things like gmake.

Those are just some of my complaints; I'm fine with people using it if they like it, I just really, really don't like it. The two best uses are IMO as a "common commands" list (like you are using it), and for setting up a this-generates-that workflow (but I think things like snakemake or luigi might be better?). But I think I'm allowed at least one pet peeve and it's make so I avoid it. :)

If you give it a try, why not.

I can try a nox-based solution, then if you don't like it, I can avoid it. I should be able to keep the makefile commands the same. :) (I technically already have a noxfile for iminuit that I use locally for development)

Ok now I found the right webpage for nox, and you are one of the maintainers of course. :) I would be happy to have a look at your nox file in another PR, since you made it already. It is probably better readable than the makefile for iminuit.

@HDembinski
Copy link
Member

Might be nice to fix.

Yes, but it is not easy to fix. It is a random numerical thing that happens on the ARM architecture.

@HDembinski HDembinski marked this pull request as ready for review July 9, 2024 13:52
@henryiii
Copy link
Member Author

Let me try to pull out the best bits. Currently recovering from conjunctivitis so might be a few days.

@HDembinski
Copy link
Member

@henryiii I fixed the failing test. I cannot fix the issue on PyPy-3.9, it is related to the meson build of scipy.

@HDembinski
Copy link
Member

I also cannot fix the segfaults on Windows. I thought it is because of numba, but that does not seem to be the case.

@HDembinski HDembinski merged commit db061c8 into scikit-hep:develop Jul 30, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants