Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(pkg): stop validating stuff we don't use #9030

Conversation

rgrinberg
Copy link
Member

We don't use [repo] files for anything. So we should not care about their
existence, format, or version.

Signed-off-by: Rudi Grinberg [email protected]

@Leonidas-from-XIV
Copy link
Collaborator

As I suggested this to be added originally, I argued that it would make sense to make sure that what we add is an opam-repository in the format that opam itself would require, thus despite not using it ourselves. Otherwise one could create opam-repos that only work in dune, but not in opam itself, leading to community split.

In fact, the git codepath is missing the validation logic and I would argue that it should probably also be implemented, for the same reasons.

cc @gridbugs

@rgrinberg
Copy link
Member Author

As I suggested this to be added originally, I argued that it would make sense to make sure that what we add is an opam-repository in the format that opam itself would require, thus despite not using it ourselves. Otherwise one could create opam-repos that only work in dune, but not in opam itself, leading to community split.

That's funny because I'm using the same reasons you are for arguing against this validation. I think the best way to avoid drift between our opam repositories and upstream is to be disciplined with how we interact with opam repos. In particular, we should assume as little as possible about the layout of the repo and let opam's client libraries take care of these details for us. That way, our task of making sure we never differ from upstream is simplified to just keeping our opam libray up to date.

More concretely, here are some possible changes that opam is within its rights to make that would immediately create an incompatibility between our repos and opam's:

  1. They could rename or move the repo file to some other place. Or even just out right get rid of it.
  2. They could bump the minimum version.
  3. They could change the format of the rep file.

Now contrast this with how we handle package files themselves. We don't make any assumptions ourselves, and yet the API allows us to read only valid opam files exactly in the same way as opam would view them. I would like the repository layout to be abstracted away from us in the same way.

@rgrinberg
Copy link
Member Author

rgrinberg commented Oct 30, 2023

Btw, I still left validation for stuff that we do rely on - like the packages directory being present for example. It has the same issue I mentioned, but at least it gives us better error messages in some concrete code paths. Nevertheless, any checks like that still belong to the opam lib.

Copy link
Collaborator

@gridbugs gridbugs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to also remove the mkrepo function from test/blackbox-tests/test-cases/pkg/helpers.sh while you're at it?

@rgrinberg rgrinberg force-pushed the ps/rr/refactor_pkg___stop_validating_stuff_we_don_t_use branch from 0c3f64a to 31d92f0 Compare October 31, 2023 01:10
@rgrinberg
Copy link
Member Author

Would it make sense to also remove the mkrepo function from test/blackbox-tests/test-cases/pkg/helpers.sh while you're at it?

I didn't remove it, but I did simplify it to stop creating the unnecessary repo file.

@Leonidas-from-XIV
Copy link
Collaborator

In particular, we should assume as little as possible about the layout of the repo and let opam's client libraries take care of these details for us.

Maybe the question is to ask OPAM maintainers what they consider an OPAM repo to be? In particular, I assume that any repo change will be marked in a way so that it is detectable in some way that OPAM uses. As such, I would find it potentially dangerous to ignore the repo file if that is the place that would tell us whether we're looking at a 2.0 repo as we support it or a hypothetical 3.0 repo where we probably want to bail out explicitly instead of treating it like a 2.0 repo and potentially missing stuff (say, files became inlined or the files are encoded in a different way).

I am not sure we can rely on the OPAM libs to do the validation - e.g. in the case of the Git-backed Rev_store, it simply isn't supported by the OPAM libs at the moment. The way to go about this would probably require splitting out the storage layer from the repository layer on the OPAM side so we could fit in our own storage layer. We can of course do so, but it locks us quite a bit into interfaces that we have to share with OPAM, whereas at the moment we can load repos from wherever we see fit. In theory that means we could even avoid having a local OPAM respository storage at all and just fetch OPAM files from the network.

@rgrinberg
Copy link
Member Author

Maybe the question is to ask OPAM maintainers what they consider an OPAM repo to be?

We should ask them to go beyond that: encode those constraints in the code for their libraries.

As such, I would find it potentially dangerous to ignore the repo file if that is the place that would tell us whether we're looking at a 2.0 repo as we support it or a hypothetical 3.0 repo where we probably want to bail out explicitly instead of treating it like a 2.0 repo and potentially missing stuff (say, files became inlined or the files are encoded in a different way).

These are important problems, but they should be solved by whoever is maintaining the opam libraries. In fact, I don't see how such a hypothetical upgrade can be done without considering how the client libraries could handle it.

I am not sure we can rely on the OPAM libs to do the validation - e.g. in the case of the Git-backed Rev_store, it simply isn't supported by the OPAM libs at the moment. The way to go about this would probably require splitting out the storage layer from the repository layer on the OPAM side so we could fit in our own storage layer.

Yes, that is exactly what they should do. They're already thinking about introducing a different storage layer for repos. It would be highly productive if they could abstract away all these details from us.

We can of course do so, but it locks us quite a bit into interfaces that we have to share with OPAM, whereas at the moment we can load repos from wherever we see fit. In theory that means we could even avoid having a local OPAM respository storage at all and just fetch OPAM files from the network.

I don't see these interfaces as a bad thing, but rather as critical to guarantee there's never any misunderstanding between dune and opam about the layout of repositories. The loss of flexibility you're pointing is indeed regrettable, but it's completely solvable by a less assuming design of the client library.

To re-iterate my point about validation in general:

  1. We validate what we own. We are free to modify, define constraints, and generally do whatever we want with such things. We just need to think about how our users would stay up to date.
  2. We validate to check our assumptions. In case there are programming or user errors, these make the debugging experience better.

Neither of those two categories is applicable to the checks removed in this PR.

We don't use [repo] files for anything. So we should not care about their
existence, format, or version.

Signed-off-by: Rudi Grinberg <[email protected]>

<!-- ps-id: 5c8b5541-d923-4e98-ad89-5e3e71db0837 -->
@Leonidas-from-XIV Leonidas-from-XIV force-pushed the ps/rr/refactor_pkg___stop_validating_stuff_we_don_t_use branch from 31d92f0 to dff1dfd Compare November 1, 2023 09:07
@Leonidas-from-XIV Leonidas-from-XIV merged commit a8d7a26 into main Nov 1, 2023
@Leonidas-from-XIV Leonidas-from-XIV deleted the ps/rr/refactor_pkg___stop_validating_stuff_we_don_t_use branch November 1, 2023 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants