Skip to content
This repository has been archived by the owner on Jun 18, 2019. It is now read-only.

Avoid case-sensitive workspace names #54

Merged
merged 4 commits into from
Jan 9, 2019
Merged

Avoid case-sensitive workspace names #54

merged 4 commits into from
Jan 9, 2019

Conversation

aherrmann-da
Copy link
Contributor

closes #52

  • This changes modifies the way Hazel generates workspace names such that they become case-insensitive. This is done by appending a hash of the package name as proposed in Colliding workspace names on case-insensitive system #52.
  • This change also unifies the Hazel name mangling into one extension file tools/mangling.bzl providing hazel_workspace, hazel_library, and hazel_binary functions in one location.

- On case-insensitive file-systems, in particular on Windows, Bazel does
    not distinguish workspaces whose names only differ by case. Stackage
    contains packages whose names only differ by case. E.g. `HSet` and
    `hset` in LTS 12.4. This causes errors of the following form on
    Windows.

    ```
    ERROR: C:/users/user/_bazel_user/glk6p2o5/external/ai_formation_hazel/hazel.bzl:172:3: Label '@haskell_HSet//:files' is duplicated in the 'files' attribute of rule 'all_hazel_packages'
    ```
- This changes modifies the way Hazel generates workspace names such
    that they become case-insensitive. This is done by appending a hash
    of the package name. Characters of different case will lead to
    different hash values.
- This change also unifies the Hazel name mangling into one extension
    file, and replaces all instances of manual name mangling by calls to
    the same name mangling functions.
@judah
Copy link
Contributor

judah commented Dec 10, 2018

I think it would be good to add more documentation in the README on how to discover what the hash is for a particular package. (That would help when debugging a package directly with blaze query or blaze build.)

In theory, you should be able to get the list of packages already by running something like blaze query "deps(@all_hazel_packages//:all-package-files,1). However, it might make sense to generate more rules to make these steps easier for users. Some possible ideas: (1) For each package haskell_foo, generate a file in all_hazel_packages named haskell_foo which contains the full repository name like haskell_foo_12345. (2) Create a rule like:

filegroup(
    name = "haskell_foo",
    deps = ["@haskell_foo_12345//:files],
)

So that blaze query "deps(@all_hazel_repositories//:haskell_foo, 1)" prints the repository name.

Use the following query to determine the workspace name to a Hackage
package:

```
$ bazel query 'deps(@all_hazel_packages//:haskell_mtl, 1)'
@all_hazel_packages//:haskell_mtl
@haskell_mtl_108453//:bzl
```

The last line describes the case-insensitive workspace name.

Note, that this also works on case-insensitive file systems, because
Bazel target names remain case-sensitive on case-insensitive file
systems.
@aherrmann-da
Copy link
Contributor Author

@judah Thanks, that's a good idea! I've confirmed that target names are indeed case-sensitive on Windows as well, so this works.

I noticed that the :files target actually doesn't exist on any of the package workspaces generated by Hazel. Am I missing something there?

For now I've implemented (2) using the @haskell_foo_12345//:bzl target.

I've also added some documentation to the README.

@thumphries
Copy link
Contributor

The implementation here looks good. I've tested it with our fairly large internal repo, which involved updating a few direct repository references to include their hashes.

I'm still not quite sure we're doing the right thing, though it certainly solves the acute problem on case-insensitive filesystems. Are there any other rulesets that deal with the same problem? Or is Hackage the only package repository to allow case-equivalent packages 😅

@aherrmann
Copy link
Contributor

aherrmann commented Dec 12, 2018

@thumphries

Are there any other rulesets that deal with the same problem? Or is Hackage the only package repository to allow case-equivalent packages

Very good question! For reference I had a look at Python, Java/Scala, and Golang, see below for details. Of what I could find it seems that this here is the first instance of this issue surfacing. It seems that most packaging ecosystems either don't allow package names to only differ in case, or have longer package coordinates including organization names that make such collisions unlikely.


Looking at rules_python they use lower-cased versions of package names, because apparently

From PEP 508 it appears that you can register packages with any capitalization, and install them with any capitalization, even if it's different. I.e. you can register package 'MyPackage' and install it with mypackage, MYPACKAGE, or even mYpAcKaGe.

In the JVM world, Maven coordinates are case-sensitive, in the sense that if you specify Guava as a dependency then Maven won't find guava. But, it seems that the Maven repository doesn't allow to create packages with names that only differ in case.

The repository.name is case-sensitive, but due to its direct use in elasticsearch (which is not case-sensitive) validation for repository creation will prevent you from creating another repository with a name with difference in only case.

So, this seems to be a non-issue for any of the JVM rules. If, perhaps, only incidentally.

In rules_go I couldn't find any handling of upper case in package names. I'm no Go expert, but my understanding is that third-party packages are always fully qualified, including some sort of organization/author identifier. This makes such upper/lower case collisions very unlikely. I could only find one instance where an organization renamed itself to an all lower case spelling. In that case rules_go decided not to act.

@thumphries
Copy link
Contributor

Apologies for sitting on this, it breaks our build in a subtle way and I haven't had time to properly evaluate why.

The mangled hazel_workspace name was inserted wrongly. Whenever another
package depended on a custom package it would fail due to the missing
external workspace.
@aherrmann-da
Copy link
Contributor Author

@thumphries I noticed that I made a silly mistake in the hazel_custom_package_* rules, sorry about that! I've fixed it in the latest commit. Maybe that was related to the subtle build failures that you encountered?

@thumphries
Copy link
Contributor

Actually, I suspect we were broken by the change in tweag/rules_haskell#550

Let's just merge this (since it is working for you) and I'll revisit once we catch up to rules_haskell master.

Again, sorry for the slow.

@thumphries thumphries merged commit 561366d into FormationAI:master Jan 9, 2019
@aherrmann-da aherrmann-da deleted the case-sensitive branch January 10, 2019 08:56
@mjrussell
Copy link
Contributor

mjrussell commented Jan 24, 2019

One thing to note about this change is that it breaks the current workaround TH embedFile with extra_src. The path updated here is now mangled.

I mentioned it in #57, but Im hoping a similar sym link approach might work

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Colliding workspace names on case-insensitive system
5 participants