Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CE 3: development process #644

Closed
alexandru opened this issue Sep 18, 2019 · 2 comments
Closed

CE 3: development process #644

alexandru opened this issue Sep 18, 2019 · 2 comments

Comments

@alexandru
Copy link
Member

alexandru commented Sep 18, 2019

This issue is about your thoughts on how the development process should go, as we'll need a plan.

What follows is my personal opinion ...


The CE 3 proposal is good and tasteful, it tries to fix several pain points that we are having with the current Cats-Effect hierarchy and I believe it has many good ideas in it.

However once upon a time (in #321 (comment)) I expressed my opinion on how the Cats-Effect 3 development should go:

  1. Scala libraries are known for breaking things and we can't take breakage lightly — when breakage must happen, we need to justify every single decision that we make, because we need really good and defensible reasons for why we'll break people's code
  2. All changes discussed should to be piecemeal such that they can be discussed individually and not in bulk — otherwise small details may slip past review

A good example is the concern I have with in #643 with the ExitCase type, because its current shape was chosen due to constraints we bumped into while developing the Bracket type class and all its instances in PR #113, but that due to the noise it's now tribal knowledge. A problem I'm certain we'll bump into again, but only after somebody goes through the pain of implementing actual instances for Bracket.

So starting with that proposal is great for the grand picture. However I don't like that we are rebuilding CE 3 from scratch, having a branch for it in which we are pushing changes that are not tested with actual implementations. That branch is great for testing and thinking about the design, don't get me wrong, but I hope you don't plan on that branch being dumped on master.

I mean it's fun and all 🙂 but we've got to do our "due diligence" and ensure that we won't break people's code lightly, given the popularity of the project.


PS — I subscribe to Rich Hickey's views on how and when to break people's code:

So when should we break an API?

The answer should be: never! But for the sake of correctness and given that Scala itself breaks compatibility with every major version, therefore people have some tolerance for breakage, then we can do it from time to time, if it's done for extremely good reasons, hence my concerns about the process we'll take.

Loved this presentation: Spec-ulation — and I totally subscribe to the idea of also changing the namespace when you change the API, unfortunately I can't convince others ¯_(ツ)_/¯

@alexandru
Copy link
Member Author

Seeing nobody replied yet 🙂 I should add that I'm not interested in debating things to death — but I'm afraid of a big code dump on master where details could be missed.

And the more people work on an alternative branch built from scratch, the more decisions become irreversible.

Therefore I'm hoping for some middle ground, to keep things fun, but at the same time to be able to show some trace of the design decisions being made and the ideal would be to separate stuff in multiple PRs for master.

For example the first step is to ... split the project in at least 2 sub-projects, to strip IO from the type classes and change Effect in the process, which might mean some work done in vain to fix the laws maybe, but then Effect will be closer to what we want in CE 3, which is progress as well. And I feel that such a task is actionable without much debate around it.

I already started on that split some time ago, but I gave up on it and I think I might have lost the branch I was working on.

@djspiewak
Copy link
Member

So the ce3 branch is definitely not going to be just dumped on master. Not even close. :-) The idea is to use it as a place to refine things and collaborate on a prototype, then move over to master and implement it for real. I could imagine code being copy/pasted from the branch, but nobody's doing a git-merge on this one. That's part of why I created it as an entirely detached branch.

Further, nothing on that branch is irreversible. I think as we get further and further along and decisions are built on decisions, certainly it takes more justification to remove something fundamental, but we're not there yet and we probably won't be there for quite a long time. Even once we abandon the branch and start doing work on master towards 3.0, things will still be up for grabs.

I feel like this is a really important point, because I sort of dumped a ton of code in ce3 and I don't want people to think it's sacred or anything. We all have things to contribute to this process, and I don't want the volume of code that's already there to make anyone think that things are set in stone or that it's already mostly-baked. It's not, and there is plenty of time to take this in a totally different direction if that turns out to be the right thing to do.

For example the first step is to ... split the project in at least 2 sub-projects, to strip IO from the type classes and change Effect in the process, which might mean some work done in vain to fix the laws maybe, but then Effect will be closer to what we want in CE 3, which is progress as well. And I feel that such a task is actionable without much debate around it.

I'm a little skeptical about taking an incremental step towards this, partially because a major goal of the 3.0 push was to get Async and Sync flipped to the leaves, which isn't something that can be done incrementally. I do agree that there is some incremental work which can be done, like the multi-module refactoring, but a lot of the hierarchy and laws are being top-to-bottom revamped, which kind of has to be done all at once.

Scala libraries are known for breaking things and we can't take breakage lightly — when breakage must happen, we need to justify every single decision that we make, because we need really good and defensible reasons for why we'll break people's code

I agree, though it's worth noting that once the bit is flipped on "we're incompatible!", then further changes beyond that are less relevant since everyone has to be careful. Put another way, I would rather shove all the possible breaking things into one big release, rather than space them out over a large number of (also breaking) releases which make smaller adjustments.

With that said though, I do take it very seriously, and there are a lot of ways in which we can make this process as painless as possible. Unnecessarily changing names is a really good example of where migration can be made needlessly hard. It's something that I'm quite concerned about with Concurrent, honestly, and to a lesser extent the async function. I'd like to make the migration as smooth as possible for users given that we're trying to significantly overhaul things.

It's worth noting that scalafix can probably cover a huge percentage of the migration cases currently proposed in CE3. Also worth noting that we're trying to do this very much in coordination with the ecosystem maintainers (both middleware and datatypes), so I expect that by the time we have a CE3 which is ready to publish, the rest of the ecosystem will at least have RCs queued up and ready to go which are already compatible. Migrating end-user code takes time, but at least we can do most of the migration work for the ecosystem by coordinating carefully as we go along.

So when should we break an API?

The answer should be: never!

Rich's views on this are commendable, but ultimately impossible in the context of Scala for several reasons. Remember that his views are colored by his firmly held belief that the correct way to build software is with a dynamically-typed, source-linked language with very minimal nominative runtime typing (since most native Clojure data is passed around in maps). So basically, he's baking an implicit assumption into his statements, which is that the general constraints of your runtime system are no more significant than that of a RESTful HTTP API, which is not an assumption which holds for very many ecosystems.

Scala is a statically-typed, strongly nominative, binary-linked language with a non-isolate runtime. So we pretty much have the hardest-possible compatibility problem space. The only thing that would make it worse would be if linkage was unchecked at runtime. In theory, Project Jigsaw could have eased the burden quite a bit by giving us controllable version isolates, but… Oracle… is Oracle. So this is what we have.

Does that mean we should be cavalier about breaking compatibility? Absolutely positively not. When a cornerstone project like Cats Effect to breaks compatibility, the results are enormously disruptive. It needs to be done very carefully, with a lot of coordination and communication, very very rarely. But sometimes it does need to be done.

We really need to break compatibility in CE3. The calculus of CE1 offers guarantees which are over-broad at times (like all Concurrent types can capture side-effects), while also precluding guarantees which could improve safety and tighten laws (like the type signature of Fiber#join). Fixing this situation requires breaking compatibility by definition, which sucks but I don't see a way around it.

With all that said, extraordinary measures require extraordinary justifications. The book is by no means closed on the breakage which has been proposed, or future breakage which may be proposed. Let's carefully justify things every step of the way, honestly examining why we need to change things, and if we don't need to change something, let's just not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants