-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monolithic Packaging #6
Comments
Based on a whitepaper by @rix0rrr. The ultimate root cause of the issues we are encountering is a specific feature of the NPM package manager, one that both encourages a package ecosystem that “just works” with minimal maintenance and versioning problems, as well one that significantly simplifies the package manager: Packages can appear multiple times in the dependency graph, at different versions. It simplifies the package manager because the package manager does not have to resolve version conflicts to arrive at a single version that will work across the entire dependency tree, and report a usable error if fails to do so. In the presence of version ranges, this is an open research problem, which NPM conveniently sidesteps. The FeatureAs an example, a package tablify could depend on a package called leftpad at version 2.0, while the application using tablify could be using leftpad at an older and incompatible version 1.0. The following dependency graph is a valid NPM dependency graph:
Loading dependencies is done in NPM by calling require("leftpad") or import { ... } from "leftpad". If this statement is executed in a source file of app, it will load [email protected], and if it is executed in a source file of tablify it will load [email protected].
The ProblemAll is well with this strategy of keeping multiple copies of a package in the dependency tree, as long as no types “escape” from the boundary of the package’s mini-closure. In the typical case of tablify and leftpad, [one can reasonably assume that] all values going in and out of the libraries are types from the standard, shared runtime (in this case strings), and the only thing being depended upon is a specific behavior provided by the library. However, let’s say that leftpad contains a class called Paddable that tablify accepts in its public interface, and that the interface of the class had undergone a breaking change between version 1 and 2 (in JavaScript): // --------------- leftpad 2.0 ---------------------------
class Paddable {
constructor(s) { ... }
padLeft(n) { ... }
}
// --------------- tablify 1.0 ---------------------------
// Expected: Array<Array<Paddable>>
function tablify(rows) {
return rows.map(row => row.map(cell => cell.padLeft(10));
}
// --------------- leftpad 1.0 ---------------------------
class Paddable {
constructor(s) { ... }
// Oops, forgot to camelCase, rectified in version 2.0
padleft(n) { ... }
}
// --------------- app 1.0 ---------------------------
import { Paddable } from 'leftpad';
import { tablify } from 'tablify';
const p = new Paddable('text');
tablify([[p]]); In this code sample, a Paddable@1 gets constructed and passed to a function which expects to be able to use it as a Paddable@2, and the code explodes. TypeScript to the rescueThe CDK uses TypeScript, and the TypeScript compiler actually protects against this issue. If we actually typed the tablify function as import { Paddable } from 'leftpad';
function tablify(rows:* *Array<Array<Paddable>>) {
...
} The TypeScript compiler would correctly resolve that type to Array<ArrayPaddable@2>, and refuse to compile the calling of tablify with an argument of type Array<ArrayPaddable@1>, because of the detected incompatibility between the types.
Implications on CDKThis situation arises the following use cases:
User wants to use an old versionThe first one, colloquially called “rollback” is easy to understand and seemingly has a simple solution, so let’s look at it first. This situation comes up when the user is in one of 2 situations:
"dependencies": {
"@aws-cdk/aws-ecs": "1.12.0",
"@aws-cdk/core": "1.12.0",
} The lack of a caret makes it a fixed version, indicating they really want 1.12.0 and not “at least 1.12.0” (which would resolve to 1.13.0 in this situation). However, because of the transitive caret dependency the complete dependency graph would look like this:
The dependency tree ends up with 2 versions of core in it, and we end up in a broken state.
This works fine for the CDK, but breaks in the face of 3rd party construct User is using 3rd party construct libraryThe previously discussed solution works fine for first-party libraries, because
|
Just occurred to me that |
I see another problem that I think a package reorganization would address: Many people are mixing up L1 and L2 APIs I see a lot of people using L2 classes, and then being disappointed when they can't plug them into L1 classes. This happens a lot in Python (where there's obviously no typechecker to tell you you're doing it wrong), but also happens in other languages where people call integration APIs between L2 classes to plug the result into an L1 (see recent changes to CloudWatch dashboards, mutliple people internally and externally were calling Apparently our current communication methods of I would therefore like to propose we drop the L1-and-L2-in-a-single-package style of bundling, and publish 2 different packages, one with all L1s and one with all L2s. Note that this needn't imply a (*) The existence of Originally posted by @rix0rrr in https://github.com/_render_node/MDIzOlB1bGxSZXF1ZXN0UmV2aWV3VGhyZWFkMjMxMjgzNzk0OnYy/pull_request_review_threads/discussion |
@rix0rrr wrote:
Not sure about publishing two separate packages for L1s and L2s but we can totally put all L1s under their separate aws-xxx-cfn namespace within the big module. For example, if we release two packages, where would the core modules go? |
We are experiencing the 3rd-party dependency issue noted above - I'd love to see a resolution to that. It's worth noting that we're encountering it in the Python version of CDK, but the results are the same. |
What are the issues you are encountering in the Python version of the CDK specifically? Doesn't user version override library version always? |
I think it's reasonable to have a subpackage that's |
Further expanding on this, although NPM does not support it natively (yet), Yarn does support a This of course is nowhere near ideal and does not solve all of the ergonomic issues, but critically can be done without a significant breaking change.
|
Proposal to distribute the AWS CDK as a single module instead of 150+ modules in order to allow third-party CDK modules to declare their dependency on the AWS CDK as a peer dependency Related to #6
Below is an example for Python. Here is an excerpt from my library's setup(
install_requires=[
'aws_cdk.aws_iam>=1.18.0',
'aws_cdk.aws_s3_assets>=1.18.0',
'aws_cdk.core>=1.18.0',
'docker'
]
) From the following (.venv) cdk-chalice apulver$ pip install -e .
Obtaining file:///Users/apulver/Code/cdk-chalice
Collecting aws_cdk.aws_iam>=1.18.0 (from cdk-chalice==0.7.0)
...
Collecting jsii~=1.3.0 (from aws_cdk.aws_iam>=1.18.0->cdk-chalice==0.7.0)
...
constructs 3.0.1 has requirement jsii~=1.1.0, but you'll have jsii 1.3.1 which is incompatible.
...
Installing collected packages: typing-extensions, cattrs, jsii, publication, aws-cdk.cloud-assembly-schema, aws-cdk.cx-api, constructs, aws-cdk.core, aws-cdk.region-info, aws-cdk.aws-iam, aws-cdk.assets, aws-cdk.aws-events, aws-cdk.aws-kms, aws-cdk.aws-s3, aws-cdk.aws-s3-assets, websocket-client, docker, cdk-chalice
Running setup.py develop for cdk-chalice
Successfully installed aws-cdk.assets-1.33.0 aws-cdk.aws-events-1.33.0 aws-cdk.aws-iam-1.33.0 aws-cdk.aws-kms-1.33.0 aws-cdk.aws-s3-1.33.0 aws-cdk.aws-s3-assets-1.33.0 aws-cdk.cloud-assembly-schema-1.33.0 aws-cdk.core-1.33.0 aws-cdk.cx-api-1.33.0 aws-cdk.region-info-1.33.0 cattrs-1.0.0 cdk-chalice constructs-3.0.1 docker-4.2.0 jsii-1.3.1 publication-0.0.3 typing-extensions-3.7.4.2 websocket-client-0.57.0 |
This literally bit me live in an AWS Gameday. I was on 1.40.0 and 1.40.1 hit. I did a new cdk init and boom type mismatch. |
Description
The CDK consists of 100+ packages, one for every AWS service and then some, with
complex interdependencies. For example, the aws-ecs package depends on core,
aws-iam, aws-ecr, aws-ec2 and more packages to do its work. In fact, it
depends on 23 other packages.
This means that when a user wishes to use the aws-ecs module, their package
manager needs to fetch all 23 dependencies.
This proposal suggests to bundle and release the AWS CDK as a single monolithic
module. This means that the way users consume the CDK will change in a breaking
way.
Progress
The text was updated successfully, but these errors were encountered: