Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition DT packages to use package.json files #769

Closed
andrewbranch opened this issue Oct 19, 2023 · 18 comments · Fixed by #788
Closed

Transition DT packages to use package.json files #769

andrewbranch opened this issue Oct 19, 2023 · 18 comments · Fixed by #788

Comments

@andrewbranch
Copy link
Member

andrewbranch commented Oct 19, 2023

This will make it much easier and faster to determine what packages have been changed since last successful publish, since that’s done via a content hash of all published files, but “all published files” is currently dynamically determined by following imports and triple-slash references from index.d.ts, which requires parsing TypeScript SourceFiles and performing module resolution.

There are two ways forward here:

  1. Use files to determine what to publish, but during CI, use the same module resolution and OTHER_FILES.txt logic to ensure that files is consistent with the result there. Critically, we need to ensure that files does not omit anything that gets referenced by index.d.ts or any other package entrypoint.
  2. Mandate that files includes all declaration files, README, LICENSE, nested package.jsons, and nothing else. (We have not yet checked to see if any packages violate this rule as they are now.) This would be simpler and cheaper. We would only have to move the Definitions must use global references to other packages, not parent ("../xxx") references check, as that’s currently issued as we collect referenced files (module-info.ts). We would also lose the unused file check, and I’m not sure how important that is, but if we want to keep it, we would want to move that into dtslint. (Note that all the module resolution already happens as part of the multiple ts.createProgram calls that happen in dtslint. In the future, we should also include all files mentioned by package.json "exports" in these program root files.)

I’m in favor of (2) for simplicity and consistency. Either way, adding files will change the content hash of every package and cause every package to be republished. A few observations about this:

  1. We should take the opportunity to change anything else we want to change about the content hash. Right now, it unnecessarily includes tsconfig.json paths, so we should remove that from the hash inputs. We also discussed stringifying the package.json object instead of including its file contents, so future formatting changes to package.json files wouldn’t cause a repo-wide republish.
  2. We could implement a hybrid approach in DT-tools if we want to transition DT slowly, by allowing, but not requiring files, and using it as the source of truth if present. Then, we could automate the change to DT as a series of PRs that can be merged slowly, as we did with formatting.
@jakebailey
Copy link
Member

I am very much in favor of the simple files array approach that just always includes d.ts files.

One gotcha is that some packages have scripts that they use to generate code, so we will want to be able to ignore those somehow. Maybe like tsconfig mismatches, error check in the mergebot if someone adds something else?

@andrewbranch
Copy link
Member Author

some packages have scripts that they use to generate code, so we will want to be able to ignore those

I think scripts should only be .ts or .js files, which wouldn’t be included in the globs. But maybe you’ll find some edge cases.

Some notes on unused files based on Teams notes.

lodash includes [OTHER_FILES.txt] since it doesn't have individual tests for the files that re-export its contents in various ways

The only real value that OTHER_FILES is serving in this case is making a note that the files in it are not included in test coverage. It’s not clear to me why this wouldn’t have just been encoded in tsconfig’s files, which is currently enforced to be just index.d.ts and the test file.

.d.mts is only [included in OTHER_FILES] because we don't support up-to-date ES modules in DT

Specifically, the thing we’re missing is considering other entrypoints besides index.d.ts. We should be looking at package.json "exports" and performing resolution from each file listed there, if we do any kind of resolution-based checks going forward. These entrypoints should also be allowed/mandated in tsconfig files, or we could just switch to the default include pattern, and add exclude for package version / typesVersions directories. Somehow, we should ensure the tsconfigs include every declaration file the package will include.

@andrewbranch
Copy link
Member Author

Note that declaration file extensions can be:

  • *.d.ts
  • *.d.mts
  • *.d.cts
  • *.d.*.ts (TS 5.0+)

The last one is neglected from DT-tools and some of our Teams conversations, so we should make sure not to forget it.

@jakebailey
Copy link
Member

The main exception to this rule so far is the nested v# directories get matched and so we need to find a way to exclude them. If we do that and use the globs listed above, these are the discrepencies:

Using local DefinitelyTyped at .
Parsing definitions...
Found 8715 packages.
Parsing in 20 child process(es)...
Parsing took 8.074 s
Mismatch in types/node-notifier/v5
dt-tools says [
  '../index.d.ts',
  '../notifiers/balloon.d.ts',
  '../notifiers/growl.d.ts',
  '../notifiers/notificationcenter.d.ts',
  '../notifiers/notifysend.d.ts',
  '../notifiers/toaster.d.ts',
  'index.d.ts'
]
but glob is   [ 'index.d.ts' ]
Mismatch in types/parse/v1
dt-tools says [
  '../index.d.ts',
  '../node.d.ts',
  '../react-native.d.ts',
  'index.d.ts'
]
but glob is   [ 'index.d.ts' ]

Which is to say that two packages are weirdly broken, but all 9000 other packages match this rule!

@jakebailey
Copy link
Member

Awful script:

// @ts-check
import { AllPackages, getDefinitelyTyped, parseDefinitions } from "@definitelytyped/definitions-parser";
import { glob } from "glob";
import assert from "node:assert";
import * as os from "node:os";
import path from "node:path";

const options = { definitelyTypedPath: ".", progress: false, parseInParallel: true };
const dt = await getDefinitelyTyped(options, console);
await parseDefinitions(dt, { nProcesses: os.cpus().length, definitelyTypedPath: "." }, console);
const allPackages = await AllPackages.read(dt);
const typings = allPackages.allTypings();

const patterns = [
    "**/*.d.ts",
    "**/*.d.mts",
    "**/*.d.cts",
    "**/*.d.*.ts",
];

await Promise.all(typings.map(async (data) => {
    const dir = path.join("types", data.subDirectoryPath);
    const resolvedDir = path.resolve(dir);
    const globFiles = (await glob(patterns, {
        cwd: dir,
        nodir: true,
        ignore: {
            childrenIgnored: (p) => {
                if (p.name === "node_modules") return true;
                const fullpath = p.fullpath();
                if (fullpath === resolvedDir) return false;
                if (
                    data.isLatest
                    && /^v[0-9]+(?:\.[0-9]+)?$/.test(p.name)
                    && fullpath === path.join(resolvedDir, p.name)
                ) {
                    return true;
                }
                return false;
            },
        },
    })).sort();
    const expected = data.files.slice().sort();

    try {
        assert.deepStrictEqual(globFiles, expected);
    } catch (e) {
        console.error(`Mismatch in ${dir}`);
        console.error("dt-tools says", expected);
        console.error("but glob is  ", globFiles);
    }
}));

@jakebailey
Copy link
Member

Just to make sure I did it right, I changed the glob to also include *.ts and it did find all of the extra files, so I think the script is working.

@sandersn
Copy link
Member

  • I think we should fix notifier/v5 and parse/v1 first instead of excluding them.
  • The unused-files check has two purposes: (1) don't ship unused .d.ts by mistake (2) don't ship other files (esp .js, maliciously). For (1) we should have a lint rule which can be disabled. It'll need to be an ts-eslint typed rule I think to get the module resolution but I'm not sure. For (2) we just need to add a simple check to validatePackageJson. But we should still check.
  • How much more annoying does this make adding a package or a adding a file? For the former, it's yet another step although dts-gen will generate it. For the latter, it's another step but not that annoying (and adding files is rare).

There's still a chance somebody would want to ship something besides README.md or .d.ts files, but I think it's low given that it hasn't happened in the last five years.

nested package.jsons
why would files include nested package.jsons?

@jakebailey
Copy link
Member

Those two are probably already "working" by nature of the publisher placing files at those paths and then npm publish not including them. But I'm sorta guessing here that this is a bug in dt-tools. Both packages declare module on themselves and I bet that's somehow not augmenting the right thing.

@jakebailey
Copy link
Member

As for the files, right now, it's the case that all files that we include are explicitly just those four globs, then npm implicitly adds README and LICENSE files, so nobody has to list them. I feel like if we just require that package.json contains those globs and nothing else (with maybe also !v*/ or something to not include child packages), that would be all we need.

We can still keep the detection and OTHER_FILES if that makes something easier, but if we are strict about the files array having a particular form, failing CI and refusing to publish, I feel like we don't need anything else.

@andrewbranch
Copy link
Member Author

(1) don't ship unused .d.ts by mistake

I don’t think “unused” has been a particularly well-defined concept here, historically. Any .d.ts file in a package that doesn’t have package.json exports is a de-facto public API entrypoint for the package, importable by "pkg/path/to/file.js". If package.json exports exist, then the public API entrypoints are explicitly defined—we can do module resolution from there to find all reachable files, and any remaining ones are much more likely to be mistakes.

Even without exports, I would probably advocate for considering tsconfig files to be the source of truth for package entrypoints. With exports, we could check that the tsconfig’s root files matches what can be seen in exports.

@sandersn
Copy link
Member

Here are my notes from our discussion that roughly correspond to work items:

  1. files array can be strict, list of globs followed by exclusions for old versions.

  2. adding an v15 subpackage needs to check its parent during incremental tests. Verify that parent's package.json files excludes v15. Just put it in dtslint's checks.ts. (Also verify that subpackages have no excludes)

  3. move ../other-module check from module-info to a lint rule

  4. publisher's getFiles should not read files array, instead build the correct array and run the glob. So you can't check in the wrong thing and include bad files.

  5. remove OTHER_FILES entirely.

  6. .... and later turn it into a lint rule based on tsconfig.json's files list.

@jakebailey
Copy link
Member

In trying to implement this, it was a lot easier to just directly check per package what the intended files array should be, and not any of the parent stuff, so my WIP branch does that instead which is clearer anyhow as it can directly assert equality to a specific expected list.

@jakebailey
Copy link
Member

It turns out that files is not consistent between package managers at all. For react, I would have expected to be able to write:

    "files": [
        "**/*.d.{ts,cts,mts,*.ts}",
        "!v15",
        "!v16",
        "!v17"
    ],

But npm7, npm8, pnpm, and all versions of yarn did not ignore things.

Both !v15/* and !v15/**/* caused npm6 to time out, the rest behaved as above.

The best I've gotten is to write .npmignore like:

*
!**/*.d.{ts,cts,mts,*.ts}
/v15/
/v16/
/v17/

This makes all package managers happy except yarn1, yarn3, and yarn4.

Really, this just affects people randomly running pack on a package dir, which already doesn't work right today. The publisher won't use any of this info at all in favor of directly calculating the expected glob / file list. So I guess what we choose just comes down to what we believe people may need when directly packing, if anyone.

@jakebailey
Copy link
Member

I guess yarn implements this as an include/exclude list, so the order doesn't matter at all. https://github.com/yarnpkg/berry/blob/ab2e84588b1eacb2ec60a751f12b168415224a19/packages/plugin-pack/sources/packUtils.ts#L204

@jakebailey
Copy link
Member

jakebailey commented Oct 25, 2023

Yeah, my script didn't take into account packages which didn't contain any files; yarn ends up with empty packages because the patterns exclude all files and further patterns do not override things.

*
!**/*.d.ts
!**/*.d.cts
!**/*.d.mts
!**/*.d.*.ts
/v15/
/v16/
/v17/

@jakebailey
Copy link
Member

Hm, additionally, patterns like **/*.d.ts don't work in yarn either; **should also match no paths (so a dts at the root), but it only accepts paths nested in dirs.

@jakebailey
Copy link
Member

Filed yarnpkg/berry#5872.

@sandersn
Copy link
Member

new work on DT=new bugs filed on package managers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants