-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: migrate to ESM #12
Conversation
40abc82
to
db6d789
Compare
Not sure if we need to fork this: andywer/threads.js#470 |
Might be able to use:
To bypass it. Now I think this would require a special |
db6d789
to
303b926
Compare
Actually I think I need to do a git submodule. That would be the easiest. |
Ok so trying to use a git submodule can be complicated due to the lack of dependencies being acquired under Going back to attempting with an import path. |
Actually even subpath imports does not work because the The only solution now is to either entirely fork the project or just provide overrides on the types, meaning we type out what The problem is none of the types work anymore.
Because of errors in how threads.js exposes the types. So we have to define all these types. |
Ok after setting the overriding the types to any, we still end up with a problem. Attempting to create a worker from a file written in TS requires
Of course if I change to just using regular But then it's not possible to do
Is necessary to actually get the constructs necessary, but the workers no longer have any corresponding types. I think though one could use annotations. But generally speaking, it's just not a good idea to use typescript based workers atm since we shouldn't be tied to ts-node anyway (even if it is only during development), because after compilation it would all be JS files anyway. For some reason All in all, I don't think as of now I might have to be forced to keep |
Going to try keeping js-workers as CJS, and long term wise look into removing threadsjs and favour of something in our flavour. Can see https://github.com/piscinajs/piscina for inspiration. |
I think we just remove browser support for the moment, and focus on nodejs worker threads, similar to our project in js-ws, and then slowly add back in webworker (browser support) afterwards. This can radically simplify this project and give us ESM support too. This could be assigned to @addievo. |
Going over the https://nodejs.org/api/worker_threads.html shows that the worker threads implementation will be quite complex. Here's a brief overview of things that need to be considered:
Point is, fixing up this worker ecosystem is extremely complicated. The This would be significant undertaking - estimated work would have to be 2 - 4 months to build a robust worker system that abides by the rest of PK's principles (I'm comparing it to how complicated js-quic became, but it should be simpler). Will need to schedule this for later after testnet 7. |
As per #16, we're not going to immediately migrate to ESM. Instead, we need to work on #16 to build out our own thread pool implementation. This is performance sensitive. So I vote for 2 entry points - Rust/C++ and JS level. Rust based entrypoint would be more flexible as we are moving towards all native libraries being written in Rust, and it would be easier to integrate into js-quic and js-db. JS level entrypoint means the threadpool is also usable by the any parallel processing required by JS. This would also mean that our threadpool doesn't abide by Web Workers. However we can follow the spec of Web Workers (in terms of API) and satisfy the interface type-wise, even if implementation wouldn't be using node's own worker threads. This would also mean our worker threads are outside libuv threading (which is traditionally used by the IO system in NodeJS), but that's also ok. There's some limitations in that libuv threadpool anyway and it was designed for IO specifically, whereas ours should work for compute parallelism too. Some testing would be important to understand whether js-quic should use libuv threading or integrate into this rust threadpool. Make use of benchmarks here early in order to get continuous benching. |
@tegefaulkes Take over this PR and update spec to target #16 and MatrixAI/TypeScript-Demo-Lib/issues/32. |
09ee60c
to
91161db
Compare
I've re-based on staging. |
After doing some prototyping with
The node implementaion of workers is pretty simple. You start a worker using As for enforcing types on making calls to the workers. The problem here is pretty similar to the RPC handles things. We have an interface that serialises data. Across this transition we loose the type enforcement so we need to re-apply types to the returned values. I think we can apply a similar solution here by providing a worker manifest which is an object of all the functions that can be called through a worker. We can then use this manifest as the worker code by calling a |
It's possible to inline the script as a string using To properly enforce types I need to construct an object as typescript code and import it as a type when creating the WorkerManager. This still needs to be solved but it should be possible to properly enforce the types this way. It will only work if we can import it but also load it as raw code. This is where the bundler comes in. We can import the types directly for type enforcement, but then we can use a raw loader to import the same code as raw code. Then provide it to the worker as an evaluated string. This should fix all out problems with bundling and import paths since everything will be imported the normal way. For example, the worker script will look something like this. // This is an example worker script
import type { WorkerManifest } from '#types.js';
import { expose } from './expose.js';
const worker = {
test: async (data: void) => {
return 'hello world!';
},
add: async (data: { a: number; b: number }): Promise<number> => {
console.log(data);
return data.a + data.b;
},
sub: async (data: { a: number; b: number }): Promise<number> => {
return data.a - data.b;
},
fac: async (data: number): Promise<number> => {
let acc = 1;
for (let i = 1; i < data; i++) {
acc = acc * i;
}
return acc;
},
} satisfies WorkerManifest;
expose(worker);
export default worker; We can then create a worker factory with the following. Assuming that this file will still be compiled the normal way for us. There may still be some things to solve here. // Import with the rawloader
import script from 'raw-loader!./script.ts'
const workerFactory: WorkerFactory = () => new Worker(script, { eval: true }); |
This is pretty much done, I just need to do final review and clean up. As for in-lining the import of the worker script and creating the worker that way. It should be possible but to do it properly requires the use of a bundler and custom loader. I can send a worker script using the script as a string and specifying The best solution is to have a bundler with a custom loader. Then we can import the worker directly to get it's manifest and types. But also import it as raw so we can create the worker with the raw code.
|
Misclicked and closed the PR. I've re-opened it. |
6c78704
to
df05712
Compare
…on to enable support for ESM
2cf5fd6
to
d5bf19d
Compare
All done, merging. |
Some notes for @shafiqihtsham based on what I had to fix up When you have a large switch (type) {
case 'one': {
/// some code here
break;
};
case 'two': {
/// some code here
break;
};
} Should be broken up like switch (type) {
case 'one': return this.processOne();
case 'two': return this.processTwo();
}
// and the protected functions are defined within the class.
protected processOne(): void { /* do one */};
protected processTwo(): void { /* do two */}; Furthermore when checking if something exist or is undefined always use When getting parameters of an object. It can be useful to name the parameters if we use it too much. For example const thing = object.otherThing.thing;
// do something with think multiple times. However, there isn't much point to it if you use const data = row.data;
const { gestaltId } = data;
// However data is only used once here and immedietly deconstructed. So it's kind of a pointless variable here. You should streamline it.
const { gestaltId } = row.data; One of the advantages of breaking down the processing code into sub functions is that you can streamline some branching logic. Take the following example protected processEventNodeUpdated(row: EventNodeUpdated): void {
const { nodeId, gestaltId } = row.data;
const gestaltMap = this.gestaltMapState.get(gestaltId);
if (gestaltMap != null) {
const node = gestaltMap.nodes.get(nodeId);
if (node != null) {
node.gestaltId = gestaltId;
this.gestaltMap$.next(this.gestaltMapState);
this.nodesService.editNode$.next(node);
}
}
}; Since this is contained within its own function with a very clear goal. We can simplify the branching here and make it easier to read. See how we flatten the logic and make it a lot clearer what's happening? protected processEventNodeUpdated(row: EventNodeUpdated): void {
const { nodeId, gestaltId } = row.data;
const gestaltMap = this.gestaltMapState.get(gestaltId);
if (gestaltMap == null) return;
const node = gestaltMap.nodes.get(nodeId);
if (node == null) return;
node.gestaltId = gestaltId;
this.gestaltMap$.next(this.gestaltMapState);
this.nodesService.editNode$.next(node);
}; Something like this has too much happening on one line. It's best to break something like this up. const edgeExists =
gestalt?.edges.whereRows(['from', 'to'], [nodeIdLow, nodeIdHigh])
.length > 0;
// Should become
const edges =
gestalt?.edges.whereRows(['from', 'to'], [nodeIdLow, nodeIdHigh])
if (edges.length > 0) // do thing; All instances of arrays should be defined as |
Description
Migrating to ESM.
We're switching to using the internal node implementation for worker threads. This will also address #16 .
Issues Fixed
Tasks
threads
and replace it withnode:worker_threads
.WorkerPool
for managing workers and scheduling tasks.Final checklist