-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine readable definitions #21
Comments
@bwalderman was openrpc.json assembled by hand? How about the API Reference, is that generated from openrpc.json? |
All of these api definition formats seem to use JSON Schema for the actual definitions. I'm not convineced that we really care about the value add of the additional layers on top of that; from a skim it looks like the additional features are about service discovery and licensing, which I don't think we particularly care about. In particular I see the following as use cases for machine-readable defintions in the spec:
I see the following as non-goals:
So I don't think we want endpoints that produce schema documents to allow clients to introspect the API or anything; in practice all the WebDriver and CDP clients are providing significant value-add over the mechanical conversion of protocol endpoints into code, and in any case updates to the spec will be accompanied by updates to the published schema, so we don't also need to allow introspection. Given that, I think we should just write json schema directly and not try to adopt any of the higher layer stuff like [Async|Open]API which afaict are mostly addressing needs we don't have. |
@foolip yes, openrpc.json was hand-written and the API reference was generated from it using https://github.com/open-rpc/schema-utils-js and some HTML templates. |
I see. I guess it’s not worth the effort now to put that build step into CI, but if we have a schema file later that’d make sense. |
It seems that the tooling for the OpenRPC spec is quite limited compared to the OpenAPI tools out there. It doesn't seem to be that difficult either to put something together that can:
While I don't think it makes sense to have spec text in the OpenRPC document it could be valueable to have parts of the bikeshed document be generated based on the OpenRPC doc. |
I've been looking at this some more. Even JSON schema seems like it's focused on something that's not quite perfect for the descriptive part of our needs (although I certainly think we are going to want to be able to generate JSON schema since that's probably the best tooling here). In particular it has quite a low-level focus on matching the on-the-wire representation of types. It seems like CDP is using some custom pdl format that's more like a high-level description of the various commands and types, and using that to generate at least JSON, TypeScript definitions, and Go bindings. That being some bespoke format is obviously troubling, but it at least looks like it solves some of the problems we have. For concreteness, let's assume we have a message format like
And we have some example command like one to enable a set of events for a specific set of browsing contexts
Then we should ideally be able to express the following properties:
This should scale to hundreds of commands, events and types without significantly violating DRY (e.g. by having to keep a list of strings that are valid command names separate from the list of commands themselves). |
It seems for me that OpenRPC can fulfil these requirements, given the example above such an OpenRPC representation could look like this: {
"name": "Network.enable",
"tags": [
{ "$ref": "#/components/tags/Command" },
{ "$ref": "#/components/tags/Network" }
],
"summary": "Enable notifications for an event.",
"paramStructure": "by-name",
"params": [
{
"name": "events",
"summary": "The name of the event to subscribe to. See Events for a full listing.",
"required": true,
"schema": {
"type": "array",
"items": { "$ref": "#/components/schemas/EventName" }
}
},
{
"name": "contexts",
"summary": "A list of context ids to connect the events to",
"required": false,
"schema": {
"type": "array",
"items": { "$ref": "#/components/schemas/ContextId" }
}
}
],
"result": { "$ref": "#/components/contentDescriptors/NullResult" }
} Note that the method name would be always something like |
For the purpose of making it easy to define commands/responses/errors/notifications I'd like make a concrete proposal for a syntax similar to Web IDL, which is already familiar to many spec authors. Illustrated with a bunch of random examples: domain Page {
// like https://w3c.github.io/webdriver/#navigate-to
command navigate {
// the command parameters:
required string url;
optional string referrer;
}; // no response parameters
// events caused by but not a response to navigate:
event navigationStart {
timestamp startTime;
};
event navigationEnd {
// just making things up...
timestamp startTime;
timestamp endTime;
};
// an event for when modal dialogs are opened
enum ModalDialogType { "alert", "confirm", "prompt" };
event modalDialogOpen {
ModalDialogType type;
string message;
};
// like https://w3c.github.io/webdriver/#accept-alert
command acceptModalDialog {
string promptText; // for "prompt" only
};
// like https://w3c.github.io/webdriver/#dismiss-alert
command dismissModalDialog {};
};
// like https://w3c.github.io/webdriver/#print-page
enum PrintOrientation { "portrait", "landscape" };
command printToPDF {
optional PrintOrientation orientation = "portrait";
// lots more
} => {
// this is a response parameter:
bytes pdfData;
};
}; I write this down not because I think it's urgent that we do something like this, but because #26 brought it to mind. A few observations:
|
CDDL is another IDL variant we could use here. It has the advantage that there's an RFC to point at and some existing tooling. It's also way easier to read/write by hand than JSON Schema. It's definitely not perfect, but might be better than inventing something entirely new. |
In direct response to @foolip's suggestion, I am wary of inventing something entirely new. We don't want to be side tracked into specifying a schema format rather than specifying an actual protocol :) That said, the more I think about it the more opposed I am to writing JSON schema directly; I think the format is too verbose and ugly, and doesn't really have the primitives we wanted in the sense that it's very focused on on-the-wire values and doesn't provide the formatlism for describing things as types. One idea I had today is to define things as TypeScript interfaces. That has the advantage that there are several TypeScript-to-JSON-schema tools available, and it's also familiar to many web devs. The main problem is that afaik there isn't a standard to point at for the syntax, so we might have to handwave a bit. I would certainly expect us to have (generated) JSON Schema as an appendix, since that seems like it's going to be most useful for implementors. |
Taking a look at https://www.typescriptlang.org/docs/handbook/interfaces.html, that seems like a reasonable fit for describing things that are JSON objects on the wire, so the parameters primarily. I suspect what we'll run into very soon is that we need more types and perhaps subsets of existing types, but perhaps that's all supported in TypeScript. @jgraham, with this approach, how do you see the name of the command itself and the domain (if we have those) being represented? Namespaces and functions, perhaps? |
I suggested in our meeting just now to first try to pin down the "model" of what our machine readable definitions are expressing. If we agree on that, the rest will "just" be syntax which does matter for spec authoring ergonomics, but many alternatives that aren't too verbose could work. With that said, I think the (nested) model is roughly:
An individual parameter is defined by its name, its type and its optionality. Is this missing anything? Other than the return and error types, is anything else possible to cut? |
@foolip that model looks good. I would say return types are important for us to be able to write tests. |
I think types, and the ability to define things as types, are important. For example we might have something like
and then prose to define that e.g. |
For comparison what CDP uses looks pretty good, but it is custom and very tied to CDP concepts: https://github.com/ChromeDevTools/devtools-protocol/blob/master/pdl/browser_protocol.pdl |
From #21 (comment) it's clear I forgot one thing, which we discussed in today's meeting. Commands should probably list the targets they can be sent to, and events should list the targets they can be emitted from. |
A non-trivial amount of complexity, I expect, will be in defining types for parameter/return/error types. This is somewhat connected to #16, but I think at the very least we'll need:
@jgraham also mentioned union types. An example of where we might end up using that would be helpful. I can't tell if CDP has that, but I'm not sure what keyword to search for :) |
I'm thinking some more about how the machine readable definitions will be integrated into the spec prose. I'm assuming there will still be ordinary spec text describing each command's behavior, so it would make sense to keep the machine readable type definitions for a command near its spec text. At a minimum each command spec would need a machine readable definition for it's parameter type (an object), and it's return type (also an object). Events would just need a parameter type. Common types that are used in more than one place (e.g. browsing context or realm IDs) could defined in a separate section. As a concrete example: Navigate ToThe command causes a Parametersinterface NavigateParams {
url: string
} Returns
Remote End Steps... Remote ends steps go here... The remote end steps assume the existence of a |
@bwalderman something along those lines is precisely what I've been envisioning, where the machine readable bits can be split into many small code blocks, and one only needs to define remote end steps which can use the parameters with the correct types directly. |
I also agree that's how we want the spec to look in the end. |
TypeScript and Web IDL seem like the best options since they are both easy for humans to read/write and have readily available tooling. Also, both have expressive enough type systems to cover our scenario and look more or less the same for the subset of functionality we'll likely be using. I'm leaning towards Web IDL. The standard provides some useful algorithms such as default steps for converting an IDL value to JSON and [checking if an object implements an interface]. These will come in handy for specify how commands/events are serialized/deserialized over the wire. |
I don't see how Web IDL as-such would work. There's a big assumption in WebIDL that you're making DOM APIs and a lot of the tooling around the platform assumes that too. We could do something WebIDLish, but it's not going to be exactly the same. Regarding sum types, if I was modelling this in a language supporting that I might start from
It's not the only way to do it of course, but being able to say things like "a target id is either a realm id or a context id, serialized in a way that allows dsicriminating the two" seems useful. |
Web IDL is theoretically language-agnostic. In practice, half the spec is dedicated to the ECMAScript binding and there are no other bindings mentioned, so yeah I agree there's a big assumption that this is for DOM APIs today. However, we might be able to add a "WebDriver" binding to that specification and fill in any gaps we need. While TypeScript interfaces are more than suitable for our needs, I'm not sure how we'd make use of it without some "handwaving" as you pointed out. From what I can tell, in the TypeScript specification, there's no straightforward algorithm we can point to that says "steps for checking if an object implements an interface". We also won't have as many (any??) options to change that spec if needed because at the end of the day, it's a programming language and not an interface definition language. We're not their target audience. Having said that, I'm not opposed to using TypeScript as our IDL if we can avoid relying too heavily on the TypeScript spec, and avoid having to (re-)?invent compliex algorithms for validating an object against a TypeScript interface. In other words, if we can simply say things like "if params does not implement TypeScript interface X return an error and terminate these steps". Another thing to keep in mind is that TypeScript is in active development so being explicit about which version we'd be using is important. |
@jgraham, since the example language above doesn't correspond exactly to the wire representation (i.e. JSON), do you expect it would be accompanied by spec text explaining how to (de)serialize it? |
For some context on #44 I chatted with @bwalderman about formats and we came to the conclusion that although there's nothing perfect available, CDDL is probably the best available option. WebIDL doesn't really match the use case of defining a wire protocol. JSON schema is pretty verbose to write, and could be problematic if we ever have a binary form in the future. Doing something custom like PDL or something that looks like TypeScript would probably give the best outcome, but in practice the amount of work required to specify the syntax is itself going to be large. CDDL gives us a fairly compact representation that's already seen usage in W3C specs defining protocols, and some degree of future compatibility if we ever add a CBOR transport. It's not perfect, but it doesn't seem worth blocking for longer on deciding something here. |
Thanks for picking something pragmatic and getting it done! I think we could close this, or keep it open to track a few final bits, which is markup conventions which would make it possible to get the domain and command name and group the parameter and return type definitions. Also, I'm curious if you found good JS or Python libraries for parsing CDDL while working on this? |
For libraries, the RFC mentions a ruby gem. The source code for that is here https://github.com/cabo/cddlc. There's also a rust library which seems to be more actively maintained and documented at https://github.com/anweiss/cddl. I didn't find any native JS or Python libraries. |
I've been using the rust library locally and it seems reasonable (there's a cli to validate that specific json matches the proposed schema, which was useful for debugging). I could imagine writing Python bindings for it if we want to use it from bikeshed or similar (related: I have some changes in the works for how we structure the schema so the spec ends up with something that could be extracted into a complete schema for endpoints to use, but things are being held up right now so no PR yet). |
hey all! I'm new to the WebDriver BiDi effort, but happy to help on the CDDL front. I'm the maintainer of https://github.com/anweiss/cddl, so let me know if if there's anything I can do to improve the library and tooling for your use case. |
What if anything is still needed here? |
I think we can close this and open more specific issues for the remaining problems. |
This is the BiDi sibling of issue w3c/webdriver#1510, see that issue's description for the full background.
The solution for REST and BiDi likely won't be the same, and we might do one without the other.
For BiDi specifically, @bwalderman has already put together a
openrpc.json
proposal.The text was updated successfully, but these errors were encountered: