Machine readable definitions #21

foolip · 2020-05-29T08:47:49Z

This is the BiDi sibling of issue w3c/webdriver#1510, see that issue's description for the full background.

The solution for REST and BiDi likely won't be the same, and we might do one without the other.

For BiDi specifically, @bwalderman has already put together a openrpc.json proposal.

The text was updated successfully, but these errors were encountered:

foolip · 2020-05-29T08:50:43Z

@bwalderman was openrpc.json assembled by hand? How about the API Reference, is that generated from openrpc.json?

jgraham · 2020-05-29T12:44:39Z

All of these api definition formats seem to use JSON Schema for the actual definitions. I'm not convineced that we really care about the value add of the additional layers on top of that; from a skim it looks like the additional features are about service discovery and licensing, which I don't think we particularly care about. In particular I see the following as use cases for machine-readable defintions in the spec:

Reduce the spec-text boilerplate describing de(serilaization) of messages
Make more of the spec constraints machine verifiable (e.g. ability to cross check that all errors are one of the accepted codes)
Give browser authors and client authors definitions they can use directly implement the de(serialization) of messages
Provide better documentation of the expected message format compared to having to reverse engineer the browser steps

I see the following as non-goals:

Allowing generic RPC clients to connect to WebDriver endpoints and navigate them without specific understanding of the protocol semantics

So I don't think we want endpoints that produce schema documents to allow clients to introspect the API or anything; in practice all the WebDriver and CDP clients are providing significant value-add over the mechanical conversion of protocol endpoints into code, and in any case updates to the spec will be accompanied by updates to the published schema, so we don't also need to allow introspection.

Given that, I think we should just write json schema directly and not try to adopt any of the higher layer stuff like [Async|Open]API which afaict are mostly addressing needs we don't have.

bwalderman · 2020-05-29T19:36:51Z

@foolip yes, openrpc.json was hand-written and the API reference was generated from it using https://github.com/open-rpc/schema-utils-js and some HTML templates.

foolip · 2020-05-29T19:48:50Z

I see. I guess it’s not worth the effort now to put that build step into CI, but if we have a schema file later that’d make sense.

christian-bromann · 2020-06-02T10:22:47Z

It seems that the tooling for the OpenRPC spec is quite limited compared to the OpenAPI tools out there. It doesn't seem to be that difficult either to put something together that can:

resolve the spec: create one large openrpc.json file based on many Yaml files
lint the spec: re-using validateOpenRPCDocument
generate an html file

While I don't think it makes sense to have spec text in the OpenRPC document it could be valueable to have parts of the bikeshed document be generated based on the OpenRPC doc.

jgraham · 2020-06-02T11:06:18Z

I've been looking at this some more. Even JSON schema seems like it's focused on something that's not quite perfect for the descriptive part of our needs (although I certainly think we are going to want to be able to generate JSON schema since that's probably the best tooling here). In particular it has quite a low-level focus on matching the on-the-wire representation of types.

It seems like CDP is using some custom pdl format that's more like a high-level description of the various commands and types, and using that to generate at least JSON, TypeScript definitions, and Go bindings. That being some bespoke format is obviously troubling, but it at least looks like it solves some of the problems we have.

For concreteness, let's assume we have a message format like

{
  id: <Integer>
  method: <CommandName>
  params: <CommandParams>
}

And we have some example command like one to enable a set of events for a specific set of browsing contexts

{
  id: <Integer>
  method: "enable",
  params: {
    events: Array<EventName>,
    contexts: Optional<Array<ContextId>>
 } 
}

Then we should ideally be able to express the following properties:

CommandName is a string that corresponds to a known command name
CommandParams is an object which is represented by a type/schema according to the value of CommandName.
"enable" is a valid value of CommandName, corresponding to a CommandParams type with the events and contexts key.
EventName is a string corresponding to a known name of an event
ContextId is a typedef for an integer that represents a browsing context id

This should scale to hundreds of commands, events and types without significantly violating DRY (e.g. by having to keep a list of strings that are valid command names separate from the list of commands themselves).

christian-bromann · 2020-06-02T13:05:06Z

It seems for me that OpenRPC can fulfil these requirements, given the example above such an OpenRPC representation could look like this:

{
    "name": "Network.enable",
    "tags": [
        { "$ref": "#/components/tags/Command" },
        { "$ref": "#/components/tags/Network" }
    ],
    "summary": "Enable notifications for an event.",
    "paramStructure": "by-name",
    "params": [
        {
            "name": "events",
            "summary": "The name of the event to subscribe to. See Events for a full listing.",
            "required": true,
            "schema": {
                "type": "array",
                "items": { "$ref": "#/components/schemas/EventName" }
            }
        },
        {
            "name": "contexts",
            "summary": "A list of context ids to connect the events to",
            "required": false,
            "schema": {
                "type": "array",
                "items": { "$ref": "#/components/schemas/ContextId" }
            }
        }
    ],
    "result": { "$ref": "#/components/contentDescriptors/NullResult" }
}

Note that the method name would be always something like "<domain>.<method>" tagged with Command and its represented domain where events could be tagged with Event and its represented domain.

foolip · 2020-06-04T20:20:20Z

For the purpose of making it easy to define commands/responses/errors/notifications I'd like make a concrete proposal for a syntax similar to Web IDL, which is already familiar to many spec authors. Illustrated with a bunch of random examples:

domain Page {
  // like https://w3c.github.io/webdriver/#navigate-to
  command navigate {
    // the command parameters:
    required string url;
    optional string referrer;
  }; // no response parameters

  // events caused by but not a response to navigate:
  event navigationStart {
    timestamp startTime;
  };
  event navigationEnd {
    // just making things up...
    timestamp startTime;
    timestamp endTime;
  };

  // an event for when modal dialogs are opened
  enum ModalDialogType { "alert", "confirm", "prompt" };
  event modalDialogOpen {
    ModalDialogType type;
    string message;
  };

  // like https://w3c.github.io/webdriver/#accept-alert
  command acceptModalDialog {
    string promptText; // for "prompt" only
  };

  // like https://w3c.github.io/webdriver/#dismiss-alert
  command dismissModalDialog {};
};

  // like https://w3c.github.io/webdriver/#print-page
  enum PrintOrientation { "portrait", "landscape" };
  command printToPDF {
    optional PrintOrientation orientation = "portrait";
    // lots more
  } => {
    // this is a response parameter:
    bytes pdfData;
  };
};

I write this down not because I think it's urgent that we do something like this, but because #26 brought it to mind. A few observations:

This doesn't give a way to specify the error data/parameters
This doesn't provide all the information needed to produce an OpenRPC file

jgraham · 2020-06-23T16:00:07Z

CDDL is another IDL variant we could use here. It has the advantage that there's an RFC to point at and some existing tooling. It's also way easier to read/write by hand than JSON Schema. It's definitely not perfect, but might be better than inventing something entirely new.

jgraham · 2020-07-01T14:18:52Z

In direct response to @foolip's suggestion, I am wary of inventing something entirely new. We don't want to be side tracked into specifying a schema format rather than specifying an actual protocol :) That said, the more I think about it the more opposed I am to writing JSON schema directly; I think the format is too verbose and ugly, and doesn't really have the primitives we wanted in the sense that it's very focused on on-the-wire values and doesn't provide the formatlism for describing things as types.

One idea I had today is to define things as TypeScript interfaces. That has the advantage that there are several TypeScript-to-JSON-schema tools available, and it's also familiar to many web devs. The main problem is that afaik there isn't a standard to point at for the syntax, so we might have to handwave a bit. I would certainly expect us to have (generated) JSON Schema as an appendix, since that seems like it's going to be most useful for implementors.

foolip · 2020-07-01T15:33:21Z

Taking a look at https://www.typescriptlang.org/docs/handbook/interfaces.html, that seems like a reasonable fit for describing things that are JSON objects on the wire, so the parameters primarily. I suspect what we'll run into very soon is that we need more types and perhaps subsets of existing types, but perhaps that's all supported in TypeScript.

@jgraham, with this approach, how do you see the name of the command itself and the domain (if we have those) being represented? Namespaces and functions, perhaps?

foolip · 2020-07-01T17:10:10Z

I suggested in our meeting just now to first try to pin down the "model" of what our machine readable definitions are expressing. If we agree on that, the rest will "just" be syntax which does matter for spec authoring ergonomics, but many alternatives that aren't too verbose could work.

With that said, I think the (nested) model is roughly:

Domains, which have:
- A name
- Commands, which have:
  - A name
  - Parameters
  - Return type (not strictly needed for the formalism to be useful for spec authors)
  - Errors types (maybe, it's debatable whether it's useful to have this)
- Events, which have a name and:
  - A name
  - Parameters

An individual parameter is defined by its name, its type and its optionality.

Is this missing anything? Other than the return and error types, is anything else possible to cut?

bwalderman · 2020-07-01T17:22:38Z

@foolip that model looks good. I would say return types are important for us to be able to write tests.

jgraham · 2020-07-01T18:30:32Z

I think types, and the ability to define things as types, are important. For example we might have something like

type EventSelector = String;

enum TargetId {
    BrowsingContextId,
    RealmId
}

Enable extends Command {
    name: "enable",
    targets: Array<TargetId>,
    events: Array<EventSelector>
}

and then prose to define that e.g. EventSelector is a pattern that matches event names or something.

jgraham · 2020-07-01T18:39:39Z

For comparison what CDP uses looks pretty good, but it is custom and very tied to CDP concepts: https://github.com/ChromeDevTools/devtools-protocol/blob/master/pdl/browser_protocol.pdl

foolip · 2020-07-01T18:49:31Z

From #21 (comment) it's clear I forgot one thing, which we discussed in today's meeting. Commands should probably list the targets they can be sent to, and events should list the targets they can be emitted from.

foolip · 2020-07-01T19:13:38Z

A non-trivial amount of complexity, I expect, will be in defining types for parameter/return/error types. This is somewhat connected to #16, but I think at the very least we'll need:

strings
numbers (likely both float and int)
booleans
enums (likely with string values)
sequences (likely parameterized like Web IDL's sequence<T>)
dictionaries or interfaces, something to define objects

@jgraham also mentioned union types. An example of where we might end up using that would be helpful. I can't tell if CDP has that, but I'm not sure what keyword to search for :)

bwalderman · 2020-07-01T19:37:38Z

I'm thinking some more about how the machine readable definitions will be integrated into the spec prose. I'm assuming there will still be ordinary spec text describing each command's behavior, so it would make sense to keep the machine readable type definitions for a command near its spec text. At a minimum each command spec would need a machine readable definition for it's parameter type (an object), and it's return type (also an object). Events would just need a parameter type. Common types that are used in more than one place (e.g. browsing context or realm IDs) could defined in a separate section.

As a concrete example:

Navigate To

The command causes a browsing context to navigate to a new location.

Parameters

interface NavigateParams {
    url: string
}

Returns

null

Remote End Steps

... Remote ends steps go here...

The remote end steps assume the existence of a parameters variable that has already been deserialized and validated as a NavigateParams object by the command processing algorithm. The remote end steps return a value and the command processing algorithm is responsible for validating this object matches the command's stated return value type (in this case, null) and serializing that value to send back over the wire.

foolip · 2020-07-01T20:10:13Z

@bwalderman something along those lines is precisely what I've been envisioning, where the machine readable bits can be split into many small code blocks, and one only needs to define remote end steps which can use the parameters with the correct types directly.

jgraham · 2020-07-01T20:11:49Z

I also agree that's how we want the spec to look in the end.

bwalderman · 2020-07-02T06:55:38Z

TypeScript and Web IDL seem like the best options since they are both easy for humans to read/write and have readily available tooling. Also, both have expressive enough type systems to cover our scenario and look more or less the same for the subset of functionality we'll likely be using.

I'm leaning towards Web IDL. The standard provides some useful algorithms such as default steps for converting an IDL value to JSON and [checking if an object implements an interface]. These will come in handy for specify how commands/events are serialized/deserialized over the wire.

jgraham · 2020-07-02T18:56:13Z

I don't see how Web IDL as-such would work. There's a big assumption in WebIDL that you're making DOM APIs and a lot of the tooling around the platform assumes that too. We could do something WebIDLish, but it's not going to be exactly the same.

Regarding sum types, if I was modelling this in a language supporting that I might start from

enum Message {
    CommandMsg(Command),
    ResponseMsg(Response),
    EventMsg(Event),
    ErrorMsg(Error)
}

struct Command {
    id: uint,
    data: CommandData
}

enum CommandData {
     Enable(EnableCommand),
     Navigate(NavigateCommand),
    […]
}

typedef RealmId = String
typedef ContextId = String

enum TargetId {
    Context(ContextId),
    Ream(RealmId)
}

EnableCommand {
    targets: Array<TargetId>,
    commands: Array<String>
}

NavigateCommand {
    context: ContextId,
    url: String
}

It's not the only way to do it of course, but being able to say things like "a target id is either a realm id or a context id, serialized in a way that allows dsicriminating the two" seems useful.

bwalderman · 2020-07-02T19:31:36Z

Web IDL is theoretically language-agnostic. In practice, half the spec is dedicated to the ECMAScript binding and there are no other bindings mentioned, so yeah I agree there's a big assumption that this is for DOM APIs today. However, we might be able to add a "WebDriver" binding to that specification and fill in any gaps we need.

While TypeScript interfaces are more than suitable for our needs, I'm not sure how we'd make use of it without some "handwaving" as you pointed out. From what I can tell, in the TypeScript specification, there's no straightforward algorithm we can point to that says "steps for checking if an object implements an interface". We also won't have as many (any??) options to change that spec if needed because at the end of the day, it's a programming language and not an interface definition language. We're not their target audience.

Having said that, I'm not opposed to using TypeScript as our IDL if we can avoid relying too heavily on the TypeScript spec, and avoid having to (re-)?invent compliex algorithms for validating an object against a TypeScript interface. In other words, if we can simply say things like "if params does not implement TypeScript interface X return an error and terminate these steps". Another thing to keep in mind is that TypeScript is in active development so being explicit about which version we'd be using is important.

bwalderman · 2020-07-07T06:44:25Z

@jgraham, since the example language above doesn't correspond exactly to the wire representation (i.e. JSON), do you expect it would be accompanied by spec text explaining how to (de)serialize it?

jgraham · 2020-07-22T09:21:33Z

For some context on #44 I chatted with @bwalderman about formats and we came to the conclusion that although there's nothing perfect available, CDDL is probably the best available option. WebIDL doesn't really match the use case of defining a wire protocol. JSON schema is pretty verbose to write, and could be problematic if we ever have a binary form in the future. Doing something custom like PDL or something that looks like TypeScript would probably give the best outcome, but in practice the amount of work required to specify the syntax is itself going to be large. CDDL gives us a fairly compact representation that's already seen usage in W3C specs defining protocols, and some degree of future compatibility if we ever add a CBOR transport. It's not perfect, but it doesn't seem worth blocking for longer on deciding something here.

foolip · 2020-08-13T15:38:11Z

Thanks for picking something pragmatic and getting it done!

I think we could close this, or keep it open to track a few final bits, which is markup conventions which would make it possible to get the domain and command name and group the parameter and return type definitions.

Also, I'm curious if you found good JS or Python libraries for parsing CDDL while working on this?

bwalderman · 2020-08-13T17:26:39Z

For libraries, the RFC mentions a ruby gem. The source code for that is here https://github.com/cabo/cddlc. There's also a rust library which seems to be more actively maintained and documented at https://github.com/anweiss/cddl. I didn't find any native JS or Python libraries.

jgraham · 2020-08-13T18:48:53Z

I've been using the rust library locally and it seems reasonable (there's a cli to validate that specific json matches the proposed schema, which was useful for debugging). I could imagine writing Python bindings for it if we want to use it from bikeshed or similar (related: I have some changes in the works for how we structure the schema so the spec ends up with something that could be extracted into a complete schema for endpoints to use, but things are being held up right now so no PR yet).

anweiss · 2020-10-19T13:55:46Z

hey all! I'm new to the WebDriver BiDi effort, but happy to help on the CDDL front. I'm the maintainer of https://github.com/anweiss/cddl, so let me know if if there's anything I can do to improve the library and tooling for your use case.

gsnedders · 2021-04-13T18:09:33Z

What if anything is still needed here?

jgraham · 2021-04-13T19:04:03Z

I think we can close this and open more specific issues for the remaining problems.

This was referenced May 29, 2020

Machine readable endpoint definitions w3c/webdriver#1510

Open

Use Bikeshed pre-processor for WebDriver w3c/webdriver#1462

Open

foolip mentioned this issue Jun 4, 2020

Define the basic transport-agnostic terms of the protocol #26

Merged

foolip mentioned this issue Jun 17, 2020

Define WebIDL Global to host WebDriver targeted APIs w3c/webdriver#1534

Closed

jgraham mentioned this issue Jul 22, 2020

Define command dispatch and parameter validation #44

Merged

jgraham closed this as completed Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Machine readable definitions #21

Machine readable definitions #21

foolip commented May 29, 2020

foolip commented May 29, 2020

jgraham commented May 29, 2020

bwalderman commented May 29, 2020

foolip commented May 29, 2020

christian-bromann commented Jun 2, 2020 •

edited

Loading

jgraham commented Jun 2, 2020 •

edited

Loading

christian-bromann commented Jun 2, 2020

foolip commented Jun 4, 2020 •

edited

Loading

jgraham commented Jun 23, 2020

jgraham commented Jul 1, 2020

foolip commented Jul 1, 2020

foolip commented Jul 1, 2020

bwalderman commented Jul 1, 2020

jgraham commented Jul 1, 2020

jgraham commented Jul 1, 2020

foolip commented Jul 1, 2020

foolip commented Jul 1, 2020

bwalderman commented Jul 1, 2020 •

edited

Loading

foolip commented Jul 1, 2020

jgraham commented Jul 1, 2020

bwalderman commented Jul 2, 2020

jgraham commented Jul 2, 2020

bwalderman commented Jul 2, 2020

bwalderman commented Jul 7, 2020

jgraham commented Jul 22, 2020

foolip commented Aug 13, 2020

bwalderman commented Aug 13, 2020

jgraham commented Aug 13, 2020

anweiss commented Oct 19, 2020

gsnedders commented Apr 13, 2021

jgraham commented Apr 13, 2021

Machine readable definitions #21

Machine readable definitions #21

Comments

foolip commented May 29, 2020

foolip commented May 29, 2020

jgraham commented May 29, 2020

bwalderman commented May 29, 2020

foolip commented May 29, 2020

christian-bromann commented Jun 2, 2020 • edited Loading

jgraham commented Jun 2, 2020 • edited Loading

christian-bromann commented Jun 2, 2020

foolip commented Jun 4, 2020 • edited Loading

jgraham commented Jun 23, 2020

jgraham commented Jul 1, 2020

foolip commented Jul 1, 2020

foolip commented Jul 1, 2020

bwalderman commented Jul 1, 2020

jgraham commented Jul 1, 2020

jgraham commented Jul 1, 2020

foolip commented Jul 1, 2020

foolip commented Jul 1, 2020

bwalderman commented Jul 1, 2020 • edited Loading

Navigate To

Parameters

Returns

Remote End Steps

foolip commented Jul 1, 2020

jgraham commented Jul 1, 2020

bwalderman commented Jul 2, 2020

jgraham commented Jul 2, 2020

bwalderman commented Jul 2, 2020

bwalderman commented Jul 7, 2020

jgraham commented Jul 22, 2020

foolip commented Aug 13, 2020

bwalderman commented Aug 13, 2020

jgraham commented Aug 13, 2020

anweiss commented Oct 19, 2020

gsnedders commented Apr 13, 2021

jgraham commented Apr 13, 2021

christian-bromann commented Jun 2, 2020 •

edited

Loading

jgraham commented Jun 2, 2020 •

edited

Loading

foolip commented Jun 4, 2020 •

edited

Loading

bwalderman commented Jul 1, 2020 •

edited

Loading