Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Babylon: JSON schemas for the new content directory #1398

Closed
Lezek123 opened this issue Sep 17, 2020 · 1 comment
Closed

Babylon: JSON schemas for the new content directory #1398

Lezek123 opened this issue Sep 17, 2020 · 1 comment

Comments

@Lezek123
Copy link
Contributor

Lezek123 commented Sep 17, 2020

The new content directory will require new JSON schemas, as mentioned in #1249
Current (old) content directory schemas reside in: https://github.com/Joystream/versioned-store-js
Some tooling related to new schemas (with examples) is already available here: https://github.com/iorveth/joystream/tree/cont_dir_json_schemas/content-directory-schemas
The new schemas must be constructed with Atlas API expectations in mind: https://github.com/Joystream/joystream/issues/824#issuecomment-653150085

It may make sense to, at some point, include those in joystream-js library which I plan to introduce in #1396

UPDATE 22.09.2020:

Since this is a complex issue and as mentioned by @bedeho in the comment below, we should focus on what we're trying to achieve for this release, I decided to describe the proccess of initializing the content directory step-by-step in order to clarify the role of json schemas, their relation to the CLI etc.

The plan for initializing the new content directory

  1. Initializing classes and schemas in the new content direcotry:

    1. Create json schemas describing CreateClass and AddSchemaToClass operations - this has been already done by @iorveth in https://github.com/iorveth/joystream/tree/cont_dir_json_schemas/content-directory-schemas/schemas
    2. Create json input files describing adding classes and schemas for classes like Video, Channel - related issue: Babylon: Create JSON input files for creating classes/schemas in the new content directory #1402. (those input files can be validated using json schemas created in previous step)
    3. Use files created in 1.ii. as input for the CLI in order to create initial content directory classes and schemas on chain, ie.:
      • contentDirectory:createClass --input=inputs/classes
      • contentDirectory:addSchemaToClass --input=inputs/schemas

    I think a good idea would be to have the opportunity to provide input for commands mentioned in 1.iii as either json file/files or in an interactive way, allowing easier creation of new content directory schemas/classes later by the lead (json schemas from 1.i. can be leveraged to also validate the input provided in an interactive way). I was planing to make use of the drafts system I created earlier (in order to allow storing working groups openings drafts in the CLI) and expand it to cover this use-case (the basic idea is - input provided interactively can be saved as json file called "draft", that can be shared or re-used when executing the same command later)

  2. Populating content directory with data:

    1. CLI could use schemas input (json) from 1.ii to prompt for and validate input for higher-level commands like createChannel, createVideo, updateVideo etc.
      The input to those commands can be provided either in an interactive form or as json file (implementing drafts system as described above)
    2. Instead of 2.i (or complementary to 2.i.) CLI could also provide lower-level commands like: createEntity / updateEntity where user would have to provide data like class/entity id (class could perhaps be selected from the list), supported schemas etc. Property values (also provided by the user) could be then validated based on this input and data fetched from the chain (ie. classById). This type of validation would be more universal and independent of files created in 1.ii. Those commands could then cover all use-cases from 2.i. + and few others cases (like marking content as curated etc., although in a slightly less user-friendly way). They would also be be easier to maintain in case classes and schemas in content directory change very often.

I think the important part about 2. is that if we create the mechanism of prompring for and validating property values based on given contentDirectory schema in the CLI (which I assume would be needed either way), it doesn't really matter if we get the schema definition from the chain or from a json file that was used to generate this schema on chain. The problem with using json files for this kind of validation is that they need to be updated in some repository, while on-chain data can be updated and fetched at any point without the need for any additional changes in a repository.

@bedeho
Copy link
Member

bedeho commented Sep 19, 2020

Background

I reviewed this and all associated issues because I have been a bit uneasy with what exactly we want our JSON schema tool to do for us, and how much work it seemingly appears to generate, in particular now that the content directory.

Analysis

There are many moving parts here, so I found it most useful to ask a simple question:

What goal(s) are we trying to achieve where this tooling, or some variation of it, is the appropriate answer for Babylon.

Here is a list of all goals on top of mind for how to use the content directory in Babylon.

  1. Publish new channels, videos, languages, etc. from the CLI (curators+members+lead).
  2. Manage permissions & vouchers from CLI (curators).
  3. Curate content using the CLI (curators).
  4. Read content directory in query node, and serve a schema level API to Atlas.
  5. Automatically populate the, initially empty, content directory with new classes, schemas and content right after launch using the CLI (lead).

Now lets go through each in turn.

1. Publish new channels, videos, languages, etc. from the CLI (curators+members+lead)

Subset of what is required in 5?

Manage permissions & vouchers from CLI (curators).

Here it appears to be sufficient to just have one or more simple commands in the CLI, which will require a handful of arguments.

No JSON tooling appears to be needed?

Curate content using the CLI (curators)

This boils down to updating a property value in one or more entities, properties likely only mutable by maintainers. Here we can provide two kinds of user experiences: high or low. A high level one would simply allow you to say something like curate -video <id>. A lower level one would have to expose raw commands for

  • update_entity_property_values - Update entity property values with provided ones
  • clear_entity_property_vector - Clear property value vector under given entity_id & in class schema property id
  • remove_at_entity_property_vector - Remove value at given index_in_property_vector
    from property values vector under in_class schema property id
  • insert_at_entity_property_vector - Insert single input property values at given index in property vector
    into property values vector under in class schema property id

No JSON tooling appears to be needed?

Read content directory in query node, and serve a schema level API to Atlas

No JSON tooling appears to be needed?

Automatically populate the, initially empty, content directory with new classes, schemas and content right after launch using the CLI (lead)

This appears to boil down to doing the a large number of the following

  • creating classes
  • updating permissions
  • adding schemas
  • creating entities
  • adding schema support to entities
  • possibly wrapping a sequence of the above into transactions.

Having some persistent format for describing these activities is not only useful for human coordination, so people can agree upon exactly what is to happen, but something like this is also required for the automation itself, since the automated needs some data source. If the CLI supports executing some action based on some input in this representation, then automating a large number of actions can be achieved by just running some script that runs the CLI a bunch of time in sequence on such input.

From this it appears that we only need to support a very small subset of extrinsics in the content directory in this standard.

Conclusion

It seems like we can, and thus probably should, be quite conservative in how much of the content directory extrinsics we try to have this JSON standard for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants