Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge 2.0.0 ballot work #630

Merged
merged 29 commits into from
Feb 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
bc727e5
update versions and rebuild
ahwagner Nov 26, 2024
a0aa56c
Merge branch '2.x' into 2.0.0-ballot.2024-11
ahwagner Nov 26, 2024
f50d9af
Merge branch '2.x' into 2.0.0-ballot.2024-11
ahwagner Nov 26, 2024
7506802
update gks-core
ahwagner Nov 26, 2024
9eb83a3
index 2.0 release notes
ahwagner Nov 26, 2024
58374b2
fix typo
ahwagner Nov 26, 2024
5fc0b6b
fix JSchema syntax error
ahwagner Nov 27, 2024
991047c
update pre-release build
ahwagner Nov 27, 2024
b81a6ea
Summarize new features
ahwagner Dec 11, 2024
73ba998
docs: resolve typos (#608) (#611)
larrybabb Dec 12, 2024
a459968
docs: resolve typos (#608) (#612)
larrybabb Dec 12, 2024
a39ed5f
update class diagram (#614)
larrybabb Dec 12, 2024
ab026e1
removed outdated unreferenced files and refs (#615)
larrybabb Dec 12, 2024
de029e5
update design decision record
larrybabb Dec 16, 2024
5dffe02
correct schema guidance for locations on circular sequences (#617)
ahwagner Dec 16, 2024
e3de2b3
SequenceLocation optional reference documentation (#618)
ahwagner Dec 16, 2024
c15310c
fix(test): update `copyChange` to use `MappableConcept` (#620)
korikuzma Dec 18, 2024
c9b3dd4
ballot .3 release
ahwagner Dec 19, 2024
7a3adb5
fix typos
jsstevenson Dec 20, 2024
f84f91c
cicd: add precommit checks to GitHub actions (#621)
jsstevenson Jan 2, 2025
8292526
fix: use defined targets for Entity references
korikuzma Jan 8, 2025
6b35fe1
fix: remove CopyNumber.rst + split into change/count
korikuzma Jan 8, 2025
adfbe3c
fix: systemic variation description
korikuzma Jan 8, 2025
ccd9b8b
fix: resolve typos
korikuzma Jan 8, 2025
67a9b49
Merge remote-tracking branch 'origin/fix' into merge-2.0.0-ballot-work
larrybabb Feb 10, 2025
55f72ad
re-run with new msp v3.1
larrybabb Feb 10, 2025
897ebce
documentation updates
larrybabb Feb 10, 2025
2215f5f
updated gks-core to 1.0.0-snapshot.2025-02.1
larrybabb Feb 10, 2025
9496cf1
run pre-commit hooks
larrybabb Feb 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions .github/workflows/cqa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: checks
on: [push, pull_request]
jobs:
precommit_hooks:
runs-on: ubuntu-latest
strategy:
matrix:
cmd:
- "check-added-large-files"
- "trailing-whitespace"
- "end-of-file-fixer"
- "mixed-line-ending"
- "update-json-def-files"
steps:
- uses: actions/checkout@v4

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: 3.12

- uses: pre-commit/[email protected]
with:
extra_args: ${{ matrix.cmd }} --all-files
2 changes: 2 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ repos:
- id: detect-private-key
- id: trailing-whitespace
- id: end-of-file-fixer
- id: mixed-line-ending
args: [ --fix=lf ]
- repo: local
hooks:
- id: update-json-def-files
Expand Down
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ sphinx:

python:
install:
- requirements: docs/source/requirements.txt
- requirements: docs/source/requirements.txt
2 changes: 1 addition & 1 deletion .requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ jsonschema
referencing
ipython
pyyaml
ga4gh.gks.metaschema==0.3.0
ga4gh.gks.metaschema==0.3.1
sphinx ~= 7.2
sphinx-rtd-theme ~= 1.2
jupyterlab
Expand Down
16 changes: 8 additions & 8 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Contributing
Contributions to this repository are intended to follow the VRS
[development process](https://vrs.ga4gh.org/en/stable/appendices/development_process.html).
The additional information presented here are guidelines for issues,
branches, commits, and pull requests. Before adding documentation,
The additional information presented here are guidelines for issues,
branches, commits, and pull requests. Before adding documentation,
please also review the [docs style guide](docs/source/style.rst).

## Discussions
[Discussions](https://github.com/ga4gh/vrs/discussions) are for feature
[Discussions](https://github.com/ga4gh/vrs/discussions) are for feature
requests, release candidate discussions, and questions.

## Issues
[Issues](https://github.com/ga4gh/vrs/issues) are for bug
reports, and planned feature descriptions. When creating an issue, use
reports, and planned feature descriptions. When creating an issue, use
sentence case for the issue title and avoid the use of periods at the end
of titles.

Expand All @@ -25,12 +25,12 @@ branch for [issue 250](https://github.com/ga4gh/vrs/issues/250) could
be `250-contributing`.

## Pull Requests
[Pull Requests](https://github.com/ga4gh/vrs/pulls) (PRs) for new
features should target the `main` branch. For version
[Pull Requests](https://github.com/ga4gh/vrs/pulls) (PRs) for new
features should target the `main` branch. For version
patches, the PR should target the appropriate minor version branch.
PRs must be approved by at least one project maintainer before they may
be merged. PR titles must reflect the issue associated with the PR. For
example, the associated PR title for
example, the associated PR title for
[issue 250](https://github.com/ga4gh/vrs/issues/250) would be
`#250: Add CONTRIBUTING.md`, as seen in
`#250: Add CONTRIBUTING.md`, as seen in
[PR #253](https://github.com/ga4gh/vrs/pull/253).
2 changes: 1 addition & 1 deletion CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
|Brian Walsh | [[10](#10)] |
|Andrew D Yates | [[8](#8)] |

See also
See also
[VRS contributors](https://github.com/ga4gh/vrs/graphs/contributors) and
[VRS Python contributors](https://github.com/ga4gh/vrs-python/graphs/contributors).

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ The VRS model is the product of the [GA4GH Variation Representation group](https

## Using the schema

The schema is available in the [schema/](./schema/) directory, in both yaml and json versions.
The schema is available in the [schema/](./schema/) directory, in both yaml and json versions.
It conforms to JSON Schema Draft 2020-12. For a list of
libraries that support JSON schema, see
[JSONSchema>Tools](https://json-schema.org/tools).
Expand Down
4 changes: 2 additions & 2 deletions TODO
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Docs
see doc-updates branch
* Standardize quoting: '**blah**' → ``blah``
* Investigate
https://pypi.org/project/sphinx-jsonschema/
* Investigate
https://pypi.org/project/sphinx-jsonschema/
90 changes: 90 additions & 0 deletions docs/source/appendices/design_decisions.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
.. _design_decisions:

Design Decisions
!!!!!!!!!!!!!!!!

The following design decisions were made in the development of the VRS:

GA4GH Inherent Properties over Value Objects
--------------------------------------------

In VRS 1.0 we operated under the principle that all identifiable objects in VRS (e.g. Allele, SequenceLocation, etc.)
would be *value objects*. This meant that they should be immutable and contain only required fields that are
necessary to uniquely identify the object. This approach somewhat simplified the ability to generate the digests by
allowing the computation of the digest to be based on the entire object. An exception was made for properties with a
leading underscore (namely, the *_id* property), which was removed from the object before a digest was calculated.

In VRS 2.0 we extended the principle of excepting designated attributes by explicitly defining *inherent properties*
that constitute the properties used to compute an object digest. This was done to enable expressivity of VRS,
enabling implementations to pass common, descriptive metadata as part of the identifiable objects without sacrificing
the ability to create globally unique, federated identifiers from VRS 1.3.

As a result, we had to introduce a new field in the digest model called *ga4gh.inherent* which is described in detail
in the section on :ref:`ga4gh-inherent-properties`.

IRIs over CURIEs
----------------

In VRS 2.0 we moved away from the use of CURIEs in favor of :ref:`iriReference`. Several factors played a role in
this decision.

JSON Schema, the default data model for GKS specifications, does not allow for encoding of CURIE namespaces as is done
in other frameworks such as JSON-LD or XML. As a result, namespaces must be captured from custom data structures, API
endpoints, or documentation that may not persist as messages are exchanged between systems. To address this, references
in GKS specs now use IRIs to reference objects explicitly.

IRI-References over IRIs
------------------------
We opted for the general use of IRI-References as a way to provide a more flexible approach to the use of IRIs
in most GKS message structures. IRI-references (relative IRIs) benefit the users allow for compact representation
of concepts that are accessible within a system (e.g. a directory structure or web API).

VRS identifier syntax and versioning
------------------------------------

The :ref:`versioning` section describes the versioning and release naming conventions for the VRS product.
Approved releases will be assigned to the version number alone, but connect, ballot and snapshot releases will
include the context term and date in addition to the target version number.

During the GA4GH Connect April 2023 meeting the maturity model was discussed at length and the following
proposal was presented for instance and class GKS identifiers.

.. image:: ../images/2023-connect-gks-identifier-proposal.png
:alt: GKS Identifiers Proposal from 2023 April Connect Session
:align: center

As an example, the Github JSON Schema URL ($id) for the VRS 2.0.0 Allele is:

.. code-block:: json

{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://w3id.org/ga4gh/schema/vrs/2.0/json/Allele",
...
}

During the **release and versioning** discussion at the GA4GH Connect April 2023 meeting the proposal
delved into the idea of including the major version number in the VRS identifier itself. Proponents of
this approach cited concern for the change in digests (and their derived identifiers) between major
versions of the same VRS object, which would become clearly visible in the identifier itself if the
major version was included.

Opponents of this approach argued that new identifiers would be required for every type of VRS object
for every major version release. Meaning that even if a given type of object has no change that would
result in a new digest, a new identifier would still be required for the new major version.

After much discussion, the decision was made to NOT include the major version number in the VRS identifier
itself. Therefore, the :ref:`identifier-construction` does NOT contain the version number, resulting in
the following syntax:

**CURIE namespace resolution**

.. code-block::

ga4gh:VA.Oop4kjdTtKcg1kiZjIJAAR3bp7qi4aNT

**URI Syntax**

.. code-block::

https://w3id.org/ga4gh/vrs/VA.Oop4kjdTtKcg1kiZjIJAAR3bp7qi4aNT
20 changes: 15 additions & 5 deletions docs/source/appendices/ga4gh_identifiers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,21 @@ reference:


.. _ga4gh-digest-keys:
.. _ga4gh-inherent-properties:

GA4GH Digest Keys
#################
When creating computed identifiers from objects, VRS uses a custom schema attribute,
*ga4gh.inherent*, that contains the property names used for computing digests. For example,
GA4GH Inherent Properties
#########################

.. admonition:: New in v2

In VRS v1, data classes were limited to only inherent properties that contained the minimum
information for describing a variant or other identifiable object. In practice, this resulted
in frequent nesting of VRS objects inside descriptive containers, a complicated pattern for
implementations. VRS 2.0 addresses this limitation with the designation of inherent properties
for use with the computed identifier algorithm.

When creating computed identifiers from objects, VRS uses a custom schema attribute,
*ga4gh.inherent*, that contains the property names used for computing digests. For example,
the Allele JSON Schema:

.. parsed-literal::
Expand All @@ -95,7 +105,7 @@ the Allele JSON Schema:

.. note::

The `ga4gh` JSON Schema namespace is aligned with the Sequence Collections effort
The `ga4gh` JSON Schema namespace is aligned with the Sequence Collections effort
(see `SeqCol#84 <https://github.com/ga4gh/refget/issues/84>`_).

GA4GH Type Prefixes
Expand Down
2 changes: 2 additions & 0 deletions docs/source/appendices/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,7 @@ Appendices
class_diagram
maturity_model
ga4gh_identifiers
resource_identifiers
truncated_digest_collision_analysis
design_decisions
glossary
Loading