Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/mx-1381 prepare database model rework #69

Merged
merged 45 commits into from
Mar 4, 2024
Merged
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
4f2d5b0
Add entityType stub to MExModel
cutoffthetop Jan 23, 2024
8c48c65
Clean up base classes and add AnyXYZModel types
cutoffthetop Jan 23, 2024
7d738c8
Merge branch 'main' of https://github.com/robert-koch-institut/mex-co…
cutoffthetop Jan 23, 2024
0de26b4
Fix tests
cutoffthetop Jan 23, 2024
c203888
Fix fixture using mark
cutoffthetop Feb 14, 2024
7b51ca0
Add transform utils
cutoffthetop Feb 14, 2024
9eb30ba
Replace ContextVar with ContextStore
cutoffthetop Feb 14, 2024
a24cd37
Merge branch 'main' of https://github.com/robert-koch-institut/mex-co…
cutoffthetop Feb 14, 2024
2619f1a
Update poetry
cutoffthetop Feb 14, 2024
c2767d2
Move stableTargetId typehint to extractedData
cutoffthetop Feb 14, 2024
68df38e
Mark frozen fields
cutoffthetop Feb 14, 2024
cf828ec
Add nested model lookup
cutoffthetop Feb 14, 2024
1f2ba8f
Remove mappings that cause circularity
cutoffthetop Feb 14, 2024
62d5e91
Update versions
cutoffthetop Feb 15, 2024
1b02d1b
Add new linting
cutoffthetop Feb 16, 2024
5567101
Merge branch 'main' of https://github.com/robert-koch-institut/mex-co…
cutoffthetop Feb 16, 2024
e880075
Update docstring
cutoffthetop Feb 16, 2024
cdb70c8
Update changelog
cutoffthetop Feb 16, 2024
75d2ee7
Poetry update
cutoffthetop Feb 19, 2024
4b5ffb1
Move stableTargetId to extracted data classes
cutoffthetop Feb 20, 2024
c308312
Merge branch 'main' of https://github.com/robert-koch-institut/mex-co…
cutoffthetop Feb 20, 2024
d3bd12a
Specify identifiers and remove public-api module
cutoffthetop Feb 20, 2024
5dfcdd5
Update cruft
cutoffthetop Feb 20, 2024
c785f4c
Random order testing
cutoffthetop Feb 20, 2024
4395d9b
Bump urllib3 from 2.2.0 to 2.2.1
dependabot[bot] Feb 21, 2024
452c5c6
Clean up and rename MExModel
cutoffthetop Feb 21, 2024
8160df2
Remove public-api, update cruft, maintain code
cutoffthetop Feb 21, 2024
65ba71a
Merge branch 'feature/mx-1562-code-maintenance' into feature/mx-1381-…
cutoffthetop Feb 21, 2024
9d4ae54
Fix changelog style
cutoffthetop Feb 21, 2024
4bcddfa
Add pytest.mark fix to changelog
cutoffthetop Feb 21, 2024
e6fdfdb
Fix makefile docstr
cutoffthetop Feb 21, 2024
38e44d4
Merge branch 'feature/mx-1562-code-maintenance' into feature/mx-1381-…
cutoffthetop Feb 21, 2024
368c0ce
Prefer annotated field syntax
cutoffthetop Feb 21, 2024
13a4b74
Merge branch 'main' of https://github.com/robert-koch-institut/mex-co…
cutoffthetop Feb 23, 2024
1bcbd64
Lock
cutoffthetop Feb 23, 2024
c35c283
Simplify and add tests
cutoffthetop Feb 23, 2024
9aee2ee
Fix code quote
cutoffthetop Feb 23, 2024
4512f90
Merge branch 'main' of https://github.com/robert-koch-institut/mex-co…
cutoffthetop Feb 27, 2024
9af2142
Update cruft ref
cutoffthetop Feb 27, 2024
7a38f7e
Merge branch 'main' into feature/mx-1381-prep-rule-endpoint
cutoffthetop Mar 4, 2024
245670f
Merge branch 'main' of https://github.com/robert-koch-institut/mex-co…
cutoffthetop Mar 4, 2024
8a2668f
Lock file
cutoffthetop Mar 4, 2024
3b83cca
Update cruft
cutoffthetop Mar 4, 2024
68cbb6d
Merge branch 'feature/mx-1381-prep-rule-endpoint' of https://github.c…
cutoffthetop Mar 4, 2024
aa6192b
Prepare release 0.21
cutoffthetop Mar 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .cruft.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"checkout": null,
"commit": "6067fc53d1335a9bda900c5eff8dbf1c42bfe4ca",
"commit": "8a99c5bda13e8a909b26d195886225f5d278e693",
"context": {
"cookiecutter": {
"project_name": "common",
Expand Down
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- add `entityType` type hint to `MExModel` (now `BaseEntity`)
- add types for `AnyBaseModel`, `AnyExtractedModel` and `AnyMergedModel`
- create more specific subclasses of `Identifier` (for extracted and merged)
- expose unions, lists and lookups for `Identifier` subclasses in `mex.common.types`

### Changes

- swap `contextvars.ContextVar` for `mex.common.context.ContextStore`
- move `stableTargetId` property from base models to extracted models
- update typing of identifiers to specific subclasses
- use `Annotated[..., Field(...)]` notation for pydantic field configs
- split up `mex.common.models.base` and move out `MExModel` and `JsonSchemaGenerator`
- rename `MExModel` to `BaseEntity` with only type hints an model config
- declare `hadPrimarySource`, `identifier` and `identifierInPrimarySource` as frozen

### Deprecated

### Removed

- absorb unused `BaseExtractedData` into `ExtractedData`
- remove `stableTargetId` property from merged models
- drop support for sinks to accept merged items (now only for extracted data)

### Fixed

### Security
Expand Down
6 changes: 3 additions & 3 deletions mex/common/backend_api/connector.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

from mex.common.backend_api.models import BulkInsertResponse
from mex.common.connector import HTTPConnector
from mex.common.models import MExModel
from mex.common.models import ExtractedData
from mex.common.settings import BaseSettings
from mex.common.types import Identifier

Expand All @@ -27,7 +27,7 @@ def _set_url(self) -> None:
settings = BaseSettings.get()
self.url = urljoin(str(settings.backend_api_url), self.API_VERSION)

def post_models(self, models: list[MExModel]) -> list[Identifier]:
def post_models(self, models: list[ExtractedData]) -> list[Identifier]:
"""Post models to Backend API in a bulk insertion request.

Args:
Expand All @@ -37,7 +37,7 @@ def post_models(self, models: list[MExModel]) -> list[Identifier]:
HTTPError: If insert was not accepted, crashes or times out

Returns:
Identifiers of posted models
Identifiers of posted extracted models
"""
response = self.request(
method="POST",
Expand Down
6 changes: 3 additions & 3 deletions mex/common/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def _callback(
# ensure connectors are closed on exit.
context.call_on_close(reset_connector_context)

# load settings from parameters and store in ContextVar.
# load settings from parameters and store it globally.
settings = settings_cls.model_validate(
{
key: value
Expand All @@ -126,7 +126,7 @@ def _callback(
)
SettingsContext.set(settings)

# otherwise print loaded settings in pretty way and continue
# otherwise print loaded settings in pretty way and continue.
logger.info(click.style(dedent(f" {func.__doc__}"), fg="green"))
logger.info(click.style(f"{settings.text()}\n", fg="bright_cyan"))

Expand All @@ -142,7 +142,7 @@ def _callback(
# if we are in debug mode, jump into interactive debugging.
pdb.post_mortem(sys.exc_info()[2])
raise error
# if not in debug mode, exit with code 1
# if not in debug mode, exit with code 1.
echo("exit", fg="red")
context.exit(1)

Expand Down
8 changes: 3 additions & 5 deletions mex/common/connector/base.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,12 @@
from abc import ABCMeta, abstractmethod
from contextlib import ExitStack
from contextvars import ContextVar
from types import TracebackType
from typing import Optional, TypeVar, cast, final

from mex.common.context import ContextStore

ConnectorType = TypeVar("ConnectorType", bound="BaseConnector")
ConnectorContextType = dict[type["BaseConnector"], "BaseConnector"]
ConnectorContext = ContextVar(
"ConnectorContext", default=cast(ConnectorContextType, {})
)
ConnectorContext = ContextStore[dict[type["BaseConnector"], "BaseConnector"]]({})


def reset_connector_context() -> None:
Expand Down
19 changes: 19 additions & 0 deletions mex/common/context.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from typing import Generic, TypeVar

ContextResource = TypeVar("ContextResource")


class ContextStore(Generic[ContextResource]):
"""Thin wrapper for storing thread-local globals."""

def __init__(self, default: ContextResource) -> None:
"""Create a new context store with a default value."""
self._resource = default

def get(self) -> ContextResource:
"""Retrieve the current value stored in this context."""
return self._resource

def set(self, resource: ContextResource) -> None:
"""Update the current value stored in this context."""
self._resource = resource
8 changes: 4 additions & 4 deletions mex/common/identity/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from mex.common.connector import BaseConnector
from mex.common.identity.models import Identity
from mex.common.types import Identifier, PrimarySourceID
from mex.common.types import AnyMergedIdentifier, MergedPrimarySourceIdentifier


class BaseProvider(BaseConnector):
Expand All @@ -11,7 +11,7 @@ class BaseProvider(BaseConnector):
@abstractmethod
def assign(
self,
had_primary_source: PrimarySourceID,
had_primary_source: MergedPrimarySourceIdentifier,
identifier_in_primary_source: str,
) -> Identity: # pragma: no cover
"""Find an Identity in a database or assign a new one."""
Expand All @@ -21,9 +21,9 @@ def assign(
def fetch(
self,
*,
had_primary_source: Identifier | None = None,
had_primary_source: MergedPrimarySourceIdentifier | None = None,
identifier_in_primary_source: str | None = None,
stable_target_id: Identifier | None = None,
stable_target_id: AnyMergedIdentifier | None = None,
) -> list[Identity]: # pragma: no cover
"""Find Identity instances matching the given filters."""
...
14 changes: 10 additions & 4 deletions mex/common/identity/memory.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@
MEX_PRIMARY_SOURCE_IDENTIFIER_IN_PRIMARY_SOURCE,
MEX_PRIMARY_SOURCE_STABLE_TARGET_ID,
)
from mex.common.types import Identifier, PrimarySourceID
from mex.common.types import (
AnyMergedIdentifier,
Identifier,
MergedPrimarySourceIdentifier,
)


class MemoryIdentityProvider(BaseProvider):
Expand All @@ -22,7 +26,9 @@ def __init__(self) -> None:
]

def assign(
self, had_primary_source: PrimarySourceID, identifier_in_primary_source: str
self,
had_primary_source: MergedPrimarySourceIdentifier,
identifier_in_primary_source: str,
) -> Identity:
"""Find an Identity in the in-memory database or assign a new one.

Expand Down Expand Up @@ -52,9 +58,9 @@ def assign(
def fetch(
self,
*,
had_primary_source: Identifier | None = None,
had_primary_source: MergedPrimarySourceIdentifier | None = None,
identifier_in_primary_source: str | None = None,
stable_target_id: Identifier | None = None,
stable_target_id: AnyMergedIdentifier | None = None,
) -> list[Identity]:
"""Find Identity instances in the in-memory database.

Expand Down
14 changes: 9 additions & 5 deletions mex/common/identity/models.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
from typing import Annotated

from pydantic import Field

from mex.common.models import BaseModel
from mex.common.types import Identifier, PrimarySourceID
from mex.common.types import Identifier, MergedPrimarySourceIdentifier


class Identity(BaseModel):
"""Model for identifier lookup."""

identifier: Identifier
hadPrimarySource: PrimarySourceID
identifierInPrimarySource: str
stableTargetId: Identifier
identifier: Annotated[Identifier, Field(frozen=True)]
hadPrimarySource: Annotated[MergedPrimarySourceIdentifier, Field(frozen=True)]
identifierInPrimarySource: Annotated[str, Field(frozen=True)]
stableTargetId: Annotated[Identifier, Field(frozen=True)]
2 changes: 1 addition & 1 deletion mex/common/ldap/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The module `ldap.transform` contains functions for transforming LDAP data into M
models.

The `mex_person.stableTargetId` attribute can be used in any entity that requires a
`PersonID`.
`MergedPersonIdentifier`.

# Convenience Functions

Expand Down
40 changes: 21 additions & 19 deletions mex/common/ldap/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,17 @@
from mex.common.identity import get_provider
from mex.common.ldap.models.person import LDAPPerson, LDAPPersonWithQuery
from mex.common.models import ExtractedPrimarySource
from mex.common.types import PersonID
from mex.common.types import MergedPersonIdentifier


def _get_merged_ids_by_attribute(
attribute: str,
persons: Iterable[LDAPPerson],
primary_source: ExtractedPrimarySource,
) -> dict[str, list[PersonID]]:
"""Return a mapping from a dynamic Person attribute to corresponding PersonIDs.
) -> dict[str, list[MergedPersonIdentifier]]:
"""Return mapping from dynamic Person attribute to corresponding merged person ids.

PersonIDs are looked up in the identity provider and will be omitted
MergedPersonIdentifiers are looked up in the identity provider and will be omitted
for any person that has not yet been assigned an `Identity` there.

Args:
Expand All @@ -23,7 +23,8 @@ def _get_merged_ids_by_attribute(
primary_source: Primary source for LDAP

Returns:
Mapping from a stringified `LDAPPerson[attribute]` to corresponding PersonIDs
Mapping from a stringified `LDAPPerson[attribute]` to corresponding
MergedPersonIdentifiers
"""
if attribute not in LDAPPerson.model_fields:
raise RuntimeError(f"Not a valid LDAPPerson field: {attribute}")
Expand All @@ -35,62 +36,63 @@ def _get_merged_ids_by_attribute(
identifier_in_primary_source=str(person.objectGUID),
):
merged_ids_by_attribute[str(getattr(person, attribute))].append(
PersonID(identities[0].stableTargetId)
MergedPersonIdentifier(identities[0].stableTargetId)
)
return merged_ids_by_attribute


def get_merged_ids_by_employee_ids(
persons: Iterable[LDAPPerson], primary_source: ExtractedPrimarySource
) -> dict[str, list[PersonID]]:
"""Return a mapping from a person's employeeID to their PersonIDs.
) -> dict[str, list[MergedPersonIdentifier]]:
"""Return a mapping from a person's employeeID to their merged person ids.

PersonIDs are looked up in the identity provider and will be omitted
MergedPersonIdentifiers are looked up in the identity provider and will be omitted
for any person that has not yet been assigned an `Identity` there.

Args:
persons: Iterable of LDAP persons
primary_source: Primary source for LDAP

Returns:
Mapping from `LDAPPerson.employeeID` to corresponding PersonIDs
Mapping from `LDAPPerson.employeeID` to corresponding MergedPersonIdentifiers
"""
return _get_merged_ids_by_attribute("employeeID", persons, primary_source)


def get_merged_ids_by_email(
persons: Iterable[LDAPPerson], primary_source: ExtractedPrimarySource
) -> dict[str, list[PersonID]]:
"""Return a mapping from a person's e-mail to their PersonIDs.
) -> dict[str, list[MergedPersonIdentifier]]:
"""Return a mapping from a person's e-mail to their merged person ids.

PersonIDs are looked up in the identity provider and will be omitted
MergedPersonIdentifiers are looked up in the identity provider and will be omitted
for any person that has not yet been assigned an `Identity` there.

Args:
persons: Iterable of LDP persons
primary_source: Primary source for LDAP

Returns:
Mapping from `LDAPPerson.mail` to corresponding PersonIDs
Mapping from `LDAPPerson.mail` to corresponding MergedPersonIdentifiers
"""
return _get_merged_ids_by_attribute("mail", persons, primary_source)


def get_merged_ids_by_query_string(
persons_with_query: Iterable[LDAPPersonWithQuery],
primary_source: ExtractedPrimarySource,
) -> dict[str, list[PersonID]]:
"""Return a mapping from a person query string to their PersonIDs.
) -> dict[str, list[MergedPersonIdentifier]]:
"""Return a mapping from a person query string to their merged person ids.

PersonIDs are looked up in the identity provider and will be omitted
MergedPersonIdentifiers are looked up in the identity provider and will be omitted
for any person that has not yet been assigned an `Identity` there.

Args:
persons_with_query: Iterable of LDP persons with query
primary_source: Primary source for LDAP

Returns:
Mapping from `LDAPPersonWithQuery.query` to corresponding PersonIDs
Mapping from `LDAPPersonWithQuery.query` to corresponding
MergedPersonIdentifiers
"""
merged_ids_by_attribute = defaultdict(list)
provider = get_provider()
Expand All @@ -100,6 +102,6 @@ def get_merged_ids_by_query_string(
identifier_in_primary_source=str(person_with_query.person.objectGUID),
):
merged_ids_by_attribute[str(person_with_query.query)].append(
PersonID(identities[0].stableTargetId)
MergedPersonIdentifier(identities[0].stableTargetId)
)
return merged_ids_by_attribute
4 changes: 3 additions & 1 deletion mex/common/ldap/models/person.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Annotated

from pydantic import Field

from mex.common.ldap.models.actor import LDAPActor
Expand All @@ -12,7 +14,7 @@ class LDAPPerson(LDAPActor):
departmentNumber: str | None = None
displayName: str | None = None
employeeID: str
givenName: list[str] = Field(min_length=1)
givenName: Annotated[list[str], Field(min_length=1)]
ou: list[str] = []
sn: str

Expand Down
Loading
Loading