Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

Changes

Deprecated

Removed

Fixed

Security

[0.50.0] - 2025-02-06

Changes

BREAKING: move ItemsContainer and PaginatedItemsContainer to mex.common.models
BREAKING: replace post_extracted_items with ingest and allow AnyRuleSetResponses
allow AnyRuleSetResponses as arguments to sinks
BREAKING: sinks now yield the models they loaded, instead of just their identifiers

[0.49.3] - 2025-01-29

Changes

update mex-model to 3.5.1

Fixed

fix regex pattern for GndIdStr in organization models

[0.49.2] - 2025-01-29

Fixed

do not wrap field types in setValues in mapping rules in another list

[0.49.1] - 2025-01-29

Fixed

reduce Filter classes to a single list field of FilterField items

[0.49.0] - 2025-01-29

Added

new (partially generic) classes for defining Mapping and Filter fields and rules

Changes

BREAKING: replaced dynamic Mapping and Filter classes with static ones

Deprecated

use FILTER_MODEL_CLASSES_BY_NAME instead of FILTER_MODEL_BY_EXTRACTED_CLASS_NAME
use MAPPING_MODEL_CLASSES_BY_NAME instead of MAPPING_MODEL_BY_EXTRACTED_CLASS_NAME

[0.48.0] - 2025-01-28

Added

add a sink registry with register_sink and get_sink functions
add a multi-sink implementation, akin to mex.extractors.load

Changes

BREAKING: convert post_to_backend_api to BackendApiSink
BREAKING: convert write_ndjson to NdjsonSink
backend and ndjson sinks log progress only in batches
increase timeout and decrease chunk size for backend API sink
port backend identity provider implementation from editor/extractors to common
allow backend and graph as identity provider setting to simplify setting subclasses, even though graph is not implemented in mex-common
BREAKING: make backend api connector response models generic, to keep DRY

[0.47.1] - 2025-01-24

Fixed

skip None values when merging extracted and rule items

[0.47.0] - 2025-01-23

Added

merging logic to mex-common

[0.46.0] - 2025-01-09

Added

BREAKING: add nested models (Text, Link) to all lookups in mex.common.fields

Changes

BREAKING: move GenericFieldInfo from models.base.field_info to utils
BREAKING: move get_all_fields from BaseModel to utils to support all base models

[0.45.0] - 2024-12-18

Changes

BREAKING: change type of distribution.title to an array of texts

[0.44.0] - 2024-12-12

Changes

updated ldap search from name and familyname to one single attribute "displayname"

[0.43.0] - 2024-12-10

Added

add preview models for merged items without cardinality validation
BREAKING: preview models are now part of all mex.common.fields lookups
add BackendApiConnector.fetch_preview_items for fetching previews

Deprecated

stop using ExtractedData, use AnyExtractedModel instead
stop using MergedItem, use AnyMergedModel instead
stop using AdditiveRule, use AnyAdditiveRule instead
stop using SubtractiveRule, use AnySubtractiveRule instead
stop using PreventiveRule, use AnyPreventiveRule instead
stop using BaseEntity, use a concrete union instead

Removed

removed deprecated BulkInsertResponse as alias for IdentifiersResponse
removed unused module export of mex.common.models.generate_entity_filter_schema
removed unused module export of mex.common.models.generate_mapping_schema
drop export models.ExtractedPrimarySourceIdentifier, import from types instead
drop export models.MergedPrimarySourceIdentifier, import from types instead

[0.42.0] - 2024-12-02

Added

add vocabulary and temporal unions and lookups to mex.common.types
add mex.common.fields with field type by class name lookups

Changes

wikidata helper now optionally accepts wikidata primary source
set default empty rules to all of the rule-set models
pin pydantic to sub 2.10 (for now) because of breaking changes

Fixed

switch HTTP method for preview endpoint to POST
add optional values to variadic values for distribution models
make endpointDescription optional for variadic access platform models

[0.41.0] - 2024-11-18

Added

organigram extraction checks for duplicate emails/labels in different organigram units

Changes

upgrade mex-model dependency to version 3.2

[0.40.0] - 2024-10-28

Changes

upgrade mex-model dependency to version 3.1

Fixed

fix typo in repositoryURL of bibliographic resources
make identifier and stableTargetId of ExtractedBibliographicResource computed fields

[0.39.0] - 2024-10-28

Added

added new consent and bibliography reference models and vocabs
added doi field to resource models
helper function for primary source look up

Changes

upgrade mex-model dependency to version 3
make ruff linter config opt-out, instead of opt-in
make instances of extracted data hashable
BREAKING: Wikidata convenience function refactored and renamed to 'helper'
wikidata helper function split between mex-common and mex-extractors
code de-duplication: fixture extracted_primary_sources uses function-part of helper
split up YearMonth and Year temporal types and improved patterns
applied all changes to model fields according to model v3
update LOINC pattern

Fixed

fix temporal entity schemas

[0.38.0] - 2024-10-11

Added

add pattern constants for vocabs, emails, urls and ids to types module
add regex pattern to json schema of identifier fields
automatically add examples and useScheme to json schema of enum fields

Changes

BREAKING: use identifier instead of stableTargetId to get merged item from backend
ensure identifier unions are typed to generic Identifier instead of the first match to signal that we don't actually know which of the union types is correct
unify pydantic schema configuration for all types
consistently parse emails, identifiers and temporals in models to their type, not str
consistently serialize emails, ids and temporals in models to str, not their type
make instances of Link type hashable, to harmonize them with Text models

Removed

drop manual examples from enum fields, because they are autogenerated now
BREAKING: remove MEX_ID_PATTERN from types, in favor of IDENTIFIER_PATTERN
BREAKING: make public MEX_ID_ALPHABET constant from identifier module private
BREAKING: remove __str__ methods from Text and Link classes
BREAKING: drop support for parsing UUIDs as Identifiers, this was unused
BREAKING: drop support for parsing Links from markdown syntax, this was unused
BREAKING: remove pydantic1-style validate methods from all type models
BREAKING: BackendApiConnector.post_models in favor of post_extracted_items

[0.37.0] - 2024-10-01

Added

added methods for extracting persons by name or ID from ldap
contains_only_types to check if fields are annotated as desired
group_fields_by_class_name utility to simplify filtered model/field lookups
new parameters to get_inner_types to customize what to unpack

[0.36.1] - 2024-09-16

Fixed

pin pytz to 2024.1, as stopgap for MX-1703

[0.36.0] - 2024-09-09

Added

added BackendApiConnector methods to cover all current (and near future) endpoints: fetch_extracted_items, fetch_merged_items, get_merged_item, preview_merged_item and get_rule_set
complete the list of exported names in models and types modules

Deprecated

deprecated BackendApiConnector.post_models in favor of post_extracted_items

Removed

containerize section from release pipeline

Fixed

added the rki/mex user-agent to all requests of the HTTPConnector

[0.35.0] - 2024-08-20

Changes

update cruft and loosen up pyproject dependencies
harmonize signatures/docs of pydantic core/json schema manipulating methods

Fixed

fix schema tests not starting with diverging model names in common and mex-model
fix serialization for temporal entity instances within pydantic models

[0.34.0] - 2024-08-12

Added

wikidata fixtures to pytest plugin: wikidata_organization_raw, wikidata_organization, mocked_wikidata
convenience function get_merged_organization_id_by_query_with_extract_transform_and_load for getting the stableTargetId of an organization, while transforming and loading the organization using the provided load function
models for rule-set requests and responses along with typing and lookups
add BaseT models to the exported names of mex.common.models
add MEX_ID_PATTERN to the exported names of mex.common.types

Changes

move all base models and pydantic scaffolding into mex.common.models.base for a cleaner structure within the growing models module

[0.33.0] - 2024-07-31

Added

HTTP connector backoff for 10 retries on 403 from server
rki/mex user agent is sent with query requests via wikidata connector

Changes

changed backend api connector payload to "items"
update wikidata search organization request query, with optional language parameter wikidata query search can be enhanced by specifying the language. EN is the default language.

[0.32.0] - 2024-07-23

Changes

move log timestamp and coloring into the formatter

Deprecated

mex.common.logging.echo is deprecated in favor of logging.info

Fixed

add missing listyness-fix support for computed-fields

[0.31.0] - 2024-07-17

Removed

BREAKING: ability to store different settings instances at the same time. Dependent repositories now must bundle all settings in a single class.

[0.30.0] - 2024-07-16

Added

get count of found wikidata organizations

[0.29.1] - 2024-07-15

[0.29.0] - 2024-07-12

Added

add validator to base model that verifies computed fields can be set but not altered
new class hierarchy for identifiers: ExtractedIdentifier and MergedIdentifier

Changes

improve typing for methods using Self
make local type variables private
use json instead of pickle to calculate checksum of models
replace set_identifiers validator with computed fields on each extracted model

Removed

removed custom stringify method on base entities that included the identifier field

Fixed

fix typing for __eq__ arguments

[0.28.0] - 2024-07-08

Added

extract multiple organizations from wikidata

[0.27.1] - 2024-06-14

Changes

update mex-model to version 2.5

[0.27.0] - 2024-06-10

Added

add static class attribute stemType to models, containing an unprefixed entityType
add AnyRuleModel, RULE_MODEL_CLASSES, RULE_MODEL_CLASSES_BY_NAME to models
add type aliases AnyPrimitiveType and LiteralStringType to types
add new utility function ensure_postfix for adding postfixes to strings

Changes

clean-up and unify mapping and filter class generation

[0.26.1] - 2024-05-29

Fixed

fix memory identity provider seeding

[0.26.0] - 2024-05-29

Added

add classes for Additive, Preventive and Subtractive rules for all entity types
add types, lists and lookups for all three rule types to mex.common.models

Changes

move aux-extractor documentation from readme to __init__ to have it in sphinx
move BaseModel specific descriptions from class to model to avoid duplication
BREAKING: move FILTER_MODEL_BY_EXTRACTED_CLASS_NAME to mex.common.models
BREAKING: move MAPPING_MODEL_BY_EXTRACTED_CLASS_NAME to mex.common.models
BREAKING: change MEX_PRIMARY_SOURCE_IDENTIFIER to end in 1, so that it differs from MEX_PRIMARY_SOURCE_STABLE_TARGET_ID

[0.25.1] - 2024-05-21

Fixed

isolate settings context before first test

[0.25.0] - 2024-05-14

Added

add precision keyword to TemporalEntity constructor
add transform function for single wikidata organization to extracted organization

Changes

add tests for ldap.extract

Fixed

fix ldap.extract.get_merged_ids_by_email

[0.24.0] - 2024-04-12

Added

synchronize changes to fields in BaseSettings to all active settings subclasses
added github action for renovatebot

Changes

make memory identity provider deterministic (same input args results in same stableTargetId and Identifier)
rework ContextStore into SingletonStore with more intuitive API
phase out ambiguous "context" naming in favor of more descriptive "singleton store"
rename SettingsContext to SETTINGS_STORE and allow multiple active subclasses
rename ConnectorContext to CONNECTOR_STORE removing its context manager functions
replace reset_connector_context() with more consistent CONNECTOR_STORE.reset()

Removed

removed types IdentifierT, SettingsType, ConnectorType in favor of typing.Self
remove github dependabot configuration

[0.23.0] - 2024-04-08

Changes

return only one org from wikidata, if multiple or no org is found then return None
filter quotation marks (") from requested wikidata label

[0.22.0] - 2024-03-19

Added

port get_inner_types from mex-backend to mex.common.utils

Changes

rename Timestamp class to TemporalEntity
added subclasses with specific resolution YearMonth, YearMonthDay, YearMonthDayTime
modernize typing with syntactic sugar
simplify BaseModel._get_list_field_names using get_inner_types
switch from poetry to pdm
use vocabulary JSON files from mex-model

Removed

remove vocabulary JSON files

Fixed

date and time validation working and harmonized with mex-model

[0.21.0] - 2024-03-04

Added

add entityType type hint to MExModel (now BaseEntity)
add types for AnyBaseModel, AnyExtractedModel and AnyMergedModel
create more specific subclasses of Identifier (for extracted and merged)
expose unions, lists and lookups for Identifier subclasses in mex.common.types

Changes

swap contextvars.ContextVar for mex.common.context.ContextStore
move stableTargetId property from base models to extracted models
update typing of identifiers to specific subclasses
use Annotated[..., Field(...)] notation for pydantic field configs
split up mex.common.models.base and move out MExModel and JsonSchemaGenerator
rename MExModel to BaseEntity with only type hints an model config
declare hadPrimarySource, identifier and identifierInPrimarySource as frozen

Removed

absorb unused BaseExtractedData into ExtractedData
remove stableTargetId property from merged models
drop support for sinks to accept merged items (now only for extracted data)

[0.20.0] - 2024-02-22

Changes

update cruft and dev dependencies
randomize test order by default

Removed

remove mex.common.public_api module and the correlating sinks
remove PathWrapper.resolve and PathWrapper.raw methods

Fixed

remove pytest.mark from fixture in mex.common.testing.plugin

[0.19.4] - 2024-02-15

Changes

update cruft and minor dependencies

Removed

date-time format validation for mapping model generation

[0.19.3] - 2024-02-06

Changes

update cruft to apply new workflow trigger config
update poetry and pre-commit dependencies

Fixed

fix mex mapping model name

[0.19.2] - 2024-02-02

Added

pytest plugins for random order and parallelized test execution
move dynamic mapping model generation from mex-assets

Changes

mex.bat test uses random order and xdist plugins by default

[0.19.1] - 2024-01-19

Added

cruft template link
workflow that syncs main branch to openCoDE
constant for MEX_PRIMARY_SOURCE_IDENTIFIER

Changes

harmonized boilerplate

Fixed

ExtractedData raises proper ValidationError when parsing wrong base type

[0.19.0] - 2024-01-12

Added

add entityType field in all extracted and merged models

Fixed

wikidata test

[0.18.2] - 2024-01-11

Added

CHANGELOG.md documenting notable changes to this project
a template for pull requests
language french in language vocabulary

[0.18.1] - 2024-01-03

Added

tests for mex.common.types.PathWrapper
method is_relative to mex.common.types.PathWrapper to check whether the path is relative

Changes

resolve base paths of work/assets path fields in settings

Fixed

nesting of mex.common.types.PathWrapper on instantiation

[0.18.0] - 2023-12-20

Changes

move Sink and IdentityProvider to mex.common.types

Deprecated

deprecate MExModel.get_entity_type, use cls.__name__ instead
deprecate mex.common.models.MODEL_CLASSES[_BY_ENTITY_TYPE], use the more precise lists or dicts like EXTRACTED_MODEL_CLASSES_BY_NAME instead

[0.17.1] - 2023-12-20

Added

use dmypy for pre-commit type checking

Fixed

fix previously undetected typing issue

Changed

configure CI linting to install poetry
update versions

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Changelog

[Unreleased]

Added

Changes

Deprecated

Removed

Fixed

Security

[0.50.0] - 2025-02-06

Changes

[0.49.3] - 2025-01-29

Changes

Fixed

[0.49.2] - 2025-01-29

Fixed

[0.49.1] - 2025-01-29

Fixed

[0.49.0] - 2025-01-29

Added

Changes

Deprecated

[0.48.0] - 2025-01-28

Added

Changes

[0.47.1] - 2025-01-24

Fixed

[0.47.0] - 2025-01-23

Added

[0.46.0] - 2025-01-09

Added

Changes

[0.45.0] - 2024-12-18

Changes

[0.44.0] - 2024-12-12

Changes

[0.43.0] - 2024-12-10

Added

Deprecated

Removed

[0.42.0] - 2024-12-02

Added

Changes

Fixed

[0.41.0] - 2024-11-18

Added

Changes

[0.40.0] - 2024-10-28

Changes

Fixed

[0.39.0] - 2024-10-28

Added

Changes

Fixed

[0.38.0] - 2024-10-11

Added

Changes

Removed

[0.37.0] - 2024-10-01

Added

[0.36.1] - 2024-09-16

Fixed

[0.36.0] - 2024-09-09

Added

Deprecated

Removed

Fixed

[0.35.0] - 2024-08-20

Changes

Fixed

[0.34.0] - 2024-08-12

Added

Changes

[0.33.0] - 2024-07-31

Added