Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PXP-8825 & BRH 206 #65

Merged
merged 87 commits into from
Oct 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
4481190
adding support for aggregation/guppy
Oct 8, 2021
810cbfd
add support for default value
Oct 14, 2021
1b90868
schema support and summary/search
Oct 18, 2021
d2c7d43
schema support and summary/search
Oct 18, 2021
d54df2a
add configuration + schema
Oct 20, 2021
7ba3720
add schema suppoort and schema introspection endpoint
Oct 22, 2021
32b0d00
refine schema support
Oct 27, 2021
c437276
update ES configuration
Nov 22, 2021
e3bbf40
update brh_config
Feb 28, 2022
b73f097
add check for drs_cache in config->settings
Feb 28, 2022
721a545
add check for drs_cache in config->settings
Feb 28, 2022
0c4aeaa
merge changes from master
Apr 20, 2022
17ca5d9
add improved ES mapping support
May 12, 2022
719db0f
fix aggMDS failing unit test: all pass
Jul 6, 2022
8bb67f1
add aggMDS unit test to increase coverage
Jul 6, 2022
ba0f4dd
add__manifest filter test, increase converage
Jul 6, 2022
6b86a3f
merge from master
Jul 6, 2022
2547b31
fix harvard adapter unit test
Jul 7, 2022
2272cec
fix failing populate test
Jul 7, 2022
a69583a
fix (again) populate unit test
Jul 7, 2022
3518887
increase test coverage
Jul 7, 2022
c221a42
add array_to_string converter
Jul 7, 2022
8ada987
add array_to_string converter unit test
Jul 7, 2022
ba2c533
add filter option to gen3 adapter
Jul 8, 2022
2b40d9f
fix drs caching error, update unit test
Jul 15, 2022
7c3a2b0
update tests
Jul 15, 2022
e0be6cd
Merge branch 'master' into feat/agg_server_paging
Jul 15, 2022
97cc562
prevent missing study field from throwing exception
Jul 26, 2022
0d185cd
update README with aggMDS development instructions
Jul 29, 2022
71d70ae
change sh to bash to wait for esproxy
Aug 3, 2022
0577bc9
Change aggregate metadata population script. Retain original index in…
tianj7 Aug 9, 2022
03cfded
update Documentation
Aug 10, 2022
b995fab
remove hostname/port arguments, update documentation
Aug 10, 2022
914efb9
Add unit tests
tianj7 Aug 10, 2022
3444b8c
add unit tests
tianj7 Aug 10, 2022
348e39f
add unit tests
tianj7 Aug 11, 2022
bcf0d90
add unit tests
tianj7 Aug 11, 2022
0bc38c5
add unit tests
tianj7 Aug 11, 2022
6f85c51
Merge branch 'feat/agg_server_paging' of github.com:uc-cdis/metadata-…
Aug 15, 2022
962ec70
set default value if available and type conversion is None
Aug 15, 2022
3427395
remove duplicate functions, refactor code and tests
tianj7 Aug 17, 2022
663dd17
Merge branch 'feat/agg_server_paging' into fix/BRH-206
tianj7 Aug 18, 2022
2daf358
Merge pull request #61 from uc-cdis/fix/BRH-206
tianj7 Aug 18, 2022
9f17126
remove /aggregate/metadata_paged merge functionality to /aggregate/m…
Aug 18, 2022
b1d1480
add mapping.ignore_malformed: True, to ES settings
Aug 18, 2022
a4b193c
add Schema and Gen3 Adapter documentation
Aug 19, 2022
3af82b1
merge from master: removed fixed normalization
Aug 19, 2022
75b0648
clean up metadata adapter documentation
Aug 19, 2022
831774c
clean up metadata adapter documentation (update title)
Aug 19, 2022
1bc26a0
address comments from reviewers: update documentation, and tests
Aug 24, 2022
c827eba
add commons_name option, update documentation
Aug 24, 2022
d02d121
add swagger documentation for aggregate api
Aug 29, 2022
e67cbe0
update in-source documentation
Aug 29, 2022
c2aae45
merge changes from master
Aug 29, 2022
bf88319
Apply automatic documentation changes
craigrbarnes Aug 29, 2022
e41d1b5
update swagger
Aug 30, 2022
27a2732
Merge branch 'feat/agg_server_paging' of github.com:uc-cdis/metadata-…
Aug 30, 2022
4762bdf
Apply automatic documentation changes
craigrbarnes Aug 30, 2022
bc6916b
updated poetry lock file
aartivnkt Sep 1, 2022
2bb3898
updated setuptools and poetry lock
aartivnkt Sep 1, 2022
30c4e5c
fix typo in aggMDS documentation
Sep 1, 2022
e3b639e
merge from master
Sep 12, 2022
3e38c6b
revert poetry
Sep 12, 2022
039167b
Apply automatic documentation changes
craigrbarnes Sep 12, 2022
f31b276
add missing additions to swagger doc
Oct 10, 2022
95a68c5
merge changes from master
Oct 11, 2022
fb871ea
Apply automatic documentation changes
craigrbarnes Oct 11, 2022
bc492fc
add error loging in conversion
Oct 12, 2022
ae0e0be
Merge branch 'feat/agg_server_paging' of github.com:uc-cdis/metadata-…
Oct 12, 2022
7476088
add error loggin with conversions
Oct 12, 2022
41d28c0
Apply automatic documentation changes
craigrbarnes Oct 12, 2022
f2475b0
remove :q from docstring
Oct 12, 2022
7e40b0e
Merge branch 'feat/agg_server_paging' of github.com:uc-cdis/metadata-…
Oct 12, 2022
6daaf5d
re-add openapi.yaml
Oct 12, 2022
a1d80fd
Apply automatic documentation changes
craigrbarnes Oct 12, 2022
7e8ef8d
update docs and tests
Oct 12, 2022
f2dc6a2
Merge branch 'feat/agg_server_paging' of github.com:uc-cdis/metadata-…
Oct 12, 2022
fcb736b
Apply automatic documentation changes
craigrbarnes Oct 12, 2022
41c80a7
update Tags
Oct 12, 2022
f88d4b3
more updates
Oct 12, 2022
e6e9a04
update FastAPI documentation
Oct 12, 2022
3522b4f
fix agg mds docs not generated
paulineribeyre Oct 13, 2022
a67826d
set USE_AGG_MDS in workflow
paulineribeyre Oct 13, 2022
35f9137
Apply automatic documentation changes
paulineribeyre Oct 13, 2022
992ddb3
extend httpx and rety timeout for Gen3Adapter
Oct 20, 2022
750ebc0
add warning for field not in schema and no value
Oct 26, 2022
72a5914
Merge branch 'master' into feat/agg_server_paging
craigrbarnes Oct 26, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ jobs:
poetry install -vv --no-interaction
poetry show -vv
- name: Build docs
env:
USE_AGG_MDS: true # so the aggregate MDS docs are added
run: poetry run python run.py openapi

- uses: stefanzweifel/[email protected]
Expand Down
14 changes: 3 additions & 11 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"files": null,
"lines": null
},
"generated_at": "2022-10-03T18:10:28Z",
"generated_at": "2022-10-12T21:15:02Z",
"plugins_used": [
{
"name": "AWSKeyDetector"
Expand Down Expand Up @@ -70,15 +70,15 @@
{
"hashed_secret": "6eae3a5b062c6d0d79f070c26e6d62486b40cb46",
"is_verified": false,
"line_number": 36,
"line_number": 62,
"type": "Secret Keyword"
}
],
"docs/metadata_adapters.md": [
{
"hashed_secret": "bf7e894868fd96c11edf05ef7d23122cbfa22e7e",
"is_verified": false,
"line_number": 60,
"line_number": 204,
"type": "Hex High Entropy String"
}
],
Expand Down Expand Up @@ -106,14 +106,6 @@
"type": "Hex High Entropy String"
}
],
"tests/test_agg_mds_adapters.py": [
{
"hashed_secret": "143e9f2aca10dbd2711cb96047f4016f095e5709",
"is_verified": false,
"line_number": 3898,
"type": "Hex High Entropy String"
}
],
"tests/test_migrations.py": [
{
"hashed_secret": "4dcba4ad1d671981e2d211ebe56da8a5b40f14ef",
Expand Down
119 changes: 116 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,119 @@ The server is built with [FastAPI](https://fastapi.tiangolo.com/) and packaged w

The documentation can be browsed in the [docs](docs) folder, and key documents are linked below.

* [Detailed API Documentation](http://petstore.swagger.io/?url=https://raw.githubusercontent.com/uc-cdis/metadata-service/master/docs/openapi.yaml)
* [Development and deployment](docs/dev.md)
* [Aggregate Metadata Service](docs/agg_mds.md)
The aggregated MDS APIs and scripts copy metadata from one or many metadata services into a single data store. This enables a metadata service to act as a central API for browsing Metadata using clients such as the Ecosystem browser.

The aggregate metadata APIs and migrations are disabled by default unless `USE_AGG_MDS=true` is specified. The `AGG_MDS_NAMESPACE` should also be defined for shared Elasticserach environments so that a unique index is used per-instance.

The aggregate cache is built using Elasticsearch. See the `docker-compose.yaml` file (specifically the `aggregate_migration` service) for details regarding how aggregate data is populated.

## Installation

Install required software:

* [PostgreSQL](PostgreSQL) 9.6 or above
* [Python](https://www.python.org/downloads/) 3.9 or above
* [Poetry](https://poetry.eustace.io/docs/#installation)

Then use `poetry install` to install the dependencies. Before that,
a [virtualenv](https://virtualenv.pypa.io/) is recommended.
If you don't manage your own, Poetry will create one for you
during `poetry install`, and you must activate it by:

```bash
poetry shell
```

## Development

Create a file `.env` in the root directory of the checkout:
(uncomment to override the default)

```python
# DB_HOST = "..." # default: localhost
# DB_PORT = ... # default: 5432
# DB_USER = "..." # default: current user
# DB_PASSWORD = "..." # default: empty
# DB_DATABASE = "..." # default: current user
# USE_AGG_MDS = "..." # default: false
# AGG_MDS_NAMESPACE = "..." # default: default_namespace
# GEN3_ES_ENDPOINT = "..." # default: empty
```

Run database schema migration:

```bash
alembic upgrade head
```

Run the server with auto-reloading:

```bash
python run.py
```

Try out the API at: <http://localhost:8000/docs>.

## Run tests

Please note that the name of the test database is prepended with "test_", you
need to create that database first:

```bash
psql
CREATE DATABASE test_metadata;
```

```bash
pytest --cov=src --cov=migrations/versions tests
```

## Develop with Docker

Use Docker compose:

```bash
docker-compose up
```

Run database schema migration as well:

```bash
docker-compose exec app alembic upgrade head
```

Run tests:

```bash
docker-compose exec app pytest --cov=src --cov=migrations/versions tests
```

### Aggregate MDS
testing populate:
```bash
python src/mds/populate.py --config <config file> --hostname localhost --port 9200
```
view the loaded data
```bash
http://localhost:8000/aggregate/metadata?limit=1000
```

## Deployment

For production, use [gunicorn](https://gunicorn.org/):

```bash
gunicorn mds.asgi:app -k uvicorn.workers.UvicornWorker -c gunicorn.conf.py
```

Or use the Docker image built from the `Dockerfile`, using environment variables
with the same name to configure the server.

Other than database configuration, please also set:

```bash
DEBUG=0
ADMIN_LOGINS=alice:123,bob:456
```

Except that, don't use `123` or `456` as the password.
130 changes: 0 additions & 130 deletions configs/brh_config.json

This file was deleted.

2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ services:
environment:
- USE_AGG_MDS=true
- GEN3_ES_ENDPOINT=http://esproxy-service:9200
command: sh -c 'while [[ "$$(curl --connect-timeout 2 -s -o /dev/null -w ''%{http_code}'' $$GEN3_ES_ENDPOINT)" != "200" ]]; do echo "wait for " $$GEN3_ES_ENDPOINT; sleep 5; done; echo es backend is available;/env/bin/python /src/src/mds/populate.py --config /src/tests/config.json'
command: bash -c 'while [[ "$$(curl --connect-timeout 2 -s -o /dev/null -w ''%{http_code}'' $$GEN3_ES_ENDPOINT)" != "200" ]]; do echo "wait for " $$GEN3_ES_ENDPOINT; sleep 5; done; echo es backend is available;/env/bin/python /src/src/mds/populate.py --config /src/tests/config.json'
db:
image: postgres
environment:
Expand Down
Loading