Skip to content

Commit

Permalink
Fix merge conflicts.
Browse files Browse the repository at this point in the history
  • Loading branch information
guzman-raphael committed May 26, 2023
2 parents 7bf0c83 + 40a6794 commit 7a41e8b
Show file tree
Hide file tree
Showing 68 changed files with 1,381 additions and 197 deletions.
11 changes: 9 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
## Release notes

### 0.14.1 -- TBD
### Upcoming
- Fixed - Fix altering a part table that uses the "master" keyword - PR [#991](https://github.com/datajoint/datajoint-python/pull/991)
- Fixed - `.ipynb` output in tutorials is not visible in dark mode ([#1078](https://github.com/datajoint/datajoint-python/issues/1078)) PR [#1080](https://github.com/datajoint/datajoint-python/pull/1080)
- Changed - Readme to update links and include example pipeline image
- Changed - Docs to add landing page and update navigation
- Changed - `.data` method to `.stream` in the `get()` method for S3 (external) objects PR [#1085](https://github.com/datajoint/datajoint-python/pull/1085)
- Fixed - Docs to rename `create_virtual_module` to `VirtualModule`
- Added - Skeleton from `datajoint-company/datajoint-docs` repository for docs migration
- Added - Initial `pytest` for `test_connection`

### 0.14.0 -- Feb 13, 2023
Expand All @@ -16,7 +23,7 @@
- Deprecated - `table._update()` PR [#1073](https://github.com/datajoint/datajoint-python/pull/1073)
- Deprecated - old-style foreign key syntax PR [#1073](https://github.com/datajoint/datajoint-python/pull/1073)
- Deprecated - `dj.migrate_dj011_external_blob_storage_to_dj012()` PR [#1073](https://github.com/datajoint/datajoint-python/pull/1073)
* Added - Method to set job keys to "ignore" status - PR [#1068](https://github.com/datajoint/datajoint-python/pull/1068)
- Added - Method to set job keys to "ignore" status - PR [#1068](https://github.com/datajoint/datajoint-python/pull/1068)

### 0.13.8 -- Sep 21, 2022
- Added - New documentation structure based on markdown PR [#1052](https://github.com/datajoint/datajoint-python/pull/1052)
Expand Down
2 changes: 1 addition & 1 deletion LNX-docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ services:
interval: 15s
fakeservices.datajoint.io:
<<: *net
image: datajoint/nginx:v0.2.4
image: datajoint/nginx:v0.2.5
environment:
- ADD_db_TYPE=DATABASE
- ADD_db_ENDPOINT=db:3306
Expand Down
36 changes: 22 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
[![DOI](https://zenodo.org/badge/16774/datajoint/datajoint-python.svg)](https://zenodo.org/badge/latestdoi/16774/datajoint/datajoint-python)
[![Build Status](https://travis-ci.org/datajoint/datajoint-python.svg?branch=master)](https://travis-ci.org/datajoint/datajoint-python)
[![Coverage Status](https://coveralls.io/repos/datajoint/datajoint-python/badge.svg?branch=master&service=github)](https://coveralls.io/github/datajoint/datajoint-python?branch=master)
[![PyPI version](https://badge.fury.io/py/datajoint.svg)](http://badge.fury.io/py/datajoint)
[![Requirements Status](https://requires.io/github/datajoint/datajoint-python/requirements.svg?branch=master)](https://requires.io/github/datajoint/datajoint-python/requirements/?branch=master)
[![Slack](https://img.shields.io/badge/slack-chat-green.svg)](https://datajoint.slack.com/)

# Welcome to DataJoint for Python!
Expand All @@ -12,22 +10,32 @@ DataJoint for Python is a framework for scientific workflow management based on
DataJoint was initially developed in 2009 by Dimitri Yatsenko in Andreas Tolias' Lab at Baylor College of Medicine for the distributed processing and management of large volumes of data streaming from regular experiments. Starting in 2011, DataJoint has been available as an open-source project adopted by other labs and improved through contributions from several developers.
Presently, the primary developer of DataJoint open-source software is the company DataJoint (https://datajoint.com).

- [Getting Started](https://datajoint.com/docs/core/datajoint-python/latest/getting-started/)
- [DataJoint Elements](https://datajoint.com/docs/elements/) - Catalog of example pipelines
- [DataJoint CodeBook](https://codebook.datajoint.io) - Interactive online tutorials
- Contribute
## Data Pipeline Example

![pipeline](https://raw.githubusercontent.com/datajoint/datajoint-python/master/images/pipeline.png)

[Yatsenko et al., bioRxiv 2021](https://doi.org/10.1101/2021.03.30.437358)

## Getting Started

- Install from PyPI

```bash
pip install datajoint
```

- [Documentation & Tutorials](https://datajoint.com/docs/core/datajoint-python/)

- [Interactive Tutorials](https://github.com/datajoint/datajoint-tutorials) on GitHub Codespaces

- [DataJoint Elements](https://datajoint.com/docs/elements/) - Catalog of example pipelines for neuroscience experiments

- Contribute
- [Development Environment](https://datajoint.com/docs/core/datajoint-python/latest/develop/)

- [Guidelines](https://datajoint.com/docs/community/contribute/)

- Legacy Resources (To be replaced by above)
- [Documentation](https://docs.datajoint.org)
- [Tutorials](https://tutorials.datajoint.org)

## Citation

- If your work uses DataJoint for Python, please cite the following Research Resource Identifier (RRID) and manuscript.

- DataJoint ([RRID:SCR_014543](https://scicrunch.org/resolver/SCR_014543)) - DataJoint for Python (version `<Enter version number>`)

- Yatsenko D, Reimer J, Ecker AS, Walker EY, Sinz F, Berens P, Hoenselaar A, Cotton RJ, Siapas AS, Tolias AS. DataJoint: managing big scientific data using MATLAB or Python. bioRxiv. 2015 Jan 1:031658. doi: https://doi.org/10.1101/031658
- [Tutorials](https://tutorials.datajoint.org)
4 changes: 3 additions & 1 deletion datajoint/s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,9 @@ def fput(self, local_file, name, metadata=None):
def get(self, name):
logger.debug("get: {}:{}".format(self.bucket, name))
try:
return self.client.get_object(self.bucket, str(name)).data
with self.client.get_object(self.bucket, str(name)) as result:
data = [d for d in result.stream()]
return b"".join(data)
except minio.error.S3Error as e:
if e.code == "NoSuchKey":
raise errors.MissingExternalFile("Missing s3 key %s" % name)
Expand Down
4 changes: 4 additions & 0 deletions datajoint/user_tables.py
Original file line number Diff line number Diff line change
Expand Up @@ -238,3 +238,7 @@ def drop(self, force=False):
raise DataJointError(
"Cannot drop a Part directly. Delete from master instead"
)

def alter(self, prompt=True, context=None):
# without context, use declaration context which maps master keyword to master table
super().alter(prompt=prompt, context=context or self.declaration_context)
4 changes: 2 additions & 2 deletions docs/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ services:
context: ../
args:
- PACKAGE
image: ${PACKAGE}-docs
image: ${PACKAGE}_python-docs
environment:
- PACKAGE
- UPSTREAM_REPO
Expand All @@ -16,7 +16,7 @@ services:
- ..:/main
user: ${HOST_UID}:anaconda
ports:
- 8080:80
- 80:80
command:
- sh
- -c
Expand Down
80 changes: 65 additions & 15 deletions docs/mkdocs.yaml
Original file line number Diff line number Diff line change
@@ -1,24 +1,77 @@
# ---------------------- PROJECT SPECIFIC ---------------------------

site_name: DataJoint Python
site_name: DataJoint Documentation
repo_url: https://github.com/datajoint/datajoint-python
repo_name: datajoint/datajoint-python
nav:
- DataJoint Python: getting-started/index.md
- DataJoint Python: index.md
- Getting Started: getting-started/index.md
- Existing Pipelines: concepts/existing-pipelines.md
- Query Language:
- Common Commands: query-lang/common-commands.md
- Operators: query-lang/operators.md
- Iteration: query-lang/iteration.md
- Query Caching: query-lang/query-caching.md
- Concepts:
- Principles: concepts/principles.md
- Glossary: concepts/glossary.md
- System Administration:
- Database Administration: sysadmin/dba.md
- File Storage: sysadmin/filestore.md
- Client Configuration:
- Install: client/install.md
- Credentials: client/creds.md
- Settings: client/settings.md
- File Stores: client/stores.md
- Schema Design:
- Schema Creation: design/schema.md
- Table Definition:
- Table Tiers: design/tables/tiers.md
- Declaration Syntax: design/tables/declare.md
- Primary Key: design/tables/primary.md
- Attributes: design/tables/attributes.md
- Lookup Tables: design/tables/lookup.md
- Blobs: design/tables/blobs.md
- Attachments: design/tables/attach.md
- Filepaths: design/tables/filepath.md
- Custom Datatypes: design/tables/customtype.md
- Dependencies: design/tables/dependencies.md
- Indexes: design/tables/indexes.md
- Master-Part Relationships: design/tables/master-part.md
- Schema Diagrams: design/diagrams.md
- Entity Normalization: design/normalization.md
- Data Integrity: design/integrity.md
- Schema Recall: design/recall.md
- Schema Drop: design/drop.md
- Schema Modification: design/alter.md
- Data Manipulations:
- Insert: manipulation/insert.md
- Delete: manipulation/delete.md
- Update: manipulation/update.md
- Transactions: manipulation/transactions.md
- Data Queries:
- Common Commands: query/common-commands.md
- Fetch: query/fetch.md
- Iteration: query/iteration.md
- Operators: query/operators.md
- Restrict: query/restrict.md
- Projection: query/project.md
- Join: query/join.md
- Aggregation: query/aggregation.md
- Union: query/union.md
- Universal Sets: query/universals.md
- Query Caching: query/query-caching.md
- Computations:
- Make Method: compute/make.md
- Populate: compute/populate.md
- Key Source: compute/key-source.md
- Distributed Computing: compute/distributed.md
- Internals:
- SQL Transpilation: internal/transpilation.md
- Reproducibility:
- Table Tiers: reproduce/table-tiers.md
- Make Method: reproduce/make-method.md
- Existing Pipelines: existing-pipelines.md
- Tutorials:
- tutorials/json.ipynb
- FAQ: faq.md
- Develop: develop.md
- Changelog: about/changelog.md
- Citation: citation.md
- Changelog: changelog.md
- API: api/ # defer to gen-files + literate-nav

# ---------------------------- STANDARD -----------------------------
Expand Down Expand Up @@ -50,9 +103,6 @@ theme:
name: Switch to light mode
plugins:
- search
- redirects:
redirect_maps:
"index.md": "getting-started/index.md"
- mkdocstrings:
default_handler: python
handlers:
Expand Down Expand Up @@ -95,11 +145,11 @@ markdown_extensions:
- name: mermaid
class: mermaid
format: !!python/name:pymdownx.superfences.fence_code_format
- pymdownx.magiclink # Displays bare URLs as links
- pymdownx.tasklist: # Renders check boxes in tasks lists
custom_checkbox: true
extra:
generator: false # Disable watermark
analytics:
provider: google
property: !ENV GOOGLE_ANALYTICS_KEY
version:
provider: mike
social:
Expand Down
12 changes: 12 additions & 0 deletions docs/src/.overrides/assets/stylesheets/extra.css
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,35 @@
html a[title="DataJoint"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="Slack"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="LinkedIn"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="Twitter"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="GitHub"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="DockerHub"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="PyPI"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="StackOverflow"].md-social__link svg {
color: var(--dj-primary);
}

html a[title="YouTube"].md-social__link svg {
color: var(--dj-primary);
}
Expand Down Expand Up @@ -91,3 +99,7 @@ html a[title="YouTube"].md-social__link svg {
/* previous/next text */
/* --md-footer-fg-color: var(--dj-white); */
}

[data-md-color-scheme="slate"] .jupyter-wrapper .Table Td {
color: var(--dj-black)
}
2 changes: 1 addition & 1 deletion docs/src/.overrides/partials/nav.html
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
{#-
Add DataJoint home link to navigation header, otherwise unchanged
-#}
<a href="https://datajoint.com/docs/" title="DataJoint">
<a href="https://datajoint.com/docs/core/" title="DataJoint">
⬅ Home
</a>
</label>
Expand Down
File renamed without changes.
7 changes: 7 additions & 0 deletions docs/src/citation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Citation

If your work uses the DataJoint API for Python, please cite the following manuscript and Research Resource Identifier (RRID):

- Yatsenko D, Reimer J, Ecker AS, Walker EY, Sinz F, Berens P, Hoenselaar A, Cotton RJ, Siapas AS, Tolias AS. DataJoint: managing big scientific data using MATLAB or Python. bioRxiv. 2015 Jan 1:031658. doi: https://doi.org/10.1101/031658

- DataJoint API for Python - [RRID:SCR_014543](https://scicrunch.org/resolver/SCR_014543) - Version `Enter version here`
3 changes: 3 additions & 0 deletions docs/src/client/creds.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Work in progress
You may ask questions in the chat window below or
refer to [legacy documentation](https://docs.datajoint.org/)
3 changes: 3 additions & 0 deletions docs/src/client/install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Work in progress
You may ask questions in the chat window below or
refer to [legacy documentation](https://docs.datajoint.org/)
3 changes: 3 additions & 0 deletions docs/src/client/settings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Work in progress
You may ask questions in the chat window below or
refer to [legacy documentation](https://docs.datajoint.org/)
3 changes: 3 additions & 0 deletions docs/src/client/stores.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Work in progress
You may ask questions in the chat window below or
refer to [legacy documentation](https://docs.datajoint.org/)
3 changes: 3 additions & 0 deletions docs/src/compute/distributed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Work in progress
You may ask questions in the chat window below or
refer to [legacy documentation](https://docs.datajoint.org/)
3 changes: 3 additions & 0 deletions docs/src/compute/key-source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Work in progress
You may ask questions in the chat window below or
refer to [legacy documentation](https://docs.datajoint.org/)
34 changes: 34 additions & 0 deletions docs/src/compute/make.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Make Method

For auto-populated *Imported* and *Computed* tables[^1], a `make` method gives exact
instructions for generating the content. By making these steps explicit, we keep a
careful record of data provenance and ensure reproducibility. Data should never be
entered using the `insert` method directly.

[^1]: For information on differentiating these data tiers, see the Table Tier section on
[Automation](../design/tables/tiers#automation-imported-and-computed).

The `make` method receives one argument: the *key*, which represents the upstream table
entries that need populating. The `key` is a `dict` in Python.

A `make` function should do three things:

1. [Fetch](../query/common-commands#fetch) data from tables upstream in the
pipeline using the key for restriction.

2. Compute and add any missing attributes to the fields already in the key.

3. [Inserts](../query/common-commands#insert) the entire entity into the
triggering table.

## Populate

The `make` method is sometimes referred to as the `populate` function because this is
the class method called to run the `make` method on all relevant keys[^2].

[^2]: For information on reprocessing keys that resulted in an error, see information
on the [Jobs table](./distributed).

``` python
Segmentation.populate()
```
3 changes: 3 additions & 0 deletions docs/src/compute/populate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Work in progress
You may ask questions in the chat window below or
refer to [legacy documentation](https://docs.datajoint.org/)
20 changes: 20 additions & 0 deletions docs/src/concepts/glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<!-- markdownlint-disable MD013 -->

# Glossary

We've taken careful consideration to use consistent terminology.

<!-- Contributors: Please keep this table in alphabetical order -->

| Term | Definition |
| --- | --- |
| <span id="DAG">DAG</span> | directed acyclic graph (DAG) is a set of nodes and connected with a set of directed edges that form no cycles. This means that there is never a path back to a node after passing through it by following the directed edges. Formal workflow management systems represent workflows in the form of DAGs. |
| <span id="data-pipeline">data pipeline</span> | A sequence of data transformation steps from data sources through multiple intermediate structures. More generally, a data pipeline is a directed acyclic graph. In DataJoint, each step is represented by a table in a relational database. |
| <span id="datajoint">DataJoint</span> | a software framework for database programming directly from matlab and python. Thanks to its support of automated computational dependencies, DataJoint serves as a workflow management system. |
| <span id="datajoint-elements">DataJoint Elements</span> | software modules implementing portions of experiment workflows designed for ease of integration into diverse custom workflows. |
| <span id="datajoint-pipeline">DataJoint pipeline</span> | the data schemas and transformations underlying a DataJoint workflow. DataJoint allows defining code that specifies both the workflow and the data pipeline, and we have used the words "pipeline" and "workflow" almost interchangeably. |
| <span id="datajoint-schema">DataJoint schema</span> | a software module implementing a portion of an experiment workflow. Includes database table definitions, dependencies, and associated computations. |
| <span id="foreign-key">foreign key</span> | a field that is linked to another table's primary key. |
| <span id="primary-key">primary key</span> | the subset of table attributes that uniquely identify each entity in the table. |
| <span id="secondary-attribute">secondray attribute</span> | any field in a table not in the primary key. |
| <span id="workflow">workflow</span> | a formal representation of the steps for executing an experiment from data collection to analysis. Also the software configured for performing these steps. A typical workflow is composed of tables with inter-dependencies and processes to compute and insert data into the tables. |
Loading

0 comments on commit 7a41e8b

Please sign in to comment.