Releases · AuReMe/esmecata

07 Mar 17:03

ArnaudBelcour

0.6.4

7dfdf1c

esmecata 0.6.4 Latest

Latest

Add

Function get_domain_or_superkingdom_from_ncbi_tax_database in esmecata.utils to check if domain is called domain or superkingdom in NCBI Taxonomy database.

Fix

Taxonomic rank superkingdom has been renamed to domain in recent version of the NCBI Taxonomy database. Fix several related issues in different parts of EsMeCaTa.
esmecata_gseapy was not working with results from esmecata precomputed due to missing proteome_tax_id.tsv in 2_annotation folder.

Assets 2

07 Mar 14:59

ArnaudBelcour

0.6.3

5ba9130

esmecata 0.6.3

Fix

KeyError when using precomputed database from EsMeCaTa article (as there is no KEGG_reaction in these databases).

Assets 2

07 Mar 14:23

ArnaudBelcour

0.6.2

ca6bb4c

esmecata 0.6.2

Modify

Replace datapane by arakawa for HTML report creation in esmecata_report (issue #20). Make HTMLs standalone (they are a lot bigger but bo not require internet to display).
Replace ete3 by ete4 as ete3 is no longer maintained (issue #18).

Assets 2

03 Mar 14:35

ArnaudBelcour

0.6.1

76642bf

esmecata 0.6.1

Fix

Issue with tax synonyms on the same tax rank when filtering rank in proteomes.

Modify

Use taxon ID to search for taxon presence in precomputed database.
Use Trusted publisher to publish on PyPI.

Assets 2

27 Jan 15:11

ArnaudBelcour

0.6.0

acf38e8

esmecata 0.6.0

Add

New command esmecata_create_db to create database from different output folders of esmecata (from_runs).
Full release of esmecata precomputed associated with the first version of esmecata precomputed database.
Option threshold (-t) to precomputed.
Add --gseapyCutOff option to gseapy_enrichr.
A check after database creation to detect taxon with few predicted proteins compared to higher affiliated taxon.
Check the good format of the gzip file.
Header KEGG_reaction in annotation_reference from annotation_uniprot to avoid issues with esmecata_create_db.

Fix

Issue with protein IDs from UniParc during annotation (incorrect split on '|').
Fix issue in get_taxon_obs_name function.
Issues in test.

Modify

Add database version in log.
Rename test_workflow.py into test_workflow_uniprot.py, to better reflect what is done.
Update workflow figure.
Update readme.
Update article_data folder and the associated readme.

Assets 2

06 Nov 14:32

ArnaudBelcour

0.5.4

0748933

esmecata 0.5.4

Fix

Issue with proteomes from UniParc during clustering (incorrect split on '|').
Issue in test with updated taxonomic group.

Assets 2

31 Oct 14:42

ArnaudBelcour

0.5.3

4436951

esmecata 0.5.3

Fix

Handle an issue with requests.exceptions.ChunkedEncodingError.

Modify

Remove unused header in output file of gseapy.

Assets 2

21 Oct 10:38

ArnaudBelcour

0.5.2

0d1c191

esmecata 0.5.2

Add

New way to search proteomes by using UniParc. Some proteomes, when downloaded directly from UniProt are empty. A solution is to search for them in UniParc and retrieved the associated protein sequences.
New plot in report showing proteomes according to tax_rank.
Database number version when creating precomputed database.
The possibility to give a file containing manually selected groups of observation names for esmecata_gseapy gseapy_enrichr.
Tests for esmecata_gseapy gseapy_enrichr.

Fix

Issue in creating heatmap of proteomes (missing taxon rank) in report creation.
Issue when creating database: there was a possibility that a taxon without consensus proteomes and associated annotations was kept.

Modify

Update parameter description for SPARQL option to indicate the value to query SPARQL UniProt Endpoint.
Rename esmecata_gseapy gseapy_taxon into esmecata_gseapy gseapy_enrichr to reflect the changes in the command.
Modify how esmecata_gseapy gseapy_enrichr works by adding a grouping parameters allowing to choose either groups according to taxon_rank or with a file created by the user and containing manually selected groups of observation names.
Update readme according to the different changes made in this release.

TODO

Investigate and solve memory leak when mapping UniParc IDs to UniProt with bioservices.
Add handling of UniParc IDs with SPARQL queries.

Assets 2

04 Oct 10:10

ArnaudBelcour

0.5.1

24b5d27

esmecata 0.5.1

Add

Metadata file for esmecata_gseapy gseapy_taxon.

Fix

Several issues in esmecata_gseapy gseapy_taxon.
Issues in tests due to UniProt updates.

Modify

Update metadata files for annotation and eggnog by adding missing dependencies.

Assets 2

02 Oct 10:03

ArnaudBelcour

0.5.0

d3ccee2

esmecata 0.5.0

WARNING: Changes in the structure of the python package of EsMeCaTa.
If you have been importing the package in Python, you will need to modify your import.

Add

New command esmecata_report to create a report from the output folder of EsMeCaTa. Scripts of esmecata_report allow to create html, pdf and tsv reports from EsMeCaTa (work of @alimatai and @PaulineGHG). This command has several subcommands:
- (1) create_report to create a report from the output folder of the esmecata workflow subcommand (only way to have the complete HTML report).
- (2) create_report_proteomes to create report files from output of esmecata proteomes subcommand.
- (3) create_report_clustering to create report files from output of esmecata clustering subcommand.
- (4) create_report_annotation to create report files from output of esmecata annotation subcommand.
New command esmecata_gseapy to create enrichment analysis of functions predicted by EsMeCaTa according to taxon rank.
New optional dependencies required for esmecata_report: datapane, plotly, kaleido, ontosunburst. As datapane is no more maintained, an alternative with panel is currently developed.
New optional dependencies required for esmecata_gseapy: gseapy and orsum.
New file indicating the EC numbers and GO Terms for the different observation name of the dataset (file function_table.tsv).
New subcommand esmecata precomputed. This subcommand uses a precomputed database to make predictions from the input file (using EsMeCaTa default parameters). It has been added to avoid creating the same prediction every run and to have a fast way to make predictions with EsMeCaTa. It is necessary to download the precompiled database before using it. At the moment of this release, the database is not available, these scripts are present for testing purposes.
Prototype for precomputed database creation: several scripts are added in esmecata/precomputed folder to create the input and the precomputed database.
Check that the proteome files are not completely empty, which could cause problems with mmseqs2.
Tests for precomputed database, report creation, database creation and eggnog annotation. Add mock on sevral functions to perform the test. Required pytest-mock.
Add readme in test folder.

Fix

Issue in proteomes SPARQL query (missing PREFIX).

Modify

Modification of the structure of the EsMeCaTa package, now divided into 4 main folders: (1) esmecata/core (for the script previously contained in the EsMeCaTa folder) and used for the workflow, (2) esmecata/report to generate a report from the esmecata output folder, (3) esmecata/gseapy to perform enrichment analysis on the esmecata output, and (4) esmecata/precomputed to create precomputed database (in development).
Change the name of intermediary files in clustering and annotation to avoid issues with ambiguous taxon names.
Modify test according to changes of packaging structure.
Modify the behaviour of annotation by eggnog-mapper. Now it merges protein sequences from clustering into bigger files (associated with superkingdom). This increases the performance of eggnog-mapper. Modification made with @megyl. Use --tax_scope with eggnog-mapper.
Update article_data folder.
Update CI tests of github workflow according to the new tests and the new dependencies.

Remove

Remove esmecata analysis subcommand as it was not used and not very useful.

Contributors

alimatai, PaulineGHG, and megyl

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add

Fix

Fix

Modify

Fix

Modify

Add

Fix

Modify

Fix

Fix

Modify

Add

Fix

Modify

TODO

Add

Fix

Modify

Add

Fix

Modify

Remove

Contributors

Releases: AuReMe/esmecata

esmecata 0.6.4

Add

Fix

esmecata 0.6.3

Fix

esmecata 0.6.2

Modify

esmecata 0.6.1

Fix

Modify

esmecata 0.6.0

Add

Fix

Modify

esmecata 0.5.4

Fix

esmecata 0.5.3

Fix

Modify

esmecata 0.5.2

Add

Fix

Modify

TODO

esmecata 0.5.1

Add

Fix

Modify

esmecata 0.5.0

Add

Fix

Modify

Remove

Contributors