FASTGenomics #540

theathorn · 2019-09-24T20:30:50Z

@mvonpapen commented on Tue Sep 24 2019

Thank you for submitting a portal to the HCA DCP Methods Registry!

To expedite your portal's addition to https://data.humancellatlas.org/analyze,
please provide the following package metadata. You can easily edit this information later by clicking "Improve this page" at the bottom of your portal's detail page (example).

Required:

Tool title: FASTGenomics
Contact name: Michael von Papen
Contact email: [email protected]
Who to attribute: CC LifeScience @ Comma Soft AG
Portal URL: www.fastgenomics.org
Short description: FASTGenomics is a platform to share scRNA-seq data and analyses. Users can either choose from best practices or create individual workflows for the exploration of gene expression data.

Optional:

Long description:

FASTGenomics - a platform to share single-cell RNA sequencing data and analyses using reproducible workflows

Recent technological advances enable genomics of individual cells, the building blocks of all living organisms. Single cell data characteristics differ from those of bulk data, which led to a plethora of new analytical strategies. However, solutions are only useful for experts and currently, there are no widely accepted gold standards for single cell data analysis. To meet the requirements of analytical flexibility, reproducibility, ease of use and data security, we developed FASTGenomics as a powerful, efficient, versatile, robust, safe and intuitive analytical ecosystem for single-cell transcriptomics [Scholz et al., 2018]. This development has been carried out at Comma Soft, Bonn, in collaboration with the Schultze lab at LIMES Institute in Bonn, Germany.

FASTGenomics is designed as a platform for single cell RNA-seq data open to the scientific community. A major feature is to provide highest reproducibility and transparency for single cell data analysis to the whole community. The platform provides publicly available datasets and analyses for exploration and visualization of gene expression data. Using docker containers provides full reproducibility and helps avoiding "works only on my machine" problems. Register now at https://www.fastgenomics.org and have a look at our collection of data sets and example analyses.

FASTGenomics serves as a platform, where you can share your data with the community and test novel algorithms on public data sets with known results. FASTGenomics scales already routinely to more than 300k cells per project and prototype apps suggest that scaling to 1M cells is also possible [Scholz et al., 2018]. Moreover, its hybrid design also allows custom solutions such as FASTGenomics on premise for clinical and pharmaceutical research facilities.

For more information, you can find us on Twitter, LinkedIn or come talk to us on Slack. To register for the platform visit us at www.fastgenomics.org.

Note: We will soon offer interactive analyses based on Jupyter notebooks. Stay tuned for the upcoming beta-test!

REFERENCES
Scholz et al. (2018) FASTGenomics: An analytical ecosystem for single-cell RNA sequencing data. BioRxiv, 272476.

Logo or screenshot:

matthewspeir · 2019-09-26T21:06:21Z

Hello, @mvonpapen.

Thank you for submitting an analysis portal to the DCP. I will review your portal against our registry standards and let you know if we have any questions.

matthewspeir · 2019-10-02T17:46:28Z

Hello, @mvonpapen.

Thank you again for submitting an analysis portal to the registry. I'm hoping you can answer a couple of questions about how your portal adheres to our standards:

For the "Use Containers and Modules" standard, it looks like you have Docker containers for your tools at https://hub.docker.com/u/fastgenomics. I noticed that you have github repos for your python client and r client for your portal, but I didn't see a docker instance for either of these in Docker hub. Am I just not seeing them or are they not present? If they're not present, are there plans to add them?
For the "Register Upstream", I can't seem to find your platform, clients, or the related utilities registered in bioconductor or bioconda. Are there plans to register all of your stuff in one of these platforms? (It's also possible they're already there and I just missed them.)

Thanks!

Matthew

mvonpapen · 2019-10-02T22:24:28Z

Hi Matthew, 1. The python and r clients you listed are actually outdated and will be deposited soon. They were used to access our database from outside the platform. Right now, we are actually in a closed beta phase and will soon relaunch. For that, we set up two new clients, fgread-r <https://github.com/FASTGenomics/fgread-r> and fgread-py <https://github.com/FASTGenomics/fgread-py>, that are used to load data from our internal database within the platform ( https://prod.fastgenomics.org). As such, the new clients are part of the jupyter images that we provide for the analyses, see, e.g., https://hub.docker.com/r/fastgenomics/jupyter-scanpy. 2. In the future, we may again plan to develop clients for external access to our database. These clients could then well be added to bioconda and/or bioconductor. As our current clients are only working within the platform, we did not think about adding them to these repositories. We have also moved away from providing analysis apps (which might be added to bioconda/bioconductor) to providing complete Jupyter notebooks for the analyses. The platform itself belongs to Comma Soft AG <https://www.comma-soft.com> in Bonn, Germany, and will not be open-sourced. I hope I could answer your questions. Please let me know if you need additional information. Best reagrds, Mitch Am Mi., 2. Okt. 2019 um 19:46 Uhr schrieb Matt Speir < [email protected]>:

…

Hello, @mvonpapen <https://github.com/mvonpapen>. Thank you again for submitting an analysis portal to the registry. I'm hoping you can answer a couple of questions about how your portal adheres to our standards: 1. For the "Use Containers and Modules" standard, it looks like you have Docker containers for your tools at https://hub.docker.com/u/fastgenomics. I noticed that you have github repos for your python client <https://github.com/FASTGenomics/py_client> and r client <https://github.com/FASTGenomics/r_client> for your portal, but I didn't see a docker instance for either of these in Docker hub. Am I just not seeing them or are they not present? If they're not present, are there plans to add them? 2. For the "Register Upstream", I can't seem to find your platform, clients, or the related utilities registered in bioconductor or bioconda. Are there plans to register all of your stuff in one of these platforms? (It's also possible they're already there and I just missed them.) Thanks! Matthew — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#540>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGIO6V7TCLLU3IWK2JYDNJTQMTM7TANCNFSM4I2EQDCQ> .

mvonpapen · 2019-10-15T16:28:56Z

Hi @matthewspeir,

Did you receive my response or do you need any additional information? Just to let you know, we will soon switch our platform to beta.fastgenomics.org. prod.fastgenomics.org will be closed at the end of the month.

Best, Mitch

mvonpapen · 2019-10-23T09:06:31Z

Dear @theathorn ,

A new version of FASTGenomics has been launched and therefore I hereby slightly update the original description. The updated submit is the following:

Tool title: FASTGenomics
Contact name: FASTGenomics Team
Contact email: [email protected]
Who to attribute: Comma Soft AG, Bonn
Portal URL: www.fastgenomics.org
Short description: FASTGenomics is a platform to explore, analyze and share scRNA-seq data with interactive Jupyter notebooks. Users can create individual workflows or choose from best practices notebooks in Python and R to explore gene expression data in various data formats.

Optional:

Long description:

FASTGenomics - a platform to analyze and share single-cell RNA sequencing data with reproducible Jupyter notebooks

Recent technological advances enable genomics of individual cells, the building blocks of all living organisms. Single cell data characteristics differ from those of bulk data, which led to a plethora of new analytical strategies. However, solutions are only useful for experts and currently, there are no widely accepted gold standards for single cell data analysis. To meet the requirements of analytical flexibility, reproducibility, ease of use and data security, we developed FASTGenomics as a powerful, efficient, versatile, robust, safe and intuitive analytical ecosystem for single-cell transcriptomics [Scholz et al., 2018]. This development has been carried out at Comma Soft, Bonn, in collaboration with the Schultze lab at LIMES Institute in Bonn, Germany.

FASTGenomics is designed as a platform for single cell RNA-seq data open to the scientific community. A major feature is to provide highest reproducibility and transparency for single cell data analysis to the whole community. The platform provides publicly available datasets and analyses in the form of Jupyter notebooks for the exploration and visualization of gene expression data. Using docker containers provides full reproducibility and helps avoiding "works only on my machine" problems. Register now at https://www.fastgenomics.org and have a look at our collection of data sets and example analyses.

FASTGenomics serves as a platform, where you can share your data with the community and test novel algorithms on public data sets with known results. FASTGenomics scales already routinely to more than 300k cells per project and prototype apps suggest that scaling to 1M cells is also possible [Scholz et al., 2018]. Moreover, its hybrid design also allows custom solutions such as FASTGenomics on premise for clinical and pharmaceutical research facilities.

For more information, you can find us on Twitter, LinkedIn or come talk to us on Slack. To register for the platform visit us at www.fastgenomics.org.

REFERENCES
Scholz et al. (2018) FASTGenomics: An analytical ecosystem for single-cell RNA sequencing data. BioRxiv, 272476.

Logo or screenshot:

matthewspeir · 2019-10-28T21:47:35Z

Hello, Michael.

I apologize for not responding sooner, I just returned from 3 weeks of traveling last Friday.

Based on your response, it doesn't sound like your tool meets the registry standards (specifically 'Be Free and Open Source' and 'Register Upstream'): https://data.humancellatlas.org/contribute/analysis-tools-registry/registry-standards. Can you please review all of the standards and describe how your tool meets the six 'Required' ones listed on that page?

mvonpapen · 2019-11-07T15:16:10Z

Hi @matthewspeir ,

I hope you had a good trip :-)

FASTGenomics is indeed a little different from the other analysis tools on the HCA site in the sense that we are an online platform and not a tool or (yet) a tertiary portal. Also, our "method packages" are Jupyter notebooks and not real packages or modules. However, all our analyses and datasets are open and accessible, so I think that we meet the registry standards.

So let's go through the registry standards one by one:

Be Free and Open Source
Our readers (fgread_r and fgread_py) as well as our "method packages" (the Jupyter notebooks, analysis_*) are freely available on on our github page. Licenses are contained in the repositories. The images we use are freely available on our dockerhub page.
Use Containers and Modules
The Jupyter notebooks are realized within docker containers that are freely available on our dockerhub page. What our platform does is the following: it loads the image from dockerhub and copies the data from our database together with the Jupyter notebooks from github into the container.
Register Upstream
Our readers are registered on pipy (here) and can be installed via pip. As our "method packages" are plain Jupyter notebooks, we think it would not make sense to register them in an upstream registry. The images for FASTGenomics are publicly available in the docker registry.
Support Standard Data Formats
Our readers currently support the most common standard data formats: rds, hdf5, h5ad, loom, csv, tsv, and soon mtx.
Document Installation and Usage
We offer an instructive git repository analysis-test-environment, which can be used to run all analyses on a local machine. It comes with a documentation and an instruction of how to set it up, how to load the data and analyses and how to run everything with docker.
Provide Testing Data
Test data in all supported formats can be found in the git repository test-data. The test data include 11 versions of all supported data formats including the 3k PBMC dataset from 10x.

Please let me know, if I'm missing something or if you need further clarification.

matthewspeir · 2019-11-15T23:02:48Z

Hi, @mvonpapen.

Thanks for providing those details. Just to be sure I'm understanding correctly, you're providing open-source tools to read data into and out of your closed platform? At least that's what I gather from looking at the Github readmes for the two fgreader tools. I'm not sure that this fits with the spirit of the DCP registry standards.

mvonpapen · 2019-11-18T09:26:23Z

Dear @matthewspeir ,

Yes, you are correct. We provide open-source-tools to read data into and out of our closed platform, and additionally to analyze this data. However, using the provided source code from git, people can actually do this without using our free(!) platform.

As I said earlier, we are very interested to get listed in the Analysis tools registry and we have already taken several steps to meet your standards. In contrast, we noticed that the Analysis tools registry even lists proprietary software such as the Bioturing Browser, which not only closed-source but also not free. Could you please explain what exactly you mean with us not fitting the spirit of DCP registry standards?

Best,
Mitch

matthewspeir · 2019-11-18T21:59:26Z

Hi, @mvonpapen.

Admittedly, I did not write the standards and have only been involved in reviewing a few of the most recent submissions to the Analysis registry. I have been applying the standards as I understand them (and with some guidance from those who wrote them) to these submissions. Since I was not involved in adding the Bioturing Browser to the registry, I can't comment on what was involved there or how the decision was made.

I believe Jean Chang (@jlchang) and Tim Tickle (@TimothyTickle) were both involved in the creation of these standards. Perhaps they could comment on how your portal meets the DCP standards.

mvonpapen · 2019-11-19T07:36:20Z

Hi @matthewspeir ,

Thank you for the reference to @TimothyTickle and @jlchang. We are eager to meet your standards and some advice how to do that would help a lot at the moment.

Best,
Mitch

mvonpapen · 2020-03-05T08:43:06Z

Dear @theathorn ,

We are still willing to provide all necessary information to be listed in the analysis portal registry. As I outlined above, all relevant information is publicly available in our github. In addition to the analysis code we also provide test-data as well as a test environment that is easy to set up.

I noticed that there are two commercial providers (BioTuring and DNAStack) in the analysis registry at the moment that offer similar services as we do. Therefore, I hope that there is a way that FASTGenomics can also be included in the registry.

I would be very happy to hear from you.

Best rregards,
Mitch

theathorn · 2020-03-05T22:36:48Z

Hello @mvonpapen,

The DCP is currently in maintenance mode and not accepting new portal registrations, but we expect to resume the registration process in the coming months. I will update this ticket when that happens.

Trevor

NoopDog · 2021-03-11T20:47:34Z

Hi @mvonpapen The HCA data portal has once again begun accepting submissions to the Methods Registry. Can you take a look at this submission and make any required updates? We will then restart the approval process.

mvonpapen · 2021-03-15T10:13:21Z

Hi @NoopDog ,
I'm happy to hear that you re-opened for submissions. I've updated our description and look forward to proceed with the process.

Required:

Tool title: FASTGenomics
Contact name: Team FASTGenomics
Contact email: [email protected]
Who to attribute: Comma Soft AG
Portal URL: https://beta.fastgenomics.org
Short description: FASTGenomics is a cloud-based collaboration platform providing data management and reproducible analyses for scRNA-seq and omics data. FASTGenomics allows you to collaborate in groups, share data, create individual analyses and publish interactive results.

Optional: Long description

FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data

Collaboration and data sharing is key in biomedical research. It involves experts from several fields of study such as Molecular Biology, Immunology, Data Science and Computer Science as well as storage and re-use of data in a reproducible environment.
Our Life and Data Science experts at Comma Soft have therefore developed the open platform FASTGenomics, which provides a common infrastructure, smart data management, is easy to use and allows direct access to data and results. It thus acts as a single point of truth and brings together all collaborators of your project.

The aim of FASTGenomics is to provide highest reproducibility and transparency for single cell and omics data analysis to the whole community. The platform offers publicly available datasets, reproducible analyses and interactive projects for exploration and visualization of gene expression data. Docker containers provide full reproducibility and help avoiding "works only on my machine" problems.

FASTGenomics is an open-access platform and is used as the central data and analytics platform in various European research projects such as the Human Cell Atlas project discovAIR and the EU H2020 project SYSCID.

We are an experienced partner with a tight network of leading experts from Bioinformatics, Immunology and Pharma. Also, we are an active member of several academic networks such as the Human Cell Atlas Lung Biological Network, Sparse2Big, and Single Cell Omics Germany. Together, we can help you get started with your research project, assist in data management, and leverage the power of state-of-the-art AI-based techniques. Our hybrid design also allows custom solutions such as FASTGenomics on-premises for clinical and pharmaceutical research facilities.

Where to find us:

Twitter: @FASTGenomics
Youtube: FASTGenomics channel
Slack: Slack support channel
Github: https://github.com/FASTGenomics
Docker: https://hub.docker.com/u/fastgenomics

Logo or screenshot:

mvonpapen · 2021-04-19T06:39:06Z

Hi @NoopDog ,
Do you have any updates regarding the submission process?

NoopDog · 2021-04-19T16:37:14Z

Hi, @mvonpapen we hope to have the latest round of applicants processed this week. Will be back to you in the next couple of days with an update. Thank you for your patience!
Cheers,
D

NoopDog · 2021-04-20T22:14:22Z

Hi, @mvonpapen I have reached out internally to get clarification on the policy for listing closed source commercial portals. I will update you when we hear back. It may take a week or so for my question to be processed. Thank you for your application to the registry and your patience! Cheers Dave Rogers

NoopDog · 2021-04-23T05:22:38Z

Hi, @mvonpapen I am happy to report that FASTGenomics has been approved for posting. Your page will be on our staging server shortly. I will let you know when it's up on our staging server so you can review it.

Cheers,
Dave

NoopDog · 2021-04-23T06:01:28Z

Hi @mvonpapen your portal page is up here:

https://dev.singlecell.gi.ucsc.edu/analyze/portals/fastgenomics

Can you review and let us know if you would like any updates before we push this live?

Cheers,
Dave Rogers

mvonpapen · 2021-04-23T13:17:19Z

Hi @NoopDog ,

That is great news! Thank you.

I do have some updates/issues:

the title figure has changed a little. please use the following title figure:
The Team name changed from currently "CC LifeScience @ Comma Soft AG" to now simply "Comma Soft AG"
please change the URL for the "View" button to "https://fastgenomics.org/login"
There are currently two titles, one above and one below the first figure. The title above the first figure is outdated (from the first submission). The new title should be: "FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data"
Please add a link to the platform under Where to find us: "Homepage: https://fastgenomics.org/login"
Contact: please change "Michael von Papen ([email protected])" to "Team FASTGenomics ([email protected])"
the FASTGenomics figure at the very bottom (the one that just spells FASTGENOMICS) can be removed

Thanks again, we are thrilled to be part of the HCA registry!

NoopDog · 2021-04-23T17:59:34Z

@mvonpapen we have made the requested changes and your page is now live.
https://data.humancellatlas.org/analyze/portals/fastgenomics

If you would like any other updates please feel free to submit a PR. You can get started with a PR by selecting the "Improve this Page" link at the bottom of the page and editing the markdown source directly in GitHub.

Cheers,
D

github-actions bot added the canary Done by the Clever Canary label Sep 24, 2019

theathorn assigned matthewspeir Sep 24, 2019

theathorn added this to the Q4 2019 Milestone 1 milestone Sep 24, 2019

theathorn modified the milestones: Q4 2019 Milestone 1, Q4 2019 Milestone 2 Oct 23, 2019

theathorn modified the milestones: Q4 2019 Milestone 2, Q4 2019 Milestone 3 Nov 20, 2019

theathorn removed this from the Q4 2019 Milestone 3 milestone Jan 28, 2020

theathorn unassigned matthewspeir Jan 30, 2020

MillenniumFalconMechanic assigned NoopDog Feb 24, 2021

NoopDog assigned frano-m Apr 23, 2021

frano-m pushed a commit that referenced this issue Apr 23, 2021

FASTGenomics. OmniBrowser. First pass. #540. #913.

085592b

frano-m pushed a commit that referenced this issue Apr 23, 2021

FASTGenomics. OmniBrowser. First pass. #540. #913.

074552c

NoopDog pushed a commit that referenced this issue Apr 23, 2021

FASTGenomics. OmniBrowser. First pass. #540. #913.

3d21e20

NoopDog closed this as completed Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FASTGenomics #540

FASTGenomics #540

theathorn commented Sep 24, 2019

matthewspeir commented Sep 26, 2019

matthewspeir commented Oct 2, 2019

mvonpapen commented Oct 2, 2019 via email

mvonpapen commented Oct 15, 2019

mvonpapen commented Oct 23, 2019 •

edited

Loading

matthewspeir commented Oct 28, 2019

mvonpapen commented Nov 7, 2019

matthewspeir commented Nov 15, 2019

mvonpapen commented Nov 18, 2019

matthewspeir commented Nov 18, 2019

mvonpapen commented Nov 19, 2019

mvonpapen commented Mar 5, 2020

theathorn commented Mar 5, 2020

NoopDog commented Mar 11, 2021

mvonpapen commented Mar 15, 2021

mvonpapen commented Apr 19, 2021

NoopDog commented Apr 19, 2021

NoopDog commented Apr 20, 2021

NoopDog commented Apr 23, 2021

NoopDog commented Apr 23, 2021

mvonpapen commented Apr 23, 2021

NoopDog commented Apr 23, 2021

FASTGenomics #540

FASTGenomics #540

Comments

theathorn commented Sep 24, 2019

matthewspeir commented Sep 26, 2019

matthewspeir commented Oct 2, 2019

mvonpapen commented Oct 2, 2019 via email

mvonpapen commented Oct 15, 2019

mvonpapen commented Oct 23, 2019 • edited Loading

matthewspeir commented Oct 28, 2019

mvonpapen commented Nov 7, 2019

matthewspeir commented Nov 15, 2019

mvonpapen commented Nov 18, 2019

matthewspeir commented Nov 18, 2019

mvonpapen commented Nov 19, 2019

mvonpapen commented Mar 5, 2020

theathorn commented Mar 5, 2020

NoopDog commented Mar 11, 2021

mvonpapen commented Mar 15, 2021

Required:

Optional: Long description

FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data

Where to find us:

Logo or screenshot:

mvonpapen commented Apr 19, 2021

NoopDog commented Apr 19, 2021

NoopDog commented Apr 20, 2021

NoopDog commented Apr 23, 2021

NoopDog commented Apr 23, 2021

mvonpapen commented Apr 23, 2021

NoopDog commented Apr 23, 2021

mvonpapen commented Oct 23, 2019 •

edited

Loading