Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FASTGenomics #540

Closed
theathorn opened this issue Sep 24, 2019 · 22 comments
Closed

FASTGenomics #540

theathorn opened this issue Sep 24, 2019 · 22 comments
Assignees
Labels
canary Done by the Clever Canary

Comments

@theathorn
Copy link

@mvonpapen commented on Tue Sep 24 2019

Thank you for submitting a portal to the HCA DCP Methods Registry!

To expedite your portal's addition to https://data.humancellatlas.org/analyze,
please provide the following package metadata. You can easily edit this information later by clicking "Improve this page" at the bottom of your portal's detail page (example).

Required:

  • Tool title: FASTGenomics
  • Contact name: Michael von Papen
  • Contact email: [email protected]
  • Who to attribute: CC LifeScience @ Comma Soft AG
  • Portal URL: www.fastgenomics.org
  • Short description: FASTGenomics is a platform to share scRNA-seq data and analyses. Users can either choose from best practices or create individual workflows for the exploration of gene expression data.

Optional:

  • Long description:

FASTGenomics - a platform to share single-cell RNA sequencing data and analyses using reproducible workflows

image

Recent technological advances enable genomics of individual cells, the building blocks of all living organisms. Single cell data characteristics differ from those of bulk data, which led to a plethora of new analytical strategies. However, solutions are only useful for experts and currently, there are no widely accepted gold standards for single cell data analysis. To meet the requirements of analytical flexibility, reproducibility, ease of use and data security, we developed FASTGenomics as a powerful, efficient, versatile, robust, safe and intuitive analytical ecosystem for single-cell transcriptomics [Scholz et al., 2018]. This development has been carried out at Comma Soft, Bonn, in collaboration with the Schultze lab at LIMES Institute in Bonn, Germany. 

image

FASTGenomics is designed as a platform for single cell RNA-seq data open to the scientific community. A major feature is to provide highest reproducibility and transparency for single cell data analysis to the whole community. The platform provides publicly available datasets and analyses for exploration and visualization of gene expression data. Using docker containers provides full reproducibility and helps avoiding "works only on my machine" problems. Register now at https://www.fastgenomics.org and have a look at our collection of data sets and example analyses.

FASTGenomics serves as a platform, where you can share your data with the community and test novel algorithms on public data sets with known results. FASTGenomics scales already routinely to more than 300k cells per project and prototype apps suggest that scaling to 1M cells is also possible [Scholz et al., 2018]. Moreover, its hybrid design also allows custom solutions such as FASTGenomics on premise for clinical and pharmaceutical research facilities.

For more information, you can find us on Twitter, LinkedIn or come talk to us on Slack. To register for the platform visit us at www.fastgenomics.org.

Note: We will soon offer interactive analyses based on Jupyter notebooks. Stay tuned for the upcoming beta-test!

REFERENCES 
Scholz et al. (2018) FASTGenomics: An analytical ecosystem for single-cell RNA sequencing data. BioRxiv, 272476.

  • Logo or screenshot: FG_Logo_Schriftmarke_CMYK
@github-actions github-actions bot added the canary Done by the Clever Canary label Sep 24, 2019
@theathorn theathorn added this to the Q4 2019 Milestone 1 milestone Sep 24, 2019
@matthewspeir
Copy link
Contributor

Hello, @mvonpapen.

Thank you for submitting an analysis portal to the DCP. I will review your portal against our registry standards and let you know if we have any questions.

@matthewspeir
Copy link
Contributor

Hello, @mvonpapen.

Thank you again for submitting an analysis portal to the registry. I'm hoping you can answer a couple of questions about how your portal adheres to our standards:

  1. For the "Use Containers and Modules" standard, it looks like you have Docker containers for your tools at https://hub.docker.com/u/fastgenomics. I noticed that you have github repos for your python client and r client for your portal, but I didn't see a docker instance for either of these in Docker hub. Am I just not seeing them or are they not present? If they're not present, are there plans to add them?

  2. For the "Register Upstream", I can't seem to find your platform, clients, or the related utilities registered in bioconductor or bioconda. Are there plans to register all of your stuff in one of these platforms? (It's also possible they're already there and I just missed them.)

Thanks!

Matthew

@mvonpapen
Copy link

mvonpapen commented Oct 2, 2019 via email

@mvonpapen
Copy link

Hi @matthewspeir,

Did you receive my response or do you need any additional information? Just to let you know, we will soon switch our platform to beta.fastgenomics.org. prod.fastgenomics.org will be closed at the end of the month.

Best, Mitch

@theathorn theathorn modified the milestones: Q4 2019 Milestone 1, Q4 2019 Milestone 2 Oct 23, 2019
@mvonpapen
Copy link

mvonpapen commented Oct 23, 2019

Dear @theathorn ,

A new version of FASTGenomics has been launched and therefore I hereby slightly update the original description. The updated submit is the following:

  • Tool title: FASTGenomics
  • Contact name: FASTGenomics Team
  • Contact email: [email protected]
  • Who to attribute: Comma Soft AG, Bonn
  • Portal URL: www.fastgenomics.org
  • Short description: FASTGenomics is a platform to explore, analyze and share scRNA-seq data with interactive Jupyter notebooks. Users can create individual workflows or choose from best practices notebooks in Python and R to explore gene expression data in various data formats.

Optional:

  • Long description:

FASTGenomics - a platform to analyze and share single-cell RNA sequencing data with reproducible Jupyter notebooks

webpage_alt

Recent technological advances enable genomics of individual cells, the building blocks of all living organisms. Single cell data characteristics differ from those of bulk data, which led to a plethora of new analytical strategies. However, solutions are only useful for experts and currently, there are no widely accepted gold standards for single cell data analysis. To meet the requirements of analytical flexibility, reproducibility, ease of use and data security, we developed FASTGenomics as a powerful, efficient, versatile, robust, safe and intuitive analytical ecosystem for single-cell transcriptomics [Scholz et al., 2018]. This development has been carried out at Comma Soft, Bonn, in collaboration with the Schultze lab at LIMES Institute in Bonn, Germany.

fg_jupyter

FASTGenomics is designed as a platform for single cell RNA-seq data open to the scientific community. A major feature is to provide highest reproducibility and transparency for single cell data analysis to the whole community. The platform provides publicly available datasets and analyses in the form of Jupyter notebooks for the exploration and visualization of gene expression data. Using docker containers provides full reproducibility and helps avoiding "works only on my machine" problems. Register now at https://www.fastgenomics.org and have a look at our collection of data sets and example analyses.

FASTGenomics serves as a platform, where you can share your data with the community and test novel algorithms on public data sets with known results. FASTGenomics scales already routinely to more than 300k cells per project and prototype apps suggest that scaling to 1M cells is also possible [Scholz et al., 2018]. Moreover, its hybrid design also allows custom solutions such as FASTGenomics on premise for clinical and pharmaceutical research facilities.

For more information, you can find us on Twitter, LinkedIn or come talk to us on Slack. To register for the platform visit us at www.fastgenomics.org.

REFERENCES 
Scholz et al. (2018) FASTGenomics: An analytical ecosystem for single-cell RNA sequencing data. BioRxiv, 272476.

  • Logo or screenshot: FG_Logo_Schriftmarke_CMYK

@matthewspeir
Copy link
Contributor

Hello, Michael.

I apologize for not responding sooner, I just returned from 3 weeks of traveling last Friday.

Based on your response, it doesn't sound like your tool meets the registry standards (specifically 'Be Free and Open Source' and 'Register Upstream'): https://data.humancellatlas.org/contribute/analysis-tools-registry/registry-standards. Can you please review all of the standards and describe how your tool meets the six 'Required' ones listed on that page?

@mvonpapen
Copy link

Hi @matthewspeir ,

I hope you had a good trip :-)

FASTGenomics is indeed a little different from the other analysis tools on the HCA site in the sense that we are an online platform and not a tool or (yet) a tertiary portal. Also, our "method packages" are Jupyter notebooks and not real packages or modules. However, all our analyses and datasets are open and accessible, so I think that we meet the registry standards.

So let's go through the registry standards one by one:

  • Be Free and Open Source
    Our readers (fgread_r and fgread_py) as well as our "method packages" (the Jupyter notebooks, analysis_*) are freely available on on our github page. Licenses are contained in the repositories. The images we use are freely available on our dockerhub page.

  • Use Containers and Modules
    The Jupyter notebooks are realized within docker containers that are freely available on our dockerhub page. What our platform does is the following: it loads the image from dockerhub and copies the data from our database together with the Jupyter notebooks from github into the container.

  • Register Upstream
    Our readers are registered on pipy (here) and can be installed via pip. As our "method packages" are plain Jupyter notebooks, we think it would not make sense to register them in an upstream registry. The images for FASTGenomics are publicly available in the docker registry.

  • Support Standard Data Formats
    Our readers currently support the most common standard data formats: rds, hdf5, h5ad, loom, csv, tsv, and soon mtx.

  • Document Installation and Usage
    We offer an instructive git repository analysis-test-environment, which can be used to run all analyses on a local machine. It comes with a documentation and an instruction of how to set it up, how to load the data and analyses and how to run everything with docker.

  • Provide Testing Data
    Test data in all supported formats can be found in the git repository test-data. The test data include 11 versions of all supported data formats including the 3k PBMC dataset from 10x.

Please let me know, if I'm missing something or if you need further clarification.

@matthewspeir
Copy link
Contributor

Hi, @mvonpapen.

Thanks for providing those details. Just to be sure I'm understanding correctly, you're providing open-source tools to read data into and out of your closed platform? At least that's what I gather from looking at the Github readmes for the two fgreader tools. I'm not sure that this fits with the spirit of the DCP registry standards.

@mvonpapen
Copy link

Dear @matthewspeir ,

Yes, you are correct. We provide open-source-tools to read data into and out of our closed platform, and additionally to analyze this data. However, using the provided source code from git, people can actually do this without using our free(!) platform.

As I said earlier, we are very interested to get listed in the Analysis tools registry and we have already taken several steps to meet your standards. In contrast, we noticed that the Analysis tools registry even lists proprietary software such as the Bioturing Browser, which not only closed-source but also not free. Could you please explain what exactly you mean with us not fitting the spirit of DCP registry standards?

Best,
Mitch

@matthewspeir
Copy link
Contributor

Hi, @mvonpapen.

Admittedly, I did not write the standards and have only been involved in reviewing a few of the most recent submissions to the Analysis registry. I have been applying the standards as I understand them (and with some guidance from those who wrote them) to these submissions. Since I was not involved in adding the Bioturing Browser to the registry, I can't comment on what was involved there or how the decision was made.

I believe Jean Chang (@jlchang) and Tim Tickle (@TimothyTickle) were both involved in the creation of these standards. Perhaps they could comment on how your portal meets the DCP standards.

@mvonpapen
Copy link

Hi @matthewspeir ,

Thank you for the reference to @TimothyTickle and @jlchang. We are eager to meet your standards and some advice how to do that would help a lot at the moment.

Best,
Mitch

@theathorn theathorn modified the milestones: Q4 2019 Milestone 2, Q4 2019 Milestone 3 Nov 20, 2019
@theathorn theathorn removed this from the Q4 2019 Milestone 3 milestone Jan 28, 2020
@mvonpapen
Copy link

Dear @theathorn ,

We are still willing to provide all necessary information to be listed in the analysis portal registry. As I outlined above, all relevant information is publicly available in our github. In addition to the analysis code we also provide test-data as well as a test environment that is easy to set up.

I noticed that there are two commercial providers (BioTuring and DNAStack) in the analysis registry at the moment that offer similar services as we do. Therefore, I hope that there is a way that FASTGenomics can also be included in the registry.

I would be very happy to hear from you.

Best rregards,
Mitch

@theathorn
Copy link
Author

Hello @mvonpapen,

The DCP is currently in maintenance mode and not accepting new portal registrations, but we expect to resume the registration process in the coming months. I will update this ticket when that happens.

Trevor

@NoopDog
Copy link
Collaborator

NoopDog commented Mar 11, 2021

Hi @mvonpapen The HCA data portal has once again begun accepting submissions to the Methods Registry. Can you take a look at this submission and make any required updates? We will then restart the approval process.

@mvonpapen
Copy link

Hi @NoopDog ,
I'm happy to hear that you re-opened for submissions. I've updated our description and look forward to proceed with the process.

Required:

Tool title: FASTGenomics
Contact name: Team FASTGenomics
Contact email: [email protected]
Who to attribute: Comma Soft AG
Portal URL: https://beta.fastgenomics.org
Short description: FASTGenomics is a cloud-based collaboration platform providing data management and reproducible analyses for scRNA-seq and omics data. FASTGenomics allows you to collaborate in groups, share data, create individual analyses and publish interactive results.

Optional: Long description

FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data

Collaboration and data sharing is key in biomedical research. It involves experts from several fields of study such as Molecular Biology, Immunology, Data Science and Computer Science as well as storage and re-use of data in a reproducible environment.
Our Life and Data Science experts at Comma Soft have therefore developed the open platform FASTGenomics, which provides a common infrastructure, smart data management, is easy to use and allows direct access to data and results. It thus acts as a single point of truth and brings together all collaborators of your project.

The aim of FASTGenomics is to provide highest reproducibility and transparency for single cell and omics data analysis to the whole community. The platform offers publicly available datasets, reproducible analyses and interactive projects for exploration and visualization of gene expression data. Docker containers provide full reproducibility and help avoiding "works only on my machine" problems.

FASTGenomics is an open-access platform and is used as the central data and analytics platform in various European research projects such as the Human Cell Atlas project discovAIR and the EU H2020 project SYSCID.

We are an experienced partner with a tight network of leading experts from Bioinformatics, Immunology and Pharma. Also, we are an active member of several academic networks such as the Human Cell Atlas Lung Biological Network, Sparse2Big, and Single Cell Omics Germany. Together, we can help you get started with your research project, assist in data management, and leverage the power of state-of-the-art AI-based techniques. Our hybrid design also allows custom solutions such as FASTGenomics on-premises for clinical and pharmaceutical research facilities.

Where to find us:

Twitter: @FASTGenomics
Youtube: FASTGenomics channel
Slack: Slack support channel
Github: https://github.com/FASTGenomics
Docker: https://hub.docker.com/u/fastgenomics

Logo or screenshot:

FASTGenomics_oneplace

@mvonpapen
Copy link

Hi @NoopDog ,
Do you have any updates regarding the submission process?

@NoopDog
Copy link
Collaborator

NoopDog commented Apr 19, 2021

Hi, @mvonpapen we hope to have the latest round of applicants processed this week. Will be back to you in the next couple of days with an update. Thank you for your patience!
Cheers,
D

@NoopDog
Copy link
Collaborator

NoopDog commented Apr 20, 2021

Hi, @mvonpapen I have reached out internally to get clarification on the policy for listing closed source commercial portals. I will update you when we hear back. It may take a week or so for my question to be processed. Thank you for your application to the registry and your patience! Cheers Dave Rogers

@NoopDog
Copy link
Collaborator

NoopDog commented Apr 23, 2021

Hi, @mvonpapen I am happy to report that FASTGenomics has been approved for posting. Your page will be on our staging server shortly. I will let you know when it's up on our staging server so you can review it.

Cheers,
Dave

@NoopDog
Copy link
Collaborator

NoopDog commented Apr 23, 2021

Hi @mvonpapen your portal page is up here:

https://dev.singlecell.gi.ucsc.edu/analyze/portals/fastgenomics

Can you review and let us know if you would like any updates before we push this live?

Cheers,
Dave Rogers

@mvonpapen
Copy link

Hi @NoopDog ,

That is great news! Thank you.

I do have some updates/issues:

  • the title figure has changed a little. please use the following title figure:
    image
  • The Team name changed from currently "CC LifeScience @ Comma Soft AG" to now simply "Comma Soft AG"
  • please change the URL for the "View" button to "https://fastgenomics.org/login"
  • There are currently two titles, one above and one below the first figure. The title above the first figure is outdated (from the first submission). The new title should be: "FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data"
  • Please add a link to the platform under Where to find us: "Homepage: https://fastgenomics.org/login"
  • Contact: please change "Michael von Papen ([email protected])" to "Team FASTGenomics ([email protected])"
  • the FASTGenomics figure at the very bottom (the one that just spells FASTGENOMICS) can be removed

Thanks again, we are thrilled to be part of the HCA registry!

@NoopDog
Copy link
Collaborator

NoopDog commented Apr 23, 2021

@mvonpapen we have made the requested changes and your page is now live.
https://data.humancellatlas.org/analyze/portals/fastgenomics

If you would like any other updates please feel free to submit a PR. You can get started with a PR by selecting the "Improve this Page" link at the bottom of the page and editing the markdown source directly in GitHub.

Cheers,
D

@NoopDog NoopDog closed this as completed Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
canary Done by the Clever Canary
Projects
None yet
Development

No branches or pull requests

5 participants