-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using s3 bucket for resources #2
Comments
@kopardev I was also thinking we should create a custom set of references for the ci workflow. Here is what I am thinking:
The Do you have some time to do look into this more? |
This is a great idea for creating a small dataset for workflow CI. As this a completely different issue, I am moving it as such. |
@kopardev I just finished creating a new docker image for kraken2+krona and re-writing/testing the new rule. I was also able to get it to integrate with the latest version of MultiQC: I will look into integrating s3 resources tomorrow. I was reading through the snakemake's documentation, and it looks pretty straight-forward. |
I was reading the snakemake documentation on s3 and it appears to me that you need to login to the s3 bucket with aws credentials. I dont know what the best way to authenticate via a pipeline (may be a service account?). But another option is to convert the s3 bucket into a static website. For eg. the fastqc adapters are now available on the following URL
Now, we can possibly use this to access the files/objects in the s3 bucket in read-only mode via HTTP. What do you think? |
Regarding kraken2.... since the visualization is now completely handled by the newer version of multiqc, do you still need krona? |
Can the resources be hosted in s3 buckets? For eg. for hg38 and gencode version 38 can we use:
I have already uploaded all resources for hg38 (gencode release 30), except the STAR indices. We only need the noGTF version of the STAR index if we are providing GTF on the fly and will be independent of the release version. All files are gzipped on the s3 bucket and folders are tar.gz (eg. rsemref.tar.gz)
The text was updated successfully, but these errors were encountered: