Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running flair collapse with longshot input #410

Open
laulambr opened this issue Jan 22, 2025 · 3 comments
Open

Running flair collapse with longshot input #410

laulambr opened this issue Jan 22, 2025 · 3 comments

Comments

@laulambr
Copy link

Copy and paste the exact command you tried to run
I am trying to run the flair collapse module while giving longshot bam and vcf files as input.

flair collapse --output $output/flair_collapse/longer-longshot-env -g $fasta -q $output/flair_all_corrected.bed -r $output/flair.input.fastq --gtf $ann --threads 20 --stringent --check_splice --generate_map --longshot_vcf $output/longshot.vcf --longshot_bam $output/longshot.bam --annotation_reliant generate

How did you install Flair?
Installed FLAIR v2.0.0 with both conda and docker. Both installations gave identical issue when given exact same command.

What happened?

Following the "Filtering isoforms by read coverage", this error message appears.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/flair/bed_to_sequence.py", line 56, in <module>
    vcf.fetch(chrom)
  File "pysam/libcbcf.pyx", line 4464, in pysam.libcbcf.VariantFile.fetch
ValueError: fetch requires an index

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/flair/bed_to_sequence.py", line 59, in <module>
    subprocess.check_call(['bgzip', '-c', args.vcf], stdout=open(args.vcf+'.gz', 'w'))
  File "/usr/lib/python3.10/subprocess.py", line 364, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/usr/lib/python3.10/subprocess.py", line 345, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/usr/lib/python3.10/subprocess.py", line 969, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.10/subprocess.py", line 1845, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'bgzip'
Traceback (most recent call last):
  File "/usr/local/bin/flair", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/flair/flair.py", line 1035, in main
    status = collapse()
  File "/usr/local/lib/python3.10/dist-packages/flair/flair.py", line 661, in collapse
    subprocess.check_call(to_sequence_cmd)
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '/usr/local/lib/python3.10/dist-packages/flair/bed_to_sequence.py', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot.isoforms.bed', '/staging/leuven/stg_00096/references/chm13_v2.0_maskedY.rCRS/fasta/chm13v2.0_maskedY_rCRS.fa', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot.isoforms.fa', '--vcf', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/longshot.vcf', '--isoform_haplotypes', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot.phase_sets.txt', '--vcf_out', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot.flair.vcf']' returned non-zero exit status 1.

Please let me know if you need any additional information. Thank you!

@cafelton
Copy link
Collaborator

There seems to be an error with the vcf file not being indexed. I'm going to follow up on this and add code to check for an index and create one if it doesn't exist, but for now, try indexing the vcf file first with tabix index (https://www.htslib.org/doc/tabix.html) and rerunning the command. If it still doesn't work, can you please share your files or subsets of them so that we can replicate this error?

@diekhans
Copy link
Collaborator

diekhans commented Jan 22, 2025 via email

@laulambr
Copy link
Author

laulambr commented Jan 23, 2025

So I created an index for the vcf file as follows but still ran into a (different) issue (similar to #263):

bgzip $output/longshot.vcf
tabix -p vcf $output/longshot.vcf.gz

Then ran the following FLAIR command:

flair collapse --output $output/flair_collapse/longer-longshot-env -g $fasta -q $output/flair_all_corrected.bed -r $output/flair.input.fastq --gtf $ann --threads 20 --stringent --check_splice --generate_map --longshot_vcf $output/longshot.vcf.gz --longshot_bam $output/longshot.bam --annotation_reliant generate

This gave me the following error message:

Traceback (most recent call last):
  File "/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/lib/python3.9/site-packages/flair/bed_to_sequence.py", line 260, in <module>
    write_sequences(chrom)
  File "/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/lib/python3.9/site-packages/flair/bed_to_sequence.py", line 218, in write_sequences
    pulled_seqs = get_sequence_with_variants(entry, seq, name)
TypeError: get_sequence_with_variants() takes 2 positional arguments but 3 were given
Traceback (most recent call last):
  File "/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/bin/flair", line 10, in <module>
    sys.exit(main())
  File "/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/lib/python3.9/site-packages/flair/flair.py", line 1035, in main
    status = collapse()
  File "/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/lib/python3.9/site-packages/flair/flair.py", line 661, in collapse
    subprocess.check_call(to_sequence_cmd)
  File "/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/lib/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/bin/python', '/vsc-hard-mounts/leuven-data/366/vsc36671/micromamba/envs/flair/lib/python3.9/site-packages/flair/bed_to_sequence.py', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot-env.isoforms.bed', '/staging/leuven/stg_00096/references/chm13_v2.0_maskedY.rCRS/fasta/chm13v2.0_maskedY_rCRS.fa', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot-env.isoforms.fa', '--vcf', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/longshot.vcf.gz', '--isoform_haplotypes', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot-env.phase_sets.txt', '--vcf_out', '/staging/leuven/stg_00096/home/laulambr/projects/2024_directRNA/20241120_directRNA_CCS15M/CCS15M/Analysis/flair-output-TEST/flair_collapse/longer-longshot-env.flair.vcf']' returned non-zero exit status 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants