Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with inserting known variants with VCF file. #134

Open
mattbird567 opened this issue Dec 11, 2024 · 6 comments
Open

Issue with inserting known variants with VCF file. #134

mattbird567 opened this issue Dec 11, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@mattbird567
Copy link

mattbird567 commented Dec 11, 2024

Describe the bug
I can't introduce know variants into the sunthetic fastq files using a VCF file. I have also tried using the VCF and other test data in the data folder and also get the same error. I can see that it starts to sample reads for a split second getting to maybe 2-3% then the error message appears. Unsure its to do with my VCF file or my install of NEAT as i am able to produce snthetic reads without a VCF file input.

To Reproduce
The VCF file variants_bug.txt
The Config file: config_bug.txt
The log file: 1733915460.8927002_NEAT_bug.txt
The ref file: Mycobacterium_tuberculosis_H37Rv_bug.txt

Expected behavior
Two paired-end fastq files with 17 known variants inserted.

Error message

NEAT run log: /neat4/1733915460.8927002_NEAT.log
2024-12-11 11:11:01,309:INFO:neat.common.logging:writing log to: neat4/1733915460.8927002_NEAT.log
2024-12-11 11:11:01,309:INFO:neat.read_simulator.runner:Using configuration file config.yml
2024-12-11 11:11:01,311:INFO:neat.read_simulator.runner:Saving output files to .
2024-12-11 11:11:01,315:INFO:neat.read_simulator.utils.options:Run Configuration...
2024-12-11 11:11:01,315:INFO:neat.read_simulator.utils.options:Input fasta: refs/Mycobacterium_tuberculosis_H37Rv.fasta
2024-12-11 11:11:01,316:INFO:neat.read_simulator.utils.options:Producing the following files:
        - neat4/test_vcf_r1.fastq.gz
        - neat4/test_vcf_r2.fastq.gz

2024-12-11 11:11:01,316:INFO:neat.read_simulator.utils.options:Single threading - 1 thread.
2024-12-11 11:11:01,316:INFO:neat.read_simulator.utils.options:Using a read length of 150
2024-12-11 11:11:01,316:INFO:neat.read_simulator.utils.options:Generating fragments based on mean=300, stand. dev=30
2024-12-11 11:11:01,316:INFO:neat.read_simulator.utils.options:Running in paired-ended mode.
2024-12-11 11:11:01,316:INFO:neat.read_simulator.utils.options:Average coverage: 5
2024-12-11 11:11:01,317:INFO:neat.read_simulator.utils.options:Using default error model.
2024-12-11 11:11:01,317:INFO:neat.read_simulator.utils.options:Ploidy value: 2
2024-12-11 11:11:01,317:INFO:neat.read_simulator.utils.options:Vcf of variants to include: test_dir_new/variants.vcf
2024-12-11 11:11:01,317:INFO:neat.read_simulator.utils.options:RNG seed value for run: 8732147021153953
2024-12-11 11:11:01,317:INFO:neat.read_simulator.runner:Reading Models...
2024-12-11 11:11:01,318:INFO:neat.read_simulator.runner:Reading refs/Mycobacterium_tuberculosis_H37Rv.fasta.
2024-12-11 11:11:01,425:INFO:neat.read_simulator.runner:Reading input VCF: test_dir_new/variants.vcf.
2024-12-11 11:11:01,425:INFO:neat.read_simulator.utils.vcf_func:Parsing input vcf test_dir_new/variants.vcf
2024-12-11 11:11:03,319:INFO:neat.read_simulator.utils.vcf_func:Found 17 variants in input VCF.
2024-12-11 11:11:03,320:INFO:neat.read_simulator.utils.vcf_func:Skipped 0 variants because of multiples at the same location
2024-12-11 11:11:03,320:INFO:neat.read_simulator.utils.vcf_func:Skipped 0 variants because of a mismatch between Ref and reference.
2024-12-11 11:11:03,672:INFO:neat.read_simulator.runner:Beginning simulation.
2024-12-11 11:11:03,788:INFO:neat.read_simulator.runner:Generating variants for ChrI
2024-12-11 11:11:03,902:INFO:neat.read_simulator.utils.generate_variants:Finished generating random mutations in 0.00 minutes
2024-12-11 11:11:03,903:INFO:neat.read_simulator.utils.generate_variants:Added 0 mutations to ChrI
2024-12-11 11:11:03,903:INFO:neat.read_simulator.utils.generate_reads:Sampling reads...
2024-12-11 11:11:05,695:ERROR:neat:read-simulator failed, see the traceback below
Traceback (most recent call last):
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/cli.py", line 131, in main
    cmd(args)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
    read_simulator_runner(arguments.config, arguments.output)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 313, in read_simulator_runner
    read1_fastq_paired, read1_fastq_single, read2_fastq_paired, read2_fastq_single = generate_reads(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/generate_reads.py", line 345, in generate_reads
    read_1.finalize_read_and_write(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 344, in finalize_read_and_write
    self.read_quality_string = "".join([chr(x + quality_offset) for x in self.quality_array])
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 344, in <listcomp>
    self.read_quality_string = "".join([chr(x + quality_offset) for x in self.quality_array])
TypeError: can only concatenate str (not "int") to str
ERROR: read-simulator failed, showing the last error
Traceback (most recent call last):
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/cli.py", line 131, in main
    cmd(args)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
    read_simulator_runner(arguments.config, arguments.output)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 313, in read_simulator_runner
    read1_fastq_paired, read1_fastq_single, read2_fastq_paired, read2_fastq_single = generate_reads(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/generate_reads.py", line 345, in generate_reads
    read_1.finalize_read_and_write(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 344, in finalize_read_and_write
    self.read_quality_string = "".join([chr(x + quality_offset) for x in self.quality_array])
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 344, in <listcomp>
    self.read_quality_string = "".join([chr(x + quality_offset) for x in self.quality_array])
TypeError: can only concatenate str (not "int") to str

Conda enviroment:


Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
alsa-lib                  1.2.12               h4ab18f5_0    conda-forge
attrs                     24.2.0                   pypi_0    pypi
biopython                 1.79            py310h5764c6d_3    conda-forge
brotli                    1.1.0                hd590300_1    conda-forge
brotli-bin                1.1.0                hd590300_1    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.33.1               heb4867d_0    conda-forge
ca-certificates           2024.8.30            hbcca054_0    conda-forge
cachecontrol              0.12.14                  pypi_0    pypi
cairo                     1.18.0               hebfffa5_3    conda-forge
certifi                   2024.8.30          pyhd8ed1ab_0    conda-forge
cffi                      1.17.1                   pypi_0    pypi
charset-normalizer        3.4.0                    pypi_0    pypi
cleo                      2.1.0                    pypi_0    pypi
contourpy                 1.2.1           py310hd41b1e2_0    conda-forge
crashtest                 0.4.1                    pypi_0    pypi
cryptography              44.0.0                   pypi_0    pypi
cycler                    0.12.1             pyhd8ed1ab_1    conda-forge
dbus                      1.13.6               h5008d03_3    conda-forge
distlib                   0.3.9                    pypi_0    pypi
double-conversion         3.3.0                h59595ed_0    conda-forge
dulwich                   0.20.50                  pypi_0    pypi
expat                     2.6.2                h59595ed_0    conda-forge
filelock                  3.16.1                   pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_3    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.53.1          py310h5b4e0ec_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
frozendict                2.4.6                    pypi_0    pypi
graphite2                 1.3.13            h59595ed_1003    conda-forge
harfbuzz                  9.0.0                hda332d3_1    conda-forge
html5lib                  1.1                      pypi_0    pypi
htslib                    1.20                 h5efdd21_2    bioconda
icu                       75.1                 he02047a_0    conda-forge
idna                      3.10                     pypi_0    pypi
importlib-metadata        8.5.0                    pypi_0    pypi
jaraco-classes            3.4.0                    pypi_0    pypi
jeepney                   0.8.0                    pypi_0    pypi
jsonschema                4.23.0                   pypi_0    pypi
jsonschema-specifications 2024.10.1                pypi_0    pypi
keyring                   23.13.1                  pypi_0    pypi
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5           py310hd41b1e2_1    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.43                 h712a8e2_2    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libblas                   3.9.0           25_linux64_openblas    conda-forge
libbrotlicommon           1.1.0                hd590300_1    conda-forge
libbrotlidec              1.1.0                hd590300_1    conda-forge
libbrotlienc              1.1.0                hd590300_1    conda-forge
libcblas                  3.9.0           25_linux64_openblas    conda-forge
libclang-cpp18.1          18.1.8          default_hf981a13_2    conda-forge
libclang13                18.1.8          default_h9def88c_2    conda-forge
libcups                   2.3.3                h4637d8d_4    conda-forge
libcurl                   8.9.1                hdb1bdb2_0    conda-forge
libdeflate                1.21                 h4bc722e_0    conda-forge
libdrm                    2.4.123              hb9d3cd8_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libegl                    1.7.0                ha4b6fd6_0    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc                    5.2.0                         0    conda-forge
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libgl                     1.7.0                ha4b6fd6_0    conda-forge
libglib                   2.80.3               h315aac3_2    conda-forge
libglvnd                  1.7.0                ha4b6fd6_0    conda-forge
libglx                    1.7.0                ha4b6fd6_0    conda-forge
libgomp                   14.1.0               h77fa898_0    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
liblapack                 3.9.0           25_linux64_openblas    conda-forge
libllvm18                 18.1.8               h8b73ec9_2    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libopenblas               0.3.28          pthreads_h94d23a6_0    conda-forge
libpciaccess              0.18                 hd590300_0    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libpq                     16.4                 h482b261_0    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libtiff                   4.6.0                h46a8edc_4    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxcb                    1.16                 hb9d3cd8_1    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxkbcommon              1.7.0                h2c5496b_1    conda-forge
libxml2                   2.12.7               he7c6b58_4    conda-forge
libxslt                   1.1.39               h76b75d6_0    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
lockfile                  0.12.2                   pypi_0    pypi
matplotlib                3.9.2           py310hff52083_2    conda-forge
matplotlib-base           3.9.2           py310hf02ac8c_0    conda-forge
more-itertools            10.5.0                   pypi_0    pypi
msgpack                   1.1.0                    pypi_0    pypi
munkres                   1.0.7                      py_1    bioconda
mysql-common              9.0.1                h70512c7_0    conda-forge
mysql-libs                9.0.1                ha479ceb_0    conda-forge
ncurses                   6.5                  he02047a_1    conda-forge
neat                      4.0                      pypi_0    pypi
numpy                     1.26.4                   pypi_0    pypi
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.3.1                hb9d3cd8_3    conda-forge
packaging                 24.2               pyhd8ed1ab_2    conda-forge
pcre2                     10.44                hba22ea6_2    conda-forge
pexpect                   4.9.0                    pypi_0    pypi
pillow                    10.4.0          py310hebfe307_0    conda-forge
pip                       22.3.1                   pypi_0    pypi
pixman                    0.43.2               h59595ed_0    conda-forge
pkginfo                   1.12.0             pyhd8ed1ab_1    conda-forge
platformdirs              2.6.2                    pypi_0    pypi
poetry                    1.3.2                    pypi_0    pypi
poetry-core               1.4.0                    pypi_0    pypi
poetry-plugin-export      1.3.1                    pypi_0    pypi
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0                    pypi_0    pypi
pycparser                 2.22                     pypi_0    pypi
pyparsing                 3.2.0              pyhd8ed1ab_2    conda-forge
pysam                     0.19.1                   pypi_0    pypi
pyside6                   6.7.2           py310heb5a38e_2    conda-forge
python                    3.10.14         hd12c33a_0_cpython    conda-forge
python-dateutil           2.9.0.post0        pyhff2d567_1    conda-forge
python_abi                3.10                    5_cp310    conda-forge
pyyaml                    6.0.2           py310h5b4e0ec_0    conda-forge
qhull                     2020.2               h434a139_5    conda-forge
qt6-main                  6.7.2                hb12f9c5_5    conda-forge
rapidfuzz                 3.10.1                   pypi_0    pypi
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1                   pypi_0    pypi
requests                  2.32.3                   pypi_0    pypi
requests-toolbelt         0.10.1                   pypi_0    pypi
rpds-py                   0.22.3                   pypi_0    pypi
scipy                     1.14.1          py310ha3fb0e1_0    conda-forge
secretstorage             3.3.3                    pypi_0    pypi
setuptools                75.6.0             pyhff2d567_1    conda-forge
shellingham               1.5.4                    pypi_0    pypi
six                       1.17.0             pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.2.1                    pypi_0    pypi
tomlkit                   0.13.2                   pypi_0    pypi
tornado                   6.4.1           py310hc51659f_0    conda-forge
trove-classifiers         2024.10.21.16            pypi_0    pypi
tzdata                    2024b                hc8b5060_0    conda-forge
unicodedata2              15.1.0          py310h2372a71_0    conda-forge
urllib3                   1.26.20                  pypi_0    pypi
virtualenv                20.21.1                  pypi_0    pypi
wayland                   1.23.1               h3e06ad9_0    conda-forge
webencodings              0.5.1                    pypi_0    pypi
wheel                     0.45.1             pyhd8ed1ab_1    conda-forge
xcb-util                  0.4.1                hb711507_2    conda-forge
xcb-util-cursor           0.1.4                h4ab18f5_2    conda-forge
xcb-util-image            0.4.0                hb711507_2    conda-forge
xcb-util-keysyms          0.4.1                hb711507_0    conda-forge
xcb-util-renderutil       0.3.10               hb711507_0    conda-forge
xcb-util-wm               0.4.2                hb711507_0    conda-forge
xkeyboard-config          2.42                 h4ab18f5_0    conda-forge
xorg-fixesproto           5.0               h7f98852_1002    conda-forge
xorg-inputproto           2.3.2             h7f98852_1002    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.9                hb711507_1    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxfixes            5.0.3             h7f98852_1004    conda-forge
xorg-libxi                1.7.10               h4bc722e_1    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-libxtst              1.2.5                h4bc722e_0    conda-forge
xorg-libxxf86vm           1.1.5                h4bc722e_1    conda-forge
xorg-recordproto          1.14.2            h7f98852_1002    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zipp                      3.21.0                   pypi_0    pypi
zlib                      1.3.1                h4ab18f5_1    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge
@joshfactorial
Copy link
Collaborator

I will have time around the holidays to dig into the code and see what is happening!

@joshfactorial joshfactorial self-assigned this Dec 17, 2024
@joshfactorial joshfactorial added the bug Something isn't working label Dec 17, 2024
@zlian1758
Copy link

I had the same problem. I didn't look into it too deeply, the superficial issue was that in paired end reads, for some reason read_2's self.quality_array is an array of strings such as ['28', '31',...] rather than an array of ints. My temporary solution was to use ... chr(int(x) + ...) on line 344 of read.py

@mattbird567
Copy link
Author

Thanks zlian! Will give these changes a go and see if it resolved my issue.

@joshfactorial
Copy link
Collaborator

joshfactorial commented Jan 8, 2025 via email

@mattbird567
Copy link
Author

I can confirm that the fix suggested by zilan did work. However, it did identify another problem which is that if the vcf file includes any insertions or deletions it will error out and not impliment them with the error message below:

(neat4) matt:/mnt/c/Users/Matt/Desktop/UKHSA/Projects/Current/UKHSA_TB/amr_syn/neat4$ neat read-simulator -c neat_config.yml -o test_3
NEAT run log: /mnt/c/Users/Matt/Desktop/UKHSA/Projects/Current/UKHSA_TB/amr_syn/neat4/1736505231.2002442_NEAT.log
2025-01-10 10:33:51,620:INFO:neat.common.logging:writing log to: /mnt/c/Users/Matt/Desktop/UKHSA/Projects/Current/UKHSA_TB/amr_syn/neat4/1736505231.2002442_NEAT.log
2025-01-10 10:33:51,620:INFO:neat.read_simulator.runner:Using configuration file neat_config.yml
2025-01-10 10:33:51,622:INFO:neat.read_simulator.runner:Saving output files to .
2025-01-10 10:33:51,628:INFO:neat.read_simulator.utils.options:Run Configuration...
2025-01-10 10:33:51,628:INFO:neat.read_simulator.utils.options:Input fasta: refs/Mycobacterium_tuberculosis_H37Rv.fasta
2025-01-10 10:33:51,628:INFO:neat.read_simulator.utils.options:Producing the following files:
        - /mnt/c/Users/Matt/Desktop/UKHSA/Projects/Current/UKHSA_TB/amr_syn/neat4/test_3_r1.fastq.gz
        - /mnt/c/Users/Matt/Desktop/UKHSA/Projects/Current/UKHSA_TB/amr_syn/neat4/test_3_r2.fastq.gz

2025-01-10 10:33:51,628:INFO:neat.read_simulator.utils.options:Single threading - 1 thread.
2025-01-10 10:33:51,628:INFO:neat.read_simulator.utils.options:Using a read length of 150
2025-01-10 10:33:51,629:INFO:neat.read_simulator.utils.options:Generating fragments based on mean=300, stand. dev=30
2025-01-10 10:33:51,629:INFO:neat.read_simulator.utils.options:Running in paired-ended mode.
2025-01-10 10:33:51,629:INFO:neat.read_simulator.utils.options:Average coverage: 20
2025-01-10 10:33:51,629:INFO:neat.read_simulator.utils.options:Using default error model.
2025-01-10 10:33:51,629:INFO:neat.read_simulator.utils.options:User defined average sequencing error rate: 0.1.
2025-01-10 10:33:51,629:INFO:neat.read_simulator.utils.options:Ploidy value: 2
2025-01-10 10:33:51,629:INFO:neat.read_simulator.utils.options:Vcf of variants to include: test_dir/variants.vcf
2025-01-10 10:33:51,630:INFO:neat.read_simulator.utils.options:RNG seed value for run: 4652726890925496
2025-01-10 10:33:51,630:INFO:neat.read_simulator.runner:Reading Models...
2025-01-10 10:33:51,630:INFO:neat.read_simulator.runner:Reading refs/Mycobacterium_tuberculosis_H37Rv.fasta.
2025-01-10 10:33:51,735:INFO:neat.read_simulator.runner:Reading input VCF: test_dir/variants.vcf.
2025-01-10 10:33:51,736:INFO:neat.read_simulator.utils.vcf_func:Parsing input vcf test_dir/variants.vcf
2025-01-10 10:33:53,351:INFO:neat.read_simulator.utils.vcf_func:Found 15 variants in input VCF.
2025-01-10 10:33:53,352:INFO:neat.read_simulator.utils.vcf_func:Skipped 0 variants because of multiples at the same location
2025-01-10 10:33:53,352:INFO:neat.read_simulator.utils.vcf_func:Skipped 0 variants because of a mismatch between Ref and reference.
2025-01-10 10:33:53,677:INFO:neat.read_simulator.runner:Beginning simulation.
2025-01-10 10:33:53,788:INFO:neat.read_simulator.runner:Generating variants for ChrI
2025-01-10 10:33:53,902:INFO:neat.read_simulator.utils.generate_variants:Finished generating random mutations in 0.00 minutes
2025-01-10 10:33:53,903:INFO:neat.read_simulator.utils.generate_variants:Added 0 mutations to ChrI
2025-01-10 10:33:53,903:INFO:neat.read_simulator.utils.generate_reads:Sampling reads...
2025-01-10 10:34:38,842:ERROR:neat:read-simulator failed, see the traceback below
Traceback (most recent call last):
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/cli.py", line 131, in main
    cmd(args)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
    read_simulator_runner(arguments.config, arguments.output)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 313, in read_simulator_runner
    read1_fastq_paired, read1_fastq_single, read2_fastq_paired, read2_fastq_single = generate_reads(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/generate_reads.py", line 345, in generate_reads
    read_1.finalize_read_and_write(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 342, in finalize_read_and_write
    self.apply_variants_for_final_output(qual_model, rng)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 261, in apply_variants_for_final_output
    self.apply_mutations(list(quality_model.quality_scores), rng)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 224, in apply_mutations
    reference_length = variant_to_apply.get_ref_len()
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/variants/unknown_variant.py", line 62, in get_ref_len
    return len(self.metadata['REF'])
KeyError: 'REF'
ERROR: read-simulator failed, showing the last error
Traceback (most recent call last):
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/cli.py", line 131, in main
    cmd(args)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
    read_simulator_runner(arguments.config, arguments.output)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 313, in read_simulator_runner
    read1_fastq_paired, read1_fastq_single, read2_fastq_paired, read2_fastq_single = generate_reads(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/generate_reads.py", line 345, in generate_reads
    read_1.finalize_read_and_write(
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 342, in finalize_read_and_write
    self.apply_variants_for_final_output(qual_model, rng)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 261, in apply_variants_for_final_output
    self.apply_mutations(list(quality_model.quality_scores), rng)
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 224, in apply_mutations
    reference_length = variant_to_apply.get_ref_len()
  File "/home/matt/anaconda3/envs/neat4/lib/python3.10/site-packages/neat/variants/unknown_variant.py", line 62, in get_ref_len
    return len(self.metadata['REF'])
KeyError: 'REF'

I have tested to see if this same VCF file works with neat v3.4 which it does and inserts both insertions and deletions. The vcf file will work if the insertion/deletion is removed and all that are left are specififc SNPs. I have attached both these vcf files below so you can replcate the issue (please use the same ref file from the original issue).

Problem vcf: variants_ins.txt
Working vcf: variants_works.txt

@joshfactorial
Copy link
Collaborator

joshfactorial commented Jan 10, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants