GetOrganelle v1.7.7.0 get_organelle_from_reads.py assembles organelle genomes from genome skimming data. Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information. Python 3.8.18 | packaged by conda-forge | (default, Dec 23 2023, 17:21:28) [GCC 12.3.0] PLATFORM: Linux onottac669386dl.cfia-acia.inspection.gc.ca 5.15.0-119-generic #129~20.04.1-Ubuntu SMP Wed Aug 7 13:07:13 UTC 2024 x86_64 x86_64 PYTHON LIBS: GetOrganelleLib 1.7.7.0; numpy 1.24.3; sympy 1.11.1; scipy 1.10.1 DEPENDENCIES: Bowtie2 2.4.1; SPAdes 3.13.1; Blast 2.5.0 GETORG_PATH=/home/galindogonzl/.GetOrganelle LABEL DB: embplant_pt 0.0.1; embplant_mt 0.0.1 WORKING DIR: /home/galindogonzl/Documents/analyses/random_genome_skimming /home/galindogonzl/anaconda3/envs/GetOrganelle/bin/get_organelle_from_reads.py -1 22002D-02-10_S37_L003_R1_001.fastq.gz -2 22002D-02-10_S37_L003_R2_001.fastq.gz -t 4 -s A.tuberculatus_Internal_CFIA_sample1.fasta -o 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed -F embplant_pt -R 15 2024-09-17 09:29:44,985 - INFO: Pre-reading fastq ... 2024-09-17 09:29:44,985 - INFO: Estimating reads to use ... (to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf') 2024-09-17 09:29:45,063 - INFO: Tasting 100000+100000 reads ... 2024-09-17 09:29:49,594 - INFO: Estimating reads to use finished. 2024-09-17 09:29:49,594 - INFO: Unzipping reads file: 22002D-02-10_S37_L003_R1_001.fastq.gz (2860549838 bytes) 2024-09-17 09:30:14,652 - INFO: Unzipping reads file: 22002D-02-10_S37_L003_R2_001.fastq.gz (3061078773 bytes) 2024-09-17 09:30:45,333 - INFO: Counting read qualities ... 2024-09-17 09:30:45,458 - INFO: Identified quality encoding format = Sanger 2024-09-17 09:30:45,458 - INFO: Phred offset = 33 2024-09-17 09:30:45,459 - INFO: Trimming bases with qualities (0.00%): 33..33 ! 2024-09-17 09:30:45,490 - INFO: Mean error rate = 0.0029 2024-09-17 09:30:45,491 - INFO: Counting read lengths ... 2024-09-17 09:31:15,823 - INFO: Mean = 151.0 bp, maximum = 151 bp. 2024-09-17 09:31:15,823 - INFO: Reads used = 15000000+15000000 2024-09-17 09:31:15,823 - INFO: Pre-reading fastq finished. 2024-09-17 09:31:15,823 - INFO: Making seed reads ... 2024-09-17 09:31:15,828 - INFO: Making seed - bowtie2 index ... 2024-09-17 09:31:16,040 - INFO: Making seed - bowtie2 index finished. 2024-09-17 09:31:16,040 - INFO: Mapping reads to seed bowtie2 index ... 2024-09-17 09:32:49,233 - INFO: Mapping finished. 2024-09-17 09:32:49,235 - INFO: Seed reads made: 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/seed/embplant_pt.initial.fq (347794625 bytes) 2024-09-17 09:32:49,235 - INFO: Making seed reads finished. 2024-09-17 09:32:49,235 - INFO: Checking seed reads and parameters ... 2024-09-17 09:32:49,235 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s). 2024-09-17 09:32:49,235 - INFO: If the result graph is not a circular organelle genome, 2024-09-17 09:32:49,235 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run. 2024-09-17 09:33:14,707 - INFO: Pre-assembling mapped reads ... 2024-09-17 09:33:45,375 - INFO: Pre-assembling mapped reads finished. 2024-09-17 09:33:45,375 - INFO: Estimated embplant_pt-hitting base-coverage = 909.12 2024-09-17 09:33:45,674 - INFO: Reads reduced to = 8249739+8249739 2024-09-17 09:33:45,674 - INFO: Adjusting expected embplant_pt base coverage to 500.00 2024-09-17 09:33:45,675 - INFO: Estimated word size(s): 113 2024-09-17 09:33:45,675 - INFO: Setting '-w 113' 2024-09-17 09:33:45,675 - INFO: Setting '--max-extending-len inf' 2024-09-17 09:33:46,422 - INFO: Checking seed reads and parameters finished. 2024-09-17 09:33:46,422 - INFO: Making read index ... 2024-09-17 09:34:18,019 - INFO: For 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/1-22002D-02-10_S37_L003_R1_001.fastq.gz.fastq, only top 8249739 reads are used in downstream analysis. 2024-09-17 09:34:50,453 - INFO: For 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/2-22002D-02-10_S37_L003_R2_001.fastq.gz.fastq, only top 8249739 reads are used in downstream analysis. 2024-09-17 09:34:57,128 - INFO: 13768564 candidates in all 16499478 reads 2024-09-17 09:34:57,128 - INFO: Pre-grouping reads ... 2024-09-17 09:34:57,128 - INFO: Setting '--pre-w 113' 2024-09-17 09:34:58,066 - INFO: 200000/2033214 used/duplicated 2024-09-17 09:35:08,282 - INFO: 4815 groups made. 2024-09-17 09:35:09,378 - INFO: Making read index finished. 2024-09-17 09:35:09,378 - INFO: Extending ... 2024-09-17 09:35:09,378 - INFO: Adding initial words ... 2024-09-17 09:35:28,471 - INFO: AW 8009850 2024-09-17 09:36:31,270 - INFO: Round 1: 13768564/13768564 AI 260655 AW 8055534 2024-09-17 09:37:28,198 - INFO: Round 2: 13768564/13768564 AI 269493 AW 8099810 2024-09-17 09:38:25,133 - INFO: Round 3: 13768564/13768564 AI 279470 AW 8145708 2024-09-17 09:39:23,050 - INFO: Round 4: 13768564/13768564 AI 288352 AW 8186658 2024-09-17 09:40:20,301 - INFO: Round 5: 13768564/13768564 AI 297387 AW 8228578 2024-09-17 09:41:18,029 - INFO: Round 6: 13768564/13768564 AI 305590 AW 8266386 2024-09-17 09:42:15,824 - INFO: Round 7: 13768564/13768564 AI 313206 AW 8302294 2024-09-17 09:43:12,607 - INFO: Round 8: 13768564/13768564 AI 320969 AW 8340530 2024-09-17 09:44:10,508 - INFO: Round 9: 13768564/13768564 AI 328950 AW 8376170 2024-09-17 09:45:07,119 - INFO: Round 10: 13768564/13768564 AI 336750 AW 8414484 2024-09-17 09:46:04,127 - INFO: Round 11: 13768564/13768564 AI 344774 AW 8451004 2024-09-17 09:47:00,653 - INFO: Round 12: 13768564/13768564 AI 351398 AW 8481474 2024-09-17 09:47:57,195 - INFO: Round 13: 13768564/13768564 AI 357058 AW 8506850 2024-09-17 09:48:53,673 - INFO: Round 14: 13768564/13768564 AI 362301 AW 8531312 2024-09-17 09:49:50,110 - INFO: Round 15: 13768564/13768564 AI 366772 AW 8552092 2024-09-17 09:49:50,110 - INFO: Hit the round limit 15 and terminated ... 2024-09-17 09:50:05,584 - INFO: Extending finished. 2024-09-17 09:50:06,089 - INFO: Separating extended fastq file ... 2024-09-17 09:50:07,406 - INFO: Setting '-k 21,55,85,115' 2024-09-17 09:50:07,406 - INFO: Assembling using SPAdes ... 2024-09-17 09:50:07,457 - INFO: spades.py -t 4 --phred-offset 33 -1 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_1_paired.fq -2 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_2_paired.fq --s1 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_1_unpaired.fq --s2 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_2_unpaired.fq -k 21,55,85,115 -o 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_spades 2024-09-17 09:52:13,784 - INFO: Insert size = 383.11, deviation = 148.761, left quantile = 216, right quantile = 590 2024-09-17 09:52:13,784 - INFO: Assembling finished. 2024-09-17 09:52:15,035 - INFO: Slimming 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_spades/K115/assembly_graph.fastg finished! 2024-09-17 09:52:15,035 - INFO: Slimming assembly graphs finished. 2024-09-17 09:52:15,035 - INFO: Extracting embplant_pt from the assemblies ... 2024-09-17 09:52:15,035 - INFO: Disentangling 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_spades/K115/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a circular genome ... 2024-09-17 09:52:15,094 - INFO: Disentangling failed: 'Multiple isolated embplant_pt components detected! Broken or contamination?' 2024-09-17 09:52:15,094 - INFO: Scaffolding disconnected contigs using SPAdes scaffolds ... 2024-09-17 09:52:15,094 - WARNING: Assembly based on scaffolding may not be as accurate as the ones directly exported from the assembly graph. 2024-09-17 09:52:15,094 - INFO: Disentangling 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_spades/K115/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a circular genome ... 2024-09-17 09:52:15,105 - INFO: Disentangling failed: 'No new connections.' 2024-09-17 09:52:15,105 - INFO: Disentangling 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/extended_spades/K115/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a/an embplant_pt-insufficient graph ... 2024-09-17 09:52:15,188 - INFO: Vertex_174138_185588_187888_188086_188026_185882_169244_184774_184590 #copy = 1 2024-09-17 09:52:15,188 - INFO: Vertex_179564 #copy = 1 2024-09-17 09:52:15,188 - INFO: Vertex_187184_185306_187236_187916_186784_184808 #copy = 2 2024-09-17 09:52:15,188 - INFO: Vertex_188244_187922_185776_188256_186044_188074_187712_187948_188250_180502_187864_188098_188160_187688_185672 #copy = 1 2024-09-17 09:52:15,188 - INFO: Vertex_188266 #copy = 1 2024-09-17 09:52:15,188 - INFO: Average embplant_pt kmer-coverage = 136.8 2024-09-17 09:52:15,188 - INFO: Average embplant_pt base-coverage = 558.2 2024-09-17 09:52:15,188 - INFO: Writing output ... 2024-09-17 09:52:15,230 - WARNING: More than one structure (gene order) produced ... 2024-09-17 09:52:15,230 - WARNING: Please check the final result to confirm whether they are simply different in SSC direction (two flip-flop configurations)! 2024-09-17 09:52:15,232 - INFO: Writing PATH1 of embplant_pt scaffold(s) to 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/embplant_pt.K115.scaffolds.graph1.1.path_sequence.fasta 2024-09-17 09:52:15,233 - INFO: Writing PATH2 of embplant_pt scaffold(s) to 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/embplant_pt.K115.scaffolds.graph1.2.path_sequence.fasta 2024-09-17 09:52:15,233 - INFO: Writing GRAPH to 22002-02-10_S37_MIRL22-Aare-01_GetOrganelle_with_seed/embplant_pt.K115.contigs.graph1.selected_graph.gfa 2024-09-17 09:52:15,234 - INFO: Result status of embplant_pt: 3 scaffold(s) 2024-09-17 09:52:15,245 - INFO: Writing output finished. 2024-09-17 09:52:15,246 - INFO: Please ... 2024-09-17 09:52:15,246 - INFO: load the graph file 'assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg' in K115 2024-09-17 09:52:15,246 - INFO: load the CSV file 'assembly_graph.fastg.extend-embplant_pt-embplant_mt.csv' in K115 2024-09-17 09:52:15,246 - INFO: visualize and confirm the incomplete result in Bandage. 2024-09-17 09:52:15,246 - INFO: If the result is nearly complete, 2024-09-17 09:52:15,246 - INFO: you can also adjust the arguments according to https://github.com/Kinggerm/GetOrganelle/wiki/FAQ#what-should-i-do-with-incomplete-resultbroken-assembly-graph 2024-09-17 09:52:15,246 - INFO: If you have questions for us, please provide us with the get_org.log.txt file and the post-slimming graph in the format you like! 2024-09-17 09:52:15,246 - INFO: Extracting embplant_pt from the assemblies finished. Total cost 1351.38 s Thank you!