-
Notifications
You must be signed in to change notification settings - Fork 53
Example 3
In this part, we use GetOrganelle to assemble the Mitragyna speciosa chloroplast genome from a real dataset (Genbank SRA SRR5602600).
Let's download the reads (ca. 400M bases) from Genbank. You have to install sra-tools (https://github.com/ncbi/sra-tools) before you use fastq-dump
.
fastq-dump --origfmt --split-files --gzip SRR5602600
Alternatively, you can download a reduced dataset from GetOrganelleGallery using wget
:
# the same to "fastq-dump --origfmt --split-files --gzip -X 500000 SRR5602600"
wget https://github.com/Kinggerm/GetOrganelleGallery/blob/master/Test/reads/SRR5602600_1.fastq.gz
wget https://github.com/Kinggerm/GetOrganelleGallery/blob/master/Test/reads/SRR5602600_2.fastq.gz
# use gitlab if above links fail, e.g. https://gitlab.com/Kinggerm/GetOrganelleGallery/-/raw/master/Test/reads/SRR5602600_1.fastq.gz
# then check the integrity of downloaded file using md5sum:
md5sum SRR5602600*.fastq.gz
# 97085d0268344591780429b4c98f74e8 SRR5602600_1.fastq.gz
# 2f03f57795237fe6d9d93f662ed52f2e SRR5602600_2.fastq.gz
# Please re-download the reads if md5 values unmatched
Conduct the assembly (Memory: ~1.8G; Duration: ~1000 sec):
get_organelle_from_reads.py -1 SRR5602600_1.fastq.gz -2 SRR5602600_2.fastq.gz -o SRR5602600-plastome -R 15 -F embplant_pt
It should finish with main output files in the output directory SRR5602600-plastome
(click the following items to expand the details). In this case, two isomeric plastome sequences are generated, differing in the orientation of SSC. These two isomers both exist in the plant (Palmer 1983; JF Walker et al. 2015) and are both usable. In practice, people usually arbitrarily use the one with a commonly-used order.
-
get_org.log.txt the log file
GetOrganelle v1.7.0-beta5 # file names changed according to 1.7.4-pre get_organelle_from_reads.py assembles organelle genomes from genome skimming data. Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information. Python 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54) [GCC 7.3.0] PYTHON LIBS: numpy 1.18.1; sympy 1.5.1; scipy 1.4.1; psutil 5.7.0 DEPENDENCIES: Bowtie2 2.3.5.1; SPAdes 3.12.0; Blast 2.9.0; Bandage 0.8.1 SEED DB: embplant_pt 0.0.0; embplant_mt 0.0.0 LABEL DB: embplant_pt 0.0.0; embplant_mt 0.0.0 WORKING DIR: /home/data1 /root/.pyenv/versions/miniconda3-4.3.30/bin/get_organelle_from_reads.py -1 SRR5602600_1.fastq.gz -2 SRR5602600_2.fastq.gz -o SRR5602600-plastome -R 15 -F embplant_pt 2020-06-17 06:13:07,648 - INFO: Pre-reading fastq ... 2020-06-17 06:13:07,649 - INFO: Estimating reads to use ... (to use all reads, set '--reduce-reads-for-coverage inf') 2020-06-17 06:13:08,468 - INFO: Estimating reads to use finished. 2020-06-17 06:13:08,468 - INFO: Unzipping reads file: SRR5602600_1.fastq.gz (236465222 bytes) 2020-06-17 06:13:12,912 - INFO: Unzipping reads file: SRR5602600_2.fastq.gz (268804576 bytes) 2020-06-17 06:13:17,830 - INFO: Counting read qualities ... 2020-06-17 06:13:18,028 - INFO: Identified quality encoding format = Sanger 2020-06-17 06:13:18,030 - INFO: Trimming bases with qualities (0.00%): 33..33 ! 2020-06-17 06:13:18,152 - INFO: Mean error rate = 0.0068 2020-06-17 06:13:18,153 - INFO: Counting read lengths ... 2020-06-17 06:13:21,930 - INFO: Mean = 248.1 bp, maximum = 250 bp. 2020-06-17 06:13:21,930 - INFO: Reads used = 1327534+1327534 2020-06-17 06:13:21,931 - INFO: Pre-reading fastq finished. 2020-06-17 06:13:21,931 - INFO: Making seed reads ... 2020-06-17 06:13:21,931 - INFO: Seed bowtie2 index existed! 2020-06-17 06:13:21,931 - INFO: Mapping reads to seed bowtie2 index ... 2020-06-17 06:15:04,166 - INFO: Mapping finished. 2020-06-17 06:15:04,167 - INFO: Seed reads made: SRR5602600-plastome/seed/embplant_pt.initial.fq (29736272 bytes) 2020-06-17 06:15:04,167 - INFO: Making seed reads finished. 2020-06-17 06:15:04,167 - INFO: Checking seed reads and parameters ... 2020-06-17 06:15:04,170 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s). 2020-06-17 06:15:04,170 - INFO: If the result graph is not a circular organelle genome, 2020-06-17 06:15:04,171 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run. 2020-06-17 06:15:08,163 - INFO: Pre-assembling mapped reads ... 2020-06-17 06:15:24,730 - INFO: Pre-assembling mapped reads finished. 2020-06-17 06:15:24,730 - INFO: Estimated embplant_pt-hitting base-coverage = 134.70 2020-06-17 06:15:24,731 - INFO: Estimated word size(s): 121 2020-06-17 06:15:24,731 - INFO: Setting '-w 121' 2020-06-17 06:15:24,732 - INFO: Setting '--max-extending-len inf' 2020-06-17 06:15:24,836 - INFO: Checking seed reads and parameters finished. 2020-06-17 06:15:24,836 - INFO: Making read index ... 2020-06-17 06:15:51,587 - INFO: Mem 1.556 G, 2619415 candidates in all 2655068 reads 2020-06-17 06:15:51,591 - INFO: Pre-grouping reads ... 2020-06-17 06:15:51,591 - INFO: Setting '--pre-w 121' 2020-06-17 06:15:51,801 - INFO: Mem 1.478 G, 29780/29780 used/duplicated 2020-06-17 06:15:57,680 - INFO: Mem 1.678 G, 354 groups made. 2020-06-17 06:15:57,961 - INFO: Making read index finished. 2020-06-17 06:15:57,961 - INFO: Extending ... 2020-06-17 06:15:57,962 - INFO: Adding initial words ... 2020-06-17 06:16:04,388 - INFO: AW 3931474 2020-06-17 06:16:54,022 - INFO: Round 1: 2619415/2619415 AI 82219 AW 4690882 Mem 1.184 2020-06-17 06:17:38,763 - INFO: Round 2: 2619415/2619415 AI 85040 AW 4814926 Mem 1.205 2020-06-17 06:18:21,302 - INFO: Round 3: 2619415/2619415 AI 87044 AW 4892046 Mem 1.218 2020-06-17 06:19:03,439 - INFO: Round 4: 2619415/2619415 AI 89194 AW 4981136 Mem 1.232 2020-06-17 06:19:46,678 - INFO: Round 5: 2619415/2619415 AI 91359 AW 5077330 Mem 1.249 2020-06-17 06:20:29,524 - INFO: Round 6: 2619415/2619415 AI 93753 AW 5179316 Mem 1.266 2020-06-17 06:21:12,596 - INFO: Round 7: 2619415/2619415 AI 95986 AW 5271878 Mem 1.281 2020-06-17 06:21:55,316 - INFO: Round 8: 2619415/2619415 AI 98070 AW 5360434 Mem 1.296 2020-06-17 06:22:41,349 - INFO: Round 9: 2619415/2619415 AI 99926 AW 5439230 Mem 1.309 2020-06-17 06:23:24,497 - INFO: Round 10: 2619415/2619415 AI 101880 AW 5520172 Mem 1.322 2020-06-17 06:24:06,469 - INFO: Round 11: 2619415/2619415 AI 103884 AW 5606174 Mem 1.462 2020-06-17 06:24:46,017 - INFO: Round 12: 2619415/2619415 AI 105702 AW 5680384 Mem 1.474 2020-06-17 06:25:25,446 - INFO: Round 13: 2619415/2619415 AI 107132 AW 5742880 Mem 1.484 2020-06-17 06:26:05,743 - INFO: Round 14: 2619415/2619415 AI 108591 AW 5803426 Mem 1.494 2020-06-17 06:26:47,253 - INFO: Round 15: 2619415/2619415 AI 110013 AW 5861168 Mem 1.504 2020-06-17 06:26:47,253 - INFO: Hit the round limit 15 and terminated ... 2020-06-17 06:26:52,467 - INFO: Extending finished. 2020-06-17 06:26:52,582 - INFO: Separating extended fastq file ... 2020-06-17 06:26:52,914 - INFO: Setting '-k 21,55,85,115' 2020-06-17 06:26:52,914 - INFO: Assembling using SPAdes ... 2020-06-17 06:26:52,922 - INFO: spades.py -t 1 -1 SRR5602600-plastome/extended_1_paired.fq -2 SRR5602600-plastome/extended_2_paired.fq --s1 SRR5602600-plastome/extended_1_unpaired.fq --s2 SRR5602600-plastome/extended_2_unpaired.fq -k 21,55,85,115 -o SRR5602600-plastome/extended_spades 2020-06-17 06:29:29,338 - INFO: Insert size = 593.543, deviation = 155.031, left quantile = 393, right quantile = 782 2020-06-17 06:29:29,338 - INFO: Assembling finished. 2020-06-17 06:29:30,722 - INFO: Slimming SRR5602600-plastome/extended_spades/K115/assembly_graph.fastg finished! 2020-06-17 06:29:31,948 - INFO: Slimming SRR5602600-plastome/extended_spades/K85/assembly_graph.fastg finished! 2020-06-17 06:29:33,192 - INFO: Slimming SRR5602600-plastome/extended_spades/K55/assembly_graph.fastg finished! 2020-06-17 06:29:33,193 - INFO: Slimming assembly graphs finished. 2020-06-17 06:29:33,193 - INFO: Extracting embplant_pt from the assemblies ... 2020-06-17 06:29:33,194 - INFO: Disentangling SRR5602600-plastome/extended_spades/K115/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a circular genome ... 2020-06-17 06:29:33,831 - INFO: Vertex_90758 #copy = 1 2020-06-17 06:29:33,832 - INFO: Vertex_90914_90832_88780_87750_89202_91046_90882_90874_91020 #copy = 2 2020-06-17 06:29:33,832 - INFO: Vertex_91022_90794_90740_16908_90820_88008_90654_2542_91008 #copy = 1 2020-06-17 06:29:33,833 - INFO: Average embplant_pt kmer-coverage = 75.2 2020-06-17 06:29:33,833 - INFO: Average embplant_pt base-coverage = 139.2 2020-06-17 06:29:33,833 - INFO: Writing output ... 2020-06-17 06:29:33,878 - WARNING: More than one circular genome structure produced ... 2020-06-17 06:29:33,879 - WARNING: Please check the final result to confirm whether they are simply flip-flop configurations! 2020-06-17 06:29:34,306 - INFO: Detecting large repeats (>1000 bp) in PATH1 with IRs detected, Total:LSC:SSC:Repeat(bp) = 155617:86151:18114:25676 2020-06-17 06:29:34,306 - INFO: Writing PATH1 of complete embplant_pt to SRR5602600-plastome/embplant_pt.K115.complete.graph1.1.path_sequence.fasta 2020-06-17 06:29:34,309 - INFO: Writing PATH2 of complete embplant_pt to SRR5602600-plastome/embplant_pt.K115.complete.graph1.2.path_sequence.fasta 2020-06-17 06:29:34,309 - INFO: Writing GRAPH to SRR5602600-plastome/embplant_pt.K115.complete.graph1.selected_graph.gfa 2020-06-17 06:29:34,511 - INFO: Writing GRAPH image to SRR5602600-plastome/embplant_pt.K115.complete.graph1.selected_graph.png 2020-06-17 06:29:34,512 - INFO: Result status of embplant_pt: circular genome 2020-06-17 06:29:34,537 - INFO: Please check the produced assembly image or manually visualize SRR5602600-plastome/extended_K115.assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg using Bandage to confirm the final result. 2020-06-17 06:29:34,537 - INFO: Writing output finished. 2020-06-17 06:29:34,538 - INFO: Extracting embplant_pt from the assemblies finished. Total cost 987.39 s Thank you!
-
extended_K115.assembly_graph.fastg
-
extended_K115.assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg
Coverages and lengths of contigs are shown. GetOrganelle use integrated information (blast hit, coverage, contig connections) to identify those chloroplast contigs (green), navigate paths to generate the final plastome files (embplant_pt.K115.complete.graph1.*.path_sequence.fasta).
-
extended_K115.assembly_graph.fastg.extend-embplant_pt-embplant_mt.csv
EDGE database loci loci_gene_sequential loci_sequential details 16908 embplant_pt petA petA>>90820 petA>>90820 petA(1-126,embplant_pt)>>90820 2542 embplant_pt psaA psaA>>90654 psaA>>90654 psaA(1-128,embplant_pt)>>90654 83412 embplant_pt psaA psaA psaA psaA(1-411,embplant_pt) 85620 embplant_pt petA petA>>14686 petA>>14686 petA(1-150,embplant_pt)>>14686 86056 embplant_mt;embplant_pt ccmFC,ccmFc;psbD psbD>>ccmFc>>ccmFC psbD>>ccmFc>>ccmFC psbD(1153-1239,embplant_pt)>>ccmFc(4673-5246,embplant_mt)>>ccmFC(4673-5203,embplant_mt) 86582 embplant_mt rrnL rrnL rrnL rrnL(179-530,embplant_mt) 88008 embplant_pt rbcL rbcL>>90654 rbcL>>90654 rbcL(1-125,embplant_pt)>>90654 88408 embplant_mt matR matR>>89824 matR>>89824 matR(41-445,embplant_mt)>>89824 89230 embplant_mt rps19,rps3 rps19>>rps3>>86202 rps19>>rps3>>86202 rps19(2297-2429,embplant_mt)>>rps3(2392-2465,embplant_mt)>>86202 89350 embplant_mt;embplant_pt rps1;psaA psaA>>rps1>>82234 psaA>>rps1>>82234 psaA(1-292,embplant_pt)>>rps1(10090-10695,embplant_mt)>>82234 89488 embplant_mt cox2,rpl5,rps14 rps14>>rpl5>>cox2>>80822 rps14>>rpl5>>cox2>>80822 rps14(130-392,embplant_mt)>>rpl5(394-948,embplant_mt)>>cox2(1067-1151,embplant_mt)>>80822 89552 embplant_pt rpoC2 rpoC2>>11862 rpoC2>>11862 rpoC2(7442-7593,embplant_pt)>>11862 89792 embplant_mt;embplant_pt ccmC,rrn5,rrnS;rrn16 rrnS>>rrn16>>rrn5>>ccmC>>90874 rrnS>>rrn16>>rrn5>>ccmC>>90874 rrnS(8904-10745,embplant_mt)>>rrn16(9165-10577,embplant_pt)>>rrn5(10897-11012,embplant_mt)>>ccmC(11186-11904,embplant_mt)>>90874 89824 embplant_mt;embplant_pt nad1,rps13;petA petA>>rps13>>nad1>>90030 petA>>rps13>>nad1>>90030 petA(1-346,embplant_pt)>>rps13(3465-3815,embplant_mt)>>nad1(4737-6358,embplant_mt)>>90030 89856 embplant_mt atp8,cox3,rps4,sdh4 rps4>>sdh4>>cox3>>atp8>>85138 rps4>>sdh4>>cox3>>atp8>>85138 rps4(184-1251,embplant_mt)>>sdh4(2703-3089,embplant_mt)>>cox3(3017-3814,embplant_mt)>>atp8(4925-5407,embplant_mt)>>85138 89898 embplant_mt ND5,nad5 nad5>>ND5>>nad5>>18246 nad5>>ND5>>nad5>>18246 nad5(4418-5635,embplant_mt)>>ND5(4418-6714,embplant_mt)>>nad5(6485-6714,embplant_mt)>>18246 89906 embplant_mt;embplant_pt mttB,tatC;psaA mttB>>tatC>>psaA>>2542 mttB>>tatC>>psaA>>2542 mttB(6468-7270,embplant_mt)>>tatC(6519-7216,embplant_mt)>>psaA(7437-7695,embplant_pt)>>2542 89920 embplant_mt ccmB ccmB>>82234 ccmB>>82234 ccmB(8492-9117,embplant_mt)>>82234 89928 embplant_mt atp6,nad6 atp6>>nad6>>21936 atp6>>nad6>>21936 atp6(2448-3221,embplant_mt)>>nad6(4718-5335,embplant_mt)>>21936 90010 embplant_pt rrn23 rrn23>>90832 rrn23>>90832 rrn23(6057-6175,embplant_pt)>>90832 90018 embplant_mt ND5,atp9,nad3,nad5,rpl2,rps12,rps3 atp9>>nad5>>nad3>>rps12>>rpl2>>rps3>>nad5>>ND5>>nad5>>ND5>>18246 atp9>>nad5>>nad3>>rps12>>rpl2>>rps3>>nad5>>ND5>>nad5>>ND5>>18246 atp9(103-327,embplant_mt)>>nad5(8244-8340,embplant_mt)>>nad3(9420-9776,embplant_mt)>>rps12(9825-10202,embplant_mt)>>rpl2(14637-17205,embplant_mt)>>rps3(17883-17956,embplant_mt)>>nad5(18688-19082,embplant_mt)>>ND5(18688-19082,embplant_mt)>>nad5(20185-20315,embplant_mt)>>ND5(20185-20306,embplant_mt)>>18246 90030 embplant_mt;embplant_pt cox2;petG cox2>>petG>>90794 cox2>>petG>>90794 cox2(420-2721,embplant_mt)>>petG(4942-5061,embplant_pt)>>90794 90110 embplant_mt nad7 nad7 nad7 nad7(5244-8330,embplant_mt) 90162 embplant_mt;embplant_pt ccmFN,ccmFn,nad9;ndhK nad9>>ndhK>>ccmFn>>ccmFN>>16442 nad9>>ndhK>>ccmFn>>ccmFN>>16442 nad9(4628-5200,embplant_mt)>>ndhK(11193-11650,embplant_pt)>>ccmFn(16668-18294,embplant_mt)>>ccmFN(16696-18294,embplant_mt)>>16442 90166 embplant_mt ND2,nad2 nad2>>ND2>>85138 nad2>>ND2>>85138 nad2(91-279,embplant_mt)>>ND2(91-275,embplant_mt)>>85138 90174 embplant_mt;embplant_pt nad1;psbC,rpl2 rpl2>>nad1>>psbC>>21936 rpl2>>nad1>>psbC>>21936 rpl2(1-141,embplant_pt)>>nad1(2323-2708,embplant_mt)>>psbC(3567-3714,embplant_pt)>>21936 90178 embplant_pt ndhA,rbcL ndhA>>rbcL>>88008 ndhA>>rbcL>>88008 ndhA(3891-4140,embplant_pt)>>rbcL(4130-4448,embplant_pt)>>88008 90190 embplant_mt rps7 rps7 rps7 rps7(6025-6496,embplant_mt) 90256 embplant_mt;embplant_pt atp1;atpA atp1>>atpA>>14686 atp1>>atpA>>14686 atp1(7030-8543,embplant_mt)>>atpA(7305-7735,embplant_pt)>>14686 90300 embplant_pt rrn16 rrn16>>91046 rrn16>>91046 rrn16(103-185,embplant_pt)>>91046 90546 embplant_mt;embplant_pt cox1,rps10;psaA rps10>>cox1>>psaA>>14686 rps10>>cox1>>psaA>>14686 rps10(3110-4285,embplant_mt)>>cox1(4490-6071,embplant_mt)>>psaA(6932-7128,embplant_pt)>>14686 90654 embplant_pt atpB,atpE,ndhC,ndhJ,ndhK,psaA,rps4,ycf3 psaA>>ycf3>>rps4>>ndhJ>>ndhK>>ndhC>>atpE>>atpB>>88008 psaA>>ycf3>>rps4>>ndhJ>>ndhK>>ndhC>>atpE>>atpB>>88008 psaA(1-1330,embplant_pt)>>ycf3(2088-4052,embplant_pt)>>rps4(4850-5455,embplant_pt)>>ndhJ(8327-8805,embplant_pt)>>ndhK(8910-9668,embplant_pt)>>ndhC(9638-10000,embplant_pt)>>atpE(12328-12719,embplant_pt)>>atpB(12716-14212,embplant_pt)>>88008 90740 embplant_pt petA,petG,psbE,psbF,psbJ,psbL petA>>psbJ>>psbL>>psbF>>psbE>>petG>>90794 petA>>psbJ>>psbL>>psbF>>psbE>>petG>>90794 petA(1-154,embplant_pt)>>psbJ(1144-1266,embplant_pt)>>psbL(1401-1517,embplant_pt)>>psbF(1541-1660,embplant_pt)>>psbE(1670-1921,embplant_pt)>>petG(3270-3383,embplant_pt)>>90794 90758 embplant_pt ccsA,ndhA,ndhD,ndhE,ndhF,ndhG,ndhH,ndhI,psaC,rpl32,rps15,ycf1 ycf1>>rps15>>ndhH>>ndhA>>ndhI>>ndhG>>ndhE>>psaC>>ndhD>>ccsA>>rpl32>>ndhF>>90914 ycf1>>rps15>>ndhH>>ndhA>>ndhI>>ndhG>>ndhE>>psaC>>ndhD>>ccsA>>rpl32>>ndhF>>90914 ycf1(72-4588,embplant_pt)>>rps15(5013-5250,embplant_pt)>>ndhH(5356-6537,embplant_pt)>>ndhA(6539-8710,embplant_pt)>>ndhI(8798-9331,embplant_pt)>>ndhG(9676-10201,embplant_pt)>>ndhE(10314-10616,embplant_pt)>>psaC(10865-11110,embplant_pt)>>ndhD(11221-12713,embplant_pt)>>ccsA(12993-13950,embplant_pt)>>rpl32(15021-15174,embplant_pt)>>ndhF(15986-18181,embplant_pt)>>90914 90820 embplant_pt accD,cemA,petA,psaI,rbcL,ycf4 petA>>cemA>>ycf4>>psaI>>accD>>rbcL>>88008 petA>>cemA>>ycf4>>psaI>>accD>>rbcL>>88008 petA(1-910,embplant_pt)>>cemA(1150-1886,embplant_pt)>>ycf4(2705-3256,embplant_pt)>>psaI(3686-3790,embplant_pt)>>accD(4519-5302,embplant_pt)>>rbcL(6785-8170,embplant_pt)>>88008 90832 embplant_pt rrn23 rrn23>>90914 rrn23>>90914 rrn23(586-2092,embplant_pt)>>90914 90874 embplant_pt rpl2,rpl23 rpl2>>rpl23>>90882 rpl2>>rpl23>>90882 rpl2(1-1163,embplant_pt)>>rpl23(1184-1465,embplant_pt)>>90882 90882 embplant_mt;embplant_pt rrnS;ndhB,rps12_2,rps7,rrn16,ycf15,ycf2 ycf2>>ycf15>>ndhB>>rps7>>rps12_2>>rrn16>>rrnS>>91046 ycf2>>ycf15>>ndhB>>rps7>>rps12_2>>rrn16>>rrnS>>91046 ycf2(248-7116,embplant_pt)>>ycf15(7232-7334,embplant_pt)>>ndhB(8325-10487,embplant_pt)>>rps7(10831-11298,embplant_pt)>>rps12_2(11351-12134,embplant_pt)>>rrn16(14043-15533,embplant_pt)>>rrnS(14256-15409,embplant_mt)>>91046 90914 embplant_mt;embplant_pt rrnL;rrn23,rrn4.5,rrn5,ycf1 ycf1>>rrn5>>rrn4.5>>rrn23>>rrnL>>90832 ycf1>>rrn5>>rrn4.5>>rrn23>>rrnL>>90832 ycf1(183-1136,embplant_pt)>>rrn5(2419-2539,embplant_pt)>>rrn4.5(2760-2861,embplant_pt)>>rrn23(2962-4379,embplant_pt)>>rrnL(2993-4137,embplant_mt)>>90832 91008 embplant_pt atpA,atpF,atpH,atpI,matK,petN,psaA,psaB,psbA,psbC,psbD,psbI,psbK,psbM,psbZ,rpoB,rpoC1,rpoC2,rps14,rps16,rps2 psbA>>matK>>rps16>>psbK>>psbI>>atpA>>atpF>>atpH>>atpI>>rps2>>rpoC2>>rpoC1>>rpoB>>petN>>psbM>>psbD>>psbC>>psbZ>>rps14>>psaB>>psaA>>2542 psbA>>matK>>rps16>>psbK>>psbI>>atpA>>atpF>>atpH>>atpI>>rps2>>rpoC2>>rpoC1>>rpoB>>petN>>psbM>>psbD>>psbC>>psbZ>>rps14>>psaB>>psaA>>2542 psbA(471-1532,embplant_pt)>>matK(2173-3585,embplant_pt)>>rps16(5189-5473,embplant_pt)>>psbK(8115-8300,embplant_pt)>>psbI(8719-8829,embplant_pt)>>atpA(10776-12293,embplant_pt)>>atpF(12354-13619,embplant_pt)>>atpH(13972-14217,embplant_pt)>>atpI(15244-15974,embplant_pt)>>rps2(16201-16911,embplant_pt)>>rpoC2(17169-21330,embplant_pt)>>rpoC1(21527-24284,embplant_pt)>>rpoB(24313-27523,embplant_pt)>>petN(29500-29589,embplant_pt)>>psbM(30760-33018,embplant_pt)>>psbD(34304-35365,embplant_pt)>>psbC(35313-36734,embplant_pt)>>psbZ(37379-37567,embplant_pt)>>rps14(38356-38658,embplant_pt)>>psaB(38793-40997,embplant_pt)>>psaA(40090-42047,embplant_pt)>>2542 91020 embplant_pt rpl2 rpl2>>91008 rpl2>>91008 rpl2(1-422,embplant_pt)>>91008 91022 embplant_pt clpP,infA,petB,petD,psaJ,psbB,psbH,psbN,psbT,rpl14,rpl16,rpl20,rpl22,rpl33,rpl36,rpoA,rps11,rps12_1,rps18,rps19,rps3,rps8 rps19>>rpl22>>rps3>>rpl16>>rpl14>>rps8>>infA>>rpl36>>rps11>>rpoA>>petD>>petB>>psbH>>psbN>>psbT>>psbB>>clpP>>rps12_1>>rpl20>>rps18>>rpl33>>psaJ>>90794 rps19>>rpl22>>rps3>>rpl16>>rpl14>>rps8>>infA>>rpl36>>rps11>>rpoA>>petD>>petB>>psbH>>psbN>>psbT>>psbB>>clpP>>rps12_1>>rpl20>>rps18>>rpl33>>psaJ>>90794 rps19(73-350,embplant_pt)>>rpl22(426-751,embplant_pt)>>rps3(860-1516,embplant_pt)>>rpl16(2571-3043,embplant_pt)>>rpl14(3182-3550,embplant_pt)>>rps8(3718-4122,embplant_pt)>>infA(4250-4483,embplant_pt)>>rpl36(4604-4717,embplant_pt)>>rps11(4826-5237,embplant_pt)>>rpoA(5312-6282,embplant_pt)>>petD(6504-7689,embplant_pt)>>petB(7869-9298,embplant_pt)>>psbH(9423-9644,embplant_pt)>>psbN(9746-9876,embplant_pt)>>psbT(9938-10045,embplant_pt)>>psbB(10241-11767,embplant_pt)>>clpP(12226-14292,embplant_pt)>>rps12_1(14433-14546,embplant_pt)>>rpl20(15326-15679,embplant_pt)>>rps18(15936-16241,embplant_pt)>>rpl33(16406-16606,embplant_pt)>>psaJ(17061-17187,embplant_pt)>>90794 91046 embplant_pt rrn16 rrn16>>90882 rrn16>>90882 rrn16(1314-1396,embplant_pt)>>90882
note that the loci information is based on blast-hit rather than thorough annotation. This file can be loaded into Bandage to visualize the blast hits of contigs.
-
embplant_pt.K115.complete.graph1.1.path_sequence.fasta assembled plastome
>91022_90794_90740_16908_90820_88008_90654_2542_91008-,90914_90832_88780_87750_89202_91046_90882_90874_91020-,90758-,90914_90832_88780_87750_89202_91046_90882_90874_91020+(circular)
GTGGGCGAACGACGGGAATTGAACCCGCGCATGGTGGAT.. (155617 bp)
-
embplant_pt.K115.complete.graph1.2.path_sequence.fasta assembled plastome
>91022_90794_90740_16908_90820_88008_90654_2542_91008-,90914_90832_88780_87750_89202_91046_90882_90874_91020-,90758+,90914_90832_88780_87750_89202_91046_90882_90874_91020+(circular)
GTGGGCGAACGACGGGAATTGAACCCGCGCATGGTGGAT.. (155617 bp)
-
embplant_pt.K115.complete.graph1.selected_graph.gfa assembly graph
S 90758 ATCTGAATATGAATGGGAATCAAGAAAATTCGAAATTAAAAGATAAAAAAGCTACTGAAACAAAGGAACTCTTCCGCTTTGAAAAACCTCTTG... S 90914_90832_88780_87750_89202_91046_90882_90874_91020 TTATAATCAAAAAGAGTAGTTACAAGAGGTTTTTCAAAGCGGAAG... S 91022_90794_90740_16908_90820_88008_90654_2542_91008 AGTAAATAGGAGAAAAATACAATTTTTTTCTTCGTCTTTACAAAA... L 90758 - 90914_90832_88780_87750_89202_91046_90882_90874_91020 + 115M L 90758 + 90914_90832_88780_87750_89202_91046_90882_90874_91020 + 115M L 90914_90832_88780_87750_89202_91046_90882_90874_91020 + 91022_90794_90740_16908_90820_88008_90654_2542_91008 - 115M L 90914_90832_88780_87750_89202_91046_90882_90874_91020 + 91022_90794_90740_16908_90820_88008_90654_2542_91008 + 115M
-
embplant_pt.K115.complete.graph1.selected_graph.png
This file will be generated when
Bandage
was added to the $PATH (optional). Bandage-visualized
Besides, you will find lots of temporary files that you can delete them after a successful assembly.
-
seed subfolder of seed reads and parameter estimation temp files
- embplant_pt.initial.sam mapping alignment
- embplant_pt.initial.fq seed reads used for read extending
- embplant_pt.initial.fq.spades used for parameter estimation, draft assembly
- extended_1_paired.fq target organelle associated reads - paired forward
- extended_1_unpaired.fq target organelle associated reads - unpaired forward
- extended_2_paired.fq target organelle associated reads - paired reverse
- extended_2_unpaired.fq target organelle associated reads - unpaired reverse
-
extended_spades subfolder of SPAdes assemblies
- scaffolds.paths
- assembly_graph_with_scaffolds.gfa
- assembly_graph.fastg
- spades.log**
- K115
- assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg
- assembly_graph.fastg.extend-embplant_pt-embplant_mt.csv
- (other files)
- K85
- assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg
- assembly_graph.fastg.extend-embplant_pt-embplant_mt.csv
- (other files)
- K55
- assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg
- assembly_graph.fastg.extend-embplant_pt-embplant_mt.csv
- (other files)
- (other files) K21, input_dataset.yaml, contigs.fasta, scaffolds.fasta, etc.