The complete genome sequence of Eimeria tenella (Tyzzer 1929), a common gut parasite of chickens

We present a genome assembly from a clonal population of Eimeria tenella Houghton parasites (Apicomplexa; Conoidasida; Eucoccidiorida; Eimeriidae). The genome sequence is 53.25 megabases in span. The entire assembly is scaffolded into 15 chromosomal pseudomolecules, with complete mitochondrion and apicoplast organellar genomes also present.


Introduction
The genome of Eimeria tenella (Houghton strain) was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all of the named eukaryotic species in Britain and Ireland. Here we present a chromosomally complete genome sequence based on a clonal specimen maintained initially at the Houghton Poultry Research Station (HPRS) and more recently at the Royal Veterinary College, Hertfordshire, UK, where it was collected from experimentally infected Gallus gallus domesticus. This apicomplexan parasite is a major cause of coccidiosis in farmed chickens in the UK.

Genome sequence report
The genome was sequenced from a clonal specimen of E. tenella collected from experimentally infected G. gallus domesticus at the Royal Veterinary College, UK. A total of 41-fold coverage in Pacific Biosciences single-molecule long reads (N50 8 kb) and 107-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 200 missing/misjoins, reducing the scaffold number by 77.9%, increasing the scaffold N50 by 0.1% and decreasing the assembly length by 1.85%. The final assembly has a total length of 53.25 Mb in 15 chromosomal scaffolds, one mitochondrial scaffold and one apicoplast scaffold. The total scaffold N50 was 4.01 Mb ( Table 1). The chromosomal scaffolds are numbered by sequence length, 1 being the smallest and 15 the largest, as is typical for Apicomplexa ( Figure 1-Figure 3; Table 2). The organellar mitochondrial and apicoplast genome sequences were each assembled into single contigs and circularized to remove redundancy. The assembly has a BUSCO v5.1.2 (Simao et al., 2015) completeness of 98.8% and duplication rate of 0.2% using the coccidia_odb10 reference set.
Of particular note is that 15 chromosomal scaffolds were identified, each with telomeres attached to both ends. This calls into question previous reports which suggested a haploid chromosome number of 14 for this species (del Cacho et al., 2005). The Hi-C map ( Figure 4) shows that each of the 15 chromosomal scaffolds has a single contact region with each of the others. It has been shown in the coccidian relative Toxoplasma gondii that centromeres are sequestered together within the nucleus throughout the cell cycle (Brooks et al., 2011). The Hi-C map suggests that this also occurs in E. tenella and if true, further supports the existence of 15 chromosomes. We examined the putative centromeric regions as identified by Hi-C in the Artemis genome browser (Carver et al., 2012) and found almost all to be in intergenic regions of, on average, 35 kb (min=15 kb, max=74 kb). The exception was chromosome 1, where it was adjacent to a repeat near to the end of the chromosome. The data suggest that E. tenella chromosomes have single, well-localised centromeres which occupy acrocentric and sub-metacentric positions ( Table 2).
The GC content of the genome was 58.6%.

Genome annotation report
We identified 7268 protein coding genes. Around 2000 gene models were manually corrected. The average exon length was 350.1, average intron length 298.1, with an average of 6.34 exons per gene. We annotated 44 pseudogenes, 32 degraded LTR retrotransposons (currently not included in GFF annotation), 140 rRNAs, 31 repeat regions, 28 ncRNAs and 345 tRNAs.

Methods
A clonal specimen of E. tenella was collected from experimentally infected G. gallus domesticus at the Royal Veterinary College, Hertfordshire, UK. Four-week-old Lohmann Valo chickens reared under specific pathogen-free conditions were used to propagate oocysts of the E. tenella Houghton strain as       The article (as Data Note) by Eerik Aunin and colleagues has produced a complete genome assembly and partially manually curated annotation of a clonal population of the apicomplexan parasite Eimeria tenella (Houghton strain) as part of the Darwin Tree of Life Project. E. tenella is a major parasite of chickens in the UK and worldwide and causes massive economic loss to poultry farming worldwide.
Using a combination of single-molecule long reads (Pacific Biosciences), and 10X Genomics read clouds, they generated a set of primary assembly contigs which were subsequently scaffolded with the use of chromosome conformation Hi-C data. This resulted in 53.25 megabases assembly of the genome of E. tenella (Houghton), represented by 15 chromosomal pseudomolecules, along with complete organellar genomes (mitochondrion and apicoplast). Remarkably, the authors have provided sufficient evidence that E. tenella has 15 chromosomes (as opposed to 14 chromosomes reported previously) and all of them have telomeres attached to both ends -thus confirming the truly 'complete' nature of the assembly. The authors also re-annotated the genome with appropriate bioinformatics tools and with the use of bulk RNA-seq datasets generated as part of the original pan-Eimeria genome analysis study (Reid et al., 2014). The revised assembly has 7,268 protein-coding genes (out of which 2,000 were annotated manually) -reflecting an impressive BUSCO completeness of 98.8%. This is a truly remarkable achievement by the authors and I wish to congratulate them for this. Surely, access to this high-quality genome annotation and assembly will help researchers not only interested in coccidiosis in chicken but also in comparative genomics of apicomplexan parasites in general. I encourage the authors to also produce similar high-quality reference assemblies for the other Eimeria species as well (if supported by sufficient funding resources) that were previously reported in the Eimeria Pan-genomics study (Reid et al.). I encourage the authors to make this genome also publicly available via VEuPathDB for wider accessibility.
The manuscript is clearly written with all the relevant details of all the tools used to generate the datasets and then analyze the results. I have absolutely no criticism for this manuscript. However, I have the following suggestions for the authors to add to the existing manuscript: A comparative overview of how this assembly outperformed the previously assembled and annotated (& published) genome of E. tenella Houghton.

1.
Since this represents a complete end-to-end assembly of all 15 chromosomes, a revised list of gene family annotations (such as the SAGs) would be very useful for the scientific community. 2.