The genome sequence of the bramble shoot moth, Notocelia uddmanniana (Linnaeus, 1758)

We present a genome assembly from an individual male Notocelia uddmanniana (the bramble shoot moth; Arthropoda; Insecta; Lepidoptera; Tortricidae). The genome sequence is 794 megabases in span. The majority of the assembly, 99.96%, is scaffolded into 28 chromosomal pseudomolecules, with the Z sex chromosome assembled.


Background
Notocelia uddmanniana (bramble shoot moth) is widely distributed across Western Europe and North Africa, with records further east from Kazakhstan to China.The larvae feed on brambles (Rubus sp.), occurring commonly where these species exist, and occasionally cause damage to cultivated varieties (Gordon et al., 1997).Eggs are laid singly on the foodplant, where larvae feed within a folded leaf and later within the tips of growing shoots; larvae overwinter in a silken web on the foodplant stem before recommencing feeding in spring (Dicker, 1939).Notocelia uddmanniana also occupies woodland, and is distributed widely throughout the UK, occurring more commonly in the south.The genome of N. uddmanniana was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all of the named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for N. uddmanniana, based on one male specimen from Wytham Woods, Oxfordshire, UK.

Genome sequence report
The genome was sequenced from a single male N. uddmanniana (Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.772, longitude -1.338).A total of 18-fold coverage in Pacific Biosciences single-molecule long reads (N50 16 kb) and 49-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 165 missing/misjoins and removed 72 haplotypic duplications, reducing the assembly length by 2.16% and the scaffold number by 64.71%, and increasing the scaffold N50 by 9.53%.
The final assembly has a total length of 794 Mb in 49 sequence scaffolds with a scaffold N50 of 29 Mb (Table 1).Of the assembly sequence, 99.96% was assigned to 28 chromosomal-level scaffolds, representing 27 autosomes (numbered by sequence length), and the Z sex chromosome (Figure 2-Figure 5; Table 2).The assembly has a BUSCO (Simão et al., 2015) completeness of 98.9% using the lepidoptera_odb10 reference set.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition, DNA extraction and sequencing
A single male M. uddmanniana (ilNotUddm1) was collected from Wytham Woods, Oxfordshire, UK (latitude 51.772, longitude -1.338) by Douglas Boyes, UKCEH, using a light  trap.The specimen was identified by the same individual and preserved on dry ice.
DNA was extracted from whole organism tissue at the Wellcome Sanger Institute (WSI) Scientific Operations core from the whole organism using the Qiagen MagAttract HMW DNA kit, according to the manufacturer's instructions.Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud sequencing libraries were constructed according to the manufacturers' instructions.Sequencing was performed by the Scientific Operations core at the Wellcome Sanger Institute on Pacific Biosciences SEQUEL II and Illumina HiSeq X instruments.Hi-C data were generated from remaining whole organism tissue using the Arima v1.0 kit and sequenced on HiSeq X.

Hidemasa Bono
Hiroshima University, Kagamiyam, Japan Authors described genome sequencing of the bramble shoot moth, Notocelia uddmanniana.There are no other assemblies in the INSDC records besides the one submitted by the author.There are no problem in sample acquisition and DNA extraction.Furthermore, the data analysis method described in the manuscript is modern and appropriate for the genome assembly of insect species.
There is a typo in the Data availability section; PRJEB42137 should be PRJEB42037.It should be fixed before the final indexing because this information is very critical for 'Data Note' manuscript.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Genome biology I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Jeffrey Marcus
University of Manitoba, Winnipeg, Manitoba, Canada In this manuscript, the authors describe the sequencing and assembly of the Notocelia uddmanniana genome using DNA from a single male specimen collected in the UK.The primary genome sequence assembly includes proposed chromosomal pseudomolecule sequences for 27 autosomes, the Z sex chromosome, and a complete mitochondrial genome.No females were collected, so the W sex chromosome was not sequenced.Overall, this is a solid contribution to the scientific literature.
Some suggestions to the authors: Methods: The date of collection for the sequenced Notocelia uddmanniana specimen is not included in the manuscript.It is best practice to include collection dates for all wild-caught insect specimens used for genome sequencing because it is not unusual for cryptic species pairs to be distinguishable by differences in phenology of adult emergence.If it should come to pass that genus Notocelia includes cryptic species, knowing the specimen collection date can help future researchers determine to which taxon this sequenced genome should be assigned, especially since it appears that the entire specimen was used for DNA extraction and is no longer available for morphological examination. 1.
For future work, I suggest that the researchers preferentially sequence the heterogametic sex when assembling genomes for previously unstudied species (in the case of Lepidoptera, the heterogametic sex is female), so that draft assembles can be prepared for both sex chromosomes.

2.
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? No
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Evolutionary biology of insects, phylogenomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Figure 1 .
Figure 1.Image of the ilNotUddm1 specimen taken during preservation and processing.

Figure 2 .
Figure 2. Genome assembly of Notocelia uddmanniana, ilNotUddm1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 794,123,667 bp assembly.The distribution of chromosome lengths is shown in dark grey with the plot radius scaled to the longest chromosome present in the assembly (75,621,453 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 chromosome lengths (28,990,537 and 17,668,102 bp), respectively.The pale grey spiral shows the cumulative chromosome count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilNotUddm1.1/dataset/CAJHZS01/snail.
Assembly was carried out with Hifiasm (Cheng et al., 2021); haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020), without the -e flag.One round of polishing was performed by aligning 10X Genomics read data to the assembly with longranger align, calling variants with freebayes (Garrison & Marth, 2012).The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using SALSA2(Ghurye et al., 2019).The assembly was checked for contamination and corrected using the gEVAL system(Chow et al., 2016) as described previously(Howe et al., 2021).Manual curation

Figure 5 .
Figure 5. Genome assembly of Notocelia uddmanniana, ilNotUddm1.1:Hi-C contact map.Hi-C contact map of the ilNotUddm1.1 assembly, visualised in HiGlass.Chromosomes are given in order of size from left to right and top to bottom.

Table 3 . Software tools used.
legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.The article accurately describes the chromosome-level assembly of the genome of a male Notocelia uddmanniana sampled in Oxfordshire.These new data open new research avenue in different scientific fields from ecology to evolution in Lepidoptera.A single male was sequenced, and the assembly show that the genome size is 794 mb long, and is arranged in 28 different chromosomes.At the moment, this genome remains to be annotated.I have only a minor comment on this genome release : It is unclear to me whether the chromosome 1, which is the largest chromosome, corresponds to the Z-chromosome ?No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.