The genome sequence of the black arches, Lymantria monacha (Linnaeus, 1758)

We present a genome assembly from an individual male Lymantria monacha (the black arches; Arthropoda; Insecta; Lepidoptera; Erebidae). The genome sequence is 916 megabases in span. The majority of the assembly (99.99%) is scaffolded into 28 chromosomal pseudomolecules, with the Z sex chromosome assembled. The mitochondrial genome was also assembled, and is 15.6 kilobases in length.


Background
Lymantria monacha (black arches) is a nocturnal, univoltine species typically identified by the numerous black markings across its white forewing.Though melanic forms can be found, these are identified by the shared pink bands present on the abdomen of each form.The larvae of L.monacha mainly feed on species in the Quercus genus, but have been recorded on a wide range of host plant species.Though local to the United Kingdom, L.monacha also occurs throughout the Palearctic region and east to India and Japan.Notably, L.monacha is a pest species of pine and spruce as well as being an invasive timber pest in North America (Keena, 2003;Prestemon et al., 2008).Climate change has shown an increase in L.monacha distribution in Northern Europe and due to the destruction it causes to affected species, this can lead to an increase in secondary pests and infections (Fält-Nardmann et al., 2018;Økland et al., 2019).As such, providing a full genome of this species may contribute to the varied methods being developed to control this species.
The genome of L.monacha was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all of the named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for L.monacha , based on one specimen from Wytham Woods, Oxfordshire, UK.

Genome sequence report
The genome was sequenced from one male L. monacha (Figure 1) collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.764, longitude -1.327).A total of 39-fold coverage in Pacific Biosciences single-molecule long reads and 52-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 33 missing/misjoins and removed 12 haplotypic duplications, reducing the assembly size by 0.20% and the scaffold number by 28.57%, and increasing the scaffold N50 by 1.14%.
While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition and nucleic acid extraction
A single male L. monacha (ilLymMona1) was collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.772, longitude -1.338) by Douglas Boyes, UKCEH, using a light trap in woodland.The sample was identified by the same individual, and preserved on dry ice.
DNA was extracted from thorax/abdomen tissue of ilLymMona1 at the Wellcome Sanger Institute (WSI) Scientific Operations core from the whole organism using the Qiagen MagAttract HMW DNA kit, according to the manufacturer's instructions.RNA was extracted from remaining thorax/abdomen tissue of ilLymMona1 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was

Themistoklis Giannoulis
Laboratory of Genetics, Faculty of Animal Science, University of Thessaly, Larissa, Greece The Manuscript is very targeted and is a part of a large effort, The Tree of Life, which is a fascinating work which will provide reference genomes for a variety of species.Towards this, the manuscript fulfils its purpose and provides a genome for the black arches, which could be used for the development of methods to control the expansion of the species and limit its destructive actions.The only thing I want to mention is what was the purpose of extracting RNA?Since it was not used in the experiment described downstream.
Is the rationale for creating the dataset(s) clearly described?

Mathilde Cordellier
Institute for Zoology, Universität Hamburg, Hamburg, Germany This study presents the fully assembled chromosome level genome of the black arches, a moth with a wide palearctic distribution.
I find this genome report to be well written and clear, with the adequate amount of information provided.The sequencing effort resulted in an amazing assembly I am sure will be helpful in the future, and I only have minor comments.I was very shocked and sad to discover the first author has passed away, and hope these questions can be addressed, nevertheless.
"Increase in distribution": do the authors mean an increase in frequency/occurrence? Is there a reason why a male was sequenced and not a female?The assembly is then missing the W chromosome.
From my experience, it is difficult to obtain assemblies without contamination.Although thorax/abdomen tissue were used, there is virtually no assembled sequence from other organisms.I am wondering how this was possible.Were measures taken to avoid contamination through gut content, ectoparasites, surface and internal bacteria?I do trust the assembly, my question is merely aimed at adding valuable information for others, whose work would greatly benefit from a protocol reducing contamination!Would it be possible to number the chromosomes on the HiC contact map, to establish a correspondence between the figure and Table 2?
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: population genomics, freshwater ecology, arthropod genome evolution, evolutionary biology I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 1 .
Figure 1.Image of the Lymantria monacha (ilLymMona1) specimen taken prior to preservation and processing.Specimen shown next to FluidX storage tube, 43.9 mm in length.

Figure 2 .
Figure 2. Genome assembly of Lymantria monacha, ilLymMona1.2:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 915,668,749 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (62,811,790 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (33,870,686 and 24,894,941 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilLymMona1.2/dataset/CAJHZT02/snail.

Figure 3 .
Figure 3. Genome assembly of Lymantria monacha, ilLymMona1.2:GC coverage.BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilLymMona1.2/dataset/CAJHZT02/blob.

Figure 4 .
Figure 4. Genome assembly of Lymantria monacha, ilLymMona1.2:cumulative sequence.BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilLymMona1.2/dataset/ CAJHZT02/cumulative.

Figure 5 .
Figure 5. Genome assembly of Lymantria monacha, ilLymMona1.2:Hi-C contact map.Hi-C contact map of the ilLymMona1.2assembly, visualised in HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this map is available here.

Table 3
contains a list of all software tool versions used, where appropriate.Ethics/compliance issuesThe materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the Darwin Tree of Life Project Sampling Code of Practice.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within

Table 3 . Software tools used.
document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators. this

Peer Review Current Peer Review Status: Version 1
Reviewer Report 02 November 2023 https://doi.org/10.21956/wellcomeopenres.19726.r68797© 2023 Giannoulis T. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.