The genome sequence of the bisetose emerald-bottle, Bellardia pandia (Walker, 1849) [version 1; peer review: awaiting peer review]

We present a genome assembly from an individual female Bellardia pandia (the bisetose emerald-bottle; Arthropoda; Insecta; Diptera; Calliphoridae). The genome sequence is 617 megabases in span. The majority of the assembly (97.82%) is scaffolded into six chromosomal pseudomolecules, with the X sex chromosome assembled.


Background
The presence of two long posteroventral bristles on the front tibiae of Bellardia pandia (previously known as Onesia biseta) distinguish this medium-sized blowfly from the similar-looking B. viarum and B. vulgaris. All three have green, turquoise or bronze reflections on the tergites, tempered by dusting that produces a tessellated pattern. The thorax is grey-dusted with several darker stripes and has faint metallic-green reflections in some lights. The male genitalia are distinctive and are the most reliable character for separating the three species.
This species occurs locally throughout Great Britain, usually in the vicinity of wetlands. Females are viviparous and larvae are thought to be predators/parasitoids of earthworms.

Genome sequence report
The genome was sequenced from a single female B. pandia ( Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.769, longitude -1.339). A total of 54-fold coverage in Pacific Biosciences single-molecule long reads and 76-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 24 missing/misjoins, reducing the scaffold number by 14.94%, and increasing the scaffold N50 by 8.33%.
The final assembly has a total length of 617 Mb in 74 sequence scaffolds with a scaffold N50 of 111.8 Mb (Table 1). The majority, 97.82%, of the assembly sequence was assigned to 6 chromosomal-level scaffolds, representing 5 autosomes (numbered by sequence length), and the X sex chromosome (Figure 2- Figure 5; Table 2). Several repetitive scaffolds remain unassigned to chromosomes. These are likely to belong to the X chromosome but a definitive placement could be made with current methods. The assembly has a BUSCO (Manni et al., 2021) completeness of 99.1% (single 97.9%, duplicated 1.2%) using the diptera_odb10 reference set. While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Methods
Sample acquisition and nucleic acid extraction A female B. pandia (idBelPand1) was collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.331) by Steven Falk, independent researcher, who also identified the specimens. The specimens were collected using a net and snap-frozen on dry ice.  DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute. The idBelPand1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing. Tissue from the whole organism was disrupted using a Nippi Powermasher fitted with a BioMasher pestle. Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent FemtoPulse. High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit. Low molecular weight DNA was removed from a 200-ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing. HMW DNA was sheared into an average fragment size between 12-20 kb in a Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample. The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit. Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics Chromium read cloud sequencing libraries were constructed according to the manufacturers' instructions. Sequencing was performed by the Scientific Operations core at the Wellcome Sanger Institute on Pacific Biosciences SEQUEL II and Illumina NovaSeq 6000 instruments. Hi-C data were generated from remaining whole organism tissue using the Arima Hi-C+ kit and sequenced on a NovaSeq 6000 instrument.