The genome sequence of the common malachite beetle, Malachius bipustulatus (Linnaeus, 1758)

We present a genome assembly from an individual female Malachius bipustulatus (the common malachite beetle; Arthropoda; Insecta; Coleoptera; Melyridae). The genome sequence is 544 megabases in span. The majority (99.70%) of the assembly is scaffolded into 10 chromosomal pseudomolecules, with the X sex chromosome assembled.


Background
The common malachite beetle, Malachius bipustulatus, is a common soft-winged flower beetle. It is widespread and common throughout Europe and western Asia. In the UK, it is common in lowland grassland and agricultural land across England and Wales, becoming scarcer further north into Scotland. It is metallic green with red tips to the elytra and anterior pronotal angles. It is larger (5.5-6 mm) than the similar Cordylepherus viridis, which has a less vivid green colouration and lacks reddened anterior pronotal angles. Adults possess bright red eversible sacs called cocardes along the thorax and abdomen that emit defensive odours, which are readily produced when the beetle is alarmed. It is univoltine, with adults emerging from April/May, and may be commonly encountered on flowers throughout the summer where they feed on pollen and nectar as well as small invertebrates. The strong association of M. bipustulatus with flowers may lead to the species acting as an incidental minor pollinator (Gutowski, 1990). As generalist predators, they may also confer some degree of biological pest control (Malschi & Others, 2000). Males produce pheromone secretions from glands on the head in between the antennal insertions. These pheromones are transferred to lobes on the inner margin of antennomeres 2-4, with the antennae held forward to attract females. Females feed on these secretions during courtship. Eggs are laid in bark crevices and low vegetation. The larvae are predatory and are relatively long-legged, active hunters that develop throughout the summer (Gradwell, 1957). Adults may persist until late summer.

Genome sequence report
The genome was sequenced from one female M. bipustulatus ( Figure 1) collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.768, longitude -0.339). A total of 32-fold coverage in Pacific Biosciences single-molecule long reads and 65-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 244 missing/misjoins and removed 69 haplotypic duplications, reducing the assembly length by 1.62% and the scaffold number by 66.23%, and increasing the scaffold N50 by 45.84%.
The final assembly has a total length of 544 Mb in 51 sequence scaffolds with a scaffold N50 of 56.3 Mb ( Table 1). The majority, 99.70%, of the assembly sequence was assigned to 10 chromosomal-level scaffolds, representing 9 autosomes (numbered by sequence length), and the X sex chromosome (Figure 2- Figure 5; Table 2). Due to ambiguous Hi-C signal and lack of additional evidence to support accurate placement, Figure 1. Image of the Malachius bipustulatus specimen (icMalBipu1) used for genome sequencing. Image taken during preservation and processing, with 43.9 mm FluidX sample tube shown above.  DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute. The icMalBipu1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing. Tissue from the whole organism was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry Pulveriser, receiving multiple impacts. Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent FemtoPulse. High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit. Low molecular weight DNA was removed from a 200-ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing. HMW DNA was sheared into an average fragment size between 12-20 kb in a Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample. The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit. Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions. Sequencing was   performed by the Scientific Operations core at the Wellcome Sanger Institute on Pacific Biosciences SEQUEL II and Illumina HiSeq X instruments. Hi-C data were generated using the Arima v2 Hi-C kit and sequenced on a HiSeq X instrument.   When assembling the mitochondrial genome, what sequence was used as a reference? Table 3 does not contain direct reference to the BUSCO software; however, they mention that they have used it.

Genome assembly
How was the X chromosome scaffold identified?
Using the provided link, we can access the raw fastq files but the links to the genome assemblies are not functional or at least not apparent.

Is the rationale for creating the dataset(s) clearly described? Partly
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Partly
Are the datasets clearly presented in a useable and accessible format? Partly