The genome sequence of the black clock beetle, Pterostichus madidus (Fabricius, 1775)

We present a genome assembly from an individual female Pterostichus madidus (the black clock beetle; Arthropoda; Insecta; Coleoptera; Carabidae). The genome sequence is 705 megabases in span. The majority (99.96%) of the assembly is scaffolded into 19 chromosomal pseudomolecules, with the X sex chromosome assembled.


Background
The black clock beetle, Pterostichus madidus, is a large, common species of ground beetle. It occurs across western and northern Europe and in the UK it is the most frequently recorded beetle in the family Carabidae. It can be found throughout a wide range of habitats where it is active during both the night and day. It is a relatively large (13-18 mm), black carabid with smoothly rounded pronotal hind angles. There are two subspecies, Pterostichus madidus validus Dejean, 1828, which has black femora, and Pterostichus madidus concinnus (Sturm, 1818), which has distinctive 'wine red' femora. Pterostichus madidus is omnivorous, being a predator and scavenger, but also feeding on plant material (Luff, 1974). It is predominantly an annual species, laying eggs in late summer/autumn and larvae developing over the winter (Luff & Others, 1973). Overwintered adults are active from spring/early summer and some adults, particularly at higher altitudes, are biennial (Butterfield, 1996).

Genome sequence report
The genome was sequenced from one female P. madidus collected from Wytham Woods, Oxfordshire (biological vicecounty: Berkshire), UK (latitude 51.775, longitude -1.326) ( Figure 1). A total of 34-fold coverage in Pacific Biosciences single-molecule long reads and 53-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 142 missing/misjoins and removed 6 haplotypic duplications, reducing the assembly length by 0.18% and the scaffold number by 80.00%, and increasing the scaffold N50 by 58.29%.
The final assembly has a total length of 705 Mb in 27 sequence scaffolds with a scaffold N50 of 37.9 Mb (Table 1). The majority, 99.96%, of the assembly sequence was assigned to 19 chromosomal-level scaffolds, representing 18 autosomes (numbered by sequence length), and the X sex chromosome (Figure 2- Figure 5; Table 2). Some regions of the genome have large repeats with less certain structure than the rest of the assembly, most notably chromosomes 14, 15 and 18. Chromosome 14 from 23.8 Mb onwards has strong Hi-C association with chromosome 18. The assembly has a BUSCO v5.1.2 (Manni et al., 2021) completeness of 98.9% (single 98.4%, duplicated 0.5%) using the endopterygota_odb10 reference set. While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Lukas Zangl
Institute of Biology, University of Graz, Graz, Austria In this study, the authors provide a full length chromosome-level quality genome of the ground beetle Pterostichus madidus. The genome was composed of PacBio long-reads as well as 10X Genomics Illumina short-read data. High BUSCO scores suggest high levels of coding gene completeness and homogenous GC values as well as low numbers of non-traget scaffolds indicate low levels of contamination.
All raw data as well as assembly and annotation are publicly available and the methods and tools described will allow for good reproducibility. My only remark in this context regards the manual edits, which are mentioned, but not described in further detail. Maybe there are some comments on rationale and procedure that would be appropriate/needed to insure full reproducibility.
Furthermore, information about the two subspecies is given, however the authors missed to provide a comment on which subspecies they have actually sequenced.
Finally, despite the apparent good quality of the assembled and annotated genome, I had a hard time interpreting the Snailplot (specifically the grey and orange parts), however, I have to admit that I am not too familiar with his kind of representation. Therefore, I would suggest adding a few comments in the main text about what can be seen in order to increase clarification for all readers in general.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes