The genome sequence of the furry-claspered furrow bee, Lasioglossum lativentre (Schenck, 1853)

We present a genome assembly from an individual male Lasioglossum lativentre (the furry-claspered furrow bee; Arthropoda; Insecta; Hymenoptera; Halictidae). The genome sequence is 479 megabases in span. The majority of the assembly (75.22%) is scaffolded into 14 chromosomal pseudomolecules. The mitochondrial genome was also assembled, and is 15.3 kilobases in length.


Background
Lasioglossum (Lasioglossum) lativentre (furry-claspered furrow bee) is a solitary, ground nesting bee found throughout the western Palearctic from the UK to Iran. In the UK the genus Lasioglossum Curtis is represented by 32 species, but worldwide there are at least 1,700 species. L. lativentre is common in lowland England up to Yorkshire and southern Wales. The species shows a particular association with plant species of the family Asteraceae (Falk, 2015). The species is found along woodland edges but can also occur in gardens and other grassland habitats. Females emerge first in March with males appearing later in June. The cleptoparasites Sphecodes ephippius (Linnaeus) and S. puncticeps Thomson use L. lativentre as their host.

Genome sequence report
The genome was sequenced from a single male L. lativentre ( Figure 1) collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.769, longitude -1.339). A total of 34-fold coverage in Pacific Biosciences single-molecule long reads and 54-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 50 missing/misjoins and removed 14 haplotypic duplication, reducing the scaffold number by 2.97%, and increasing the scaffold N50 by 70.83%.
The final assembly has a total length of 479 Mb in 1143 sequence scaffolds with a scaffold N50 of 27.7 Mb (Table 1). Of the assembly sequence, 75.22% was assigned to 14 chromosomal-level scaffolds (numbered by sequence length) (Figure 2- Figure 5; Table 2). The assembly has a BUSCO v5.1.2 (Manni et al., 2021) completeness of 96.2% (single 95.6%, duplicated 0.5%) using the hymenoptera_odb10 reference set (n=5991). While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Methods
Sample acquisition and DNA extraction A male (iyLasLatv2) and a female (iyLasLatv1) L. lativentre were collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.769, longitude -1.339) by Steven Falk, Independent Researcher, using a net. The  DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute. The iyLasLatv2 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.
Whole organism tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle. Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent FemtoPulse. High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit. Low molecular weight DNA was removed from a 200-ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing. HMW DNA was sheared into an average fragment size between 12-20 kb in a Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample. The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and
The genome sequence is released openly for reuse. The L. lativentre genome sequencing initiative is part of the Darwin Tree of Life (DToL) project. All raw sequence data and the assembly have been deposited in INSDC databases. The genome will be annotated and presented through the Ensembl pipeline at the European Bioinformatics Institute. Raw data and assembly accession identifiers are reported in Table 1.

Hidemasa Bono
Laboratory of Genome Informatics, Graduate School of Integrated Sciences for Life, Hiroshima University, Higashihiroshima, Japan Authors present the genome sequencing of the furry-claspered furrow bee, Lasioglossum lativentre . The methods were concisely described, but the software used in the analysis were sufficiently summarized as Table 3 with versions and sources. Data described in the manuscript were fully opened from the public repository.
Only the genome sequence was reported in this manuscript. This reviewer cannot see why 'iyLasLatv1'(female) and 'iyLasLatv2' (male) were used.
Authors used iyLasLatv1 in genome sequencing, but the name of assembly identifier is 'iyLasLatv2'.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Partly
Are the datasets clearly presented in a useable and accessible format? Yes © 2023 Paschoal A. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Alexandre R. Paschoal
Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, Brazil The authors provide a brief data genome assembly and BUSCO annotation on the Lasioglossum lativentre genome. The manuscript is well described in a summary report. My comments are to help to address more detail in this work. I hope can contribute to improving it.
1) Could authors clarify where the genome assembly data is?
The authors mention the Accession number PRJEB46299 at the ENA. We have only fastaq for the ID: PRJEB46259. Next, I found the https://projects.ensembl.org/darwin-tree-of-life/ with the Lasioglossum lativentre data. But only soft masked genome is available. Why? and no masked and hard masked?
3) I feel a lack of basic annotation statistics and discussion: Repeat elements, particularly transposable elements.