The genome sequence of the grey wolf, Canis lupus Linnaeus 1758

We present a genome assembly from an individual male Canis lupus orion (the grey wolf, subspecies: Greenland wolf; Chordata; Mammalia; Carnivora; Canidae). The genome sequence is 2,447 megabases in span. The majority of the assembly (98.91%) is scaffolded into 40 chromosomal pseudomolecules, with the X and Y sex chromosomes assembled.


Background
The grey wolf, Canis lupus, is the largest species within the group wolf-like canids (Subtribe: Canina) and the member with the largest geographic distribution. Originally wolves were found throughout Eurasia, with the exception of tropical Southeast Asia, and all of North America. This vast distribution contains numerous habitats, encompassing wolf ecotypes adapted to the diverse environments throughout their distribution. The wolf is locally extinct in several places, such as the UK, Ireland and Brittany, yet it still holds much of its original distribution; the global population is estimated to be in the order of 200-250 thousand individuals (Jhala et al., 2018).
Once numerous, wolves were eradicated from the islands of Great Britain in the 15th century and Ireland in the 18th century. There have been proposals to reintroduce populations of wolves to the Scottish Highlands to manage populations of red deer, which have a negative effect on biodiversity through overgrazing (Nilsen et al., 2007). The Scottish Highlands are considered to be the only location in Great Britain that could support a healthy population of wolves; however, objections of livestock owners are likely to prevent their reintroduction in the near future (Wilson, 2004). The reintroduction of wolves elsewhere has led not only to the reestablishment of this apex predator, but also to marked improvements in biodiversity in the ecosystem as a whole (Ripple et al., 2014). Wolves reintroduced into the Yellowstone National Park, Wyoming, USA, in 1995 predated grazing animals such as wapiti (Cervus canadensis) that preserved grasslands. The subsequent changes in prey behaviour led to trophic cascades that resulted in the reestablishment of tree species and an associated increase in populations of species that rely directly and indirectly on this habitat (Ripple & Beschta, 2012).
Wolves have historically been found in Northwest, Northeast and East Greenland (Dawes et al., 1986). Wolves were extirpated from East Greenland through hunting by 1939 and were absent from this area for the next 40 years (Marquard-Petersen, 2012). In around 1979, a pair of wolves travelled from the north of the island and began a recolonisation of East Greenland, establishing a population of around 23 wolves (Marquard-Petersen, 2011). A recent assessment found no trace of wolves for a number of years in East Greenland, while a population of up to 32 animals is still found in the northernmost parts of Greenland. Since the population in East Greenland was located entirely within the Northeast Greenland National Park, affording the wolves legal protection, it is unlikely that this extinction event was driven by hunting (Marquard-Petersen, 2021).
Domestic dogs share a common ancestor with Eurasian wolves around 33,000 years ago (Skoglund et al., 2015;Wang et al., 2016). In this regard, the Greenland wolf or Polar wolf reference genome described herein is highly relevant for dog and/or Eurasian wolf genomics. The Polar wolf is a North American wolf, an outgroup to dogs and Eurasian wolves (Gopalakrishnan et al., 2019;Sinding et al., 2018), which will aid in making a minimally reference-biased representation of diversity in re-sequenced genomes (Gopalakrishnan et al., 2017). The Polar wolf is also the North American wolf type with the least coyote-like ancestry (Sinding et al., 2018); thus, it is probably the closest possible outgroup to dogs and Eurasian wolves with the least amount of exotic admixture that other North American wolves carry. Finally, this reference genome permits detailed genomic investigations of Polar wolves themselves, as a precise reference, to identify rare genomic variation. The genome is therefore an overall useful resource for research in the Polar wolf itself, a small, isolated and understudied population, but also canids, wolves and dogs overall.

Genome sequence report
The genome was sequenced from a single male C. lupus subspecies orion collected from Siorapaluk, Greenland (latitude 77.785278, longitude -70.631389) in 2016. A total of 28-fold coverage in Pacific Biosciences single-molecule long reads and 74-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 135 missing/misjoins and removed 9 haplotypic duplications, reducing the assembly length by 0.2% and the scaffold number by 42.1%, and increasing the scaffold N50 by 15.9%.
The final assembly has a total length of 2,447 Mb in 82 sequence scaffolds with a scaffold N50 of 66 Mb (Table 1). Of the assembly sequence, 98.91% was assigned to 40 chromosomal-level scaffolds (named by synteny to an assembly for C. lupus familiaris, breed labrador: GCF_014441545.1), including 38 autosomes and the X and Y chromosomes (Figure 1-Figure 4; Table 2). The assembly has a BUSCO (Simão et al., 2015)     HiGlass (Kerpedjiev et al., 2018) and Pretext. Regions of concern were identified and resolved using 10X longranger and genetic mapping data. The genome was analysed within the BlobToolKit environment (Challis et al., 2020). Table 3 contains a list of all software tool versions used, where appropriate.
The genome sequence is released openly for reuse. The C. lupus genome sequencing initiative is part of the Darwin Tree of Life (DToL) project and the Vertebrate Genomes Project. All raw sequence data and the assembly have been deposited in INSDC databases. The genome will be annotated using the RNA-Seq data and presented through the Ensembl pipelineat the European Bioinformatics Institute. Raw data and assembly accession identifiers are reported in Table 1.