The genome sequence of the orange-striped anemone, Diadumene lineata (Verrill, 1869)

We present a genome assembly from an individual Diadumene lineata (the orange-striped anemone; Cnidaria; Anthozoa; Actiniaria; Diadumenidae). The genome sequence is 313 megabases in span. The majority of the assembly (96.03%) is scaffolded into 16 chromosomal pseudomolecules. The complete mitochondrial genome was also assembled and is 17.6 kilobases in length.


Background
The Orange-striped anemone, Diadumene lineata (Verrill, 1869), is believed to be the world's most widely distributed sea anemone.Native to the Northwest Pacific, it is now established on almost every temperate and tropical coast worldwide, and is a remarkable colonising species that serves as a model by which to address invasion hypotheses (Flenniken, 2017).In the UK, it has been recorded all along the south coast of England, around the Welsh coast, and from a few sites in Northern Ireland and Scotland.In these areas, it is typically found in sheltered estuaries attached to artificial structures in marinas and harbours, often in association with oysters and mussels, but also on sheltered natural shores, on stones, shells and seaweeds.
Diadumene lineata is a small, delicate anemone, with a smooth column up to 20 mm in diameter (in the UK, but larger in its native range).Generally, it is olive green or brown with contrasting orange vertical stripes.It has 25-100 slender, smooth tentacles, which are all of one type and usually colourless, but can be reddish.Thread-like defensive organs (acontia) can extend through pores in the column.It preys mainly on small crustaceans but may also consume larvae of commercially important species such as oysters and mussels.Under suitable conditions, it can quickly form large clonal aggregations.
In its native range D. lineata reproduces both asexually by fission and sexually (Ryan & Miller, 2019).However, outside its native range it is presumed that only asexual reproduction occurs, as no populations with both males and females together have been reported, except for a recently discovered population with both males and females in Coos Bay, Oregon, USA (Newcomer et al., 2019).

Genome sequence report
The genome was sequenced from a single D. lineata of unknown sex collected from Queen Anne's Battery Marina visitors' pontoon, Plymouth, UK (Figure 1).A total of 27-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 82-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 113 missing/misjoins and removed 43 haplotypic duplications, reducing the assembly size by 1.34% and the scaffold number by 41.95%, and increasing the scaffold N50 by 101.80%.
The final assembly has a total length of 313 Mb in 137 sequence scaffolds with a scaffold N50 of 17.7 Mb (Table 1).The majority, 96.03%, of the assembly sequence was assigned to 11 chromosomal-level scaffolds, representing 16 autosomes (numbered by sequence length) (Figure 2-Figure 5; Table 2).Two 3-Mbp sub-chromosome sized scaffolds were added as S17 and S18 to the unlocalised sequences.S17 and S18 are part of the host, as evidenced by SSU markers and coverage.Parts of

Assembly identifier
jaDiaLine6.  the centromere could not be uniquely assigned to chromosomes and are part of the unlocalised sequence.
The assembly has a BUSCO v5.1.2(Manni et al., 2021) completeness of 96.1% (single 95.6%, duplicated 0.5%) using the metazoa_odb10 reference set (n=954).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition and DNA extraction
Two D. lineata specimens (jaDiaLine6 and jaDiaLine7) were collected by hand from Queen Anne's Battery Marina visitors' pontoon, Plymouth, UK (latitude 50.3644, longitude -4.1320) by John Bishop, Joanna Harley (both Marine Biological Association) and Rob Mrowicki (Natural History Museum).The specimens were identified by Chris Wood (Marine Biological Association) and John Bishop and snap-frozen in liquid nitrogen.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute.The jaDiaLine6 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Whole organism tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent FemtoPulse.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 200-ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size between 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.RNA was extracted from jaDiaLine7 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μl RNAse-free water and its concentration RNA assessed using a Nanodrop spectrophotometer and Qubit Fluorometer using the Qubit RNA Broad-Range (BR) Assay kit.Analysis of the integrity of the RNA was done using Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics Chromium read cloud sequencing libraries were constructed according to the manufacturers' instructions.Sequencing was performed by the Scientific Operations core at the Wellcome Sanger Institute on Pacific Biosciences SEQUEL II (HiFi), Illumina NovaSeq 6000 (10X) and Illumina HiSeq 4000 (RNA-Seq) instruments.Hi-C data were generated in the Tree of Life laboratory from remaining tissue of jaDiaLine7 using the Arima v2 kit and sequenced on a NovaSeq 6000 instrument.4. The range of DNA amount (0.01-0.5 ng) is unusually low for fragment size analysis.Should be explained.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Evolutionary genetics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
We provide some feedback below on the background, and to a certain extent the justification for sequencing this specific taxon, and on the methodology.
First, we provide some feedback that is geared toward improving the background information so that interested readers have a roadmap to the current literature on this species.Both Flenniken (2017) and Glon et al. ( 2021) provided excellent background information on the invasion history and invasion-facilitating characteristics of D. lineata.

1.
The link connected with the statement "Under suitable conditions, it can quickly form large clonal aggregations" leads to a non-functioning page.However, this statement could be supported with primary citations, such as: Shick (1976) which documented the occurrence of a high-density population, presumably of a single clone, in Blue Hills, Maine.
Shick and Lamb (1977) which discussed the general tendency of D. lineata (under the former name Haliplanella luciae) to form clonal aggregations and provided allozyme evidence of clonal aggregations from a collection of sites.
Ryan et al. (2021) which provided evidence with microsatellite markers of highly clonal population genetic structures from eight sites across the Atlantic Coast of the US.

2.
Gamete production and mixed sex populations are now well established in the non-native range, though fertilization and larval settlement have not been confirmed in the field: Ryan and Miller (2019) demonstrated that gamete production is common across the range of D. lineata on the Atlantic Coast of the United States, including many populations with fertile males and females present.Gamete production in D. lineata has been documented in Japan (within the presumed native range) by Fukui (1991), Fukui (1995), and Ryan and Kubota (2016), but not Ryan and Miller (2019).

5.
It might also be useful to acknowledge that literature on this species has been published under many species names over time.Hancock et al. (2017) gives a particularly nice overview of the taxonomic history and documented occurrences.

6.
Second, we provide some feedback on the methodology to generate the genomic data.
In Figure 1, perhaps circle the specific specimen to which you are referring as there are multiple individual anemones in the photograph.Perhaps even cropping the image would be useful.

1.
More information on the manual curation would be nice to determine exactly what was done.For example, was there any bacterial removal from the assembly? 2.
Was RNA extracted from the whole organism?3.
Were both anemones -jaDialine6 and jaDialine7 -sterile?Also, when were they collected (date?).The authors mention 'unknown sex' in the Genome sequence report section, but a little bit more information would be useful.Also, this is presumably from the sample used for HiC rather than RNA?Two anemones were used for all sequencing, so clarifying this is also useful.

4.
Continuing from the point above, more information on what specimens were sequenced would be useful.

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Evolutionary ecology, population genetics, sea anemone biology
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Diadumene lineata.As there is currently a dearth of genome assemblies from the Cnidaria and particularly chromosome-level assemblies (perhaps only for the recent assemblies of Nematostella vectensis and Scolanthus callimorphus (Zimmerman et al. 2022), 1 the addition of this assembly from Diadumene lineata is welcome.The genome size (313Mb) and chromosome number (N=16) is consistent with those of N. vectensis (244Mb, 15 chromosomes) and S. callimorphus (414Mb, 15 chromosomes) (Zimmerman et al. 2022). 1  It would be interesting to have further details of the manual curation that led to the welcome improvement in scaffold number and scaffold N50, and also useful to know what tissues were sampled for the RNASeq dataset.Presumably, this is the whole organism?However, the resources detailed within will be extremely useful for the study of cnidarian genomics and beyond.

Reviewer Expertise: Genomics (invertebrates)
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 1 .
Figure 1.Image of the Diadumene lineata specimen taken prior to preservation and processing.The sample is shown in focus, slightly to the left of centre.

Figure 2 .
Figure 2. Genome assembly of Diadumene lineata, jaDiaLine6.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 313,006,248 bp assembly.The distribution of chromosome lengths is shown in dark grey with the plot radius scaled to the longest chromosome present in the assembly (42,014,270 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 chromosome lengths (17,670,263 and 13,127,821 bp), respectively.The pale grey spiral shows the cumulative chromosome count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the metazoa_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/jaDiaLine6.1/dataset/CAKKNV01/snail.

Figure 3 .
Figure 3. Genome assembly of Diadumene lineata, jaDiaLine6.1:GC coverage.BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/jaDiaLine6.1/dataset/CAKKNV01/blob.

Figure 4 .
Figure 4. Genome assembly of Diadumene lineata, jaDiaLine6.1:cumulative sequence.BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/jaDiaLine6.1/dataset/ CAKKNV01/cumulative.

Figure 5 .
Figure 5. Genome assembly of Diadumene lineata, jaDiaLine6.1:Hi-C contact map.Hi-C contact map of the jaDiaLine6.1 assembly, visualised in HiGlass.Chromosomes are arranged in size order from left to right and top to bottom.The interactive Hi-C map can be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=bZPq5k_oTFuM-wZyPAZiXg under track jaDiaLine6.

3 .
Newcomer et al. (2019) also found a population with fertile males and females coexisting on the Pacific Coast of the United States.

References 1 .
Zimmermann B, Robb S, Genikhovich G, Fropf W, et al.: Sea anemone genomes reveal ancestral metazoan chromosomal macrosynteny.bioRxiv.2020.Publisher Full TextIs the rationale for creating the dataset(s) clearly described?YesAre the protocols appropriate and is the work technically sound?YesAre sufficient details of methods and materials provided to allow replication by others?PartlyAre the datasets clearly presented in a useable and accessible format?YesCompeting Interests: No competing interests were disclosed.

Table 1 . Genome data for Diadumene lineata, jaDiaLine6.1.
(Allio et al., 2020)jiev et al., 2018)and Pretext.The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2021), which performs annotation using MitoFinder(Allio et al., 2020).The genome was analysed and BUSCO scores generated within the BlobToolKit environment (Challis et al., 2020).Table3contains a list of all software tool versions used, where appropriate.Ethics/compliance issuesThe materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the Darwin Tree of Life Project Sampling Code of Practice.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.

Table 2 . Chromosomal pseudomolecules in the genome assembly of Diadumene lineata, jaDiaLine6.1. INSDC accession Chromosome Size (Mb) GC%
Members of theTree of Life Core Informatics collective are listed here: https://doi.org/10.5281/zenodo.6125046.Members of the Darwin Tree of Life Consortium are listed here: https://doi.org/10.5281/zenodo.5638618.1."it is now established on almost every temperate and tropical coast worldwide" might be overstated?2. "Under suitable conditions, it can quickly form large clonal aggregations."Should be more specified.3. "of the assembly sequence was assigned to 11 chromosomal-level scaffolds, representing 16 autosomes" is confusing.