The Avon Longitudinal Study of Parents and Children - A resource for COVID-19 research: Home-based antibody testing results, October 2020. An emphasis on self-screening at a population level

The Avon Longitudinal Study of Parents and Children (ALSPAC) is a prospective population-based cohort study which recruited pregnant women in 1990-1992 and has followed these women, their partners (Generation 0; G0) and offspring (Generation 1; G1) ever since. The study reacted rapidly to the COVID-19 pandemic, deploying online questionnaires in March and May 2020. Home-based antibody tests and a further questionnaire were sent to 5220 participants during a two-week period of October 2020. 4.2% (n=201) of participants reported a positive antibody test (3.2% G0s [n=81]; 5.6% G1s [n=120]). 43 reported an invalid test, 7 did not complete and 3 did not report their result. Participants uploaded a photo of their test to enable validation: all positive tests, those where the participant could not interpret the result and a 5% random sample were manually checked against photos. We report 92% agreement (kappa=0.853). Positive tests were compared to additional COVID-19 status information: 58 (1.2%) participants reported a previous positive test, 73 (1.5%) reported that COVID-19 was suspected by a doctor, but not tested and 980 (20.4%) believed they had COVID-19 due to their own suspicions. Of those reporting a positive result on our antibody test, 55 reported that they did not think they had had COVID-19. Results from antibody testing and questionnaire data will be complemented by health record linkage and results of other biological testing– uniting Pillar testing data with home testing and self-report. Data have been released as an update to the original datasets released in July 2020. It comprises: 1) a standard dataset containing all participant responses to all three questionnaires with key sociodemographic factors and 2) as individual participant-specific release files enabling bespoke research across all areas supported by the study. This data note describes the antibody testing, associated questionnaire and the data obtained from it.


Introduction
At the time of writing we are ten months into the coronavirus disease 2019 (COVID-19) pandemic and many countries have resorted to a second or third national lockdown, in an attempt to control the spread of the virus 1 . It was noted early in the pandemic that antibody testing could give an indication of likely past exposure to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection 2 . Recent studies on the prevalence of infection from antibody tests are primarily from hospital patients; yet mass testing in general populations is vital for improving the understanding of the spread of infection given that many individuals, particularly younger members of society, appear to be asymptomatic 3 . The largest antibody testing study to date in England -the Real-time Assessment of Community Transmission (REACT), using 100,000 home-based antibody tests performed in June and July 2020 -suggests that 6% of the population had been infected with the virus 4 . Testing in longitudinal population studies would be beneficial, in order to objectively identify cases and improve the assessment of the impact of, and risk factors for, infection on individuals who have rich pre-pandemic data and planned follow up. Work is ongoing in REACT to better understand the limitations of antibody testing, given the relative unknowns of when antibodies may begin to decline after infection, particularly in those with mild illness 3 .
The Avon Longitudinal Study of Parents and Children (ALSPAC) is a unique multi-generational study: 'G0' includes the original pregnant women and their partners (mean age ~59 years); 'G1': the original index children (mean age ~28 years); and 'G2': the offspring of the original children [5][6][7][8] . ALSPAC has been able to collect self-reported data throughout the pandemic. This can be combined with data from clinical services based on linkage to medical and other records. In addition to collecting data about self-reported exposure to the infection and reporting on the impact of mitigation on participants (e.g. 9), we wanted to objectively estimate how many people in the study may have been infected with the virus that causes COVID-19. Antibody tests were therefore deployed to our participants.
This data note describes the data collected via our third online questionnaire in October 2020 which was complemented by home-based antibody testing. The update to the original dataset obtained from our first two online questionnaires 10,11 are described here, together with any variables that have been derived using all sets of questionnaire data. We also present a summary of the antibody testing results and summarise the agreement between participant reports of their test results against our own checking.

Methods
Setting ALSPAC is an intergenerational longitudinal cohort that recruited pregnant women residing in Avon, UK with expected dates of delivery 1 st April 1991 to 31 st December 1992 5,6 . The initial cohort consisted of 14,541 pregnancies resulting in 14,062 live births and 13,988 children who were alive at 1 year of age. From the age of seven onwards, the initial sample was bolstered with eligible cases who had originally failed to join the study and there were subsequently 14,901 children alive at 1 year of age following this further recruitment 7 . Please note, the study website contains details of all the data that is available through a fully searchable data dictionary and variable search tool.
ASLPAC developed a data collection strategy in response to the pandemic which was practical and yielded data rapidly. We achieved this through online only data collection approaches. This meant we had to restrict invites to those participants with a valid email address. This was coordinated alongside a systematic communications/outreach campaign to obtain updated information from our participants. Our questionnaires were developed and deployed using REDCap (Research Electronic Data CAPture tools 12 ); a secure web application for building and managing online data collection exercises, hosted at the University of Bristol.
Invitation and reminder strategy for antibody testing As part of the second questionnaire we asked participants if they were happy to be contacted about future research projects involving testing or taking biological samples. Participants who responded positively to this question (n=5,828, 90% of those responding to the questionnaire) and those who completed the first questionnaire but not the second (n=1,178), and therefore did not complete this question, formed the basis of our invites to take part in antibody testing ( Figure 1). An initial email was sent out to participants asking them to read a participant information sheet (PIS) and instruction booklet (which included a link to a brief video), containing details on the purpose of the research, what was involved and the risks of taking part. This information was based on a modified version of that used by the REACT study 13 . Participants were asked to log on and complete an online REDCap consent form, which confirmed that they had read and understood the information provided, had the opportunity to ask questions and that they agreed to take part in the study. They also provided an address for the kit to be sent to. This may have differed to that stored on our administrative database and was kept only for the purpose of sending the test. Finally, participants were asked a screening question about bleeding disorders. Antibody tests were not sent to those participants who reported a bleeding disorder nor to those who provided an address outside the UK; costs and timescales made sending kits overseas impractical, resulting in 6,828 participants eligible to take part.

Amendments from Version 1
In response to reviewer's comments, we have made some minor alterations to the text (primarily the strengths and limitations section). Ad addition of "An emphasis on self-screening at a population level" was made to the title.
Any further responses from the reviewers can be found at the end of the article

REVISED
Between the 1 st and 5 th October 2020, 5,220 participants were sent testing kits through the post. An accompanying letter included detailed instructions and an invitation to complete an online questionnaire and upload a photo of the test result. Reminders were sent on the 9 th October. The questionnaire survey was live on the online platform for just over two weeks (all questionnaires and tests were completed between the 3 rd and 20 th October 2020). Unlike our standard questionnaires (usually completed annually) we did not provide any incentive for completion; however, we did offer a prize draw (three prizes of £100) for those who completed their questionnaire by 14 th October.

Antibody tests
Una Health and Fortress Diagnostics Ltd. (Stoke-on-Trent, UK) supported this study by providing the antibody test cassettes procured initially by the Department of Health and Social Care, UK 14 . The lateral flow test (Fortress Diagnostics, Antrim, Northern Ireland) was selected following evaluation of performance characteristics (sensitivity and specificity) against pre-defined criteria for detection of IgG 15 , and extensive public involvement and user testing 16 . Approval for the use of these kits for research purposes was obtained from The Medicines and Healthcare products Regulatory Agency (MHRA).
The lateral flow test (LFT) kits require the user to place a drop of blood onto the test, adding some buffer solution and waiting for around 15 minutes. The test could either be negative, positive (showing antibodies; IgG and/or IgM) or invalid, meaning that the test didn't work. IgM antibodies appear first in infected individuals and levels typically fall rapidly, indicating recent infection. Whereas IgG may remain detectable for many months. We therefore considered a test positive if it showed the presence of IgG antibodies (with or without IgM).

Questionnaire content
The questionnaire was considerably shorter than our previous COVID-19 questionnaires and captured information on the following: • Symptoms of COVID-19 and negative control symptoms since March 2020 (symptoms repeated from Q2) • Diagnosis with COVID-19 and testing history • Attempted and completed antibody test, with reasons why not attempted or completed

• Result of antibody test and confidence in own interpretation
The final questionnaire (REDCap PDF) used is available with the associated data dictionary (which includes frequencies of all variables that are available) and both are available as extended data 17 . In addition, participants were asked to take a photo of their test result and upload it through the online system if they wanted to (97% of responders did so).

Validation of test results
Following the REACT protocol 16 , two authors (KN and RH) reviewed a sample of photographs from all participants who reported a positive (IgG and IgG/IgM) result or stated they couldn't tell, alongside a 5% random sample of the remainder. The two authors examined a random set of photos and recorded their interpretation of the results. These results were then compared to the participants' reports. Agreement between authors and participants were assessed using kappa statistics.

Key results
Response rate A total of 6,828 consent forms were sent to participants asking if they would be interested in completing a serology test and related questionnaire ( Figure 1). This group is considered the invited group. Of these, 5,220 serology kits and questionnaires were sent out (76% of those invited), of which 4,819 participants returned a questionnaire (71% of those invited; 92% of those sent a test kit and questionnaire; see Figure 1 for a flow diagram of participant numbers).
Female participants completed a larger proportion of questionnaires (73%). However, there was only a small difference in response rates for the sexes, with 67% of males and 72% of females who were invited returning the questionnaire. Table 1 summarises the response rate within each group organised by cohort structure. Of those invited to take part, 52% were G0s and 48% were G1s. It should be noted that considering those who were sent an antibody kit as the baseline (as opposed to those who were sent a consent form), the response rate was 92%.
Characteristics of responders according to key variables that will be released with the complete dataset can be seen in Table 2. The population who responded were predominantly white (> 98%) and the majority had at least A-level qualifications (optional exams sat at the age of 18 years), with almost 80% of the G1 cohort in this category. G0 Fathers/partners were three years older on average than G0 mothers (61.1 years vs 58.5 years) and G1 partners were two years older than G1 participants on average (30.3 years versus 28.2 years).
As with the previous questionnaire we sought to assess potential reasons for non-completion of this third COVID-19 questionnaire, which could potentially bias comparisons between questionnaire waves. Associations between various sociodemographic factors and returning this questionnaire, of those invited, were examined ( Figure 2). Returning the third questionnaire was strongly associated with age and generation whereby older, and therefore G0 participants, were more likely to complete compared to the younger/G1 participants. After adjusting for generation (G0 vs G1), there was evidence of social bias in those returning the questionnaire. Participants with higher education qualifications (a proxy for socioeconomic position) were much more likely to return the third questionnaire, while greater financial worry was associated with non-completion. However, physical and mental health were not strongly associated with the return of this questionnaire. As with the previous COVID-19 questionnaires 10,11 , women were more likely to return the questionnaire than men. Individuals who had previously self-reported that they had had COVID-19 (from either a positive test, doctor suspicions or own suspicions) were more likely to complete this third questionnaire. Participants from a non-white ethnic background were less likely to complete this questionnaire, although the 95% confidence intervals slightly overlap with the null. Table 3 shows the agreement between each author's and the participant's interpretations for each category of result. Overall, there was 92% agreement between authors and participants (kappa=0.853). This can be broken down as follows: 99% agreement for the negative test results and 94% agreement for the IgG positive test results. The biggest disagreement was in the 'can't tell' category, where the authors interpreted the result in all but two cases (removing those results in a kappa of 0.923). Of those participants who reported a positive IgG result but where the author disagreed, nine were negative, two were IgM positive and in one case the authors could not interpret the result. Of those participants who reported they could not tell but where the author disagreed, 17 were negative, one was IgM positive and three were IgG positive. We have created a new variable that replaces the participant's report of the result with our own interpretation. Ten participants did not answer this question but did upload an image; these images were assessed, and the results have been added to the 'ALSPAC interpretation of test results' variable. Table 4a and Table 4b present the test results reported by participants (Table 4a) and then the ALSPAC-validated results (Table 4b). For the latter, 168 participants had a positive IgG result and 33 had a positive IgM and IgG result, meaning that 4.2% of participants reported a positive antibody test. 44 participants had an invalid result, one participant did not take a photo and could not remember their result when they came to complete the questionnaire, and even after validation by authors there were 10 cases where it was impossible to tell the result. Seven participants did not complete the test and three did not report on the test result. In total, 3.2% of G0s reported an IgG positive antibody test compared to 5.6% of G1s. Table 5 reports on the breakdown of key variables according to whether participants reported a positive test or not (with a positive result defined as either an 'IgG positive' or 'IgG and IgM positive' result, while a negative result is defined as either a 'negative' or 'IgM positive' result). No substantial differences were observed, although among the G0 generation participants with a positive result were slightly younger than participants with a negative result.

Self-report of COVID-19
As with previous questionnaires, participants were asked whether they thought they had had COVID-19, prior to taking this antibody test. Options were: 'Yes, confirmed by a positive test', 'Yes, suspected by a doctor but not tested', 'Yes, my own suspicions' or 'No'. In this questionnaire 58 (1.2%) respondents reported that they had tested positive for COVID-19, 73 (1.5%) reported that COVID-19 had been suspected by a doctor but not tested and 980 (20.4%) believed they had had COVID-19 due to their own suspicions. Table 6 summarises the responses to this question by cohort structure.
Of those who reported a positive IgG result on the antibody test, 55 reported that they did not think they had had COVID-19. Further investigation is warranted to investigate symptom reports in this group to assess the proportion who are truly asymptomatic.

Strengths and limitations of the data
This data collection has a number of strengths. Primarily, the study has been able to respond rapidly to the pandemic and collect several waves of data already. Secondly, the addition of antibody testing was unique in a longitudinal cohort study at the time of first publication. This puts us in an excellent position to identify true 'cases' of COVID-19 over and above selfreport. We have undertaken linkage to Public Health England (PHE) Pillar I and II test results and are in the process of triangulating this data to obtain defined cases in our population. We achieved an excellent response rate despite a) the lack of incentives and b) the fact we were calling on our participants to take part in data collection for the third time in a matter of months. All the antibody tests were completed over a twoweek period and therefore provide a relatively accurate snapshot in time We will use the data obtained to identify changes over time in symptom experience and better understand the waxing and waning of antibody levels according to likely date of infection.
It should be noted, however, that a 2020 Cochrane review assessing the diagnostic accuracy of antibody tests identified a number of issues with the existing evidence 18 , including that: 1) sensitivity of the tests is too low in the first week after symptom onset (this means we may have missed some cases who were in the early stage of infection in October), 2) the   3) sensitivity of the tests has primarily been tested in hospital patients. The accuracy of home-based antibody tests in the general population is therefore still relatively unknown. We are not in a position to compare the results presented here with immuno-assays, so the population specific specificity-sensitivity is not possible to estimate. However, we have detailed survey data on symptoms, self-reported infection and also linkage to Pillar I and Pillar II testing as recorded by PHE for our G1 participants and those G0 mothers who have consented to such linkage. As noted above we will triangulate this data to identify true cases and when they may have occurred. Finally, all ALSPAC 'cases', together with both matched and random controls are currently being invited to take part in a sub-study as part of the UK Coronavirus Immunology Consortium (UK CIC, 19 ). This work will provide a better understanding of the immune response to the virus and help us with identifying true cases.
We were able to validate a sub-sample of test results using photographs of testing cassettes uploaded by the participants. We showed good agreement with participant reported results, which was comparable to that reported by REACT 15 . In particular, we reported high agreement in those tests that were negative. However, we only took a 5% sub-sample of negative results and there is a possibility that in the remainder of the sample there were results reported as negative that were in fact positive. Based on the fact that two of the 5% sample we checked were in fact positive, there is the potential for approximately 40 tests in the whole sample to be incorrectly reported as negative. As we are not solely basing our case definition for future studies on the antibody test results there is every chance that we would pick such cases up through self-report of linkage to PHE test results, unless those participants were completely asymptomatic. Our agreement analysis suggests that in this population, participants were able to correctly determine their COVID-19 infection status most of the time.
We were able to assess some key sociodemographic factors predicting questionnaire completion, which is important for assessing and quantifying the extent of possible selection and collider bias, which may bias both our prevalence estimates and associations between variables 20 . As reported both here  duration of the rise of antibodies is unknown due to the lack of data (current thinking is around 35 days after symptom onset; this means we may have missed cases who were infected early in the pandemic and may have since lost their antibodies) and and previously 10,11 , questionnaire response is socially patterned, with older, female, and higher-socioeconomic position participants more likely to respond. Additionally, in this questionnaire participants who previously reported that they had COVID-19 were more likely to respond. As those with COVID-19 were more likely to respond, our prevalence estimates may be somewhat inflated. Additionally, as previous COVID-19 status was associated with questionnaire completion (and therefore missing data), this could result in biases when assessing relationships between COVID-19 status and other risk factors which also predict questionnaire completion (e.g., age, sex, socioeconomic position). This study is therefore not generalisable to populations larger than ours that may contain the vulnerable groups most likely to benefit from self-screening programmes. Researchers using and interpreting this data should be aware of these potential biases 21 .
The UK Scientific Advisory Group for Emergencies (SAGE) have noted potential behavioural responses to both positive and negative antibody results 22 . This may have an impact on future research in this population. However, we were very careful to clearly explain to our participants that the test results were for research purposes and that they should not change their behaviour as a result. But this is not something we can guarantee.

Data availability
Underlying data ALSPAC data access is through a system of managed open access. The steps below highlight how to apply for access to the data included in this data note and all other ALSPAC data: 1. Please read the ALSPAC access policy 23 which describes the process of accessing the data and samples in detail, and outlines the costs associated with doing so.
2. You may also find it useful to browse our fully searchable research proposals database 24 , which lists all research projects that have been approved since April 2011.
3. Please submit your research proposal 25 for consideration by the ALSPAC Executive Committee. You will receive a response within 10 working days to advise you whether your proposal has been approved.
Please note that a standard COVID-19 dataset will be made available at no charge (see description below); however, costs for required paperwork and any bespoke datasets required additional variables will apply.

COVID-19 Questionnaire 3 Data File
Data from the third ALSPAC COVID-19 questionnaire (known internally as the serology questionnaire) is available in two ways.
1. A freely available standard set of data containing all participants together with key sociodemographic variables (where available) is available on request (see above). This dataset also includes data obtained from the first two COVID-19 questionnaires. Subject to the relevant paperwork being completed (costs may apply to cover administration) this dataset will be made freely available to any bona fide researcher requesting it. Variable names will follow the format covid3_xxxx where xxxx is a four-digit number. A full list of variables released is available here: https://doi.org/10.17605/OSF.IO/6JR7E. Frequencies of variables and details of any coding/editing decisions and derived variables are also available in the data dictionary: https://doi.org/10.17605/OSF.IO/6JR7E.
2. Formal release files have been created for G0 mothers, G0 fathers and G1 participants in the usual way and now form part of the ALSPAC resource (due to the small number of G1 partners contributing we will not be formally releasing this data, however, it may be available on request for specific G2 projects). These datasets (or sections therein) can be requested in the usual way. Variable names will replicate those in 1) above but as each variable in ALSPAC is uniquely defined we have added markers to denote the source of the variable. For example, in the above dataset, the age of the participant at completion (in years) is denoted by covid3_9650. In the mother's dataset this will be denoted by covid3m_9650, for fathers/partner this will be covid3p_9650 and for the G1 generation it will be covid3yp_9650. Frequencies for all variables for each participant group are available in the data dictionary in the usual way 24 .
Text data and other potentially disclosive information will not be released until they have been coded appropriately. Data will be incorporated back into both file sets as they become available.

Consent
Participants consented electronically to take part in the antibody testing. Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees. The South Central -Berkshire Research Ethics Committee provided specific approval for this data collection (REC reference number: 20/SC/0361). Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time. Study participants have the right to withdraw their consent for elements of the study or from the study entirely at any time. Full details of the ALSPAC consent procedures are available on the study website 25 .

Open Peer Review
useful for clinical screening, and generates epidemiological interest, and thus should be presented primarily as the value of self-reporting of self-administered home screening kits to screen for current or past infections, reliability of self-screening and reporting based on this cohort, biases, what this means for community screening. The data is interesting from the health policy perspective and primary care perspective so should be restructured to appeal to a wider clinical audience. This ought to be reflected in a more specific title emphasizing self-screening, and the abstract should clearly state the aims of determining the value of self-screening from the outset. The paper should perhaps explain in greater depth the value of this intervention and how it can be generalised or adapted to more diverse populations.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Negatives:
"As the authors discussed, the study population is the main limitation, being almost entirely Caucasian and >50% completing at least A levels, with additional biases of higher socioeconomic position, female gender and previous self-reported COVID-19 infections, demonstrating strong motivation to take the test and complete the study. Therefore this study is not generalisable to larger heterogenous populations which contain the vulnerable groups more likely to benefit from such self-screening programmes. The data generated and influence on health policy is thus potentially limited to similar populations." We completely agree and have now made this clear in the 'strengths and limitations' section (penultimate paragraph).
"To me, this paper comes across as a report on the execution of an observational questionnaire-based study of a sub-cohort of the ALSPAC cohort, with secondary reporting of results. The data is useful for clinical screening, and generates epidemiological interest, and thus should be presented primarily as the value of self-reporting of self-administered home screening kits to screen for current or past infections, reliability of self-screening and reporting based on this cohort, biases, what this means for community screening. The data is interesting from the health policy perspective and primary care perspective so should be restructured to appeal to a wider clinical audience. This ought to be reflected in a more specific title emphasizing self-screening, and the abstract should clearly state the aims of determining the value of self-screening from the outset. The paper should perhaps explain in greater depth the value of this intervention and how it can be generalised or adapted to more diverse populations." The reviewer is correct in that the aim of the data note is to describe the data we have collected. However, the aim of that data collection was not to assess self-screening ability. We appreciate the helpful comments on restructuring to appeal to a wider audience and have made some minor changes as a result. We have amended the title adding reference to selfscreening and have also added some detail throughout the paper as suggested:introduction (final sentence) and discussion (end of third para) (

Jean-Luc Murk
Medical Microbiology and Immunology, Elisabeth-Tweesteden Ziekenhuis, Tilburg, The Netherlands Dr. Northstone et al. have submitted a manuscript that describes their COVID-19 research in a British population based-cohort, that is followed since 1990. The COVID-19 study consists of elaborate online questionnaires and the results from a home-based SARS-CoV-2 IgM and IgG test.
The study was performed in 2020. The antibody test results give an indication of the percentage of SARS-CoV-2 infections up to period of testing (October 2020). The research is technically sound and the datasets contain a lot of interesting information.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes © 2021 Taheri M. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.

Morteza Taheri
Department of Sport Sciences, Imam Khomeini International University, Qazvin, Iran

Abstract:
Statistical values should be stated in the results section of the abstract.
Justify and discuss the results more clearly.

○
Report recommendations for future studies based on study limitation(s).

○
We provide reviewer 1 and 3's comments below with our response in italics:

Reviewer 1: Abstract:
Statistical values should be stated in the results section of the abstract.

○
As the focus of this article is a description of the data we do not feel this is appropriate.
Your abstract should be organized in such a way that the content of the aim, methodology, findings and conclusion are stated separately.

○
We feel that the abstract is an appropriate summary of the data note, follows the author guidelines and is aligned to all the other data notes we have published relating to Covid-19 data collection.
Key words should be checked based on Mesh standard.

○
Thank you, we have done that and are ensuring consistency also with our previous and subsequent data notes.
The general conclusion in the abstract should be more clearly stated ○ As a description of the data there is no firm conclusion.

Introduction:
In addition to the explanations related to the topic of Covid-19 epidemics and the need for diagnostic antibody tests, it is also worth mentioning their effectiveness in vaccination, which you can use from the following references: Dergaa et al. (2021 1 ).

○
At the time of writing and initiating the testing programme in our population, home-based tests were relatively new and our testing was carried out well before any vaccinations had been approved for use. As noted in our response to reviewer 3, the focus of the current study was to assist in establishing 'caseness' in our population (avoiding reliance on self-report when testing was not widespread) rather than building the case for the use of home testing in the community.
The introduction is very brief, it is better to be more complete and focus more on international survey study with respect to Covid-19. Use the following reference for showing the importance of survey by using a large number of questionnaires: Trabelsi et al. (2021 2 ). ○ Again, our focus is here is on describing the data we have. The current data note is one of several describing the detailed resource that ALSPAC now has and contributes one of five questionnaires that have now been completed by our participants. Given the time that has passed we would prefer to keep the data note as is so that our contribution flows in a succinct way.
You mentioned less about the importance of addressing the issue of Covid-19 in the text, so it is important to use up-to-date references regarding it.
○ As above we are describing our dataset and our response to the pandemic..

Method:
The validity and reliability of questionnaires? ○ As questionnaires had to be designed very quickly given the sudden onset of the pandemic we were not able to formally test the validity or reliability of our questionnaires. However, the questions used are very similar to many other longitudinal population studies in the UK, with whom we are now working closely. Indeed, the need to collect more objective data was driven by the subjective nature of participants 'self-suspicions' of COVID-19 infection at a time when testing was not widespread and these antibody tests can be used to test the validity and reliability of the questionnaires (see Table 6).
Exclusion and inclusion criteria for participants? ○ This is described in the invitation strategy section: Inclusion: "As part of the second questionnaire we asked participants if they were happy to be contacted about future research projects involving testing or taking biological samples. Participants who responded positively to this question (n=5,828, 90% of those responding to the questionnaire) and those who completed the first questionnaire but not the second (n=1,178), and therefore did not complete this question, formed the basis of our invites to take part in antibody testing." Exclusions: " Antibody tests were not sent to those participants who reported a bleeding disorder nor to those who provided an address outside the UK " Please also see Figure 1.
Please kindly upload the questionnaire for consideration.

○
As described in the "Questionnaire content" section the Q is available in the uploaded data dictionary. Please see: https://osf.io/6jr7e/.

Any ethical committee code?
○ This is also provided in the consent section: "REC reference number: 20/SC/0361".

Result:
Use a graph to show the results more clearly.

○
We are not sure which results you are referring to. All the tables presented are simple and one graph is used which presents the more detailed information.

○
The tables that present the results of statistical tests clearly state already in the table or in the title which test is used (Table 3 using kappa and Table 5 using chi squared or t-tests).

Discussion:
Justify and discuss the results more clearly.

○
We are not sure which results the reviewer is referring to specifically. In the discussion we have summarised the results and detailed key strengths and limitations as appropriate for a data note.
If the reviewer has more specific comments we would be happy to update as appropriate.
Report recommendations for future studies based on study limitation(s).