Examining the longitudinal nature of depressive symptoms in the Avon Longitudinal Study of Parents and Children (ALSPAC)

Depression during adolescence is associated with a number of negative outcomes in later life. Research has examined the longitudinal nature of adolescent depression in order to identify patterns of depressive mood, the early antecedents and later consequences. However, rich longitudinal data is needed to better address these questions. The Avon Longitudinal Study of Parents and Children (ALSPAC) is an intergenerational birth cohort with nine repeated assessments of depressive symptoms throughout late childhood, adolescence and young adulthood. Depressive symptoms are measured using the Short Mood and Feelings Questionnaire (SMFQ). Many studies have used ALSPAC to examine the longitudinal nature of depressive symptoms in combination with the wealth of early life exposure and later outcome data. This data note provides a summary of the SMFQ data, where the data are stored in ALSPAC, the characteristics and distribution of the SMFQ, and highlights some considerations for researchers wanting to use the SMFQ data in ALSPAC.


Introduction
Depression during adolescence is associated with a number of negative outcomes in later life such as poorer mental health 1 , impaired educational attainment 2 and reduced social functioning 3 . Understandably, research has examined the aetiology of depression during and around adolescence in order to identify preventions and interventions that could reduce these impairments.
Adolescence marks a period where depression as a disorder first commonly onsets 4 , but this period is also characterised by dynamic changes in depressive mood 5 . Consequently, depression during and surrounding adolescence can fluctuate rapidly across short periods of time 6 , and it can be difficult to quantify the true nature of adolescent depression without longitudinal research. Several recent studies have suggested that examining depression within individuals over time may be helpful method for 1) uncovering the nature of adolescent depression and how it changes over time, 2) identifying risk factors associated with greater adolescent depression, and 3) examining how greater depression during and across adolescence is associated with later outcomes 7 .
The Avon Longitudinal Study of Parents and Children (ALSPAC) is a unique intergenerational cohort study with a wealth of biological, genetic and phenotypic data from parents and children. However, one of the most important aspects of the ALSPAC study is through its repeated assessments of psychiatric traits 8 . ALSPAC is one of the few cohorts that has repeated assessments of the Short Mood and Feelings Questionnaire (SMFQ) 9 , reported by the child themselves throughout childhood, adolescence and young adulthood. The SMFQ is a 13-item questionnaire designed for examining the presence of depressive symptoms in epidemiological studies 9-11 , and has been shown to be a strong predictor of depression 12 . The SMFQ summaries these 13 items to give a score ranging between 0-26, where greater scores represent higher depression.
Many studies have used to examine the SMFQ in ALSPAC for exploring the nature of depression across adolescence 6 , risk factors for greater depression 13-16 , and how depression during this period can be associated with later outcomes 2,17 . However, there are still many unanswered questions regarding the longitudinal nature of depression during and across adolescence, and ALSPAC will play a role in answering these questions. Therefore, the aim of this data note is to provide the reader with a comprehensive overview of the SMFQ in ALSPAC. Primarily, this data note focuses on the location of these data within ALSPAC, the sample sizes, validity and correlations between SMFQ assessments and the distribution of the SMFQ data.

ALSPAC data
The Avon Longitudinal Study of parents and Children (ALSPAC) is an intergenerational longitudinal cohort that recruited pregnant women residing in Avon, UK with expected dates of delivery 1 st April 1991 to 31 st December 1992 18,19 . The initial cohort consisted of 14,062 children, but has been increased to 14,901 with further recruitment 20 . The study website contains details of all the data that are available through a fully searchable data dictionary and variable search tool: http://www.bristol.ac.uk/alspac/researchers/our-data. Part of the depressive symptoms data were collected using REDCap.

The SMFQ
The SMFQ is a 13-item questionnaire that measures the presence of depressive symptoms in the last two weeks 9 . Table 1 shows the list of questions administered at each occasion in ALSPAC. For each question, the answer can be "not true" (scored 0), "sometimes" (scored 1) and "true" (scored 2). As each question is scored between 0-2, the resulting summary score of all the items can range between 0-26, with higher scores being more indicative of greater depression. As well as being used as a dimensional outcome, some studies have also shown that a cut off point for scores of 11 or more have good specificity for predicting depression 12 . As such, this binary threshold has also been used in several studies 10,13 .

SMFQ within ALSPAC
The SMFQ has been measured on nine occasions between the ages of 10 and 24 in the ALSPAC cohort. At each of these occasions, the SMFQ has been self-completed by the child/young person. However, there are an additional four occasions where For each question, the responses are: not true (scored 0), sometimes (scored 1) and true (scored 2). The total scores are then added up to give a score ranging between 0 and 26 where higher scores indicate higher depressive symptoms.

Amendments from Version 1
This update incorporates the comments from Reviewer 2 regarding the median and the IQR of the SMFQ. These have now been included into Table 3. I have also given some more information regarding additional measures of depressive mood available in ALSPAC such as the DAWBA and the CIS-R.
Any further responses from the reviewers can be found at the end of the article REVISED the SMFQ has been completed by a parent or guardian for the child/young person; these data are not the subject of this data note.
The SMFQ was administered in ALSPAC via postal/email questionnaire or at research clinics. Table 2 shows the how each questionnaire was collected. Across the nine occasions, the SMFQ has been collected via post/email on five occasions, and via a research clinic on the four other occasions. ALSPAC data is split between questionnaire files (post/email) and clinic files; Table 2 also highlights the name of the files where the SMFQ data is stored, along with the names of the SMFQ questions. Syntax for creating the scores is provided as Extended data 21 .
The SMFQ in ALSPAC was not collected at regular age intervals. Table 3 shows the mean age of participants at each assessment. There is no obvious pattern for time between assessments but the longest period between assessments falls between the ages of 18.6 and 21.95 years. The shortest period between assessments falls between the ages of 17.84 and 18.65.

Characteristics of the SMFQ in ALSPAC
The sample size of the SMFQ also tends to vary in ALSPAC, with a maximum sample of 7,364 at the first occasion (age 10.65), compared to the lowest sample of 3,305 at the seventh occasion (age 21.95). Note that sample size has increased in the latter waves of data collection. However, the overall trend of   decreasing sample size means that researchers should be aware of this attrition and take steps towards addressing it such as multiple imputation or full information maximum likelihood.
One of the benefits of assessing the SMFQ repeatedly over time is the ability to examine the nature of depressive symptoms across multiple stages of development (i.e., late childhood to adolescence, across adolescence, adolescence to young adulthood). Table 3 and Figure 1 both highlight how the SMFQ has changed over time. From initially low levels of depressive symptoms in late childhood, scores tend to increase until the age of 18. From here, depressive symptoms begin to decline until the age of 22, where symptoms then begin to rise again to greater levels than previously observed at age 18. There is much more heterogeneity around the data towards the later stages of data collection with higher standard deviations observed. Likewise, the median and interquartile range tend to increase throughout the latter waves. Figure 2 shows histograms for the nine occasions of the SMFQ. The scores tend to be skewed towards smaller values across all occasions. However, there is a trend with the tails from the histograms getting larger across time as the distribution of scores slowly move towards the tails. Relatedly, the number of individuals scoring 11 or above on the SMFQ also tends to increase over time as shown in Table 3.
Validity and utility of the SMFQ Within ALSPAC, the SMFQ has good internal reliability as assessed by Chronbach's alpha. Table 3 shows that the reliability is lowest on the first occasion (0.797, and highest on the seventh occasion (0.915). There are also strong correlations observed between each of the assessments (P values < 0.0001). As Table 4 shows, there tends to be a pattern where occasions measured more closely together have higher correlations (i.e., ages 10.65 and 12.81, compared to the correlations between ages 22.88 and 23.8), and these are particularly strong towards the last three assessments (r > 0.569). The strong correlation between all the assessments indicates that the SMFQ is a valid tool for examining depressive symptoms over time within ALSPAC.
Demographics of the SMFQ A brief exploration of these data shows that the demographic information of individuals who have completed at least one assessment of the SMFQ varies from those who have not completed any assessments. Table 5 highlights these differences, but it is important to note that individuals without SMFQ measures are more likely to be male, have mothers with poorer educational attainment and lower socioeconomic status at birth, be the thirdborn or later child and have a younger mother.

Considerations for the data
There are several considerations that should be noted when using the SMFQ data in ALSPAC. The first is that like all longitudinal studies, ALSPAC is subject to attrition and, as shown in Table 3, the sample size for using the SMFQ tends to decrease over time. As ALSPAC has a plethora for sociodemographic information and a number of other psychiatric assessments, it is possible to impute the missing data (for examine using multiple imputation with missing at random assumptions). Other longitudinal studies have used full information maximum likelihood to address patters of missing data, but considerations should be given to the issue of missing data when using the SMFQ.
The second consideration is that exploring the distribution of data revealed an anomaly in the data, with a random spike occurring at the fifth assessment of the SMFQ (age 17.84). A closer inspection of this data revealed that 183 individuals answered "sometimes" to every question of the SMFQ at this age. Sensitivity analyses in one recent study found that removing these individuals had no effect on the interpretation of the results 2 . Still, researchers may choose to remove these individuals from analysis.
The final consideration is that future assessments of the SMFQ may become available within ALSPAC throughout the duration of the study. A tenth occasion will be released shortly which will address depressive symptoms around the age of 26. If ALSPAC continues to assess the SMFQ past this age, this study will be one of the few longitudinal studies with repeated assessments of depressive symptoms, along with a host of exposure and outcome data. It is also important to highlight that ALSPAC has other measures of depressive mood such as the DAWBA 22 (assessed at ages 7, 10, 13 and 15) and the CIS-R 23 (assessed at ages 18 and 24). Together, these data will be vital for exploring the nature of depression across multiple periods of development.

Ethical approval and consent
Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees, full details of the approvals obtained are available from the study website (http://www.bristol.ac.uk/alspac/researchers/research-ethics/).

Data availability
Underlying data ALSPAC data access is through a system of managed open access. The steps below highlight how to apply for access to ALSPAC data.
1. Please read the ALSPAC access policy which describes the process of accessing the data in detail, and outlines the costs associated with doing so.
2. You may also find it useful to browse the fully searchable research proposals database, which lists all research projects that have been approved since April 2011.

Stephan Collishaw
Developmental Psychiatry Section, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff, UK Depression is one of the most common mental health problems affecting young people. Depression is associated with major societal and individual burdens, including wide-ranging impacts on young people's distress, peer and family relationships, education and long-term health. A developmental perspective is essential given the rise in the incidence of depression across adolescence and young adulthood, emerging gender differences, and a need to understand underlying risk and protective mechanisms in order to inform effective prevention.
This data note provides a helpful and informative guide to the use of the short Mood and Feelings Questionnaire (sMFQ) in the Avon Longitudinal Study of Parents and Children (ALSPAC). The sMFQ is a widely used and well-validated measure of depression symptoms. Repeated assessment of depression across nine occasions spanning late childhood, adolescence and early adulthood is a major strength of the ALSPAC cohort, and this facilitates longitudinal investigation into depression across this crucial developmental period. The use of the same measure on each occasion is a unique strength facilitating application of trajectory-based methods of analysis including across the transition to adult life. The data note provides important background to the measure, helpful descriptive data, and useful practical help for researchers planning to use the measure (e.g. inclusion of variable names, syntax for scoring, missing data patterns).
A few suggestions for revision: There are well-established gender differences in depression, and these change across development. I strongly recommend providing a breakdown of sMFQ scores in ALSPAC by gender and age.
Include reflection on the validity of self-reports (vs other informant reports) of depression, and whether this changes across age.
Comment on whether differences in the mode of assessment (postal questionnaire vs research clinic) and variation in the gaps between assessments might affect analysis, and if appropriate provide recommendations.

Indicate proportion of item-level missing data within measurement occasions
Change heading "Demographics of the SMFQ" to "Predictors of response" Table 3: what is the age range of participants at each assessment?

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes data should be presented by gender. This is a useful paper for investigators wanting to obtain this type of information and the presentation is well done.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes No competing interests were disclosed.