Protocol for an app-based affective control training for adolescents: proof-of-principle double-blind randomized controlled trial

Background: 75% of all mental health problems have their onset before the end of adolescence. Therefore, adolescence may be a particularly sensitive time period for preventing mental health problems. Affective control, the capacity to engage with goal relevant and inhibit distracting information in affective contexts, has been proposed as a potential target for prevention. In this study, we will explore the impact of improving adolescents’ affective control capacity on their mental health. Methods: The proof-of-principle double-blind randomized controlled trial will compare the effectiveness of an app-based affective control training (AffeCT) to a placebo training (P-Training) app. In total, 200 (~50% females) adolescents (11-19 years) will train for 14 days on their training app. The AffeCT will include three different n-back tasks: visuospatial, auditory and dual (i.e., including both modalities). These tasks require participants to flexibly engage and disengage with affective and neutral stimuli (i.e., faces and words). The P-Training will present participants with a perceptual matching task. The three versions of the P-Training tasks vary in the stimuli included (i.e., shapes, words and faces). The two training groups will be compared on gains in affective control, mental health, emotion regulation and self-regulation, immediately after training, one month and one year after training. Discussion: If, as predicted, the proposed study finds that AffeCT successfully improves affective control in adolescents, there would be significant potential benefits to adolescent mental health. As a free app, the training would also be scalable and easy to disseminate across a wide range of settings. Trial registration: The trial was registered on December 10th 2018 with the International Standard Randomised Controlled Trial Number (Registration number: ISRCTN17213032).


Abbreviations
AffeCT = Affective control training; P-Training = Placebo training; DERS = Difficulties in Emotion Regulation Scale Background 75% of all mental health problems have their onset before the end of adolescence 1 . Many of these disorders, for example major depressive disorder, will be recurrent throughout the lifespan creating large costs in human suffering [2][3][4] . Adolescence -here defined as starting with puberty and ending with the attainment of an independent adult role (10-24 years 5 ) -thus may be a particularly sensitive time period for prevention of mental health problems 6 . In this study, we will explore the impact of improving adolescents' affective control capacity on their mental health.
Affective control, the capacity to flexibly engage and disengage from affective information as required by current goal-demands, is impaired across a wide range of mental health problems 7 . Poor affective control capacity has been shown to be associated with poor mental health outcomes over and above neutral 'cool' cognitive control during adolescence [8][9][10][11][12] . We have previously suggested that this association between affective control and mental health can be partially accounted for by emotion regulation 12 . That is, affective control constitutes the cognitive building blocks of successful emotion regulation. Emotion regulation refers to the automatic and volitional processes deployed to modify an individual's affective experiences 13 . Improving affective control in adolescents, whose everyday environments can include high levels of negative affect and affective fluctuations 14-17 , may then confer benefits to emotion regulation capacity and mental health.
Studies conducted in adults have shown that training affective control leads to improvements in both emotion regulation capacity and self-reported mental health [18][19][20][21] . Cool cognitive control training has also been shown to be effective in reducing symptoms of depression and anxiety, as well as improving emotion regulation [22][23][24][25][26][27] .
The cognitive training literature in children and adolescents has largely focused on remediation for learning difficulties as well as neurodevelopmental disorders such as attention deficit and hyperactivity disorder (ADHD 28,29 ). Less is known about the impact of cognitive control training on young people's mental health 30 . A notable exception is the literature on cognitive training for adolescents with psychotic symptoms 31,32 , which synthesizing evidence suggests shows promising effects on symptoms and functioning 33 . However, given that affective control in particular may be impaired in adolescents with high levels of mental health problems, we will trial the effect of an affective control training (AffeCT) paradigm in our forthcoming study. Specifically, we will explore whether AffeCT improves adolescents' mental health and emotion regulation capacity and whether the magnitude of these improvements differs as a function of age.

The present study
To investigate the potential of AffeCT in adolescents, the current study will include 200 adolescents (11-19 years). Including this age range will allow us to investigate the potential age-related differences in the effectiveness of training, as shown in studies using cool cognitive training paradigms 34 .
The training that will be used in the current study is a variant of a paradigm we applied successfully in adults and a preliminary study in adolescents with posttraumatic stress disorder 21,35,36 . The impact of the AffeCT will be assessed in a proof-ofprinciple double-blind randomized controlled trial. In the AffeCT, participants will be presented with visuospatial, auditory and dual (combined visuospatial and auditory) versions of the n-back task. The three versions will, respectively, require participants to continuously update faces or words or both. On the first three days of the 14-day training programme, participants will train on one version each day. On days 4-14, participants are free to select any or all of the training versions. However, both training groups will be provided with a rationale suggesting that training on version C is likely to confer more benefits than the other two versions. By providing a rationale for one training being associated with superior benefits compared to others, participants should be motivated to select C over the other two training versions. However, C is more cognitively demanding, therefore requiring self-regulation to engage in this task over the others for potential future benefits. We hypothesise that opting to engage in a more challenging but potentially more beneficial task is an index of self-regulation, which has been shown to be associated with mental health across the lifespan 37 . This will allow us to explore the role of self-regulation in cognitive training and any effects on mental health outcomes.
The tasks train affective control by requiring effective engagement and disengagement with affective information depending on task-demands. Affective valence is introduced to the training by including valenced stimuli. Specifically, the AffeCT will include 20% neutral, 20% positive and 60% negative stimuli. The rationale for including stimuli of different valences is that mental health problems can be characterised by difficulties disengaging from negative material 38 , avoidance of negative (e.g., threatening) information 39 or aberrant processing of positive information 40 . Moreover, we have recently shown in a metaanalysis that, individuals with mental health problems, affective control, measured with working memory tasks such as the training task, is similar across positive and negative valence 7 .
The effectiveness of the AffeCT will be compared with an active placebo training (P-Training), which includes three versions: shapes, words and faces from a feature match task. The P-Training requires participants to indicate whether the items presented in two panels are matched or mismatched. The training

Amendments from Version 1
We have addressed the reviewers' comments providing more methodological detail.
Any further responses from the reviewers can be found at the end of the article REVISED was designed to be minimally demanding on cognitive control, while exposing participants to the same stimuli as the AffeCT.
To investigate potential benefits of the AffeCT on mental health or emotion regulation, participants will complete self-report measures immediately after training, after one month, and after one year. Additionally, the three facets of affective controlinhibiting attention and responses toward goal-irrelevant affective information, updating affective information or updating information in the context of affective distraction, and shifting flexibly between affective and non-affective task demands -will be assessed using experimental paradigms. Inhibition will be assessed with a modified version of Preston and Stansfield's 41 affective Stroop task, which requires participants to categorize adjectives as either happy or sad that are superimposed over task-irrelevant faces that are either congruent, incongruent or neutral (i.e., scrambled). Updating will be assessed with the affective digit backward span task. In this task digits are serially presented over either a neutral or affective background image and then recalled in reverse order (modified version of standard digit span task; 42). Finally, shifting will be assessed with an affective card sorting task, which requires participants to flexibly switch between affective and neutral sorting rules 12 .
This study will allow us to investigate the following four hypotheses: 1. Affective control can be improved in adolescents (affective control training hypothesis). To investigate this hypothesis, we will compare individuals' performances on the affective n-back task across the two training groups.
2. AffeCT compared to P-Training will lead to greater improvements in all facets of affective control as measured by non-trained affective control tasks, including affective inhibition, updating and shifting tasks (affective control facets hypothesis).
3. The benefits of AffeCT will vary as a function of age (age-related change hypothesis).
4. Increases in affective control from pre-to post-training will be associated with fewer self-reported mental health problems and emotion regulation difficulties, as well as higher levels of self-reported self-control, at each assessment time point (mental health hypothesis).

Study setting
The study will be run in schools in London, Cambridge and surrounding areas, and at the UCL Institute of Cognitive Neuroscience, UK.

Participants
In total, 200 adolescents (~50% female, 11-19 years) will be recruited through schools, advertisements on the lab website, the MQ research portal, the Anna Freud Centre "Schools in Mind" website, and social media. Recruitment will be stratified by age to ensure a proportional representation of each chronological year group. Including 200 participants (100 per training group) results in ≥ 93% power to detect an effect on our first hypothesis that affective control can be trained in adolescence. Power was established with time as within-subjects factor, training group as fixed factor and participants as random factors. The effect size for the interaction was estimated as small to medium d = .40, based on our previous training studies in adolescents 36 and adults 21,35 . The calculator used was https:// jakewestfall.shinyapps.io/pangea/.
Eligibility criteria. To be included participants will have to be between 11-19 years old and speak English fluently. Participants will be excluded from the study if they have a history of traumatic head injury, a diagnosed neurological or neurodevelopmental disorder, or if they are currently enrolled in another cognitive training intervention.
Allocation procedure. Included participants will be randomized to either the AffeCT or the P-Training groups. Condition allocation will be concealed to experimental staff by using computer-generated condition assignment (using Sealed Envelope simple randomisation service) stratified by age (young adolescents 11-14 years and mid-late adolescents 15-19 years; in line with: 12). Allocation will be based on a blocked randomization sequence with randomly mixed block sizes (2-6), which prevents the experimenter from deducing any potential sequencing even with awareness of the randomization type 43 . One experimenter (SS) will only conduct pre-training assessments and not be involved with any further participant testing as they will answer any queries about the training and technical issues that the participants may face during the training.
Blinding. For blinding procedures see extended data 44 . The procedures are uploaded to the trial registration page as a timestamped private document and will be made available online upon study completion. Following the final participant's follow-up assessment (one year after the second testing session, T2), all participants will receive an email describing the study purpose and giving them access to all training tasks for 12 months.
Training procedure and timeline Participants complete a pre-training assessment, followed by 14 days of training within a four-week period. The training will be completed individually by the participants on their own devices (any device that supports mobile apps). Within one week of the end of the four-week period, participants will complete the post-training assessment. Any deviations from the per protocol timeline due to the constraints of school-based testing will be accounted for in our analyses (see Statistical analyses section). Thirty days after the post-training assessment, participants will be asked to complete an online follow-up assessment. A final follow-up assessment will be completed one year after the post-training session. For a schematic overview of the study timeline see Figure 1.
Training phase. During the training phase, participants from both training groups will be presented with three different training tasks (see below for descriptions). On the first three days they will complete a different version of the training task each day. The presentation order of the three versions will be fixed for days 1-3. The P-Training group will complete the shapes (A), words (B), and faces (C) versions on the first, second, and third day of training, respectively. The AffeCT group will complete the visuospatial (A), auditory (B), and dual (i.e., including both modalities; C) versions of the training task on the first three days of training. From the fourth day of training onward participants in both groups will be free to select any of the three different versions of their training schedule.
At the beginning of the training, both groups will be told that they should spend as much time as possible training on version C due to its benefits to attention, memory and emotion regulation. Version C in the AffeCT will be significantly more cognitively demanding 45 than A and B, whereas there are no differences in cognitive demands between versions A, B or C in the P-Training. Emulating the design of established measures of academic diligence and self-regulation 46-48 , the ratio of time spent training on version C relative to A and B will be taken as a behavioural index of self-regulation.

Procedure on each training day.
On each training day tasks will be populated with a different set of stimuli. In both training groups participants will be given the option to end the training any time from 10 mins onward. The full training session will take between 20-30 mins depending on the level achieved. There will be no limit on the number of training sessions they can complete during a day. Training sessions that are less than 10 mins will not be considered as full training sessions, and will not be included in the analyses, nor will participants be compensated for these sessions.
Each time they start the training, participants will be asked four brief questions about their mood, affect regulatory intentions, social context and current activity. To assess current mood, participants will be asked, "How happy do you feel right now?". They will provide their mood rating by moving the cursor on a visual analogue scale ranging from "Very unhappy" to "Very happy". Affect regulatory intentions will be assessed with the question, "Are you trying to change the way you feel right now?". They will be offered nine answer options from a dropdown menu. Participants will be able to select "No." or "Yes, by …" followed by different types of regulatory strategies (i.e., distraction, problem-solving, behavioural activation, reappraisal, avoidance, social support, acceptance or other). Social context will be assessed by selecting from a dropdown menu to indicate whether right now they are: "Alone", "With others (friends/family)", or "With others (strangers)." Finally, participants will indicate their current activity from a selection of eleven options on a dropdown menu (e.g., commuting, school/work).
Adherence/retention. Participants will be compensated for each section of the study to incentivise enrolment and study completion. They will be paid £10 for each pre-and post-training assessments (T1 & T2) and £5 for both the online follow-up assessments (T3 & T4). Participants will additionally receive £2 per completed training day. If participants complete two or more sessions on a single day they will be paid £5.
Training. In addition to payments, retention will be optimised by sending participants a daily training reminder at 8am. Participants who have not completed at least 10 mins of training by 5pm will be sent an additional reminder, informing them that they have a training session waiting for them. A final reminder will be sent at 8pm for any participants who have not completed their minimum training requirement by then.
Follow-up. Two weeks after the initial request to complete the follow-up assessments, email reminders will be sent to incentivise follow-up completion. Reminder emails will be sent at weekly intervals, until the follow-up assessments are completed or until the maximum number of reminders (i.e., three) has been sent, whichever comes first.

Training tasks
Affective control training. The three versions of the AffeCT tasks are described below and depicted in Figure 2.
Training progression. The first three days will start at n = 1. From day 4 onward participants will select one of the versions to train on and training will start at the average level of n-back achieved on the previous training session with the selected version. During each individual training session, the difficulty level will be titrated to each participant's maximum capacity with n increasing by 1 if performance reaches ≥ 70% accuracy and decreasing by 1 when accuracy is ≤ 30%. Accuracy feedback will be provided after each response. A red boarder will flash up around the grid for false  Figure 2. Affective control training tasks. The figure depicts sample trials for each of the three training tasks: A) visuospatial n-back, B) auditory n -back, and C) dual n-back task. Trials depicted with a light blue background require a "No Match" button press, whereas yellow backgrounds indicate "Match" (i.e., target) trials in the respective modality. The green border provides feedback to participants, where green indicates the response was correct, whereas a red border appears for incorrect trials. Feedback is provided after each response or when a trial times out. The example block in Figure 2 is depicted for n = 1. Match trials for the visuospatial n-back training task are trials where the current face is presented in the same location as the face n positions back. For auditory n-back match trials, the same word is presented as the one n trials back. The dual n-back training task includes both modalities and both types of target trials (for additional buttons appearing on screen with the dual n-back see the task description below). 2500ms = the maximal (duration is self-paced up to 2500ms) time between onset of one stimulus and the next (i.e., total trial time); 500ms = face presentation time; 150ms = feedback presentation time; 500-950ms = word presentation time. 20 + n = each block consists of 20 + n trials.
alarms (participant presses Match on a non-target trial) or misses (participant presses No Match on a target trial or fails to provide any response). A green boarder will flash up for all correctly classified trials.

Stimuli.
Each of the training versions will include 20% neutral stimuli and 80% affective stimuli to train the flexible engagement and disengagement from affective information. 30% of the trials will constitute target trials. The words included in the dual and auditory versions of the AffeCT are derived from the Affective Norms for English Words database 49 and with the exception of the positive words were included in previous versions of this training task 21,35 . Positive stimuli are included in the current training task because of the salience of positive material in adolescence 50 , as well as research showing a critical role of reward processing in the onset of mental health problems 51,52 . The words are 20% neutral, 30% positive and 50% negative.
The faces stimuli were selected from several different databases, which are licenced for use online, to provide a diverse stimulus set in terms of demographics and emotional expressions. Visuospatial n-back. In the visuospatial n-back task faces appear for 500 ms on a 4×4 grid. The task requires participants to indicate within 2.5 s whether the face they are seeing in the current trial is presented in the same location as the face presented n trials back. Responses are provided via "No Match" or "Match" button press. Auditory n-back. In the auditory version of the training task participants are presented with words over headphones. On each trial they have 2.5 s to indicate via button press (see visuospatial n-back), whether the word presented in the current trial is the same as they heard n trials back.
Dual n-back. The dual version of the task presents participants with the visuospatial and auditory n-back simultaneously. The task requires participants to indicate whether the location in which the face is appearing on the current trial is the same as the location in which a face appeared n trials back. At the same time, they indicate whether the word they are hearing on the current trial is the same as the word n-trials back. The response options include four buttons: "No Match" for non-target trials, "Location Match", for trials including only a visuospatial target, "Word Match", for trials including only an auditory target, and "Both Match" for trials including both an auditory and visuospatial target. One third of the target trials are visuospatial targets, one third auditory targets, and one third dual targets. Placebo training. The P-Training task requires participants to indicate via button press ("Match", "No Match") whether two panels display exactly the same stimuli in the same positions on a grid. In the shapes version the stimuli are random geometric shapes. The faces and words versions include the same stimuli as the AffeCT. Each trial is self-timed up to a maximum of 90 s after which participants are asked to respond more quickly. The initial trial includes 5 items per panel, the number of items per panel increases with participant's performance.

Pre-and post-training session assessments
For an overview of all measures that will be included in T1-T4 see Figure 3.
Demographics. Self-identified gender, ethnicity and parental education level will be assessed. Parental education will be included as a proxy measure for socioeconomic status (SES). Parental education has been shown to be a robust indicator of SES 58 and has been previously used by our group in similar samples (e.g., 59).
Pubertal development. Pubertal development will be assessed with the well-validated, self-report Pubertal Development Scale 60 . The scale will be sent to participants via email link, so that they can complete it in private at home.

Self-reported mood and mental health.
All self-report measures will be administered on a computer screen.
Positive and negative affect. To assess current positive and negative affect participants will complete the state version of the Positive and Negative Affect Schedule 61 . The scale requires individuals to rate the extent to which 10 positive and 10 negative adjectives describe them. We will ask participants to rate the adjectives with respect to how well they describe them over the past week.
Mental health difficulties. Mental health problems will be assessed with the Strengths and Difficulties questionnaire. The questionnaire is a 25-item self-report measure, which is divided into five subscales 62,63 . Four of the subscales measure difficulties and one subscale measures a strength, prosocial behaviour. The difficulties subscales assess emotional symptoms (internalizing symptoms), conduct problems, hyperactivity/inattention and peer relationship problems. The measure has been shown to have good psychometric properties in the age group that will be recruited for the current study (Cronbach's α of 0.80), as well as good sensitivity, specificity and prospective utility 64-66 .
Emotion regulation. The 36-item Difficulties in Emotion Regulation Scale (DERS) will be administered to assess emotion regulation 67 . The DERS has six subscales that measure: nonacceptance, the propensity to experience secondary negative emotions in response to negative emotions; goals, difficulties engaging with goal-directed behaviours when upset; impulse, the ability to control one's behaviour when experiencing negative emotions; awareness, the tendency to attend to emotions; strategies, individuals' perception that emotions cannot be controlled; and clarity; individuals' ability to correctly identify their emotions 67 . The scale has shown high internal consistency, Cronbach's α = 0.93 67 and has been reliably used in the age range included in the current study 68 .
Self-regulation. The Brief Version 69 of the Self-Control Scale 70 will be administered to measure self-regulation. The scale has 13 items and has shown good internal consistency 69,71 , Cognitive-affective task performance. Three tasks will be included to assess the impact of AffeCT relative to P-Training on the different facets of affective control: inhibition, updating and set-shifting.
Inhibition. Inhibition of affective interference will be assessed using an affective Stroop task 41 . In this modified version of the task participants indicate whether adjectives are happy or sad. The words are superimposed on the image of a face, resulting in three trial types: congruent (emotions of the word and face are matched), incongruent (emotions of the word and face are mismatched) or neutral (the word is superimposed on scrambled face image). This modified version of the task includes only happy and sad as emotion categories, whereas the original task also included words and faces expressing anger. The current version also includes only four words per emotion category and the faces are from two adult actors and two child/ adolescent actors (50% female). These modifications were made to adapt the difficulty level of the task for younger participants and to make the stimuli age-appropriate. The face stimuli are derived from the same face databases as the training stimuli.
Feedback is provided after each trial with a red or green border appearing around the image for 200 ms, indicating an error or correct response, respectively. Trials are self-paced up to 4 s. If no response is detected a red border appears and the next trial is presented. There are 96 trials in total with each actor being paired with each of the eight adjectives in each condition.
Shifting. The capacity to shift flexibly between task-demands will be assessed using an affective set-shifting task. The task is an affective version of the Madrid Card Sorting Task 72 . Participants are dealt a card, which they are asked to assign to one of four decks according to three possible sorting rules: card color, number of items and shape (neutral version) or emotional expression (affective version). Sorting rule switch randomly after 6 to 9 trials (on average after 8 trials). Each rule is presented twice in the neutral and affective versions each, resulting in 96 trials. Participants are required to respond within 30 s, after which the trials are recorded as an error. The presentation order of the affective and neutral versions is randomized. Performance on the task is operationalized as random errors. These are errors that occur on any trial in the series after the initial two trials (needed to establish the correct sorting rule). Random errors are most reliably associated with mental health outcomes in adolescents on this version of the task 12 .
Updating. Updating will be assessed with an affective backward digit span task, where participants are presented with digits (1500 ms) in serial order. The task starts with two digits per trial. Following the final digit in each trial, a keypad appears and participants are required to enter the digits in reverse order. Each span level is presented twice on this task. At least one out of two correct trials per span level is needed for progression to the next level. If both trials are incorrect the task is terminated. To manipulate valence, the digits are presented over negative and neutral background images. The images are from the Geneva Affective Picture Database 73 .
Fluid intelligence. The 12-item version of the Raven's Advanced Progressive Matrices will be used to assess intelligence 74 . Participants will be told that they should complete the task as quickly as possible. The measure has good psychometric properties 75 . We chose the Raven's Advanced Progressive Matrices because it is sensitive in the wide age range included in the current study.
Benchmark training tasks. The benchmark versions of the training tasks will be identical to the training versions with a few exceptions noted below.
Visuospatial n-back tasks. The benchmark version of AffeCT is identical to the visuospatial n-back that will be used in the AffeCT with the exception that in the benchmarking version of the task only four blocks will be presented. Two of the blocks will include faces and two blocks will include scrambled faces.
Placebo task. For the benchmark version of the P-Training we will present the faces version of the feature match task. The version is identical to the training version with the exception that it is only 90 s long.

Data management
Following study completion all data will be linked-anonymized, with the linking documents being kept on separate encrypted drives. Fully anonymized data will be made open access through managed open access following the publication of our findings. That is, any researcher will be provided with our data if they consent to adhere to the General Data Protection Regulation and the British Psychological Society's Code of Ethics and Conduct.
Our consent procedure will inform participants of these data storage and sharing procedures.

Statistical analyses
Statistical analyses will be performed using R 76 . Prior to all hypotheses testing, the two training groups will be compared on age using a Bayesian t-test to ensure that stratification was successful. The groups will then be compared on the following potential confounds using non-parametric Chi-square tests for binary and general linear modelling for continuous variables: gender, parental education, pubertal stage, intelligence, time interval between pre-and post-training (days); testing location and testing groups size to experimenter ratio. Any variables showing significant group differences at baseline will be added to all subsequent group comparisons as covariate.
Next, we will explore the structure of our outcome measures of interest at baseline. Specifically, we will explore the structure of affective and cognitive functioning using structural equation modelling (SEM). We hypothesize that cognitive control is best modelled using separate factors for affective, versus neutral, item content, such that the model in Figure 4 will outperform a single-factor model.
To investigate the first hypothesis (affective control training hypothesis) we will use mixed effects modelling with training group as fixed effect and time as within-subject effect. The outcome of interest will be d' achieved on the affective n-back task. Additionally, we will explore the impact of training on reaction time as a secondary outcome of interest. For the secondary analyses to be considered significant, we will apply a Bonferroni correction to reduce the threshold for statistical significance for two comparisons (accuracy and reaction) to α ≤ .025. We will then explore whether any effects of group and time are moderated by total training time (mins), total number of individual training sessions and ratio of time spent training on training task C relative to A and B. We plan to include these overlapping measures separately as they arguably provide different types of information. Specifically, total training time will allow us to explore dose-response relationships. The other two measures, we propose, index more motivational factors such as diligence and/or motivation by the monetary incentive. To facilitate interpretation of any potential moderating effect, we will use a SEM trees approach to these moderators and enter the relevant groupings as moderators. We will additionally investigate the effect of time spent on each training version separately on the outcome of interest and compare the effect of time spent on the single versions versus time spent on the dual version.
Our second hypothesis (affective control facets hypothesis), that AffeCT will improve inhibition, updating and shifting, will be tested with a multivariate mixed effects model. Time and group will be included as fixed effects and the three measures of affective control as outcomes of interest (i.e., working memory updating, inhibition, and set-shifting). The primary outcome of interest is accuracy, as this has been shown to be a more sensitive than reaction time in dissociating between individuals with and without mental health problems 7 . As with the first hypothesis, we will investigate whether any training benefits are moderated by total training time (mins), total number of individual training sessions or time spent training on task C relative to A and B. The analyses will be repeated with reaction time as a secondary outcome of interest.
Our age-related change hypothesis, will be tested by including age a moderator in the multivariate mixed effects model used to test the affective control training hypothesis. The potentially moderating effect of age will be investigated using SEM trees. SEM trees identify age groupings to the benefits conferred by training. While the exploration is data-driven, it is theoretically informed by the literature showing differential effects of cognitive training on young compared to older adolescents 34 . A second exploratory analysis will include age as a continuous variable in the same model to investigate any linear or polynomial effects of age.
Fourth, to test the mental health hypothesis, we will use latent change score models, a subclass of SEM approaches 77 , which naturally allows for the integration of predictors of rates of change (e.g. improvements in mental health). Specifically, we will investigate whether pre-to post-training changes in the affective control factor established at baseline are associated with  fewer self-reported mental health problems and emotion regulation difficulties, as well as higher levels of self-reported self-control at each assessment time point. The primary analyses will include the post-training and one-month follow-up assessment. Secondary analyses will include the one-year follow-up.

Research ethics approval
Ethical approval for this study has been conferred by the University College London (UCL) Research Ethics Committee on 23 April 2018; Project ID:12753/002. Any protocol amendments will be submitted to the UCL Research Ethics Committee Chair for approval and recorded on the Open Science Framework pre-registration documentation.

Consent and assent procedures
Assent and consent will be obtained from prospective participants and their legal guardians, respectively. For participants under 18 years, study information and consent forms will first be sent to the parents. Parents will then have the opportunity to read the information and contact the research team with any potential questions. Children of parents who provide consent will be asked to provide written informed assent. Participants aged 18 years and over will be asked to provide written informed consent. When participants are tested at the pre-and post-training assessments they will be reminded that they can withdraw consent at any time before, during or after the study without any consequence and that they will be compensated for any part of the study completed until withdrawal.

Availability of data
Consent from participants will be obtained to share data through managed access. Researchers wishing to access the data need to consent to storing and analyzing the data in line with the General Data Protection Regulations and the British Psychological Society's Code of Ethics and Conduct.

Dissemination policy
The study has a multi-component dissemination policy: academic, stake-holders and public.
Academic. Standard academic dissemination of the study results will be sought through journal publications. Findings will also be communicated at scientific conferences and where permitted by journal regulations published on pre-print archives.
Stakeholders. Findings will be communicated via email to all research participants in a newsletter style communication. The main trial outcome paper will also be submitted as a Frontiers for Young Minds article and if accepted sent to all participants.
We will further present the findings during a school talk in any of the participating schools that are interested in this option.
Public. The findings will also be communicated to the public by presenting them at public talks as well as through social media and, if interest can be generated, conventional broadcast or print media.

Trial status
The trial data collection started 21 September 2018 and the funding end date for this trial is 08 January 2022. Pre-training assessments have been completed in 64 participants, but none of these participants have completed training or any post-training assessments. This is protocol version 1 (30 November 2018).

Conclusions
If, as predicted, the proposed study finds that AffeCT successfully improves affective control in adolescents, there would be significant potential benefits to adolescent mental health. As a free app, the training would also be scalable and easy to disseminate across a wide range of settings.

Underlying data
No data are association with this article. 1.

5.
6. disorders in adolescents. An important advantage of this study is that it allows to explore long-term transfer effects. The manuscript is well-written, provides a clear overview of the design of the study and a transparent description of the to be conducted analyses. As such, I would like to compliment the researchers for the Open Science approach and fully support/endorse this timely study. However, some questions remain (at least partially) unaddressed in the current manuscript: The authors allow the participants to decide on which training task they complete following session three while suggesting them to conduct the dual -back version. The rationale for this is not fully n clear to me. In addition, this is likely to yield multiple training trajectories reflecting different difficulty levels (e.g., mixed single vs. double back tasks). This may form a confounding factor for some of the presented analyses. In addition, it complicates evaluation of training progress over time.
Related to this, the authors state that "opting to engage in a more challenging, but potentially more beneficial task is an index of self-regulation". From a developmental perspective the authors target a relatively wide age span (i.e., 11 -19 years), inherently resulting in a sample showing strong heterogeneity in executive functions. As such, multiple other factors are likely to drive this choice (e.g., difficulty of the task, [reduced] maturation of executive control regions [due to a history of internalizing psychopathology], age appropriateness and attractiveness of the selected training procedures for the age group, etc., each of which may interact with other motivational factors). In addition, it would be interesting to also explicitly assess and model factors such as user engagement and task motivation.
Moreover, the introduction lacks a clear rationale for the age-related change hypothesis.
Although mixed yet encouraging findings have been presented regarding emotional transfer effects of cognitive control interventions, establishing cognitive transfer has been more challenging. In particular, cognitive transfer effects are often found to be task-specific. In a recent meta-analysis, n -back tasks have been classified as indicators of updating ability (Zetsche ., (2018) ). To what et al extent can cognitive transfer effects be expected for other executive control functions such as inhibition and shifting ability, and how does this relate to emotional transfer effects?
Throughout the training procedure, the authors reinforce participants to conduct multiple sessions per day. What is the rationale for this, what do the authors consider as the 'optimal dose' for this type of cognitive control training and sample?
Developing a placebo/control training task is a challenging endeavor. However, the presented "placebo training" seems to be a non-adaptive task and as such does not allow to fully account for motivational effects of undergoing training given that the training procedure is adaptive. The authors could potentially control for this by adding a measure such as the Credibility and Expectancy Questionnaire at baseline and following training. Sadly, data-collection has already commenced for this study, which limits the modifications that can be made to the design to take into account the concerns raised by the reviewers of the presented protocol.
Nonetheless, I very much look forward for the results of this interesting study.

References
1. Zetsche U, Bürkner PC, Schulze L: Shedding light on the association between repetitive negative thinking and deficits in cognitive control -A meta-analysis.
. : 56-65 | Clin Psychol Rev 63 PubMed Abstract Publisher Full Text Is the rationale for, and objectives of, the study clearly described? 1

Are sufficient details of the methods provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?

Not applicable
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Depression, cognitive vulnerability, cognitive control training, emotion regulation.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 23 Sep 2019 , University College London, London, UK Susanne Schweizer Dear Dr Hoorelbeke, Thank you for your helpful feedback on our protocol. Please find our responses below to the issues you raised in your report. Your original comments are printed in bold and our responses are copied below. Where appropriate we copied the amended/added sections from the manuscript in italics.
The authors allow the participants to decide on which training task they complete following session three while suggesting them to conduct the dual n-back version. The rationale for this is not fully clear to me. In addition, this is likely to yield multiple training trajectories reflecting different difficulty levels (e.g., mixed single vs. double back tasks). This may form a confounding factor for some of the presented analyses. In addition, it complicates evaluation of training progress over time.
Please see our response to Dr Cohen.

Response:
Related to this, the authors state that "opting to engage in a more challenging, but potentially more beneficial task is an index of self-regulation". From a developmental perspective the authors target a relatively wide age span (i.e., 11 -19 years), inherently resulting in a sample showing strong heterogeneity in executive functions. As such, multiple other factors are likely to drive this choice (e.g., difficulty of the task, [reduced] maturation of executive control regions [due to a history of internalizing psychopathology], age appropriateness and attractiveness of the selected training procedures for the age group, etc., each of which may interact with other motivational factors). In addition, it would be interesting to also explicitly assess and model factors such as user engagement and task motivation.
We agree with the proposed age-related mechanisms, which is why we are explicitly Response: investigating the moderating role of age on training effectiveness. We agree with the proposed age-related mechanisms, which is why we are explicitly Response: investigating the moderating role of age on training effectiveness. Unfortunately, we do not have other metrics of task motivation and user engagement than time spent training, which is why we cannot model these factors as proposed by the reviewer.

Moreover, the introduction lacks a clear rationale for the age-related change hypothesis.
Please see our response to Dr Cohen.

Response:
Although mixed yet encouraging findings have been presented regarding emotional transfer effects of cognitive control interventions, establishing cognitive transfer has been more challenging. In particular, cognitive transfer effects are often found to be task-specific. In a recent meta-analysis, n-back tasks have been classified as indicators of updating ability (Zetsche et al., (2018)

1). To what extent can cognitive transfer effects be expected for other executive control functions such as inhibition and shifting ability, and how does this relate to emotional transfer effects?
This is an important question and one which has not been investigated in the context Response: of affective control, which is why we are exploring the effectiveness of transfer to other facets of affective control. In a previous study we have shown transfer to affective inhibition as measured by the Stroop task included (see our response to Dr Cohen). The effects of the training on affective shifting, however, remain unexplored.

Throughout the training procedure, the authors reinforce participants to conduct multiple sessions per day. What is the rationale for this, what do the authors consider as the 'optimal dose' for this type of cognitive control training and sample?
Given the relative lack of work in this age group and empirically driven "recommended Response: dose" cannot be determined. As noted in our response to Dr Cohen the rationale for including the option to engage in multiple shorter training session is to maximize potential engagement of this age group with the training.
Developing a placebo/control training task is a challenging endeavor. However, the presented "placebo training" seems to be a non-adaptive task and as such does not allow to fully account for motivational effects of undergoing training given that the training procedure is adaptive. The authors could potentially control for this by adding a measure such as the Credibility and Expectancy Questionnaire at baseline and following training.
We apologize for omitting this information in the previous version. The placebo training Response: is also adaptive and we now explicitly state this in the training description: The proposed protocol outlines a double-blind and app-based training study seeking to improve affective cognitive control in adolescents (11-19 years). Training will consist of online practice on an adaptive and affective version of the -back task, with participants allowed to choose between audio, visual and a n dual-audio visual version across 11-days of a 14 day training period. In addition to the training task participants will complete a battery of mood and mental health questionnaires, as well as a battery of cognitive-affective tasks before and after training. The effects of training on changes in mood and mental health and task performance will be compared to a placebo control group completing one of three visual matching tasks consisting of shapes, words or faces. The author's will be testing four hypotheses predicting training related change in affective control on the -back task, improvement on n cognitive-affective tasks relative to the control group, that the benefits of training will decrease with age and that increased training effects on affective control will be associated with lower self-reported mood and mental health problems.

Critique:
The protocol is written with a very clear rationale and objectives. The methods are clearly described and seem appropriate for the objectives of the training, however please also include the maximum -back n level training participants can achieve. The study design seems largely appropriate as well, and has several strengths, including the use of an active control group, multiple outcome measures and follow-up periods. However, centrally, the reasoning behind allowing participants to choose which version of the training to take could be more clearly supported. If the research question is to compare the effectiveness of affective control training, why have participants potentially only completed less effective versions of the training (i.e. audio and visual only) versus the dual version? Additionally, given the adaptive nature of the tasks, increased improvement would seem to suggest a degree of self-regulation, so this construct could use a clearer operationalization.

Is the study design appropriate for the research question? Partly
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: My research explores the nature, and remediation of, cognitive dysfunction in depression.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 23 Sep 2019 , University College London, London, UK Susanne Schweizer Dear Dr Owens, Thank you for your helpful feedback on our protocol. Please find our responses below to the issues you raised in your report. Your original comments are printed in bold and our responses are copied below. Where appropriate we copied the amended/added sections from the manuscript in italics.

However please also include the maximum n-back level training participants can achieve.
The maximum level of back is not capped.

Response: n
However, centrally, the reasoning behind allowing participants to choose which version of the training to take could be more clearly supported. If the research question is to compare the effectiveness of affective control training, why have participants potentially only completed less effective versions of the training (i.e. audio and visual only) versus the dual version?
Please see our response to Dr Cohen.

Response:
Additionally, given the adaptive nature of the tasks, increased improvement would seem to suggest a degree of self-regulation, so this construct could use a clearer operationalization.

Noga Cohen
Department of Special Education, University of Haifa, Haifa, Israel The Edmond J. Safra Brain Research Center for the Study of Learning Disabilities, University of Haifa,

Haifa, Israel
In this report, Schweizer and colleagues propose a protocol for an app-based 14 day training procedure that is predicted to improve affective control, mental health, emotion regulation and self-regulation among youths (11-19 years).
The authors propose an important and timely study, designed to test whether affective control training can be beneficial for youths. The design proposed in this protocol is based on the authors' prior work showing improved emotion regulation following an affective control (working memory) training.
The paper is well written and the proposed study is expected to advance our understanding on the mechanisms involved in emotion regulation and behavioral deficits among youths, as well as open new avenues for treatment. My comments are mainly related to clarification of tasks-related aspects.
What is the rational for allowing participants to choose the version of the task on days 4-14?
Moreover, by reading the "The present study" paragraph it is not clear whether participants in the control group (placebo training) will receive the same instructions -allowing them to choose and prompting them to prioritize one of the versions over the other two versions. Although this information is mentioned later in the paper, I suggest referring to this issue already in "the present study" paragraph.
The authors plan to include both positive and negative stimuli in the affective control training. Are there studies showing beneficial outcomes for this type of training when positive stimuli are used (besides work on eating and addiction with the go/no-go and stop signal tasks)? Can the authors elaborate a bit more about their decision to include positive stimuli in the training?
Can the authors say something about the similarity/difference of the affective control and placebo training in regard to difficulty? Are accuracy rate and mean reaction time more or less similar in the working memory and perceptual tasks? It can be nice to show that training outcomes are not modulated by task difficulty (therefore strengthening the authors' notion that the outcomes are specific to improvement in affective control).
In line with the previous comment, assessing the links between training outcomes in terms of the cognitive/affective control tasks and the self-report measures can be highly valuable to the understanding of the processes by which the training influence emotion regulation, mood, as well as psychological and behavioral difficulties.
Pre/post changes in inhibition will be assessed using a modified Stroop task. Did the authors consider using a more "classic" inhibition task (e.g., go/no-go, stop signal)? Why Stroop?
In the predictions section the authors write that they plan to test improvement in affective control by using an affective -back task (which is different from the training task). The authors mention later in the paper n that this task will be administered before and after the training. I suggest mentioning this before the "Methods" section because when I read the predictions section it was not clear to me. In addition, why not use the training task itself to assess changes in affective control? Is it because participants choose the version of the task on days 4-14? 1 2 Dear Dr Cohen, Thank you for your helpful feedback on our protocol. Please find our responses below to the issues you raised in your report. Your original comments are printed in bold and our responses are copied below. Where appropriate we copied the amended/added sections from the manuscript in italics.
What is the rational for allowing participants to choose the version of the task on days 4-14?
We are interested in exploring the role of self-regulation in cognitive training. Providing Response: a rationale for one training (C) being associated with superior benefits compared to A and B should motivate individuals to select C more often. However, C is more cognitively demanding, therefore requiring self-regulation to engage in this task over the others for potential future benefits. We now elaborate on this further in the manuscript.
"On the first three days of the 14-day training programme, participants will train on one version each day. On days 4-14, participants are free to select any or all of the training versions. However, both training groups will be provided with a rationale suggesting that training on version C is likely to confer more benefits than the other two versions. By providing a rationale for one training being associated with superior benefits compared to others, participants should be motivated to select C over the other two training versions. However, C is more cognitively demanding, therefore requiring self-regulation to engage in this task over the others for potential future benefits. We hypothesise that opting to engage in a more challenging but potentially more beneficial task is an index of self-regulation, which has been shown to be associated with mental health across the lifespan . This will allow us to explore the role of self-regulation in cognitive training and any effects on mental health outcomes." Moreover, by reading the "The present study" paragraph it is not clear whether participants in the control group (placebo training) will receive the same instructionsallowing them to choose and prompting them to prioritize one of the versions over the other two versions. Although this information is mentioned later in the paper, I suggest referring to this issue already in "the present study" paragraph.
We have modified the section accordingly (see above).

Response:
The authors plan to include both positive and negative stimuli in the affective control training. Are there studies showing beneficial outcomes for this type of training when positive stimuli are used (besides work on eating and addiction with the go/no-go and stop signal tasks)? Can the authors elaborate a bit more about their decision to include positive stimuli in the training?
We opted to train affective control across valences because aberrant reward Response: processing is a core characteristic of mood disorders. Similarly, we have recently shown in a meta-analyses that affective control measured with working memory tasks such as the training task is equally impaired in individuals with mental health problems across valences. We now elaborate on this further in the background section: Can the authors say something about the similarity/difference of the affective control and placebo training in regard to difficulty? Are accuracy rate and mean reaction time more or less similar in the working memory and perceptual tasks? It can be nice to show that training outcomes are not modulated by task difficulty (therefore strengthening the authors' notion that the outcomes are specific to improvement in affective control).
While the placebo task is engaging and challenging it cannot be directly compared to Response: affective control training task in terms of either accuracy or reaction time as the former is a self-paced search task. We will consider the potentially confounding effect of task difficulty in our interpretation of our findings.
In line with the previous comment, assessing the links between training outcomes in terms of the cognitive/affective control tasks and the self-report measures can be highly valuable to the understanding of the processes by which the training influence emotion regulation, mood, as well as psychological and behavioral difficulties. Pre/post changes in inhibition will be assessed using a modified Stroop task. Did the authors consider using a more "classic" inhibition task (e.g., go/no-go, stop signal)? Why Stroop?
We have included the Stroop task to replicate findings from the adult literature, which Response: have shown the affective control performance to improve affective Stroop performance (Schweizer et al., , 2011).

PLoS ONE
In the predictions section the authors write that they plan to test improvement in affective control by using an affective n-back task (which is different from the training task). The authors mention later in the paper that this task will be administered before and after the training. I suggest mentioning this before the "Methods" section because when I read the predictions section it was not clear to me. In addition, why not use the training task itself to assess changes in affective control? Is it because participants choose the version of the task on days 4-14?
We have removed the reference to the different versions as we appreciate the Response: confusion this caused. As noted in the methods, the task is the same but includes non-trained stimuli sets as well as a different progression and termination rule. That is, each level of is n presented twice. The task terminates once both blocks at a given level are completed incorrectly.
The authors predict that the benefits of AC-Training will decrease with age (age-related change hypothesis). I advise the authors to relate to this issue in the introduction. Is this prediction based on prior literature?
We have modified the hypothesis to be non-directional as we agree that there is not Reponses: sufficient evidence to support a directional hypothesis. We now also refer the age-related hypothesis earlier stating: "Specifically, we will explore whether AffeCT improves adolescents' mental health and emotion 38 39 7 "Specifically, we will explore whether AffeCT improves adolescents' mental health and emotion regulation capacity and whether the magnitude of these improvements differs as a function of age.
To investigate the potential of AffeCT in adolescents, the current study will include 200 adolescents (11-19 years). Including this age range will allow us to investigate the potential age-related differences in the effectiveness of training, as shown in studies using cool cognitive training paradigms ." The authors write that participants will complete 14 days of training within a four-week period. Moreover, they mention that there will be no limit on the number of training sessions participants can complete during a day. Do they mean that during the 4-week period there are 14 specific days in which participants can do the training? On a training day, can participants do a 2-minute task 5 times to complete the session? I find it hard to follow the training procedure. Moreover, what is the rational for enabling participants to choose the number of training sessions?
A full training regime is considered 14 days but participants can opt to train for more Response: days. There will be no limit on the number of training sessions they can complete during a day. A full training session takes between 20-30 minutes but participants have the option to end the training after 10 minutes. Anything less than 10 minutes will not be counted as a training session. The reason for allowing participants to stop the training after 10 minutes is to increase engagement with the training in this age group. Allowing for multiple training sessions in a day enhances the possibility of maximizing training time in a sample who is likely not to engage in an activity for extended amounts of time but who might be motivated by the monetary compensation to engage in a task repeatedly. In the methods we note: "[…] participants will be given the option to end the training any time from 10 mins onward. The full training session will take between 20-30 mins depending on the level achieved. There will be no limit on the number of training sessions they can complete during a day. Training sessions that are less than 10 mins will not be considered as full training sessions, and will not be included in the analyses, nor will participants be compensated for these sessions." On each training session, participants will be asked "Are you trying to change the way you feel right now?" and could choose a strategy -distraction, problem-solving, behavioural activation, reappraisal, avoidance, social support, acceptance or other. I guess that these strategies will be explained at the pre-training session. Can the authors relate to this? They can do that in the "extended data" document if they prefer not to include these details in the main paper.
have a few comments/remarks that the authors might want to consider: First, would there be a problem in letting the participants 'choose' the single (auditory or visual) or the dual -back training after day 3? I would assume that the single -back can be less demanding on working n n memory resources than the dual -back which can be more engaging for the participant. In this way, how n can we control for possible confounds as a matter of choice of training? How can we compare the efficacy of the single to the dual, which I am guessing is a noteworthy comparison to do anyway, but won't be possible given the unsystematic manipulation if any.
Second, it isn't clear if participants will do the training at home or at school if they are using an app? There is a mention of an app and that the study will run in schools -in form of a group testing session? Or individually?
I fully support this study and I hope that the findings can pave the way towards promoting as well as sustaining better mental health in a population in most need of it.

Are sufficient details of the methods provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Affective and Cognitive Neuroscience, Neurocognitive mechanisms of emotional vulnerability and resilience in anxiety and depression (adults and adolescents) and breast cancer survivorship.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 23 Sep 2019 , University College London, London, UK Susanne Schweizer Dear Professor Derakshan, Thank you for your helpful feedback on our protocol. Please find our responses below to the issues you raised in your report. Your original comments are printed in bold and our responses are copied below. Where appropriate we copied the amended/added sections from the manuscript in italics.
First, would there be a problem in letting the participants 'choose' the single (auditory or visual) or the dual n-back training after day 3? I would assume that the single n-back can be less demanding on working memory resources than the dual n-back which can be be less demanding on working memory resources than the dual n-back which can be more engaging for the participant. In this way, how can we control for possible confounds as a matter of choice of training? How can we compare the efficacy of the single to the dual, which I am guessing is a noteworthy comparison to do anyway, but won't be possible given the unsystematic manipulation if any.
We agree that the effects of the training are likely to vary as a function of the training Response: version selected. As suggested by Professor Derakshan we will look at the effectiveness of training as a function of the training version selected. Specifically, we will look at the effects of time trained on each version on our outcomes of interest. We will additionally also run a comparison of time trained on the dual version versus time trained on the single versions. We now explicitly state this in the analysis section: Second, it isn't clear if participants will do the training at home or at school if they are using an app? There is a mention of an app and that the study will run in schools -in form of a group testing session? Or individually?
Participants will do the training in their own time, on their own devices (any device that Response: supports mobile apps). We now specified this further in the methods.
"Participants complete a pre-training assessment, followed by 14 days of training within a four-week period. The training will be completed by the participants on their own devices (any device that supports mobile apps) outside of school time." NA Competing Interests: