An approach to identifying young children with developmental disabilities via primary care records

Background: Preschool aged children with developmental disabilities frequently receive a diagnosis of an indicator of disability, such as developmental delay, some time before receiving a definitive diagnosis at school age, such as autism spectrum disorder. The absence of a definitive diagnosis potentially underestimates the need for support by families with young disabled children, also delaying the access of families to condition-specific information and support. Our aim was to develop a strategy to identify children with probable and potential developmental disabilities before the age of five in primary care records for a UK birth cohort, considering how the identification of only probable or potential developmental disability might influence prevalence estimates. Methods: As part of a study of the effects of caring for young children with developmental disabilities on mothers’ health and healthcare use, we developed a two-part strategy to identify: 1) children with conditions associated with significant disability and which can be diagnosed during the preschool period; and 2) children with diagnoses which could indicate potential disability, such as motor development disorder. The strategy, using Read codes, searched the electronic records of children in the Born in Bradford cohort with linked maternal and child sociodemographic information. The results were compared with national and Bradford prevalence estimates. Results: We identified 83 children with disability conditions and 394 with potential disability (44 children had a disability condition and an indicator of potential disability). Combined they produced a developmental disability prevalence of 490 per 10,000 which is above the UK estimate for developmental disabilities in children under five (468 per 10,000) and within the 419-505 per 10,000 prevalence estimated for Bradford (for children aged 0-18). Conclusions: When disability prevalence is estimated only using conditions diagnosed as developmental disabilities, most young children with developmental disabilities likely to be diagnosed at later ages will be missed.


Introduction
Developmental disabilities are long term physiological impairments that significantly affect a child's ability to perform activities of daily living, such as independent feeding, mobility, and communication 1 (World Health Organization, Unicef, 2012). Globally in 2016, 840 per 10,000 of children under the age of five were estimated to have developmental disabilities 2 . However, the accurate prevalence estimation of this group of disabilities is influenced by taxonomic and diagnostic decisions and norms in clinical practice and academia and by how conditions recognised as developmental disabilities, e.g. Down syndrome and autism spectrum disorders (ASD), are recorded in healthcare systems. The reliable and accurate estimation of the prevalence and social context of both disabilities and diagnostic practices is necessary for understanding the extent of the burden of disability on individuals and their families for the provision of appropriate health, social care and other supportive services. Awareness of differences in how developmental disabilities are classified and prevalence estimates derived via healthcare systems also provides valuable information for making inter-and intra-country comparisons; and thus, identifying differences in need. The identification of young children with developmental disabilities can enable earlier support for these children and their families, as is recommended 3 .
There may be a great deal of inter-or even intra-country variation in prevalence estimates due to different age ranges and conditions being included in the classification of developmental disabilities. For example, the United Kingdom (UK) prevalence of developmental disabilities for children under the age of five years is estimated at 468 per 10,000 2 . It includes vision and hearing loss, epilepsy, and attention deficit hyperactivity disorder (ADHD) but excludes motor development disorders, except for cerebral palsy when learning disability is indicated. In the United States (US), the prevalence estimate for 3-17 year olds (an estimate for 0-5 year olds was unavailable) is up to 1,500 per 10,000 4 . In addition to the difference in the age range, the US estimate contains a greater range of conditions than the UK estimate, also including cerebral palsy; ASD; stuttering or stammering; learning disorders; and/or other developmental delays. Disaggregation of data by age and the conditions identified as developmental disabilities is helpful but not always presented, especially in small studies where participant identification must be avoided.
For research, electronic health records are an important source of data and clinical codes for diagnoses recorded in primary care records have been used to produce prevalence estimates for people in the UK with learning disabilities and for people who are likely to be disabled 5,6 . In the UK, diagnoses of developmental disabilities are usually made by a secondary care specialist (e.g. a Child Development Centre), communicated to the child's primary care provider via a consultant letter and recorded in the child's primary care record 7 . Disability describes how impairment affects function, but electronic health records are based on a system of clinical codes designed to classify disease and conditions, not function (World Health Organization, 2018). The extent of the impact of a condition on function can vary considerably from no impairment to profound. The degree of disability is not usually recorded alongside the diagnosis, unless specified as part of the clinical code e.g. profound learning disability 5 . Likewise, a child receiving a diagnosis of developmental delay could have a mild, profound or potentially transient disability, but this is not reflected in the clinical codes.
There are two approaches to identifying disability cases from health records: 1) identify those with conditions classified as developmental disabilities (hereafter referred to as disability conditions); or 2) identify those with indicators of potential disability. The first approach will inaccurately identify some, but presumably few, children who do not have disability (false positive) but will miss many children who might (false negative). The second approach will have a higher false positive rate and a lower false negative rate. Allgar et al. 6 provide an example of the first approach to case ascertainment as they sought to identify only people with a very high likelihood of learning disability, therefore arriving at a conservative estimate of the prevalence of people with learning disability. Lingam et al. produced a prevalence estimate for people who potentially have disability, which will have included an unknown number of people without disability and is an example of the second approach to case ascertainment. The preschool period (child age 0-5 years) is when parents usually start to notice developmental differences between their child and other children 8,9 . It is during this period that they often seek and receive either a diagnosis for a disability condition, such as ASD, or for developmental delay or a developmental disorder, which are indicators of potential disability but are not definitive 10 . For example, ASD and cerebral palsy can be diagnosed at age 3 years 10,11 . However, in practice, it is common for clinicians to wait until children are school age (above five years) to diagnose the disability condition 12-15 . For cerebral palsy and learning disability this may be because the diagnostic tests cannot be used accurately before the child is school aged 16 . For example, learning disability is underdiagnosed in preschool children because an IQ test, the standard assessment used to distinguish mild, moderate or profound learning disability, is not appropriate for use 17 . Instead, it is standard practice for children aged 0-5 years with developmental disabilities to first receive diagnoses that indicate potential rather than definitive disability, such as developmental delay or disorders relating to specific characteristics (e.g. delayed speech or social interaction) 18 . The only notable exceptions are a few congenital anomalies, such as Down or Edwards' syndromes, for which all pregnant women are offered routine pre-natal screening 19 .

Amendments from Version 1
We have made changes in response to all the second reviewer's comments. Chiefly, we have improved the presentation of Figure 2- Figure 5, strengthened the rationale for the study and elaborated on the limitation of not being able to use a gold standard assessment to verify disability in the study sample. We have also corrected some errors in the referencing. The updated version includes all the appropriate references.
Any further responses from the reviewers can be found at the end of the article

REVISED
To add further uncertainty, whether and which diagnosis is received during the preschool years is not a reliable indicator of disability severity. For example, a child under five can receive the same diagnosis of developmental delay for either a profound learning disability or if they simply fail to meet their developmental milestones but go on to catch up over time 20 .
There are relationships between sociodemographic factors and the diagnosis of disability conditions and indicators of potential disability which will affect prevalence estimates, perhaps particularly during the preschool period. For example, low socioeconomic status is associated with an increased risk of developmental delay 21 . There is a greater risk of Down syndrome in children of older mothers (who also often have high education and socioeconomic status) 22 ; and high maternal education is associated with higher rates of ASD diagnosis 12 . Pakistani ethnicity is associated with a higher prevalence of congenital anomaly 23 . Children of ethnic minority mothers are less likely to receive a diagnosis of ASD by age eight years than children of white British mothers (but the true prevalence is not expected to differ between these ethnic groups) 12 . As sociodemographic contexts vary by place, so too might the accuracy of prevalence estimates and risk of false negative and positive misclassification in the measurement of developmental disability via primary care records. Some of this variance has known biological explanations, while some may be due to inequalities in accessing healthcare and recording diagnoses. For example, ethnic minority mothers without English language fluency may find it harder to persist in seeking a specific disability diagnosis (e.g. ASD) than, in particular, white British mothers with high education. These factors may influence the extent of the false negative/positive error and thus bias any estimates of the prevalence of developmental disability. For example, children of ethnic minority and low socioeconomic status mothers may be both more likely to receive a diagnosis of an indicator of potential disability rather than a disability condition during the preschool period and less likely to receive any diagnosis.
To our knowledge, no previous research has looked at how many young children receive diagnoses of disability conditions versus indicators of potential disability and the relationship of these to sociodemographic factors. No existing strategies to identify people with disabilities via primary care data are appropriate. Allgar et al.'s list of clinical codes would not identify young children with developmental disabilities as even the children with severe learning disability would not yet have received a definitive condition diagnosis and codes for indicators of potential learning disability (e.g. developmental delay) were not included. Lingam et al.'s list extends beyond the scope of developmental disabilities. As such one strategy is too narrow and the other not narrow enough to estimate the prevalence of developmental disabilities during the preschool period.
Our aim was to develop a two-part strategy that identified children with probable and potential developmental disabilities diagnosed before the age of five years in primary care data for a UK birth cohort, considering how the identification of only probable or potential developmental disability might influence prevalence estimates.
This study was conducted as part of a PhD research project exploring the health and healthcare use of mothers of young children with developmental disabilities using primary care data linked with sociodemographic data from the Born in Bradford (BiB) cohort study 24 . As such, much of the research presented here is also available in the lead author's thesis published in the White Rose eThesis Online repository. In the wider research project, we found that mothers of young children with developmental disabilities are more likely to have ill-health than other mothers of children of the same age, with increased rates of symptoms of psychological distress, exhaustion, and musculoskeletal pain 24,25 . Parents report experiencing high emotional stress during the period of seeking and receiving a disability diagnosis for their child 9,26 . The absence of a definitive diagnosis delays families' access to condition-specific information and support and can affect their awareness of eligibility for financial support and social care. Where deferral of a definitive diagnosis, associated with assessment issues or parental sociodemographic factors occurs, the diagnostic process is protracted with a potentially great impact on families' health and access to supportive resources.

Methods
Women were recruited to the BiB cohort between March 2007 and December 2010. The cohort comprises of 12,453 mothers, 13,776 pregnancies and 3,448 fathers, and has been described elsewhere 27 . We used data from the BiB baseline questionnaire completed when women were recruited to the study linked with primary care records for mother-child dyads for the period 2007-2015.
The BiB study received ethical approval for data collection from the Bradford Research Ethics Committee (Ref 07/H1302/112). Our study received ethical scrutiny as part of our BiB data application, and we complied with all standards and policies of the University of York's Data Management Policy 28 . As our study was a secondary analysis of an existing data set, additional ethical approval was not needed.

Strategy development
We developed a two-part strategy to identify children aged 0-5 years via electronic primary health records: 1) with a disability condition; and 2) with an indicator of potential disability. The strategy was developed following consultation with paediatric clinical researchers at the University of York (Dr Bob Phillips and Professor Lorna Fraser) and paediatric clinicians in the Bradford Child Development Centre (Dr Stella Yeung and a Lead Nurse in the Child Development Service).
The first part aimed to identify children with the most common (prevalence of at least one in 10,000 children aged 0-18 years) conditions that cause significant long term variation in the child's capacity to achieve the expected developmental (functional performance) milestones for their age 1 and can be diagnosed below the age of five years. We used the developmental disabilities most frequently associated with paediatric disability complexity by Horridge et al. 29 : ASD, cerebral palsy, chromosomal syndromes and intellectual disability ( Table 1). The specific chromosomal syndromes of Down syndrome and Fragile X syndrome were specified as these are the two most common chromosomal syndromes which typically cause disability 6 . Learning disability is one of the few conditions classified by severity (from mild to profound) in the clinical coding hierarchy and was restricted to moderate-profound severity.
The second part of the strategy reflected the practice of deferred disability diagnosis identified by the Bradford-based clinicians, that whilst the disability conditions can be diagnosed in children under five, it is common practice for children in Bradford (and elsewhere) to receive these diagnoses later (age 5 years and above). Therefore, we also aimed to identify children with indicators of potential disability classified as: developmental delay; generalised developmental disorders; disorders relating to specific developmental characteristics; mild or unknown severity learning disability; and generic disability (e.g. on learning disability register and disability not otherwise specified).
Each part consisted of four code lists: four for the disability conditions (n=148 Read codes) (Table 2A) and four for the indicators of potential disability (n=103 Read codes) ( Table 2B).
The lists used the hierarchical clinical code language Clinical Terms Version 3 (commonly known as Read codes) as all primary care practices in Bradford use the SystmOne electronic record system 36 . They were developed using the National Health Service (NHS) Clinical Terminology Browser Clinical Terms Version 3 -Clinical 2017-10-01 Drugs 2016-04-01 (also known as a Read code browser). Only Read codes which positively identified the condition or indicator were included in the lists. They were identified by searching for the condition key term (e.g. Down syndrome), then using the step-up/step-down functions to identify all relevant Read codes in the 'Clinical findings: Disorders' hierarchy of the classification system. Drug, treatment and referral Read codes were not included. These codes indicate potential disability complexity, including chronic illness, but do not on their own provide enough information to deduce disability. Codes for assessment were included only when the outcome was a definitive diagnosis of one of the disability conditions. For example, the paediatric consultants recommended including the Gross Motor Function Classification System (GMFCS) for cerebral palsy. The codes for the Surveillance of Cerebral Palsy Europe (SCPE) classification system for cerebral palsy were excluded as the assessment is not used in preschool children and the GMFCS is the preferred assessment tool in Bradford. For ethical and resource reasons, it was not possible to access the free text in the children's medical records to look for descriptions of disability severity, or to independently verify the diagnoses by performing additional assessment.
The primary care records of all children in the BiB cohort were searched to identify every child who had one or more of the codes recorded in their primary care record during the period of birth to their fifth birthday. The clinical codes Table 1. United Kingdom (UK) prevalence estimates and disability characteristics for the disability conditions.

ASD
• 38 per 10,000 boys aged 8 years (3 years for girls) 32 • 103 per 10,000 children aged 5-8 years in Bradford 12 • Delayed speech and social interaction problems (typical) • Learning disability (if severe ASD) and behavioural problems (common) Cerebral palsy • 20 per 10,000 children aged 0-5 years 10 • Up to 41 per 10,000 children aged 0-5 years in Bradford 33 • Motor impairment (typical) • Learning disability and behavioural problems (common) Down syndrome • 9 per 10,000 children aged 0-5 years 34 • Learning disability (typical) • ASD and behavioural problems (common) Fragile X syndrome • 2 per 10,000 aged 0-10 years (3 years for boys, 1 year for girls) identified via pre-natal screening 35 • Learning disability (typical) • ASD and behavioural problems (common) Combined prevalence for the conditions • 419 per 10,000 • 505 per 10,000 for Bradford ASD; Autism Spectrum Disorders Mod-severe learning disability  and date of entry for every code were extracted and the age of the child when each code was recorded was calculated to explore differences between when disability condition and indicator codes were received. To protect the anonymity of the study participants, these calculations used the month and year of the child's birth, using the first date of the month for the calculation. Only one child per mother was included, with further exclusions for children who were withdrawn from the BiB study or died, did not have linked primary care data or a maternal BiB baseline questionnaire (n=2,469) 24 . For every child, we also extracted data on the child's sex, mother's age at the child's birth, ethnicity, measures of socioeconomic status, such as education. Where there were fewer than five children with any of the disability conditions, the children were excluded from the study to protect their anonymity.

Prevalence
Data analysis was performed using Stata 15 37 . Descriptive statistics were used to describe and compare the prevalence of developmental disabilities and sociodemographic differences between the two parts of the case ascertainment strategy.
There is not a gold standard strategy to identify developmental disability in primary care records against which to validate our strategy. As we do not have accurate estimates for the true prevalence in our dataset or the preschool age group in the UK, we compare the prevalence estimates in this dataset to the available Bradford and UK estimates for specific disability conditions presented in Table 1 and an estimate of developmental delay for three year olds in the Millennium Cohort 21 .
Based on the estimates in Table 1 for children under five, where available, the UK prevalence is 419 per 10,000. However, prevalence estimates also vary by country and region, with a higher prevalence of childhood disability found in Bradford 27 . A higher prevalence of ASD and cerebral palsy has been found for Bradford compared with other UK estimates; and a higher prevalence of chromosomal syndromes (per 10,000): BiB cohort 25 versus UK prevalence 15 13 . This estimate includes Down and Fragile X syndromes but is not disaggregated by condition, so the elevated prevalence of these conditions in the BiB cohort is unknown. Given the known higher prevalence of some conditions in Bradford (Table 1), the prevalence estimate for this geographical area is (at least) 505 per 10,000.
Most prevalence estimates, including all those presented in Table 1, are dependent on the children receiving diagnoses for the disability conditions before the age of five. Lingam et al. found a potential disability prevalence of 130 per 10,000 in children aged 0-4, increasing to 500 per 10,000 for the 5-9 age group. This suggests that we may find the prevalence of both disability conditions and indicators of potential disability in children aged 0-5 identified via primary care records to be substantially lower than both the UK and Bradford prevalence estimates (presented in Table 1).
The prevalence of developmental delay in high income countries is estimated at 300 per 10,000 of children 38 , and was 320 per 10,000 for children aged three years in the UK Millennium Cohort 39 . The prevalence of global developmental delay, where children have a delay in more than one area of development e.g. motor and speech, is 100-300 per 10,000 40 . The second part of our case ascertainment strategy was expected to identify at least 384 children in the BiB cohort (3% of 12,000), and at least 120 with more than one indicator of potential disability (as a measure for global delay). Given the clinical norm of initially diagnosing developmental delay or a generalised disorder, it was likely that a high proportion of the children identified by the primary strategy would also have indicators of potential disability. The number of codes and the code description found in the records of the children identified as having disability conditions were compared with those of the children with indicators of potential disability only.
We expected sociodemographic differences between the children and the parents identified via the two parts of the strategy: 1) mothers of children in the disability condition group were expected to be older on average (and have higher socioeconomic status) than the potential disability group due, in part, to the relationships between higher maternal age and the increased prevalence of Down syndrome and diagnosed ASD; 2) the age of the children when they received their condition or indicator diagnosis was expected to be lower in the condition group because Down syndrome and Fragile X syndrome are usually identified during pre-natal screening 41 and greater disability severity (including more visible disability) was expected to be associated with earlier diagnosis; and 3) the disability condition group was expected to have a higher proportion of boys than the potential disability group due to the higher prevalence of ASD and Fragile X syndrome in boys 32,35 . We performed tests of between group difference for the sociodemographic factors in which we expected the two groups to vary.

Results
Of the 9,727 children included in the linked study, 477 (4.9%) had either a disability condition (probable disability) or an indicator of potential disability or both ( Figure 1).
The two strategies combined produced a developmental disability prevalence of 490 per 10,000. This is within the 419-505 per 10,000 prevalence estimated for Bradford and above the UK estimate for developmental disabilities (468 per 10,000) (Table 3).

Probable disability
Of the 9,727 children, 83 (0.9%) had a Read code for ASD, cerebral palsy or Down syndrome recorded in their primary care record between birth and age five, giving a prevalence of 85 per 10,000. There were no children diagnosed with moderate-profound learning disability. To protect anonymity due to small numbers, the children with a diagnosis of Fragile X syndrome were excluded from the study.
Of the 148 Read codes searched for, 13 (recorded 97 times) were found in the primary care records (Figure 2).
No children had more than one of the disability conditions, but 53% (n=44/83) had at least one indicator of potential  disability (Figure 3). Of the 103 Read codes included in the secondary case ascertainment strategy, 16 (recorded 62 times) were found in the children's primary care records.
As anticipated, the children with Down syndrome received their diagnoses earliest (soon after birth) and the children with ASD received diagnoses latest (Table 4); and a large proportion of children with ASD and cerebral palsy received a diagnosis of developmental delay prior to receiving a condition diagnosis. There was considerable variability in the age at which children with ASD and cerebral palsy received their first diagnosis (of either a condition or indicator).
Compared with the other disability condition groups, the ASD group had a higher proportion of male than female children, mothers who were white British and educated above age 16 (Table 4). The average maternal age of the Down syndrome group was higher, but there was not a greater proportion of Pakistani (versus white British) or high (versus low) educated mothers compared with the other groups.

Potential disability
Of the study sample, 4.1% of the children had indicators of potential disability (n=394/9,727), a prevalence of 405 per 10,000 ( Figure 4). Just under a quarter (24.1%) had more than one indicator (from the same or different categories: developmental delay, developmental disorders, mild/unspecified learning disability or other unspecified disability) ( Figure 5).
Of the 103 Read codes in the secondary case ascertainment strategy, 33 (recorded 521 times) are found in the children's primary care records (Table 5).  Clinical codes for general developmental delay or delay in speech and language development occurred most frequently in the children with potential disability (Table 5) as well as those with disability conditions (Figure 3).

Between-group sociodemographic differences
As anticipated, the disability condition group had significantly more highly educated, older mothers and the children received an earlier diagnosis than the potential disability group (Table 6).   Although there is a greater proportion of males in the condition than potential disability group the difference is not significant.
There were no significant differences for the characteristics in which the groups were not expected to vary (Table 7).

Discussion
We developed a two-part strategy to identify children with probable and potential developmental disabilities diagnosed before the age of five in primary care data for a UK birth cohort. Using this strategy, we found that the prevalence of developmental disability in preschool children might be greatly underestimated if only disability conditions are used (85 rather than 419 per 10,000), as is usually the case in research 2,39 . The prevalence of the disability conditions was lower than anticipated (except for Down syndrome and ASD). However, when the disability condition strategy that identifies children with diagnosed developmental disability is used together with a strategy that identifies children with potential developmental disability, the resultant prevalence (490 per 10,000) is within the 419-505 per 10,000 prevalence estimated for Bradford and above the UK estimate for developmental disabilities (468 per 10,000).
Many of the children with the disability conditions (excluding Down syndrome) received an initial diagnosis of an indicator of potential disability (36%; n=17 of the ASD group; 50% of the cerebral palsy group). The prevalence of potential disability appeared superficially to be higher than in other samples, such as the 320 per 10,000 prevalence of developmental delay in the UK Millennium Cohort (n=12,689 children aged 3) 21 . However, that sample consisted of only monolingual English-speaking families as the multilingual families had extremely high rates of developmental delay. The BiB cohort includes multilingual families, and we used a different sampling strategy (clinical codes in electronic health records rather than cross-sectional assessment). Given these differences and the broader age range in our study, it is likely that the prevalence in the cohorts are roughly equivalent.
An additional finding of note was that fewer children in the potential disability group than expected had more than one indicator (n=90 versus the 120 expected) which gives an indication of global development delay 40 . This is highly unlikely to mean milder or more transient developmental delay than observed elsewhere, rather it may reflect issues with the identification of global developmental delay or of long intervals between the initial diagnosis of a delay and follow up assessment. It probably also reflects the paediatric clinicians' anecdotal evidence that when there are signs of developmental disability in a preschool child, an initial diagnosis of developmental delay is given, and a more definitive diagnosis sought after the age of five years. Whilst there may be clinical explanations for these findings, it could also be a red flag for long waiting times for child disability assessment, potential inequalities in access to assessment associated with sociodemographic factors, and the unmet needs of families for support.  (1) Developmental disorder NOS (1) Developmental language impairment (1) Developmental language disorder (1) Developmental speech disorder (1) 1 NOS, Not otherwise specified  The practice of deferring giving a definitive (condition) diagnosis until the child is older could explain why there were no or very few children with moderate-severe learning disability or Fragile X syndrome in the cohort. Accordingly, it was highly likely that some of the children in the sample who received indicator diagnoses before the age of five had, as yet, undiagnosed ASD, cerebral palsy and moderate-profound learning disability. It might reasonably be assumed, therefore, that the 83 children in our sample who did receive a disability condition diagnosis before the age of five either had severe disability or a very typical manifestation which made diagnosis straightforward. The possibility of greater disability severity in this group may be supported by the finding that over half (53%, n=44) of the children with disability conditions also had an indicator of potential disability compared with 24% (n=95) of the potential disability group having two or more indicators.
Alternatively, sociodemographic factors may have influenced the diagnosis. In particular, we found, as expected, that a greater number of mothers of children with ASD had higher education than mothers of children with other disability conditions or indicators. This may be due to higher educated mothers being more assertive or persistent in the pursuit of a diagnosis for their child 12,42 . An unexpected finding was that there were not more Pakistani than white British children with Down syndrome, despite the prevalence of other congenital anomalies being higher in Pakistani families in Bradford 13,23 . The explanation could be that Pakistani mothers in the cohort tended to be younger than the white British mothers. This would reduce the risk of Down syndrome in the babies born to Pakistani mothers given the known association between maternal age and Down syndrome.

Strengths and limitations
We developed a practical strategy for identifying preschool children with developmental disabilities via primary care records and have identified the practice of deferring the diagnosis of specific developmental disabilities. Without including indicators of potential disability in case ascertainment strategies, young children with developmental disabilities will not be identified, and therefore, would be underrepresented in any prevalence estimates or in research requiring the identification of these children. Only a hybrid strategy which includes Read codes for probable and potential disability could accurately identify the true number of children in the preschool age group with developmental disabilities via primary care records. Whilst our strategy aimed to achieve this, some limitations remained.
The two parts of the strategy were developed to try and balance the risk of including versus excluding an unknown number of children without disabilities. Neither strategy could eliminate the risk of false positives or negatives misclassification entirely, with a greater expected risk of misclassification for the potential disability strategy. However, in practice, this risk was low as it was expected that a disability condition or indicator of potential disability would, largely, only be diagnosed during the preschool period if the characteristics were distinct, which is more likely for moderate and severe than mild impairment. Sensitivity analysis to assess and compare the extent to which the case ascertainment strategies resulted in misclassification error (false positive and false negative) was not performed as this would have required the use of a gold standard comparison strategy. None of the existing strategies were suitable or could be swiftly adapted solely to gauge the extent of the misclassification error. Attempts were made to identify differences in disability severity by measuring the number of diagnoses and age of the child when the mother's symptoms were detected but no inferences about disability severity could be made.
For ethical and resource reasons, we could not access the free text in the children's medical records to look for descriptions of disability severity or to independently verify the diagnoses by performing additional assessment. As these are common challenges in using routinely collected data to produce disability estimates, our study provides an initial insight into the potential utility of such an approach and highlights the limitations which could be explored and addressed through further research. Although our two-part strategy identified a disability prevalence close to other prevalence estimates, further research is needed to assess the reliability of our approach and findings. A study is required that can perform independent clinical assessment of disability to verify the diagnoses in the primary care records including assessment of disability severity and the potential for making a definitive condition diagnosis when a disability indicator has been recorded. A longitudinal study could explore the journey of different children from diagnosis of a disability indicator to receiving a disability condition diagnosis and how this might vary between children based on different clinical characteristics, including disability severity/number of disability indicators, and sociodemographic characteristics.
We have highlighted the clinical practice of deferred disability diagnosis during the preschool period. For data systems with linked mother and child health records, our strategy could be used to investigate regional variation in time to diagnosis and thus variation in practice. This could include investigation of our finding that fewer children appeared to receive indicators of global development delay (more than one indicator of disability) in Bradford than in other study samples. Further our strategy can be used in the investigation of the impact of diagnostic uncertainty on caregiver health. Despite caregiver statements that the period of disability identification and diagnosis are highly stressful, there is little empirical research on this period in relation to caregiver ill-health. Studies have looked at caregiver adjustment but encompassing a wider child age range 43,44 . The longitudinal investigation of changes in caregiver adjustment and health over time, and at key points of disability identification, diagnosis, and transitions between preschool, school and adult services have not been investigated. By identifying key points of caregiver burden and whether these vary by disability diagnosis, services and interventions that support families at high-risk intervals across the life course could be developed.

Conclusion
We have developed a strategy for identifying preschool aged children with developmental disabilities via primary care records. We have shown that by using a two-part case ascertainment approach which combines strategies that identify probable and potential disability, a realistic estimate of developmental disability in children aged 0-5 can be obtained.
However, questions remain about misclassification error and without accessing additional information about the children, disability severity cannot be assessed using the strategy.

Data availability
Scientists are encouraged and able to use BiB data, which are available through a system of managed open access. The steps below describe how to apply for access to BiB data.
-Before you contact BiB, please make sure you have read our Guidance for Collaborators. Our BiB executive review proposals on a monthly basis and we will endeavor to respond to your request as soon as possible. -If your request is approved. we will ask you to sign a collaboration agreement and if your request involves biological samples we will ask you to complete a material transfer agreement.
resident, it is unclear what "White" represents as well as "Pakistani". What about those who immigrated from other continents (Africa, Asia, Latin America)? And are these distinctions necessary for such research or do they contribute to public discrimination?

Results:
In the first sentence under "Probable disability", it states 83 children were identified (or 85/10,000), yet in Figure 1, the numbers do not seem consistent. Please clarify the totals either in the text, tables, or figures. I assume the 83 = 39 + 44 in the figure but had to reread the numbers, tables, and figures to be sure, since each used a different denominator (Figure used total cases, I assume, while Table 3 used "per 10,000". Consistency or clear explanation are preferred. I struggled similarly with Figure 2 in comparing the "n"s on the figure with the numbers in the column. 1.

Discussion:
In the first sentence of the 3rd paragraph ("An additional finding..."), the authors identify a potential flaw in their method and discuss it as warranted. There is a reference to an article that their conclusion is based on. This highlights the weakness of the definition of "global developmental delay" which offers a refuge for imprecision in diagnosis and leads to public health research problems being addressed here. The authors are credited for some discussion to this point and the impact. 1.
"...a greater number of mothers of children with ASD had higher education..."-Is it that higher educated mothers are more assertive or persistent or do they have greater access to diagnostic care due to related wealth tied to their education? 2.
Strengths and limitations: First paragraph-By definition, developmental disabilities require time (development) to emerge and be identifiable, creating the research problem being examined here. Would the authors consider offering a "best age" for studying prevalence of DD, rooted in their findings? 3.
While these critiques highlight questions for the investigators, they deserve praise and close examination of the strengths and limits of their research. The Conclusion is perfect, as is this sentence under the Strengths and Limitations section ("As these are common challenges in using...") which sums up the value of this paper.

Is the study design appropriate and is the work technically sound? Yes
Are sufficient details of methods and analysis provided to allow replication by others?
clearly the response to the question "so what?". What difference does it make that the practice is to give a quite general code in the 0 -4 year olds, then a definitive diagnosis only later?
I can appreciate that to bring children in for an evaluation is another whole study, but this paper really suffers from its lack of a gold standard. Without that, it is essentially impossible to evaluate the utility of this method. The fact that the prevalence estimate comes close to other prevalence estimates is nice, but what does that really tell us? Do the children identified through these methods really have developmental disabilities and what is the severity for those who do? I would like to see a follow up study involving bringing in a sample of these children for clinical evaluation to provide a more objective view of the utility of the proposed method of identification. I am not necessarily saying that this sort of validation sub-study is a requirement for publication, but I would like to see this concept included in the discussion section for future directions for research.
These tables would not get a passing score in my class. Why do you run together different types of information in the same column? This just makes the tables more difficult to decipher. It also means that you have parentheses cluttering up the body of the table making it hard to read.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
Thank you, Prof Pettygrove, for your review of our manuscript. We have made the recommended revisions and think the manuscript is greatly improved. A summary of the changes we have made in response to each of your comments is provided below. We acknowledge that the important 'why bother' message was missing from the abstract and the importance of early diagnosis for families could be strengthened in the introduction. In both places, we have added statements stating the potential impact of delayed diagnosis on families' access to appropriate support and resources. We believe that it is now clear from the introduction that it is valuable to researchers, health and social care commissioners and families that accurate estimates of disability prevalence in young children can be obtained from the analysis of primary care records, and of the benefit of early diagnosis of disability to families.
We have revised the sentence on page 3 identified as repetitious, identifying only the additional conditions included in the US estimate. In responding to your 'so what?' question of why it matters that fewer children in the potential disability group than expected had more than one indicator of disability, we have suggested possible explanations for the observed finding as we believe it could indicate issues in the diagnostic process. Thank you for identifying and suggesting the need for a clear statement of research needed to further explore and strengthen our approach and findings. Although it was not possible within the scope of our study to use a gold standard assessment, we acknowledge this as a significant limitation and elaborate on this in our strengths and limitations section. Hopefully we may be able to perform this study ourselves in future. Thank you for prompting us to improve the accessibility of our figures. We have revised the presentation of figures 2-5 -they are now side by side bar charts with the labels below and not on the bars).
would need to be able to map the Read coded terms to those available in the SNOMED-CT browser.
One of the challenges in this area of practice is the lack of harmonisation of terminologies used by clinicians and this is likely to vary between clinicians and localities. Thus, whilst in Bradford the term 'developmental delay' is used a lot, other clinicians and services avoid this, rather favouring Early Developmental Impairment or Provisional Intellectual Developmental Disorder (ICD-11). It would be unusual for a child aged four years or under to have a confirmed learning disability, as assessments at this age are unreliable due to developmental changes.
These differences in terminologies would need to be taken into consideration if others were wanting to replicate the study in different settings, but mapping should be possible, as long as definitions of the terms used are clear.
it is surprising that 'possible autism spectrum' was not part of the provisional language used, but again, it may be that the local clinical service uses different terms for children undergoing autism assessments in the early years.
In all, the authors are to be commended to tackling an area where there is a paucity of robust, population research and hopefully others will build on the knowledge gained.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound? Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes