Small for gestational age babies and depressive symptoms of mothers during pregnancy: Results from a birth cohort in India

Background: Annually, more than a million low birthweight (LBW) is born in India, often afflicting disadvantaged families. Several studies have undertaken the association of poverty, nutritional status, and obstetric factors with LBW. Through our study, we aimed to examine the possibility of any relation between the Edinburgh Postnatal Depression Scale (EPDS) score measured during pregnancy with the incidence of babies born Small for Gestational Age (SGA). Methods: Pregnant women attending the antenatal clinic at a public hospital between 14 to 32 weeks were recruited from April 2016 to Oct 2017. The EPDS was administered to assess depression through face-to-face interviews. Newborn anthropometry was performed post-delivery. For analysis, birth weight <10 percentile was classified as SGA. Results: Prevalence of depressive symptoms (EPDS score >11) was 16.5% (n=108/654) in antenatal mothers. These women delivered a higher proportion of SGA babies (21.3 v/s 15.8) compared to women with no symptoms. The odds of women giving birth to a child with SGA were twice as high for women with EPDS scores >11 (adjusted OR = 2.03; 95% CI = 1.12 – 3.70) compared to the women with EPDS scores of ≤11, The EPDS 12 (Adjusted OR = 1.96; 95% CI = 1.04 – 3.69) and EPDS 13 (Adjusted OR = 2.42; 95% CI = 1.24 – 4.70) cut-off categories also proved to be a risk factor for SGA with significant p-value (0.0006 and 0.0003) and the individuals with more than 13 EPDS score is found to have the highest odds of SGA. Conclusions: We found a strong association of antenatal depressive symptoms during pregnancy with SGA measured by EPDS. Thus, we recommend the implementation of timely and effective screening, diagnostic services, and evidence-based antenatal mental health services to combat SGA and further associated-metabolic syndromes.


Amendments from Version 2
Based on the reviewers' suggestions amendments have been made to the second version. We have removed the entire analysis on the predictive capabilities of EPDS for SGA, as it has potential for a separate paper. Additionally, as Large for gestational age (LGA) was briefly mentioned in the methods and results, we have considered removing the same throughout. We have now included in the results analysis with EPDS score as a continuous predictor. For better presentation of tables, we have included means and categories for continuous measures. Additionally, we have retained the cell count for gravidity and parity and have mentioned mean age; excluded still birth and abortion to reduce the confusion. We have also considered respondent's and her husband's income as a criterion for socio economic status, in order to avoid the problem of multi-collinearity, and hence have excluded socioeconomic status from our model. Instead of pre-pregnancy BMI, we included maternal adiposity measured through skinfold thickness as a confounder. Reported maternal substance use was very minimal (less than 1%) in the study sample; hence we have adjusted for husband's current tobacco and alcohol consumption as an indicator domestic violence. In the supplementary table, we have also provided results of the effect modification interaction.

Introduction
Low birth weight (LBW; <2500 g), a marker of poor intrauterine growth, leads to the double burden of stunting in childhood and predisposes to obesity in adolescence 1,2 . The pathways triggered by LBW lead to perpetuating, independent cycles of ill health 3,4 . More than one million babies are born with LBW in India every year. LBW often afflicts disadvantaged families, accentuating the risk of child mortality and morbidity 5 . Despite the high prevalence of LBW, its causes are poorly recognized. Infants with LBW comprises of preterm babies (<37 weeks gestation) or Small for Gestational Age (SGA) or both 6 . SGA is defined as birth weight below the population-specific 10 th percentile for the gestational age. Children, who are born SGA, have several short and long-term adverse outcomes 7-9 .
Apart from the increased risk of mortality, infants with SGA might have a broad spectrum of adverse growth, morbidity, and developmental outcomes 10 . Due to poor nutritional status, a range of problems from malabsorption to growth retardation can affect the growing children 11 . The 'thrifty phenotype' hypothesis describes that adaptive mechanisms due to child undernutrition are on the rise and result in type 2 diabetes mellitus (T2DM), which is epidemic in low-and middle-income countries (LMICs). Confronted with undernutrition as a fetus and child, the compensatory adaptive mechanism stores excess energy as fat 12 . As a result, LBW in babies accentuates the risk of obesity, insulin resistance, cardiovascular diseases and T2DM 13 .
Over the past several decades, program interventions to reduce LBW have mostly focused on addressing poverty, maternal nutritional status, and obstetric factors in India. However, the proportion of children with LBW has remained stagnant or reduced only minimally over this period in LMICs, such as India. The role of antepartum depression is often neglected as a determinant of SGA, despite evidence indicating that women with antepartum depression have an increased risk of having a preterm birth and LBW babies 14 . Meta-analyses also suggest that the magnitude of this association varies with how depression is measured, country of residence and socioeconomic status 14,15 . Almost all the evidence on the impact of antepartum depression on LBW is from developed countries. As an exception, a study from Bangladesh has suggested an association of high Edinburgh Postnatal Depression Scale (EPDS) score in pregnant women may be associated with LBW 16 . Also, the role of EPDS as screening criteria for antepartum depression is underexplored in most LMICs, and studies have used different cut-offs for different samples 17 .
This study aims to examine if the relationship between the Edinburgh Postnatal Depression Scale (EPDS) score and SGA. Despite the high prevalence of SGA in LMICs such as India, the awareness of mental health problems is low. Antenatal depression in pregnancy is not routinely screened in LMICs, including whether it can be a risk factor for poor intrauterine growth. This is specifically relevant in metropolitan cities like Bangalore, which has relatively better socio-economic standards in communities compared to several other regions but continues to experience persistently high proportions of children born with SGA.

Study setting
Maternal antecedents of adiposity and studying the transgenerational role of hyperglycemia and insulin (MAASTHI) is a birth cohort established to prospectively identify risk factors in pregnancy associated with adverse infant outcomes, especially in predicting the possible risk markers of later chronic diseases 18 . The detailed protocol of the study has been published elsewhere 18 . Briefly, pregnant women with gestational age (GA) between 14 to 32 weeks were recruited. GA was determined by ultrasonography record and if not available, the last menstrual period was noted. In the 1557 women enrolled, 654 women who had completed follow up after delivery comprise the study sample for the present study, stillbirth and twins were excluded from the data analysis. (Figure 1).

Data collection
Data was collected from April 2016 to October 2017 at a secondary level public hospital. Data at baseline (second and third trimester of pregnancy) included socioeconomic conditions that included religion, education, occupation and the women's reproductive history, social support, depressive symptoms and consumption of tobacco and alcohol. EPDS tool was translated into the local language (Kannada) and then back-translated to English for accuracy. Through this, efforts were made to ensure a clear and conceptually accurate translation that was easily understood by the local population. The Questionnaire was then administered to the respondents by trained research assistants who would interview without altering the actual meaning. The response score is quantified by asking the frequency of occurrence of depressive symptoms for several days. The respondent's weight, height, Mid-upper arm circumference (MUAC), head circumference, biceps, triceps and subscapular skinfold thickness were recorded. Birth data were collected through structured interviews and anthropometric assessment by trained female research staff in the hospital. The data collection for pregnant women regarding depressive symptoms was done during the second and third trimester, and the anthropometry of the newborn was recorded between 2 to 48 hours following delivery. Several birth outcomes were assessed including the length of pregnancy, mode and place of delivery, complications during labour, live or stillbirth, birth weight, length, head, chest, waist, hip and MUAC of the newborn. Skinfold thickness was measured using Holtain callipers at biceps, triceps and subscapular sites.

Assessment of antepartum depressive symptoms. The Edinburgh
Postnatal Depression Scale (EPDS) is a widely used selfreporting questionnaire developed specifically to screen for symptoms of perinatal depression 19,20 . EPDS has been validated by Fernandes et al. for prenatal depression in South India at a cut-off of ≥13 (sensitivity = 100%, specificity = 84.90%, and AUC = 0.95) 21 . Depressive symptoms are assessed by a 10-item scale, which determines the psychosocial stress level of pregnant women in the last seven days. Social support was measured using a questionnaire developed at St. John's Research Institute to evaluate a broad range of social support (i.e., emotional, instrumental, informational, and appraisal) 22 . This questionnaire has a total of 12 items, and each item is scored between 0 (definitely not enough) to 3 (definitely enough). The highest score being 36 means excellent social support and 0 meaning low social support. The scale reported an excellent value of internal consistency, as determined by Cronbach's alpha of 0.935 all variables showing a high level of consistency. Trained Research Assistants using an Android tablet administered the questionnaire; the system is programmed to generate a EPDS score in real-time, and in case the woman scored >13 she was referred to the psychiatrist at the hospital. The correlates of EPDS have internal consistency exceeding 0.8. Pregnant women were classified into two groups based on their EPDS score: 0-11, without depressive symptoms; 11+ with depressive symptoms. This 10-item scale has been translated into many different languages and validated in many countries, including India 23 . The cutoff values of EPDS as a screening tool for antenatal depression in primary health care settings is dependent on cultural settings. For example, a cut-off EPDS score for the Spanish version of the EPDS is 8/9, and the Chinese version is 9/10 24 . A cutoff score of 11/12 was found to detect perinatal depression with acceptable sensitivity and specificity in Goa, India 25 . In concurrence with this evidence, we aimed to assess the exact EPDS score cut-off value (11, 12 or 13) as a better predictor of association between antenatal depression and SGA.
Other risk factors. Possible risk factors for SGA were assessed by a standardized questionnaire seeking information on women's medical and obstetric history (parity, abortion), socio-economic and demographic characteristics (age, education, and occupation), smoking habits and alcohol consumption. The research staff measured women's height, weight, MUAC. Skinfold thickness was measured using Holtain callipers at biceps, triceps and subscapular sites.
Anthropometry. Adult anthropometry: After ensuring that the scale was placed on level ground, the research staff would view 'zero' reading. After ensuring that the respondent would remove heavy outer clothing and shoes, two readings to the nearest 10 gram were taken. Further, we used SECA 213 portable stadiometer for measuring the height to the nearest 0.1 cm. This was measured by requesting the respondent to stand straight with her feet together, ensuring the posterior surface of the head and heels was applied to the stadiometer. The head was positioned in an imaginary line joining the upper margin of the external auditory meatus and the lower border of the orbit of the eye (Frankfurt plane). The head plate of the stadiometer would then be pulled down to ensure that it rests on the crown of the head 26 .
Baby anthropometry: Newborn anthropometry was performed using SECA 354 Weighing Scale and SECA 417 Infantometer. The baby was placed naked on the digital weighing scale, and readings are taken to the nearest 0.5g. For measuring infant length, the baby's head is held against the end of the head plate and the legs extended until they are flat. The footplate is brought up to the heels ensuring that feet and knees were flat, the length is recorded. Chasmors body circumference tape was used to measure the circumferences. Head circumference is measured with the baby's head on the side so that the maximum occipitofrontal circumference could be found. The tape was placed on the forehead, on the most anterior point (just above the eyebrows) and passed around the head to the most posterior part of the head making sure the maximum circumference is found. Waist circumference was taken by placing the tape around the abdomen immediately above the umbilicus, ensuring that it is horizontal and marked at the end of expiration. Chest circumference is measured by placing the tape around the chest at the level of xiphisternum, ensuring that it is placed horizontal and marked at the end of expiration. MUAC was recorded with the arm bent, allowing the measurement to be taken with the baby in its natural position. Skinfold thickness is measured on the left side of the body using the Holtain Calipers. Three readings to the nearest 0.2mm were taken unless this caused too much distress, in which case, a single measurement was taken. For triceps skinfold thickness, the tape is placed around the upper arm at the level of the mark done while measuring MUAC. With the tape in position, a horizontal line is drawn on the skin posteriorly at the level of the mark. Another vertical line is marked on this line at the most dorsal part of the upper arm. This level was determined by 'eyeballing' the mid-point. The point at which the fold is to be measured was then marked; the skin was lifted over the posterior surface of the triceps muscle, above the marked point, on a vertical line passing upward from the olecranon to the acromion. The callipers are applied below the fingers such that the marked cross was at the apex of the fold. Biceps skinfold is measured in the anterior midline of the arm over the biceps on the same level as the triceps skinfold. For subscapular skinfold thickness, the inferior angle of the scapula was identified, and the skin is marked immediately below the angle. The skinfold was picked up above the mark with the fold slightly inclined downward and laterally, in the natural cleavage of the skin. The calliper jaws are applied below the fingers, such that the marked point is at the apex of the fold 26 .
The weight of infant was classified into percentiles based on the Indian standards for birth weights of newborns based on the sex and order of the baby 27 . Anything less than 10 percentile was classified as SGA, between 10 to 90 th percentile was appropriate for gestational age (AGA) and greater than 90 th percentile was large for gestational age (LGA). Babies born before 37 weeks of gestation were considered premature. Other details of neonatal morbidity and hospitalization were obtained from the family members and medical records.

Statistical analysis
We used logistic regression analysis to assess the association between SGA and EPDS score. The association with SGA was examined taking the EPDS score as a continuous as well as categorical predictors. The 3 categorical variables were formed based on the cut-off scores of 11, 12 and 13. This was adjusted for known confounders based on literature review for maternal age, religion, respondent's and husband's incomes, gravida, parity, husband's current consumption of tobacco and alcohol and respondent's sum of skinfold thickness. These variables were adjusted based on the priori information 28-33 . The goodness of fit of the models was assessed using the Hosmer-Lemeshow statistic. Statistical analysis was performed using Stata/IC 14.2 for Mac (Revision 19 Dec 2017, Copyright 1985-2015 StataCorp LLC) and SPSS version 23. Descriptive analysis was done for maternal and neonatal characteristics for both women with and without depressive mental symptoms.

Results
A total of 654 pregnant mothers who completed the EPDS questionnaire were taken into consideration for analysis in the present study. The mean maternal age of the study sample at baseline was 23.6 ± 3.9 years. Mothers with depressive symptoms had lower mean social support scores compared to mothers without depressive symptoms ( Table 1). The study found that overall, 16.51% (n=108) of the antenatal mothers had depressive symptoms (EPDS score of >11).
Among mothers with depressive symptoms (EPDS score >11), 43 (39.8%) mothers were below the age of 22 years. Depressive symptoms affected predominately young mothers and the symptoms decreased with increase in age of the women. The majority of the study sample comprised of Muslim women, and they were the most afflicted with depressive symptoms (65.7%), followed by mothers belonging to the Hindu religion (32.4%). Pregnant women with high school education had a high proportion of depressive symptoms (44.3%) compared to other levels of educational attainment. Among the pregnant women, the depressive symptoms in the women with first pregnancy were high (41.7%) and decreased with an increase in the number of times conceived and delivered.
The results indicate that 60% of husbands of pregnant women with depressive symptoms were consuming tobacco, and 21% were drinking alcohol (Table 1).
Women with depressive symptoms delivered a greater proportion of SGA (21.3 vs 15.8%) compared to women with no symptoms. While there were no major differences for normal term delivery, women with depressive symptoms had a slightly elevated proportion of caesarian section delivery (31.5 vs 24.2%) ( Table 2).
Maternal and neonatal characteristics in relation to SGA and AGA status are summarized in Table 3.
No major variation was found between the mean and standard deviation for age, gravida, parity and abortion status of mothers with relation to SGA and AGA category. A higher proportion of SGA was found in male babies compared to female babies. Mothers who delivered SGA babies had greater mean EPDS scores during pregnancy (6.27 vs 5.73%) and at the time of delivery (21.1 vs 14.5%) compared to the mothers who delivered AGA babies. Among the mothers who delivered SGA babies, a majority (68.8%) were younger (under 25 years) and the SGA proportion decreased with the increase in age. Hindus had a higher proportion of delivering SGA babies (49.5%) followed by Muslims (45.9%) and Christians (4.6%) ( Table 3). Education of the partners with higher than high school level had a lesser chance of delivering SGA babies compared to their counterparts.
Adjusted odds ratio (OR) and 95% confidence interval (CI) for EPDS cut off 11, 12, 13 and SGA is presented in Table 4. The EPDS score as a continuous predictor did not show statistically significant association with SGA. A significant  association was found between EPDS 11 cutoff and SGA. Women with EPDS scores of above 11 had a twice as high odds of giving birth to a child who would be SGA (Adjusted OR = 2.03; 95% CI = 1.12 -3.70) compared to the women with EPDS scores of 11 and below. The EPDS 12 (Adjusted OR = 1.96; 95% CI = 1.04 -3.69) and EPDS 13 (Adjusted OR = 2.42; 95% CI = 1.24 -4.70) cut-off categories also proved to be a risk factor for SGA with significant p-value (0.0006 and 0.0003), and the individuals with more than 13 EPDS score is found to have the highest odds of SGA

Discussion
Using a longitudinal study, we found that a relationship may exist between the symptoms of mental distress in pregnant women and SGA babies. Using a validated EPDS questionnaire, appropriate for the India populace, we were able to capture scores from 654 expectant mothers during and postpregnancy. We also found that the prevalence of depressive symptoms was relatively high (16.5%; n=108/654). This was higher compared to our previous study using the Kessler-10 scale (prevalence of 8.7%) across Bangalore 34 , and is comparable to other Asian countries (20%) and LMICs (15.6%) 35,36 .
Further, more salient findings from our analysis showed that pregnant women with depressive symptoms in the second trimester exhibited an increased likelihood of giving birth to SGA infants when assessed using a cut-off value of 11 or above of the EPDS. This association was observed after adjusting for possible confounders: maternal age, religion, consanguineous marriage, respondent and husband's education, occupation, and income, gravida, parity, anaemia, husband's current tobacco and alcohol consumption, and respondent's sum of skinfold thickness. A significant association between scores of 11 or above and SGA were noted (p≤0.005) that were further corroborated with OR -values, while lower EPDS scores were not significantly associated. We believe that mental health problems faced by pregnant women may not be simply and completely measured by EPDS alone, as the perception of stressors may vary and there may be varying levels of buffer mechanisms 37,38 . Thus it is essential to explore further these findings based on perception, coping, and interpersonal attitudes 33,39,40 . Earlier studies have shown maternal nutrition to be an important predictor of LBW 48 . In our study, after adjusting for anaemia, . Further proof/evidence that delineates causative pathways leading to LBW and its interactions will provide a unique, compelling opportunity to inform the development of specific preventive interventions for childhood malnutrition. Since LBW is multifactorial in origin and can lead to childhood obesity and its complications, our results indicate psychosocial environment as a potential, contextually relevant risk factor for LBW.
There is a need for establishing the causal association, after which the policymakers can prioritize screening pregnant women for mental health problems. The governments can modify and or/incorporate mental health screening within the existing provisions of the national health mission.
In summary, we were successful in using a simple screening method at the primary care level for screening depression in the antenatal population. Healthcare workers at primary health care levels can thus efficiently screen pregnant women for depression and refer those in need of further care.
There are three potential explanations for the association of antenatal depression and SGA. One, antenatal depression might result in dysregulation of the hypothalamic-pituitaryadrenocortical axis, thereby releasing stress hormones. For example, cortisol levels might mediate this association 49 , possibly resulting in decreased blood flow to the placenta and consequent restriction of oxygen and nutrients to the fetus leading to intrauterine growth retardation 50-54 . To explore this possibility further, mediation mechanisms by cortisol and other catecholamines prospectively is necessary. Two, it is possible that the antenatal depression interacts with other maternal antecedents, such as maternal undernutrition, poor access to healthcare facilities, smoking, alcohol and substance abuse, which are independent known risk factors of LBW 55 . Such an association may be generally seen in women of disadvantaged social groups. Therefore poverty might confound the association between mental health and LBW. Although we have adjusted for income, there might be a possibility of residual confounding distorting the association. Thirdly, pre-conception depression and mental health status have also been showed to be associated with low birth weight 56

Strengths and limitations
There are various strengths of our study: First, our study is a birth cohort with real-time data quality monitoring. Second, our prospective examination of antenatal depression with SGA is carried out in a sufficiently large study sample. Third, we were able to adjust for several potential confounders; fourth, have also demonstrated the usefulness of the 10-item EPDS screening tool in screening for antenatal depression that can be used even at primary care level. Further, there were few limitations: first, since our study is not immune to the source of systematic error similar to all other observational studies, we are not providing any causal inference regarding the association between EPDS and SGA. Second, we have not recorded prepregnancy BMI. Third, we did not assess violence which is a considerable risk factor; and finally, we have not evaluated anxiety as part of the screening and it might be a limitation given that anxiety and depression are known to be co-morbid 57,58 .

Conclusion
Our findings indicate that maternal distress due to depression can lead to the birth of SGA babies. There is a need to universally screen women for depression during pregnancy.
The causal links and mediation by other factors have to be delineated before policymakers can consider prioritizing screening and care for mental health, especially in the women belonging to vulnerable or lower socioeconomic backgrounds.

Ethics and consent
The study was reviewed and approved by the institutional ethical review board at the Bangalore campus of IIPH-H ( "First, the authors could better consider the association between EPDS as a continuous score, including the potential for non-linearity as they suggest may occur and present this as a main not supplemental finding (as it will also address their secondary aim , so that we get a sense of the distribution of SGA across the continuum and whether there is any indication of "dose-response" or whether those thresholds are not as strongly associated. A cross-tab by EPDS score, would be very useful descriptive information. "Second, while it is heartening that authors specified a priori covariates, the high likelihood of residual confounding should direct them to consider other additional measures at their disposal in this cohort study including maternal BMI, blood pressure, gestational diabetes or dysgylcemia (which, given its etiology, will precede their collection of EPDS in mid/late pregnancy), and maternal medical history, especially, if available, medication." I appreciate the addition of some measure of maternal adiposity (skinfold thickness) as a covariate, however I'm not sure why BMI could not be used/included as the authors spend an entire paragraph describing the protocol for "Adult Anthropometry" which includes standing height and weight. I am also still puzzled, given the context (MAASTHI) is an adiposity and metabolism cohort, why other pregnancy conditions and medical history were not included in models. I should note here that the methods now report different covariates than the discussion and Table 4 caption, so they need to be harmonized.
Also, it is not clear why concern with "multi-collinearity" would require removal of all measures of socioeconomic status from the model. Surely, at minimum, maternal education and household income could be retained? There is often important and critical variation between these factors relevant to maternal depressive symptoms (e.g. from mismatch between education and income). From Table 1, it is interesting to note, for example, that there are higher proportions of both middle school and graduate+ educated mothers among EDPS > 11 (vs <11), suggestive of a mixed SES group. "Third, they had hypothesized that associations may differ based on socioeconomic status, yet they did not evaluate any interactions by such factors. If any were specified a priori, they would be interesting to present (even if "not significant")." Author have not yet addressed potential interaction with SES despite it being alluded to in the Introduction as an important source of variation and therefore contribution to the literature regarding the local context. For example, consistent or different associations across SES groups would provide some insight as to why results from Bangalore are different from other contexts. "Fourth, the cohort has commendably extensive measures of anthropometrics. It is surprising, given findings they have cited (e.g. Broekman, . (2014) ) that they did not investigate associations with et al parameters other than SGA." In addition, in light of all the effort put into collecting the other anthropometric measures and reporting of measurement methodology in the manuscript, I'm also not clear why associations between EPDS and those other anthropometric measures were not reported given, as above, past findings with respect to head circumference. These results should be provided as well to avoid the impression of selective reporting. In contrast, Table 3 which reports anthropometry (and other measures) stratified by SGA/AGA are not as informative to your main hypothesis. "The flow chart is missing some information…Since mothers were recruited in 2 and 3 trimester, it is surprising that so large a fraction would be lost to follow-up. Since depression and fetal health may both be related to loss-to-follow-up there is a possibility that any associations found in your final sample may 4 nd rd be related to loss-to-follow-up there is a possibility that any associations found in your final sample may be subject to selection bias. At the very least, baseline characteristics of recruited participants should be reported, stratified by whether they are retained in your final sample." The authors still need to address why there is such a striking loss (794/1557 = 51% loss) from late pregnancy depression assessment (N = 1557) to live births (N = 763) when only 9 are accounted for from either twin or still birth. As noted before both maternal symptoms and fetal health may be related to loss-to-follow-up, so it is important to note why so few were followed up at birth. At minimum, it should be stated, e.g. how many subjects could not be traced or refused to participate vs. how many had pregnancy loss. Adding baseline data for those lost to follow-up in a supplement would also be helpful. "In Table 4, it is not entirely clear which tests the two columns of p-values refer to. Assuming "EPDS score in the model" refer to the p-value for the specific coefficient and "model" refers to the p-value for the goodness-of-fit test including all covariate, this should be clarified in the caption text. Undue precision in the estimates is also discouraged. For example, 2 decimal places for odds ratios and 3 for p-values are probably sufficient. I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. 1.

2.
3. This manuscript explores the association of maternal depression during pregnancy with infant birth size. The primary outcome was small for gestational age (SGA) defined based on population based birth weight percentiles. Notably the investigators also collected a broad range of infant anthropometric measures. A secondary aim of the study was to determine the utility of depression symptoms as measured by the Edinburgh Perinatal Depression Scale (EPDS) for predicting SGA. As noted by the authors despite the high prevalence of both SGA and perinatal depression in low and middle income countries the majority of the research on the associations of these outcomes has been in high income, Western countries (though there are notable exceptions). The authors further note that given the growing concern regarding diseases such as diabetes and cardiovascular disease in low and middle income countries and the potential links with low birth weight via the "thrifty phenotype" hypothesis that there are important long term population health implications for failing to understand and address the role of perinatal depression in fetal growth restriction. The cohort described in this analysis provides a potentially powerful data set to begin to address these questions.

Jodie G. Katon
Given the number of infant anthropometric measures and questions regarding the cut-off of the EPDS score raised by other reviewers I suggest the following: Focus the manuscript on associations of EPDS and infant anthropometry (beyond just SGA).
Remove the analysis on the predictive capabilities of EPDS for SGA, as this could be further developed in to a secondary manuscript.
Consider not just dichotomizing EPDS but consider looking at it as a continuous predictor as well as a non-linear predictor.
Large for gestational age (LGA) is only mentioned briefly in the methods and results, but I find it 4. 1.

2.
Large for gestational age (LGA) is only mentioned briefly in the methods and results, but I find it interesting that LGA is also higher among those with symptoms of depression. Focusing the manuscript more on the associations EPDS with infant anthropometry might enable further discussion of this finding. If you don't explore further I would consider removing as it is otherwise distracting and difficult to reconcile with the primary findings. Minor points regarding organization and data presentation: Tables include means and categories for continuous measures (e.g. age), either chose one (I would recommend categories) or if both are retained group them together in the tables (e.g. mean age followed by the age categories).
Move the Total column to be the first column in Table 1. This will enable the authors to first describe overall cohort characteristics and then to focus on descriptive comparisons based on EPDS score.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 30 Jan 2020 , Indian Institute of Public Health -Bangalore, Bengaluru, India Giridhara R Babu Focus the manuscript on associations of EPDS and infant anthropometry (beyond just SGA).
Many thanks for the suggestion, however, at this point we would like to share that we Response: have decided to present the associations of EPDS and infant anthropometry as a separate paper. Hence adding it now, is beyond the scope of the paper.
Remove the analysis on the predictive capabilities of EPDS for SGA, as this could be further developed into a secondary manuscript. developed into a secondary manuscript.
Many thanks for this important suggestion as well. We have made sure to remove all Response: the analysis on the predictive capabilities of EPDS for SGA, and as suggested planned another paper based on the same.
Consider not just dichotomizing EPDS but consider looking at it as a continuous predictor as well as a non-linear predictor.
Thank you for suggestion. We have included this in Table.4.

Response:
Large for gestational age (LGA) is only mentioned briefly in the methods and results, but I find it interesting that LGA is also higher among those with symptoms of depression. Focusing the manuscript more on the associations EPDS with infant anthropometry might enable further discussion of this finding. If you don't explore further I would consider removing as it is otherwise distracting and difficult to reconcile with the primary findings.
Many thanks for this important suggestion. We duly accept the suggestion and have Response: made sure to remove all the analysis LGA.
Minor points regarding organization and data presentation: Tables include means and categories for continuous measures (e.g. age), either chose one (I would recommend categories) or if both are retained group them together in the tables (e.g. mean age followed by the age categories).
Thank you for the suggestion. Changes are made accordingly.

Response:
Move the Total column to be the first column in Table 1. This will enable the authors to first describe overall cohort characteristics and then to focus on descriptive comparisons based on EPDS score.
This study seeks to demonstrate associations between reported maternal depression symptoms and risk for small-for-gestational-age (SGA) birth. The motivation is that SGA is related to substantial subsequent morbidity, the role of maternal depression in SGA is underexplored in the Indian context, and implicitly, this relationship in India may differ from more commonly studied countries. The authors should be commended for the strengths of their study particularly the live ascertainment of EPDS and the extensive anthropometry collected in neonates.
In general, this study supports a large, if inconsistent body, of past findings and meta-analyses (Szegda et . (2013) , Eastwood . (2017) ) that support an association between depressive symptoms as al et al measured by the EPDS and adverse birth outcomes such as LBW, PTB, or in this case, SGA. On the other hand, the study does not move towards filling the noted gaps in current understanding of the causal nature of maternal depressive symptoms, including 1) whether associations merely reflect the importance of pre-conception depressive symptoms, 2) whether key covariates such as medication usage and medical history (Zhao, . (2018) ) can be accounted for, and 3) as mentioned by other reviewers, et al whether associations remain across subclinical scores. Nor does it address some of the local needs they refer to in their introduction.
Within the constraints of their current design, the authors could potentially improve upon the literature and their analyses in a few key ways: First, the authors could better consider the association between EPDS as a continuous score, including the potential for non-linearity as they suggest may occur and present this as a main not supplemental finding (as it will also address their secondary aim).
Second, while it is heartening that authors specified a priori covariates, the high likelihood of residual confounding should direct them to consider other additional measures at their disposal in this cohort study including maternal BMI, blood pressure, gestational diabetes or dysgylcemia (which, given its etiology, will precede their collection of EPDS in mid/late pregnancy), and maternal medical history, especially, if available, medication.
Third, they had hypothesized that associations may differ based on socioeconomic status, yet they did not evaluate any interactions by such factors. If any were specified a priori, they would be interesting to present (even if "not significant").
Fourth, the cohort has commendably extensive measures of anthropometrics. It is surprising, given findings they have cited (e.g. Broekman, . (2014) ) that they did not investigate associations with et al parameters other than SGA.
Finally, given the possibility that depressive symptoms may reflect pre-conception depression and the likelihood of residual confounding, discussion of "effects" or benefit of intervention during pregnancy should be minimized. The authors should be commended for mentioning this in their conclusions, however the extent of residual confounding and selection bias (see "major point" below) is not given sufficient weight in their "three potential explanations" paragraph.
A secondary aim appeared to be exploring whether a threshold for EPDS could be used for screening on the basis of SGA risk. Risk prediction models may be valid even with absent causal interpretations, however, the development of screening tools require more inputs that simplify the performance of a particular set of predictors, in particular giving due consideration for the contexts for decision making. Notably, an AUROC of 0.76 alone is not sufficient information to determine if the EPDS is a useful as a diagnostic or screening tool for SGA. Other characteristics such as positive predictive value ("precision") 1 2 3 4 diagnostic or screening tool for SGA. Other characteristics such as positive predictive value ("precision") and accuracy need to be presented, taking into account decision support considerations such as the cost/penalty of false positives, given for example, the 15% false positive fraction for depression at EPDS >= 13 the authors cited. Again, EPDS scores need not be dichotomized as a predictor of SGA and may be assessed continuously and non-linearly. Notably, given three dichotomous cutpoints and looking only at goodness-of-fit and AUROC, the authors have not well investigated whether "peak adversities of SGA" occur at a score of 11. A continuous measure would be more informative in this regard. Finally, one key missing piece of information from the current analyses is the degree to which addition of EPDS improves the classification of SGA over models without the score.
Major points: The flow chart is missing some information. The count and percentage of live births do not seem to line up: 763 is 49% of 1557 recruited. After excluding the 5 twins and 4 still births, there are still 785 (50%) not accounted for. It is important to know the disposition of this group, what % withdrew, lost contact, miscarried, etc? Since mothers were recruited in 2 and 3 trimester, it is surprising that so large a fraction would be lost to follow-up. Since depression and fetal health may both be related to loss-to-follow-up there is a possibility that any associations found in your final sample may be subject to selection bias. At the very least, baseline characteristics of recruited participants should be reported, stratified by whether they are retained in your final sample. More rigorously, they could see the sensitivity of their findings using regressions that are weighted for an inverse probability of selection into the final sample. Such a weight could be created by using logistic regression with the outcome being an indicator of censorship (1 if lost to follow-up, 0 if observed) predicted by all available covariates. While this will not fully correct for selection based on unobserved characteristics, it can help demonstrate the direction of the bias.
Minor points: There is not a strong need to report both mean and counts for the same variables (e.g. age, gravidity and parity) in the model. For gravidity and parity, cell counts can provide more information if they are exhaustive. In contrast, for age, the wide categories including truncation at 22 years are somewhat arbitrary when the mean (or likely median) suggest, for example, there is little difference in age distributions by EPDS threshold.
In the Results, if the goal is to describe differences in characteristics between mothers below/above the EPDS threshold (i.e. in line with Table 1), authors should provide both sets of results so readers can make the explicit comparison. For example, if authors wish to report that "60% of husbands [of women with EPDS > 11] were consuming tobacco" it would be helpful to present that only 42% of husbands of EPDS <= 11 women. Why weren't abortions reported in Table 1?
In Table 4, it is not entirely clear which tests the two columns of p-values refer to. Assuming "EPDS score in the model" refer to the p-value for the specific coefficient and "model" refers to the p-value for the goodness-of-fit test including all covariate, this should be clarified in the caption text. Undue precision in the estimates is also discouraged. For example, 2 decimal places for odds ratios and 3 for p-values are probably sufficient.
If available, consideration of physical activity and plasma vitamin D may be beneficial.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility? No

Are the conclusions drawn adequately supported by the results? No
No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.
Author Response 30 Jan 2020 , Indian Institute of Public Health -Bangalore, Bengaluru, India Giridhara R Babu First, the authors could better consider the association between EPDS as a continuous score, including the potential for non-linearity as they suggest may occur and present this as a main not supplemental finding (as it will also address their secondary aim).
: Thank you for the suggestion. We have included the results of association between Response EPDS as a continuous score and SGA. (Table-4) Second, while it is heartening that authors specified a priori covariates, the high likelihood of residual confounding should direct them to consider other additional measures at their disposal in this cohort study including maternal BMI, blood pressure, gestational diabetes or dysgylcemia this cohort study including maternal BMI, blood pressure, gestational diabetes or dysgylcemia (which, given its etiology, will precede their collection of EPDS in mid/late pregnancy), and maternal medical history, especially, if available, medication.
: We do not have maternal pre-pregnancy BMI in this study but have adjusted for Response maternal skinfold thickness which is a better marker for adiposity during pregnancy considering our cohort participants were recruited into study at different stages of gestation (14 to 36 weeks).
Third, they had hypothesized that associations may differ based on socioeconomic status, yet they did not evaluate any interactions by such factors. If any were specified a priori, they would be interesting to present (even if "not significant").
: We have considered respondent's as well as husband's income in the model. Since, Response income is a criterion for socio economic status, in order to avoid the problem of multi-collinearity we excluded socioeconomic status from the model. : Thank you for mentioning it. Since the aim of this study is to examine if the relation Response between Edinburgh Postnatal Depression Scale (EPDS) score and SGA, we focused on SGA as outcome.
Finally, given the possibility that depressive symptoms may reflect pre-conception depression and the likelihood of residual confounding, discussion of "effects" or benefit of intervention during pregnancy should be minimized. The authors should be commended for mentioning this in their conclusions, however the extent of residual confounding and selection bias (see "major point" below) is not given sufficient weight in their "three potential explanations" paragraph.
: Thank you, we have now included this in the manuscript "Thirdly, pre-conception Response depression and mental health status have also been showed to be associated with low birth weight A secondary aim appeared to be exploring whether a threshold for EPDS could be used for screening on the basis of SGA risk. Risk prediction models may be valid even with absent causal interpretations, however, the development of screening tools require more inputs that simplify the performance of a particular set of predictors, in particular giving due consideration for the contexts for decision making. Notably, an AUROC of 0.76 alone is not sufficient information to determine if the EPDS is a useful as a diagnostic or screening tool for SGA. Other characteristics such as positive predictive value ("precision") and accuracy need to be presented, taking into account decision support considerations such as the cost/penalty of false positives, given for example, the 15% false positive fraction for depression at EPDS >= 13 the authors cited. Again, EPDS scores need not be dichotomized as a predictor of SGA and may be assessed continuously and non-linearly. Notably, given three dichotomous cut points and looking only at goodness-of-fit and AUROC, the authors have not well investigated whether "peak adversities of SGA" occur at a score of 11. A continuous measure would be more informative in this regard. Finally, one key missing piece of information from the current analyses is the degree to which addition of EPDS improves the classification of SGA over models without the score.
: Thank you for the suggestions. We have removed the analysis on the predictive Response : Thank you for the suggestions. We have removed the analysis on the predictive Response capabilities of EPDS for SGA. We have included the analysis results using EPDS score as a continuous predictor.
Major points: The flow chart is missing some information. The count and percentage of live births do not seem to line up: 763 is 49% of 1557 recruited. After excluding the 5 twins and 4 still births, there are still 785 (50%) not accounted for. It is important to know the disposition of this group, what % withdrew, lost contact, miscarried, etc? Since mothers were recruited in 2nd and 3rd trimester, it is surprising that so large a fraction would be lost to follow-up. Since depression and fetal health may both be related to loss-to-follow-up there is a possibility that any associations found in your final sample may be subject to selection bias. At the very least, baseline characteristics of recruited participants should be reported, stratified by whether they are retained in your final sample.
: We have considered woman accessed for depressive symptom as sample population Response for the study. This excludes twins and still birth. Also, since cohort in ongoing, all the women have not delivered when the current study was conducted. Only 50% of mothers delivered, out of those who were recruited into the study during the time of current study. We have considered so far delivered women and whose followup had been completed on time. For the analysis, we have considered 654 respondents whose EPDS score had been assessed during pregnancy and who had delivered and followup completed on time. We have updated the figure.
More rigorously, they could see the sensitivity of their findings using regressions that are weighted for an inverse probability of selection into the final sample. Such a weight could be created by using logistic regression with the outcome being an indicator of censorship (1 if lost to follow-up, 0 if observed) predicted by all available covariates. While this will not fully correct for selection based on unobserved characteristics, it can help demonstrate the direction of the bias.
: We have removed the findings. Response Minor points: There is not a strong need to report both mean and counts for the same variables (e.g. age, gravidity and parity) in the model. For gravidity and parity, cell counts can provide more information if they are exhaustive. In contrast, for age, the wide categories including truncation at 22 years are somewhat arbitrary when the mean (or likely median) suggest, for example, there is little difference in age distributions by EPDS threshold.
: Thank you, we have now modified the table and retained the cell count for Gravidity Response and parity and have mentioned mean age.
In the Results, if the goal is to describe differences in characteristics between mothers below/above the EPDS threshold (i.e. in line with Table 1), authors should provide both sets of results so readers can make the explicit comparison. For example, if authors wish to report that "60% of husbands [of women with EPDS > 11] were consuming tobacco" it would be helpful to present that only 42% of husbands of EPDS <= 11 women. Why weren't abortions reported in Table 1? : Since gravida and parity are included, we excluded still birth and abortion to concise Response the information.
In Table 4, it is not entirely clear which tests the two columns of p-values refer to. Assuming "EPDS In Table 4, it is not entirely clear which tests the two columns of p-values refer to. Assuming "EPDS score in the model" refer to the p-value for the specific coefficient and "model" refers to the p-value for the goodness-of-fit test including all covariate, this should be clarified in the caption text. Undue precision in the estimates is also discouraged. For example, 2 decimal places for odds ratios and 3 for p-values are probably sufficient.
: Thank you for your suggestion. Changes are made accordingly.

Response
If available, consideration of physical activity and plasma vitamin D may be beneficial.
: Thank you for the suggestion, however, we have not yet assessed the vitamin D levels Response in the stored samples. We will assess these in a future manuscript

Howard Cabral
Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA This paper examines the association of maternal depressive symptoms during pregnancy and small for gestation age delivery in a birth cohort in India from April 2016 to October 2017. The paper is generally well written and the tables and figures are well done and informative.
There are several points raised in the prior review that have not been addressed in the revised text.
Among the important confounding variables not included in the analysis would indeed be exposure to violence, a factor that is often not included in similar studies, though it clearly should be if available given that depressive symptomatology is the primary independent variable here. Checking the effects of applying different cutoffs to the Edinburgh (EPDS) score is helpful from a clinical standpoint, though the intent of developing a score is to be able to identify risk that is subclinical. Hence, analyses that use the EPDS score as continuous would also be informative. Women with scores less than a cutoff are indeed not "without mental depressive symptoms". The authors note that they have performed analyses using the continuous score but this is not apparent in the Methods or Results but in a Supplemental file. If this is the accepted approach of the publishing platform, this is fine but a link to this information of results should be included in the main text also.
The authors state that additional statistical analyses checked for effect modification (interaction) with depressive symptoms for salient variables on intrauterine growth. The methods and results of these models are not shown in the main text. Are these included in the Supplemental File also? If so, the recommendation above applies here also. If the interactions were found to be statistically and clinically significant, then showing the main effects only model as the primary set of results is inappropriate. significant, then showing the main effects only model as the primary set of results is inappropriate.
As noted above with respect to exposure to violence, very important confounders are not included in the statistical models that could alter the estimation of the effect of depressive symptoms on intrauterine growth. These would include maternal pre-pregnancy weight or BMI, as well as maternal health habits that have been shown to have associations with depressive symptoms, including maternal substance use of various kinds and the quality of prenatal care. A list of the most important confounders that were not examined in this study should be included in the limitations.
The English grammar in the text should be thoroughly re-checked.

If applicable, is the statistical analysis and its interpretation appropriate? No
Are all the source data underlying the results available to ensure full reproducibility? No

Are the conclusions drawn adequately supported by the results? No
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Biostatistics, statistical modeling, maternal and child health.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 30 Jan 2020 , Indian Institute of Public Health -Bangalore, Bengaluru, India Giridhara R Babu Among the important confounding variables not included in the analysis would indeed be exposure to violence, a factor that is often not included in similar studies, though it clearly should be if available given that depressive symptomatology is the primary independent variable here. Checking the effects of applying different cutoffs to the Edinburgh (EPDS) score is helpful from a clinical standpoint, though the intent of developing a score is to be able to identify risk that is subclinical. Hence, analyses that use the EPDS score as continuous would also be informative. Women with scores less than a cutoff are indeed not "without mental depressive symptoms". The authors note that they have performed analyses using the continuous score but this is not apparent in the Methods or Results but in a Supplemental file. If this is the accepted approach of the publishing platform, this is fine analyses using the continuous score but this is not apparent in the Methods or Results but in a Supplemental file. If this is the accepted approach of the publishing platform, this is fine but a link to this information of results should be included in the main text also.
: Thank you for the suggestion. We have included the analysis results using EPDS Response score as a continuous predictor.
The authors state that additional statistical analyses checked for effect modification (interaction) with depressive symptoms for salient variables on intrauterine growth. The methods and results of these models are not shown in the main text. Are these included in the Supplemental File also? If so, the recommendation above applies here also. If the interactions were found to be statistically and clinically significant, then showing the main effects only model as the primary set of results is inappropriate.
: The results of interaction are provided in the supplement table. Interactions were not Response statistically significant.
As noted above with respect to exposure to violence, very important confounders are not included in the statistical models that could alter the estimation of the effect of depressive symptoms on intrauterine growth. These would include maternal pre-pregnancy weight or BMI, as well as maternal health habits that have been shown to have associations with depressive symptoms, including maternal substance use of various kinds and the quality of prenatal care. A list of the most important confounders that were not examined in this study should be included in the limitations. : Unfortunately, we don't have the data regarding exposure to violence and Response pre-pregnancy BMI, this is one of the limitations of our study. We have mentioned it in our manuscript as a limitation of study. Instead of pre-pregnancy BMI, we included maternal adiposity measured through skinfold thickness as a confounder. Maternal substance use is very minimal (less than 1%) in the study sample, we have adjusted for the husband's current tobacco and alcohol consumption as indicator domestic violence.
The English grammar in the text should be thoroughly re-checked. : We have thoroughly revised the grammar of the manuscript.

Response
No competing interests were disclosed. Competing Interests:

Version 1
gestational age (SGA) birth in the MAASTHI birth cohort in India.
The stated study aim in the manuscript is to "replicate the association between antepartum depression and SGA in the setting of a public hospital in India", however the abstract conclusion seems to comment on the validity of using EPDS as a screening tool for antenatal depression. The study does not explicitly state the aim of examining the validity of EPDS as a screening tool. The abstract also reports values for the AUC using different cut-offs of EPDS for the diagnosis of antenatal depression. These values are only in relation to the SGA outcome examined in this study and does not compare EPDS to a 'gold standard' or another screening test for antenatal depression. Therefore, it is not accurate to comment of the "usefulness of using 10-item EPDS screening tool" in relation to other outcomes other than SGA, or for use as a screening tool in general. The manuscript needs to be clear about this, and if the authors would like to keep the 'prediction' element of EPDS in relation to SGA as an outcome, they need to be clear about this in the aims and methods.
Under the Methods section-Measurement, the authors state that they "aimed to assess the exact EPDS score cut-off value (11,12 or 13) as a better predictor of association between antenatal depression and SGA". Firstly, this statement needs to move to the aims section at the end of the Introduction section, and also needs to be clearly stated in the abstract. Secondly, this aim is not interchangeable with testing if EDPS is a valid screening tool for antenatal depression in the population the study is trying to generalise results to.
Under the Statistical Analysis section, it is not clear whether the association with SGA was examined using the continuous EPDS score or the 3 categorical variables based on the cut-off scores of 11, 12 and 13, or both.

Was maternal body mass index taken into account as a confounder?
Under the Results section, second paragraph: "among mothers with depressive symptoms…." using what EPDS cut-off? This applies to all the descriptive findings.
It is strange that the direction of effect is so different between using a cut-off of 11 versus 12 or 13 of the same scale (aOR 2.18 versus 0.46 and 0.41). Please check your categories and what you have assigned as a reference in your models.
Last paragraph of the results section, 'accuracy of EPDS scale' in relation to what? Are you saying that the strength of association with one outcome (SGA) a measure of accuracy of the screening test? Please clarify. If you are trying to predict the outcome then that is a function of other factors accounted for in the prediction model (if it is adjusted), not just the EPDS cut-off.

Are sufficient details of methods and analysis provided to allow replication by others? Partly
If applicable, is the statistical analysis and its interpretation appropriate?

No No
Are all the source data underlying the results available to ensure full reproducibility? No

Are the conclusions drawn adequately supported by the results? No
No competing interests were disclosed. Competing Interests: I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.
Author Response 12 Feb 2019 , Indian Institute of Public Health -Bangalore, Bengaluru, India Giridhara R Babu 1. The stated study aim in the manuscript is to "replicate the association between antepartum depression and SGA in the setting of a public hospital in India", however the abstract conclusion seems to comment on the validity of using EPDS as a screening tool for antenatal depression. The study does not explicitly state the aim of examining the validity of EPDS as a screening tool. The abstract also reports values for the AUC using different cut-offs of EPDS for the diagnosis of antenatal depression. These values are only in relation to the SGA outcome examined in this study and does not compare EPDS to a 'gold standard' or another screening test for antenatal depression. Therefore, it is not accurate to comment of the "usefulness of using 10-item EPDS screening tool" in relation to other outcomes other than SGA, or for use as a screening tool in general.
Thank you for the comments. We have modified the abstract conclusion and result section as per the suggestion.

The manuscript needs to be clear about this, and if the authors would like to keep the 'prediction' element of EPDS in relation to SGA as an outcome, they need to be clear about this in the aims and methods.
We have used antenatal depression as the exposure and SGA as an outcome. We have mentioned it clearly in the aims and methods.
3. Under the Methods section-Measurement, the authors state that they "aimed to assess the exact EPDS score cut-off value (11,12 or 13) as a better predictor of association between antenatal depression and SGA". Firstly, this statement needs to move to the aims section at the end of the Introduction section, and also needs to be clearly stated in the abstract. Secondly, this aim is not interchangeable with testing if EDPS is a valid screening tool for antenatal depression in the population the study is trying to generalise results to.
We sincerely thank the reviewer for the comment. The aim of the study is now modified as per the suggestion of the reviewer. We agree with the reviewer that the aim is not interchangeable with testing if EDPS as a valid screening tool for antenatal depression in the population. Clearly, we do not have the intent of doing so. There is no external validity (generalization) without meeting the testing if EDPS as a valid screening tool for antenatal depression in the population. Clearly, we do not have the intent of doing so. There is no external validity (generalization) without meeting the internal validity. Since our study not immune to the source of systematic error similar to all other observational studies, we are not providing any causal inference regarding the association between EPDS and SGA. We have included this limitation in the revised manuscript.
4. Under the Statistical Analysis section, it is not clear whether the association with SGA was examined using the continuous EPDS score or the 3 categorical variables based on the cut-off scores of 11, 12 and 13, or both.
The legends of tables contain the categorical classification of EPDS score as per the cut-offs as 11, 12 and 13 Association with SGA was examined using EPDS score as categorical variable based on the cut off values. We have updated the details in the Statistical Analysis section as well .(Page 9 Line 6)

Was maternal body mass index taken into account as a confounder?
As we have no data on pre-pregnancy BMI we have not considered the body mass index obtained during different trimester of pregnancy as a confounder, but we have taken sum of skinfold thickness into account. ( ) 1

Under the Results section, second paragraph: "among mothers with depressive symptoms…." using what EPDS cut-off? This applies to all the descriptive findings.
Here depressive symptom is defined as EPDS score >11 as we have mentioned in Table 1 and it applies for all descriptive findings. In the present study the cutoff score 13 showed highest OR compared to rest two categories, however, we have shown the descriptive statistics with cutoff of 11 since it is the minimum value at which we got statistically significant results. We sincerely thank the reviewer for this input. Please note that there was a mistake in coding the variable (EPDS score cut off 11, 12, 13). We recoded the entire data set and have thoroughly checked the entire analysis after redoing it. The resulted OR changes gradually from one cut off category to another. In our study, the use of EPDS score without adjusting for its confounders resulted in very low specificity in predicting SGA. The area under ROC curve using EPDS score alone in predicting SGA was 0.515. EPDS is a screening tool and hence may not fare well as a diagnostic test. However, after adjusting for confounders, the accuracy improved. Therefore, we meant that accuracy in predicting SGA by using EPDS scale improves after accounting for other variables confounders. This section is modified. (Page 18 Line 1) A number of points of concern, however, can be raised regarding this paper. Among these are points raised in a prior review by Dr. Desai, all of which are very pertinent. The inclusion of fetal loss deliveries would not be appropriate. If these were excluded the sample should be described as one comprised of livebirths only. Also, the inclusion of multiples would render as inappropriate analyses that assume independent observations. Not accounting for potential clustering by clinical site would additionally be inappropriate should such effects be observed (standard errors would likely be too small without such adjustment for site). Among the important confounding variables not included in the analysis would indeed be exposure to violence, a factor that is often not included in similar studies, though it clearly should be if available given that depressive symptomatology is the primary independent variable here.
In terms of additional comments, the following can be listed: The data analyzed should be described as the "study sample" and not the "study population".
Checking the effects of applying different cutoffs to the Edinburgh (EPDS) score is helpful from a clinical standpoint, though the intent of developing a score is to be able to identify risk that is subclinical. Hence, analyses that use the EPDS score as continuous would also be informative. Women with scores less than a cutoff are indeed not "without mental depressive symptoms".
The statistical analyses did not include checks of effect modification (interaction) with depressive symptoms for salient variables on intrauterine growth.Such effects should be checked at a minimum to verify that the main effects only model is valid. Any effect modification identified would be useful in delineating the mechanism of how depressive symptoms affect intrauterine growth.
Very important confounders are not included in the statistical models that could alter the estimation of the effect of depressive symptoms on intrauterine growth.These would include maternal pre-pregnancy weight or BMI, as well as maternal health habits that have been shown to have associations with depressive symptoms, including maternal substance use of various kinds and 4.
associations with depressive symptoms, including maternal substance use of various kinds and the quality of prenatal care.
The fit of the logistic regression models with respect to calibration should include the Hosmer-Lemeshow statistic and its associated degrees of freedom and p-value. A good fitting model should have both good calibration and discrimination.
The discrimination abilities of the models (c statistics or area under the ROC curve) are poor and barely above the null value of 0.5.The lack of additional confounding control also likely contributed to this under-fitting. In addition, there must be some recoding of the data that somehow has resulted in c statistics less than 0.5. The authors should carefully check this. There should not be values less than 0.5. Moreover, such a coding problem has likely resulted in the stark change in the direction of the odds ratios as shown in Table 4. There should not be such a drastic change from an odds ratio of 2.18 for the EPDS cutoff of 11 that indicates higher risk of SGA to one of 0.46 for a cutoff of 0.46. This kind of error markedly reduces the confidence of the reader in the overall analysis.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility? No

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 12 Feb 2019 , Indian Institute of Public Health -Bangalore, Bengaluru, India Giridhara R Babu 1.This paper examines the association of maternal depressive symptoms during pregnancy and small for gestation age delivery in a birth cohort in India from April 2016 to October 2017. The paper is generally well written and the tables and figures are well done and informative.
We sincerely thank the reviewer for the encouraging review with very constructive suggestions.

2.
A number of points of concern, however, can be raised regarding this paper. Among these are points raised in a prior review by Dr. Desai, all of which are very pertinent. The inclusion of fetal loss deliveries would not be appropriate. If these were excluded the sample should be described as one comprised of livebirths only. Also, the inclusion of multiples would render as inappropriate analyses that assume independent observations. Not accounting for potential clustering by clinical site would additionally be inappropriate should such effects be observed (standard errors would likely be too small without such adjustment for site).
Thank you for the very useful comment. We have provided the responses for each point.
Twin deliveries and stillbirths were excluded from the study analysis. We have now mentioned this in the Methods. (Page 6 and Line 10) Women with Multiple viable wombs are excluded from the study and analysis We have conducted the study in only one hospital. Therefore, there is no possibility of errors induced due to clustering. 3. Among the important confounding variables not included in the analysis would indeed be exposure to violence, a factor that is often not included in similar studies, though it clearly should be if available given that depressive symptomatology is the primary independent variable here.
We understand and agree that exposure to domestic violence was not measured in our study. However, the assessment of the psychosocial environment in the pregnant women was clearly directed resulting in stress/depression in pregnant women at the end result of many factors such as domestic violence might have resulted in. For example, if the women is a victim of domestic violence, the questions in the questionnaire would definitely indicate that she would not have slept well or felt low or has suicidal tendencies etc. Including the assessment of domestic violence as an antecedent was not done as it would have amounted to include other sources of maternal stress/depression such as job stress, social settings, poverty etc.
In terms of additional comments, the following can be listed: 4.The data analyzed should be described as the "study sample" and not the "study population".
Thank you for the comment, we have made the necessary change.
5.Checking the effects of applying different cutoffs to the Edinburgh (EPDS) score is helpful from a clinical standpoint, though the intent of developing a score is to be able to identify risk that is subclinical. Hence, analyses that use the EPDS score as continuous would also be informative. Women with scores less than a cutoff are indeed not "without mental depressive symptoms".
We sincerely appreciate this comment and do agree that it is useful to examine the risk of a sub-clinical group. In this regard, we have provided a graph indicating the relation between EPDS as a continuous variable and the proportion of women delivered with SGA. (Supplementary File: ) Figure 1, Page 2 6.The statistical analyses did not include checks of effect modification (interaction) with 6.The statistical analyses did not include checks of effect modification (interaction) with depressive symptoms for salient variables on intrauterine growth. Such effects should be checked at a minimum to verify that the main effects only model is valid. Any effect modification identified would be useful in delineating the mechanism of how depressive symptoms affect intrauterine growth.
We sincerely thank for this suggestion. As per the advice, we have run separate models including interaction effect. The results are provided in ( ) Supplementary File: Table 1, Page 1 We considered skinfold thickness as a continuous variable and excluded BMI to avoid the problem of multicollinearity. 7.Very important confounders are not included in the statistical models that could alter the estimation of the effect of depressive symptoms on intrauterine growth. These would include maternal pre-pregnancy weight or BMI, as well as maternal health habits that have been shown to have associations with depressive symptoms, including maternal substance use of various kinds and the quality of prenatal care.
We have not measured the maternal pre-pregnancy weight, however, have adjusted for the maternal sum of skinfold thickness. Maternal substance use is very minimal (less than 1%) in the study sample, we have adjusted for the husband's current tobacco and alcohol consumption.
8.The fit of the logistic regression models with respect to calibration should include the Hosmer-Lemeshow statistic and its associated degrees of freedom and p-value. A good fitting model should have both good calibration and discrimination. The discrimination abilities of the models (c statistics or area under the ROC curve) are poor and barely above the null value of 0.5.The lack of additional confounding control also likely contributed to this under-fitting. In addition, there must be some recoding of the data that somehow has resulted in c statistics less than 0.5. The authors should carefully check this. There should not be values less than 0.5. Moreover, such a coding problem has likely resulted in the stark change in the direction of the odds ratios as shown in Table 4. There should not be such a drastic change from an odds ratio of 2.18 for the EPDS cutoff of 11 that indicates higher risk of SGA to one of 0.46 for a cutoff of 0.46. This kind of error markedly reduces the confidence of the reader in the overall analysis.
Thank you for pointing out this. We sincerely thank you for pointing to the error; it is very useful insight and we realized that there was a mistake in coding the variable (EPDS score cut off 11, 12, 13). We recoded the entire data set and have thoroughly checked the entire analysis after redoing it. The resulted OR changes gradually from one cut off category to other and the AUROC curves obtained from the predicted probabilities of each model are above the null value. We sincerely apologize for the mistake. Hosmer-Lemeshow test statistic indicated model is a good fit. Overall model predictability is 83.6% for EPDS cut off category 11. We tried performing discriminant analysis, but the factors found to have a significant deviation from the multivariate normal distribution.
No competing interests were disclosed. Competing Interests: