Does recall time matter in verbal autopsies? Evidence from urban informal settlements in Nairobi, Kenya

Background: To assign a cause of death to non-medically certified deaths, verbal autopsies (VAs) are widely used to determine the cause of death. The time difference between the death and the VA interview, also referred to as recall time, varies depending on social and operational factors surrounding the death. We investigated the effect of recall time on the assignment of causes of death by VA. Methods: This is a secondary analysis of 2002-2015 survey data of the Nairobi Urban Health Demographic Surveillance System (NUHDSS). The independent variable recall time was derived from the date of death and the date when the VA was conducted. Univariate and multivariate logistic regression methods were used to calculate odds ratios of assigning a cause of death in defined categories of recall time. Results: There were 6218 deaths followed up between 2002 and 2016, out of which 5495 (88.3%) had VAs done. Recall time varied from 1-3001 days (median 92 days, IQR 44-169 days). Majority of the VAs (45.7%) were conducted between 1-3 months after death. The effect of recall time varied for different diseases. Compared to VAs conducted between 1-3 months, there was a 24% higher likelihood of identifying HIV/AIDS as the cause of death for VAs conducted 4-6 months after death (AOR 1.24; 95% CI 1.01-1.54; p-value = 0.043) and a 40% increased chance of identifying other infectious diseases as the cause of death for VAs conducted <1 month after death (AOR 1.4; 95% CI 1.02-1.92, p-value = 0.024). Conclusions: Recall time affected the assignment of VA cause of death for HIV/AIDS, other infectious diseases,maternal/neonatal and indeterminate causes. Our analysis indicates that in the urban informal setting, VAs should be conducted from one month up to 6 months after the death to improve the probability of accurately assigning the cause of death.


Introduction
Mortality data collected as part of vital registries, disease surveillance systems and epidemiological studies is essential for decision making 1 . The need for its accuracy and reliability cannot be overemphasized 2,3 . Currently, the gold standard method for assigning the specific cause of death is through complete diagnostic autopsy 4 The alternative, where this is not possible, is certification by a medical practitioner using guidelines stipulated in the International classification of diseases and related health problems (ICD), currently available in the eleventh version 5 In countries where the vital registration systems are not fully developed, or in situations where there are challenges in medical certification of deaths, such as a home death, verbal autopsies (VAs) are an important means of assigning the cause of death 6,7 . The VA consists of an interview using standardized questionnaires with the deceased person's close relatives or caregivers who is aware of the circumstances leading to the death 7 . The most widely used VA questionnaires are by the World Health Organization 7 and the Population Health Metrics Research Consortium 8 .
After completion of the VA interview, data from the VA questionnaire is then interpreted to assign the cause of death. Methods for assigning the cause of death from the VA interview data vary and include the use of either physician certified verbal autopsies(PCVA) or computer coded verbal autopsy (CCVA) systems that utilise algorithms, statistical techniques, machine learning and deep learning approaches 9,10 . A systematic review comparing PCVAs with various CCVA systems found that although the methods differed in the cause of death output, none of the VA interpretation methods reviewed was superior to the others 9 . Use of CCVA may however decrease the time and cost associated with PCVAs and improve consistency and comparability 1

.
A key element in the reliability of cause of death data as determined through VAs is the time between when the verbal autopsy was conducted and when the death occurred, also referred to as the recall time 11 . There are varying optimal recall time to achieve maximum validity of a VA available in literature ranging from as soon as the death occurs up to 12 months and beyond [12][13][14] . Data from a study in South Asia indicates that the probability of assigning a correct cause of death by VA methods decreases by 0.55% per increasing month of the recall time 11 . In contrast, a study in South Africa found that apart from neonatal causes, there was no impact on the validity of VA for recall time as long as one year 15 . It is of importance that the timing of the VA not only takes account of the validity but also social and cultural factors 16 . As such, it may be imperative to set context-specific optimal timing for VAs. In this paper, we aim to determine the optimal timing of conducting a VA to achieve higher odds of assigning the accurate cause of death in a low-income setting. We use mortality data collected from the Nairobi Urban Health Demographic Surveillance System (NUHDSS). This NUHDSS is run by the African population and research Centre (APHRC) in the informal settlements of Korogocho and Viwandani in Nairobi, Kenya 17

Study setting
This paper utilizes mortality data collected from two informal settlements in Nairobi Kenya, that form the NUHDSS, a DSS system run by APHRC. Korogocho is located in Ruaraka Sub-County in Nairobi and covers an area of 0.9 square km with a total population of 36,900 and a density of 42,401 persons per km 2 as per the 2019 Kenya population and housing census 18 . Viwandani is located in Makadara Sub-County, it covers an area of 5 km 2 with a population of 43,070 and a density of 8554 persons per square km 18 . Viwandani forms part of Nairobi's industrial region and its inhabitants are mainly casual labourers within the industries in the area, while Korogocho is inhabited by residents mainly engaged within the informal job sector. Maps for the NUHDSS can be accessed elsewhere 19 Key challenges in Korogocho and Viwandani, as in many other urban informal settlements, include minimal formal infrastructure such as roads and piped water networks as well as rising cases of communicable and non-communicable diseases, rampant insecurity, environmental pollution, alcohol and drug abuse 17

Design
This secondary analysis uses VA data collected through a series of surveys by the APHRC in the NUHDSS. The general methodology of the NUHDSS is summarized elsewhere 20 . In summary, these were a series of surveys conducted from 2002 to 2015 using methodologies described by The International Network for the Demographic Evaluation of Populations and their Health (INDEPTH) network 21 . Surveys were done every four months. Information on the events surrounding the death within the NUHDSS was collected by field

Amendments from Version 1
This manuscript has been updated to address comments from Reviewers 1 and 2. The main changes are: -Calculation of the cause specific mortality fractions has been redone taking into consideration of all the three probable causes of death assigned by the interVA-4 output.
-We clarify the cut off points for recall time.
-We provide more details for using the 1-3 recall time as reference category in the regression analysis.
-We provide more discussion around the social and economic circumstances affecting the timing of conducting verbal autopsies.
Any further responses from the reviewers can be found at the end of the article REVISED interviewers and later independently validated by three physicians to assign a probable cause of death. Data from the questionnaire was then used as inputs in the InterVA-4 software, a tool that uses probabilistic models based on Bayes' theorem to interpret symptom data and determine possible causes of death 22 . The output from the InterVA-4 model consists of up to three likelihoods per case attributed to different causes. An indeterminate residual portion is assigned when the sum of the probability of the three likelihoods is less than 100% 22 . For this study, we utilized the output from the InterVA-4 model to carry out analysis of optimal recall time.

Data analysis
We used STATA version 15.1 in the analysis and included all records where a successful VA was conducted. The main independent variable of interest in the analysis, recall time, is derived from the date of death and the date when the VA was conducted. Based on consideration of the average mourning time, and the acceptable time to conduct a VA as per the Kenya Verbal Autopsy Guidelines 23 , we categorized recall time into less than 1 month, 1 to 3 months, 4 to 6 months, 7 to 12 months and greater than 12 months. The upper limit integer month for each category included all the possible decimals before the next category (for example, 1-3 months included all possible days from 1 month up to 3.99 months). In the descriptive analysis, we tabulate the recall time against other background characteristics to calculate their frequencies in the categories of recall time.
For each death, there were three possible causes as an output from the InterVA-4 model. The interVA-4 output was converted into cause-specific mortality fractions considering all the three probable causes of death assigned. The dependent variables were derived from each specific probable cause of death, transformed into a binary variable of yes/no. There were 22 most probable causes of death as an output in the interVA-4 model. To reduce data sparsity in our modelling, we collapsed these 22 probable causes of death into 12 categories guided by their counts and "reasonable relatedness". Frequently occurring diseases such as tuberculosis and HIV/AIDS were retained in their original categories. Asthma, diabetes melitus, chronic obstructive pulmonary disease (COPD) and other chronic diseases were reclassified as other chronic/non communicable diseases. Injuries were grouped together with other external causes of death. Malaria and anaemia were grouped together given their limited numbers and that malaria is a major cause of anaemia in this region 24 . Direct obstetric and neonatal causes were combined into one category as maternal/neonatal causes. Details of this re-categorization of the probable causes of death are found in Table 1. This approach to recategorization has also been applied in other studies 15,20 .
To investigate the effect of recall time on determining the cause of death outcome of the VA, logistic regression methods were used to calculate odds ratios of assigning a probable cause of death using the categories of recall time defined above. The 1 to 3 months recall time was used as the reference category for the logistic regression as the highest number of VAs were conducted within this period and this falls within the Kenya VA guidelines on appropriate time to conduct a VA. In the univariate logistic regression analysis, association of recall time and probable cause of death outcome of the VA was tested using chi square tests while in the multivariate analysis, likelihood ratio tests were carried out. Wald tests were used to determine the effect of various categories of recall time. The backward modelling approach was used in the multivariate analysis to adjust for the background characteristics.

Distribution of deaths in the NUHDSS 2002-2016
There were a total of 6218 deaths followed up between 2002 and 2016, of which the highest number were in 2011(533; 8.6%), out of which 5495 (88.3%) had VAs conducted. Most of the deceased were between the ages of 15-49 (3147, 50.6%). Distribution of deaths by year and age is shown in Figure 1.
The most common cause of death in the period of 2002 to 2016 was tuberculosis and the least commonly identified cause of death was anaemia. The cause-specific mortality fractions are shown in Table 2.

Description of VA recall time
The range of recall time for deaths with VAs done was from 1 day to 3001 days, with a median recall time of 92 days (Interquartile range 44-169 days). Majority of the verbal autopsies (45.7%) were conducted between one to 3 months of death. Distribution of socio-demographic characteristics varied based on the various verbal autopsy recall periods as shown in Table 3. There were more VAs conducted within the first month of death between 2010 and 2013 (26%) as compared to the other years under study.
The effect of recall time on VA cause of death assignment The effect of recall time varied for different causes of death (Table 4). We found strong evidence (p-value = 0.0124) that identifying HIV/AIDS as the likely cause of death varied based on the timing of the VA. Conducting the VA 4 to 6 months after the death increased the likelihood of identifying HIV/ AIDS as the cause of death by 24% compared to conducting it between 1 to 3 months (AOR 1.24; 95% CI 1.01-1.54; p-value = 0.043).
Both the crude and the adjusted analysis showed that an assignment of malaria/anaemia, meningitis, malignancies and other chronic diseases as the probable cause of death did not depend on the timelines within which the VA was conducted.
We observed that assignment of tuberculosis as the probable cause of death was dependent on the recall time in the unadjusted analysis (p value= 0.0442) but this effect was lost in the adjusted analysis.
In the crude analysis, recall time did not affect the identification of other infectious diseases as the probable cause of death but we found an effect of recall time in the adjusted analysis (p-value = 0.024). Compared to doing the VA one to three months after the death, conducting it in less than one month increased the chance of identifying other infectious diseases as the probable cause of death by 40 per cent (AOR 1.4; 95% CI 1.02-1.92). Similarly, identifying maternal /neonatal causes of death did not depend on the recall time in the crude analysis but in the adjusted analysis, there was an effect of recall time with a 43% lower chance of assigning probable cause of death to maternal/neonatal causes in VAs done less than one month compared to those done between

Discussion
In this secondary analysis, we investigate the effect of recall time on the cause of death assigned through the InterVA-4 software. There was a notable variation in the timing of VAs with a higher percentage of VAs being conducted within three months of death in the latter years of follow up (2010 -2016) as compared to the earlier years (2002-2009) suggesting an improved death notification and follow up system in the latter years.
The recall time in a VA is crucial as it determines if the respondent(s) can accurately recount the prevailing symptoms, signs and probable diagnosis before the deceased person's death 25 . The interVA-4, as well as other VA systems, rely on the accuracy of this information to assign the possible cause of death. In our adjusted analysis, we found a significant difference in the odds of the assignment of a cause of death as HIV/AIDS, other infectious disease and maternal/neonatal causes by recall time. We found that there was a 40% higher chance of identifying other infectious diseases as the cause of death in our analysis for VAs conducted less than one month of death as compared to those conducted between one to three months.
The timing of VAs should consider prevailing circumstances to the death and ensure the appropriate mourning period be observed based on the accepted cultural norms and religion. In a study 26 looking at effects of culture and ethics on VAs in Ghana, it was noted that the timing should take into consideration the respondent's emotional distress, for example from the death of an only child or a maternal death. The mourning period may also vary based on the relationship between the interviewee(s) and the deceased as well as the nature of the death and role of the deceased person in the household 26 . The Kenya 2019 verbal autopsy guidelines provide for a 30-40-day period of mourning before conducting a VA with a maximum allowable period of one year after the death 23 .
In the NUHDSS, the VA was preceded by the field supervisor's visit to the deceased home to determine the appropriate time to conduct the VA, based on an assessment of the situation and the availability of a credible respondent 20 . A maximum of up to five visits were conducted to contact the deceased household members and where it was established that they were no longer residents of the area, a willing credible neighbour was interviewed.
Some of the factors that could affected the timing of the VA are migration of the deceased family following the death due to changes in the economic situation and other factors. Mourning period's in Kenya also vary based on tribe and cultures, with some cultures having extensive periods of mourning as compared to others, while in some cultures, the length of the mourning period will be dependent on marital status, age, gender, role in community, and birth order 27 . Data on tribe/ethnicity was, however, not available for further analysis in this study.
Our results, in general, find no effect of recall time on identifying the cause of death by VA for most chronic diseases (except HIV/AIDS) and indeterminate causes as compared to acute conditions. We observe that VA respondents in situations where the deceased had a chronic disease are more likely to remember the symptoms, signs and the probable cause of death overtime.
Delay of VA for periods beyond 12 months has been hypothesised to lead to a less accurate recall of symptoms 14 . Our analysis did not show any significant differences for any  of the identified causes of deaths for recall periods above 12 months apart from the indeterminate causes of death. Compared to a recall period of 1-3 months, cases with recall periods of >12 months were 41% less likely to be classified as indeterminate. In comparison, in a study in South Africa, the only significant difference for recall periods over 12 months were for neonatal deaths 15 .
In the regression analyses, we do not adjust for seasonality as seasons and weather patterns in Kenya have varied considerably over the period of analysis in the paper 28 , it would be less prudent to include seasonality in the model. Additionally, the NUHDSS is located in an urban informal settlement where disruptions of participant availability for reasons like farming are less as compared to a similar study done in South Africa 15 .
The interVA-4 outputs consist of three likelihoods, other approaches to the analysis include inclusion of all the possible causes of death and their likelihoods as weights to the regression models. However, where there is need to recategorize the causes of the death, assumptions will have to be made on how the likelihoods combine before and after the categorisation, whether additive, multiplicative or other complex form of combination. We avoid these assumptions by conducting the regression analysis with the most probable cause of death as assigned by the interVA-4. We acknowledge that there might me alternative preferences to this.
The ideal method of identifying the effect of recall time would have been a comparison of the cause of death generated from gold standard death certification to the VA assigned cause of death for the different recall times. However, in the absence of the gold standard comparator, as in our case, the alternative approach used in this paper was to compare probabilities or odds of making a specific VA cause of death assignment for different recall periods. To our best knowledge this is the first paper within the East African region to expound on the effects of recall time in VAs.

Conclusion
Recall time affected the assignment of VA cause of death for HIV/AIDS, other infectious diseases, maternal /neonatal and indeterminate causes. Our analysis indicates that in the urban informal setting, VAs should be conducted from one month up to 6 months after the death to improve the probability of accurately assigning the cause of death.

Data availability
The KENYA -NUHDSS -Verbal Autopsy, Causes of deaths 2002-2015 dataset was used in this secondary analysis. The dataset is available upon request from the APHRC Microdata Portal http://microdataportal.aphrc.org/index.php/catalog. Access to the dataset can be obtained through submission of a written request in the APHRC portal following creation of an account. Further information on how to apply for access to the data can be found here.

Open Peer Review
My thanks to the authors for their careful and detailed responses and my apologies for the delay in re-reviewing. I am happy with the changes made and have no further concernscongratulations on the paper! If possible, please could the authors provide the exact cut-offs (in days) used for the recall time categories. At present it is not entirely clear if a VA conducted after ~3.5 months would be categorised as "1-3 months" or "4-6 months" (same for recall times between 6 and 7 months).

1.
There may be a good reason for using 1-3 months as a reference instead of <1 month but the reason for this approach is not described in the methods. In my opinion this makes it more difficult to interpret the results and to look at trends across recall periods (regardless of statistical 'significance').

Results
Page 4: it may be more interesting to show the median + IQR for recall time, given that it is unlikely to be normally distributed?
1.    Table 3: Typo in the 95% CI for the aOR for HIV/AIDS deaths with a recall time of 7-12 months ("10.89").

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Partly expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. Major comments: 1) Calculation of CSMFs -The method used to calculate CSMFs is potentially problematic. Inter-VA output (up to three causes with likelihoods +/-residual likelihood) is intended to be converted into CSMFs in a way that takes account of all causes of death assigned (see Appendix 4 of the InterVA-4 user guide), not only the most likely cause, as described by the authors. The 'collapsing' process described seems perfectly justifiable but may have been better applied after the calculation of CSMFs to reflect better the true output of InterVA. Adopting this approach would likely have produced different CSMFs for each recall period, and may therefore have affected the results of the regression.
Response: Thank you for the comment. The cause specific mortality fractions have been recalculated as suggested and are presented in table 2 of the revised paper. However, we have not used the CSMFs for the regression analysis as adding the cause of death specific weights/likelihoods to a model where the cause of death have been collapsed makes assumptions on how the cause of death weights/likelihoods combine before and after the re-categorization. To avoid making these assumptions, we prefer to use the most likely cause of death in the logistic regression analysis. However, we have included this limitation in the discussion.
2) Logistic regression -Although most factors likely to affect the relationship between recall time and cause of death were adjusted for, time of year/seasonality was not. This is potentially very important, as it may have affected mortality patterns but also affected the ability of VA teams to access respondents (e.g. this may have been harder in the rainy season) and therefore affected time between death and VA. This may be less relevant in the urban setting where this study was conducted (compared with the rural setting in the study by Hussain-Alkhateeb et al.) but this should at least have been discussed and the omission justified by the authors.
Response: Seasons and weather patterns in Nairobi have varied considerably over the period of analysis in the paper, it would have been less prudent to include seasonality in the model. Additionally , we agree with the reviewer that this variable is less relevant in this context unlike in the Hussain Alkhateeb et al paper. We have included text in the discussion section to reflect our rationale for the lack of inclusion of this variable in the model.
3) Interpretation of results -My apologies if I have misunderstood, but I do not entirely follow some of the statements made in the discussion and I am not sure the conclusions flow from the results presented. The authors suggest that respondents may be more likely to remember the symptoms of an acute illness if the VA is done in a shorter time after death, yet in this analysis the odds of being assigned a respiratory tract infection CoD were higher for VAs done after >12 months compared with 1-3 months.
Response: In our analysis, there is no overall association between recall time and RTI based on the likelihood ratio test (P value 0.0585) and that is why we did not focus on RTIs. Our general observation that respondents are more likely to remember acute illness if the VA is done in a shorter time still holds. 4) I also do not entirely follow the possible explanation for why maternal deaths were less likely to be assigned <1 months after death than 1-3 months. Are the authors suggesting that fewer maternal deaths were assigned because VAs may have been delayed when a particular cause of death was suspected? It would seem important to know if this was done when any other cause of death was suspected, or if it was only for maternal deaths. The authors also state that the Kenyan guidelines recommend 30-40 days mourning prior to conducting a VA, but there is no description of how often these guidelines were followed (and/or if they were followed more closely in one district than another) -this is a potentially a major source of bias and should ideally have been explored in more detail in the analysis.
Response: Thank you for the comment. We have updated the discussion section to further explain the potential sources of delay to conducting a VA in the NUHDSS. This was not specific to maternal deaths but to the prevailing circumstances. We have edited the discussion to focus less on the maternal/neonatal finding. While the numbers for combined direct obstetric and neonatal causes were low, the majority of these were neonatal deaths. The effect observed could therefore have been a chance finding due to the few numbers or as a result of the higher proportion of neonatal deaths to direct obstetric causes ( 6 to 1) especially in cases where the neonatal deaths were not reviewed promptly.
Minor comments: Introduction: 5) Paragraph 1: The 'gold standard' for cause of death is one assigned by complete diagnostic autopsy, not by a medical practitioner, though of course for the majority of deaths a practitioner-assigned CoD is often the best that can be hoped for. It may be worth making this point a little more clearly in the first paragraph.
Response: This has been updated 6)Paragraph 2: This is a very minor point, but the multiple plurals used here ('interviews', 'questionnaires', 'relatives', 'caregivers') suggests that a standard VA involves interviewing more than one person, when this is rarely the case. This pre-print may provide a more up-to-date list (InterVA is also misspelled in this paragraph, as "inter-VA").

Response:
We have updated the list of CCVA methods based on the updated reference, however we have retained our original citation as well since Mapundo et al paper is in a preprint. Methods: 9) If possible, please could the authors provide the exact cut-offs (in days) used for the recall time categories. At present it is not entirely clear if a VA conducted after ~3.5 months would be categorised as "1-3 months" or "4-6 months" (same for recall times between 6 and 7 months).
Response: Thank you for the comment, the highest cutoff integer month for our categories included all the possible decimals before the next category, for example 1-3 months includes all possible days up to 3.99.We have updated this in the methods section. 10) There may be a good reason for using 1-3 months as a reference instead of <1 month but the reason for this approach is not described in the methods. In my opinion this makes it more difficult to interpret the results and to look at trends across recall periods (regardless of statistical 'significance').
Response: The 1 to 3 months recall time was used as the reference category for the logistic regression for two reasons; the highest number of VAs were conducted within this period and secondly this falls within the Kenya VA guidelines on appropriate time to conduct a VA thus we set to compare all the results against this recall period. We have updated this in the methods section. Results 11) Page 4: it may be more interesting to show the median + IQR for recall time, given that it is unlikely to be normally distributed?
Response: Thanks, we concur. The Median and IQR for recall time have been included. Response: Implies that the response given did not apply to the context of the deceased under question. We have added a footnote in the table to explain

Are sufficient details of methods and analysis provided to allow replication by others? Partly
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility? Partly Are the conclusions drawn adequately supported by the results? Partly regression analysis. In the discussion, we have expounded on some of the potential socialcultural factors around recall period not available in the dataset.
Methods: Paragraph 1, Study setting: A map would be good to include, if possible. A reference to where the NUHDSS map can be accessed has been provided in the revised version.
Paragraph 3, Design: The term demographic health system may be misleading, suggest replace and use HDSS consistently.
Response: This has been updated. Paragraph 3, Design: InterVA5 model is currently available, it may be worth considering applying InterVA5 for these data. Thank you for the comment. The data set was provided following the Inter-VA4 analysis therefore it would not be possible to analyze it using Inter-VA5. InterVA-5 was developed to accommodate the WHO 2016 standard VA tool that was not utilized for this data set.
Page 4, Table 1: the title of this table might be rephrased -strictly speaking, criteria are not presented. Recategorization of InterVA CoDs to constructed CoD categories might be more accurate.
Response: Thank you. This has been updated Page 4, Table 1: The collapsed CoD categories are not quite clear. The authors state that category classifications were guided by counts and reasonable relatedness -this requires some clarification, especially around amenia and malaria. The authors also state that the collapsed categories are based on classifications used in other work, however there are 8 categories in the Hussain et al. study and 15 in the Oti and Kyobutungi paper. The relationship of these to the 12 categories used in this analysis similarly would benefit from some clarification.

Response:
We have updated the text to further detail how we did the recategorization. The approach to this classification has been used by other studies as cited, however the final number of categories varies depending on data and specific circumstances in the study areas. The collapsing of malaria and anemia is based on the precinct that malaria is a major cause of anemia in tropical areas.
Page 4, Table 1: It is not customary to cite AIDS/HIV as a category, rather HIV/AIDS, as is done later in the paper -suggest consistency with commonly used term. This has been updated. Thank you.
Page 4, paragraph 1: CSMFs typically account for all probable CoD assigned per case. Suggest recalculating the CSMF using all probable CoDs from InterVA. CSMFs have been recalculated as suggested.It is not entirely clear why 1-3 months recall time is used as the reference period. The paper indeed presents research from the same region where 12 months recall is found to be suitable. The authors may wish to clarify this element of the analysis.
Response: The 1 to 3 months recall time was used as the reference category for the logistic regression for two reasons; the highest number of VAs were conducted within this period and secondly this falls within the Kenya VA guidelines on appropriate time to conduct a VA thus we set to compare all the results against this recall period. We have updated this in the methods section.
Results: While I am not well-qualified to comment on the statistical analyses, it would appear that not all significant findings from the adjusted regression analysis are highlighted in the narrative, please see (d) and (e) below. The main findings would appear to be as follows: Compared to a recall period of 1-3 months. (a) Cases with recall periods of 4-6 months are 24% more likely to be classified as due to HIV; (b) Cases with recall periods of <1 month 40% more likely to be classified as due to other infectious causes. (c) Cases with recall periods of <1 month 43% less likely to be classified as due to maternal causes. (d) Cases with recall periods of >12 months 70% more likely to be classified as due to RTI. (e) Cases with recall periods of >12 months 41% less likely to be classified as indeterminate.
Response: Thank you. We have updated the missing significant results for the indeterminate, however we have left out the RTIs as the association with recall time is not significant on the overall likelihood ratio test of association.

Discussion:
The discussion could be further developed by considering the plausibility of the main findings offering reasons as to why the recall time effects on some cause of death categories were found. For example, that cases with recall periods of <1 month are 43% less likely to be classified as due to maternal causes is counter-intuitive -maternal deaths are relatively rare, generally avoidable and therefore likely to be highly memorable.
Response: While the numbers for combined direct obstetric and neonatal causes were low, the majority of these were neonatal deaths. The effect observed could therefore have been a chance finding due to the few numbers or as a result of the higher proportion of neonatal deaths to direct obstetric causes ( 6 to 1) especially in cases where the neonatal deaths were not reviewed promptly.
Page 9, paragraph 5: the statement 'our analysis did not show any significant differences for any of the identified causes of deaths for recall periods above 12 months' needs to be checked. Again, from a general perspective, Table 3 appears to suggest that deaths investigated >12 months are (a) 70% more likely to be classified as due to RTI (AOR 1.70 (CI 1.21-2.41) P=0.002) and (b) 41% less likely to be diagnosed by InterVA as indeterminate (AOR 0.59 (CI 0.39-0.88) P=0.011).
Response: Thank you for the observation. We have revised the statement and included the