State of the science on implementation research in early child development and future directions

We summarize the state of the field of implementation research and practice for early child development and propose recommendations. First, conclusions are drawn regarding what is generally known about the implementation of early childhood development programs, based on papers and discussions leading to a published series on the topic. Second, recommendations for short‐term activities emphasize the use of newly published guidelines for reporting data collection methods and results for implementation processes; knowledge of the guidelines and a menu of measures allows for planning ahead. Additional recommendations include careful documentation of early‐stage implementation, such as adapting a program to a different context and assessing feasibility, as well as the process of sustaining and scaling up a program. Using existing implementation information by building on and improving past programs and translating them into policy are recommended. Longer term goals are to identify implementation characteristics of effective programs and determinants of these characteristics.


State of the science and future directions a
The field of implementation research on early childhood development (ECD) is coming of age. There is considerable evidence that nurturing care programs for caregivers of young children and early learning centers for young children can be effective. This conclusion rests on the proviso that implementation quality can be assured. The pressing question now is: What are they doing that makes them so effective? If the answer to this question were known, reach and impact of ECD programs might expand. The future of ECD programs' quality and impact depends, in part, on a published body of knowledge that systematically answers the above a This paper concludes the 2018 special issue "Implementation Research and Practice for Early Childhood Development." Ann. N.Y. Acad. Sci. 1419: 1-271. question. Although slightly different definitions of "implementation research" exist, here we use the version proposed by Peters and colleagues, namely, that it is scientific inquiry into questions concerning activities undertaken as planned with the intention of producing an effect, and may consider the factors affecting implementation, the process itself, the indicators of how well it worked, and the strategies used to improve, sustain, and expand its reach. 1,2 The current situation leaves many organizations and governments in the position of knowing that the need for nurturing care programs exists, while not knowing exactly how to implement it. Nurturing care is defined as a stable environment created by parents and other caregivers that ensures children's good health and nutrition, protects them from threats, and gives young children opportunities for early learning, through interactions that are emotionally supportive and responsive. 3  kind of parenting or preschool program would provide a suitable solution to promote nurturing care in their context? What kind of supports (e.g., policy, budget, and workforce) does this program need in order to be effective? Other than importing a program developed in a high-income country, they might try to replicate programs from Latin America and the Caribbean, insufficiently described in the literature, or to connect with one of the researchers of such programs. This has arguably been a common but limited approach. Instead, the field should offer a menu of options with information from each on how to adapt the content and format, train a workforce, arouse demand, monitor and evaluate such programs in their own setting, and estimate the costs. Although some of this information exists in published form in articles, systematic reviews, and the gray literature, it is not easily accessed. Moreover, it is incomplete and unsystematic.
The special issue of Ann. N.Y. Acad. Sci. 4 on implementation research and practice for ECD programs begins to fill this gap. One key paper provides a blueprint for systematizing knowledge by proposing initial guidelines for reporting implementation information. 5 Another provides a description of how to measure the important implementation information. 6 Together, these two articles inform program managers and M&E experts on what information others need to know about their program and how to collect and report that information. The special issue also includes case studies of specific programs implemented around the world, detailing how they undertook the process of adapting their program, training and supervising their workforce, evaluating its quality, and using the information to improve the program. Specific papers focus on issues that exercise minds when trying to arouse demand, estimate the costs, scale up an effective program, and tailor one to humanitarian settings. Overall, the papers, based on tacit knowledge from personal experiences, monitoring information, and systematically collected implementation data, provide an update on the state of ECD implementation research at this point and set the stage for how early childhood programs can move forward.
The conclusions we detail in this paper are informed by both the published papers and discussions of participants who attended a 2-day work-shop sponsored by and held at the New York Academy of Sciences, December 4 and 5, 2017, in New York City. The workshop brought together several members of an expert panel on implementation and authors of the special issue. 4 A report of implementation by itself has value but becomes more valuable when linked to outcomes. This holds for small-and large-scale programs. If outcomes are not positive, then a description of the implementation process allows for identifying components that did not work well, barriers to implementing those components, and decisions on how to improve the program. If outcomes are positive, then improvements are still an option, along with decisions on how to sustain and scale it, and how to adapt and apply it in another setting. In some cases, a report of implementation by itself, as in a pilot or feasibility study, has value; it allows one to judge whether the program is feasible and desirable to implement in that setting.
We divide our remarks on the state of the science into three sections (see Box 1). The first addresses conclusions that we feel are generally known and accepted by ECD implementers and researchers. Some examples to support these conclusions are provided, though conclusions may change when new evidence is revealed. The second section provides guidance on what implementers need to do in the coming years to provide a systematic body of useful knowledge about program activities, adaptation, measurement, improvement, and expansion, among others. The third section states some longer term goals for implementation research on ECD in terms of identifying what facilitates implementation and what leads to effective outcomes. Because most ECD programs are broad-band with several components (several messages and several delivery formats), individual studies may not be able to identify the specific component responsible for the positive or negative outcome. Unless impact studies are purposely designed to compare packages of interventions, it is only after looking across many studies that we can identify the components most frequently associated with a positive or negative outcome. Likewise, integrating findings from implementation studies will provide an opportunity to identify how well such components are implemented and the factors that enable good implementation.

Box 1. A summary of implementation research in ECD
What is known r Descriptions of the implementation process during feasibility and pilot studies have benefited decisions about program modification.
r Evaluations of the quality of a program in relation to outcomes can help to identify ways to improve the program.
r The content and methods of teaching/learning relate to outcomes. r The capacity of delivery agents is considered a central determinant of program effectiveness and so warrants careful documentation. r Reporting guidelines and a menu of measures help to systematize the collection of implementation process information and its reporting.
r The field has advanced more rapidly since people started to use existing knowledge on implementation to build on currently existing programs and to improve programs.

Recommendations for the immediate future
r Before implementing a new program, the best strategy is to modify an existing evidence-based program rather than developing an entirely new one.
r Reporting the implementation process using reporting guidelines will inform implementers and researchers in the field of ECD.
r Reporting on the use of quantitative and qualitative (mixed method) measures and their reliability will enhance comparability across studies.
r Follow ups to implementation findings will inform program improvement and policy. r Implementation reports on how small-scale effective programs are scaled up at regional and national levels will be critical as we move into a new stage with expanded reach.
r Practical guides are needed for assessing implementation at different stages of program development. r Integrated programs for child development are seen as preferable.

Longer term goals
r With more data available, we can identify characteristics of effective programs by associating implementation features with outcomes.
r With more data available, systematic reviews of implementation studies will analyze determinants of common implementation parameters.
r Publish implementation research findings in visible outlets.

First known
Descriptions of the implementation process have benefited decisions about the feasibility of a program and features to be modified. Some programs can be replicated as is, using similar messages, methods of delivery, and workforce. However, when moving to a new continent or culture, the principles of adaptation and flexibility are more important than exact replication. Most have benefited from adaptation followed by a feasibility study, with careful attention to the concerns of delivery agents who fit the new activity into their existing workload. For example, the feasibility of implementing an adapted version of the program Care for Child Development in Malawi was based on implementation research with two delivery formats. The program was well received by caregivers despite moderate attendance, but the main barrier to reach was the workload of providers. 7 Likewise, work overload among health providers was an issue when implementing the Reach Up parenting visits in Brazil. 8 For these and other reasons, it is generally accepted that pilot, feasibility, and other forms of formative research provide necessary insights for program design. 9

Second known
Evaluations of the quality of a program in relation to outcomes can help to identify ways to improve the program. Flexibility for the purpose of quality improvement trumps fidelity at early stages of program development. For example, measures of preschool quality have been used in many parts of the world to improve program pedagogy and resources. The case study of aeioTU in Colombia provided details on how teacher training made better use of interactions with children to enhance reasoning and language. 10 Similarly, Bangladeshi and Indonesian preschool programs underwent improvements over several years based on findings that showed low quality of literacy and math teaching/learning. 11,12 As quality improved, so did outcomes. The quality of parenting programs, both home visiting and group sessions, lacks a standard measure; however, implementers have been converging on similar components. 13 They include making sure that mothers are directly addressed by providers, and that they have opportunities to practice new behaviors with their child and solve problems they encounter when trying to do so. Other useful indicators of implementation include whether the dosage, reach, acceptability, and cost were appropriate.

Third known
The content and methods of teaching/learning relate to outcomes. If the desired outcome is enhancement of mental development, input should include provision of responsive stimulation and warmth. This is most clear in nurturing care programs for caregivers, where nutrition and hygiene by themselves are unlikely to have the impact that provision of stimulation has. 14,15 The content of parenting programs should therefore include messages about responsive talk and play with young children, and the best method of adult teaching/learning includes active practice and problem-solving. Likewise, the content of preschool programs should include language, numbers, nature, and motor activities, as well as integration across content areas. The methods of teaching/learning in preschool should not be based on rote repetition, but based on stimulating teacher-child interactions enhanced through the use of varied groupings throughout the day (e.g., individual, small group, and large group). Another critical method includes free-choice indoor play with a variety of stimulating materials; creative, motivated, self-directed, and engaged play is enhanced with playmates and playthings. Logic models are one way of outlining the proposed theory of change, but designers have benefited from more precisely outlining the psychosocial processes that connect activities to short-term outcomes (e.g., active learning activities lead to new habitual behaviors being acquired and perfected, thus resulting in sustained new parenting practices).

Fourth known
The capacity of delivery agents is considered a central determinant of program effectiveness and so warrants careful documentation in terms of training, supervision, and assessed competencies. Countries differ in the level of education and experience existing in its workforce; consequently, delivery agents will not all meet the same standard when implementing an ECD program. Training manuals and competency assessments have proved useful when comparing across contexts, as has information about the conditions under which delivery agents work (e.g., wages and workload). Analyses show that the competence of delivery agents, bolstered by supportive supervision, is associated with quality implementation and effective outcomes. 10,16 Fifth known Reporting guidelines and a menu of measures help to systematize the collection of implementation process information and its reporting. Reporting guidelines for randomized, nonrandomized, and observational research have helped to ensure that all the information required to evaluate a study is reported. 17 Nonetheless, the information is usually insufficient to replicate an intervention, adapt it to another setting, or explain different outcomes. In current implementation papers, only partial information is available. With reporting guidelines, implementers can plan ahead what variables to measure, when, and with what method. 5,6 Outcomes are becoming more comparable as researchers use a menu of child development measures based on direct assessment or parent report. 18 The same progress on implementation measures should follow once ECD implementers start to use similar measures of process quality, provider competence, and stakeholder engagement.

Sixth known
The field has advanced more rapidly since people started to use existing knowledge on implementation to build on currently existing programs and to improve programs. Effective programs exist for disadvantaged populations living in low-and middleincome countries (LMICs). When implementers and researchers build on each other's programs, they benefit from knowing what was well implemented and what was poorly implemented. We continue to learn from mistakes, from programs that were incorrectly implemented, 19 and from programs that were not effective. 20,21 There is a great deal of room for innovation in different contexts, but practice-like science-benefits from understanding the details of what has gone on before.

First recommendation
Before implementing a new program, the best strategy is to modify an existing evidence-based program, rather than develop an entirely new one. Learning how others have embedded their program in an existing platform such as health, education, cash transfer, or a social entrepreneurship is also helpful. Curriculum and operational manuals exist or can be obtained from practitioners and studied before selecting the one most suitable to the context. A sufficient number of programs implemented in LMICs allow organizations to consider what is unique about these contexts that distinguish them from highly resourced contexts and what could be usefully adapted from a high-resource program. A situation analysis of the needs and enablers in the context will help determine the goals of the program from which a logic model can be developed. Engaging with stakeholders will be important in order to adapt the program to the local context and to leverage existing resources. To help others in their attempts to select and adapt a program, the field requires many more replication-and/or-adaptation publications with details on how and why decisions were made at the front end of program delivery.

Second recommendation
Reporting the implementation process using specific reporting guidelines will inform implementers and researchers in the field of ECD. 5 Such guidelines are aspirational, and not all variables will have been accounted for. Planning what and how to measure each step in the process of implementation will enhance the value of a given report. 6 Comparisons with other similar programs help to draw attention to implementation features that overlap, and others that are unique.

Third recommendation
Reporting on the use of quantitative and qualitative (mixed method) measures and their reliability will enhance comparability across studies. Because most measures and methods have not yet been standardized in this field, detailed information on who collected the data, when, from whom, and how will be important not only to inform decisions but also to help other implementers select measures and train data collectors. In particular, implementers search for measures that are concise and easy to use, yet provide valid summary information that can be compared with other implementations, used to inform decisions and improve program quality, and that are associated with outcomes.

Fourth recommendation
Follow ups to implementation findings will inform program improvement and policy. Specifically, it is important to know how and what decisions at the program and policy level were made based on implementation findings. For example, improvements made to programs in the wake of outcome and implementation findings are helpful to others, especially after a pilot or feasibility study and even after an effectiveness or transition-to-scale study. Organizations and governments need to know the data and the findings that warned about a problem, when the problem was identified, what kind of course-correction was envisioned, and who was involved in making the decision. Policies regarding training and supervising service providers, delivery platforms, and administrative responsibilities may change based on implementation findings. Case studies are often the format used to report such follow-ups though a streamlined set of guidelines for reporting on use of implementation findings to inform policy and programs would systematize knowledge. Existing publications of follow-up programs that were evaluated and found to be more effective than the initial one rarely elaborate the process of change.

Fifth recommendation
Implementation reports on how small-scale effective programs are scaled up at regional and national levels will be critical as we move into a new stage with expanded reach. Issues of manpower training, government involvement, and monitoring to maintain quality are challenges when scaling and sustaining these programs. Early childhood education at the preschool level is receiving current attention as governments act to meet the Sustainable Development Goal 4.2 to provide quality experience to children in the 3-to 6-year age group before entering primary school. Measures of child development and preschool quality are being trialed by governments and teacher certification programs are being implemented. Similar progress for programs addressing child development from birth to 3 years is on the horizon. Programs for these very young children may be integrated within the health system where an aligned set of programs target mothers during antenatal visits, community workers during the postnatal year, and fathers throughout.

Sixth recommendation
Practical guides are needed for assessing implementation at different stages of program development. Therefore, we need details on program designs and on indicators of implementation from the initial stages of program development (e.g., pilot and feasibility) to effectiveness and transition-to-scale. Different stages of program development and scale may require moving away from indicators of quality of programs that now are based on efficacy and effectiveness trials and highlight internal validity, to ones that include external validity, reach, feasibility, and cost. 22 In other words, while minimizing biases is important, implementation studies need to be evaluated also in terms of external validity. External validity concerns how well the program fits the context and needs of the population. What are the situational conditions in which the program is embedded and which, in turn, define the scope of its application? To assess external validity requires a description and evaluation of the system before introducing an ECD program (e.g., workforce, demand for ECD, and policies) and how the system responded to the program's introduction. To assess generalizability of the program, researchers and implementers need to have a list of specific indicators that allow for comparability across countries. The indicators might include but are not limited to Human Development Index, proportion living in rural areas, wealth/assets, parental schooling, antenatal care, facility delivery, women's empowerment, children's stunting, breast feeding, preprimary enrollment, and government expenditure on health and education.

Seventh recommendation
Integrated programs for child development are seen as preferable. Practitioners will find useful reports on integrated, or bundled, programs that combine, for example, stimulation, nutrition, WASH (water, sanitation, hygiene), and maternal well-being 23 so that cost-benefits of integration can be assessed. This will permit an evaluation of critical components as programs expand their goals and reach, such as the workload for providers and caregivers, cost savings, and links among children's health, growth, and mental development.

Longer term goals
First long-term goal With more data available, we can identify characteristics of effective programs by associating implementation features with outcomes through estimation methods. This will help identify core elements of a program that are most strongly and consistently associated with positive outcomes. 24 Very preliminary correlational analyses identified features of programs in LMICs that enhanced child development outcomes, including formats that provided small media and opportunities for demonstrations, practice, and problem-solving. 13 Kirby and colleagues took this a step further when they undertook an analysis of group curriculumbased HIV prevention programs for youth. 25,26 They identified features of the curriculum and its implementation that could be individually scored and related to behavioral outcomes. The common characteristics of effective programs included: adaptation to the context based on a situation analysis, a minimal and clear statement of the behavior to be changed, pilot testing the program, instructional activities for active learning, high fidelity, and ongoing activities to expand reach. The findings yielded a tool whereby practitioners could ask themselves questions about the characteristics of a program they were selecting or had already implemented, to increase their chances of success. Although the findings may be unique to HIV programs, the process of identifying implementation features of effective ECD programs should be similar. More direct comparisons can be made between core elements of programs if they are built into the design of the impact evaluation study. This has been accomplished comparing nutrition and stimulation studies, and comparing different delivery formats and providers for stimulation programs.

Second long-term goal
With more data available, systematic reviews of implementation studies will analyze determinants of common implementation parameters such as delivery mode and ratio of providers: families. This is the science of implementation, which seeks to identify features associated with good implementation. Durlak and DuPres 27 examined 23 factors that affect implementation of child and adolescent health promotion programs, covering community, provider, and innovation features of the program itself. Fidelity and dosage were the common implementation features measured in this research, and they were consistently determined by policy, funding, providers' skill proficiency, adaptability of the program, and its compatibility with the context. It will likewise be important to identify determinants of ECD program implementation. Although fidelity and dosage are two important implementation variables for all programs, quality and reach must also be considered as programs are scaled up.

Third long-term goal
The field of ECD needs consistent outlets where implementation studies are shared with other researchers and policymakers. The set of guidelines proposed in this series provide standards for the field to use. We seek to foster a culture of high-quality implementation evaluations for publications and advocate for greater visibility of implementation research in peer-reviewed journals.

Conclusion
We hope that a new generation of nurturing care programs will benefit from these recommendations. The majority of recommendations are ones that all researchers and implementers can adopt in their current work. Future goals are worth keeping in mind as the field progresses. We are mindful of the many challenges that implementers face as they seek to expand scale and sustainability of programs while enhancing demand and program quality and attending to costs. Reporting these challenges and how they were managed will further future work. The systematic reporting of implementation process findings should in turn lead to greater understanding of past successes and barriers. The result will be higher quality programs with stronger impact, reach, and sustainability.

Acknowledgments
The contribution of F.E.A. was to write the first draft of the manuscript. A.K.Y. and M.N. provided substantive input to its intellectual content. P.B. made comments on the penultimate draft.
This paper was invited to be published individually and as one of several others as a special issue of Ann. N.Y. Acad. Sci. (1419: 1-271, 2018). The special issue was developed and coordinated by Aisha K. Yousafzai, Frances Aboud, Milagros Nores, and Pia Britto with the aim of presenting current evidence and evaluations on implementation processes, and to identify gaps and future research directions to advance effectiveness and scale-up of interventions that promote young children's development. A workshop was held on December 4 and 5, 2017 at and sponsored by the New York Academy of Sciences to discuss and develop the content of this paper and the others of the special issue. Funding for open access of the special issue is gratefully acknowledged from UNICEF and the New Venture Fund.