Wishing for Ex-Post Evaluation Christmas Lights
Rather than Needles in Haystacks
This is what life of most ex-post evaluation researchers looks like, mostly without the counting congratulator:
I recently spent three days looking for ex-post evaluations for a client across nearly a dozen organizations. I was hard-pressed to find 16 actual ones. Sorting through ‘impact evaluations’ that were done in the middle of implementation does not tell us anything about what was sustained after we leave, nor do delayed final evaluations that happen to be done after closure. While these (rightly) focus on cost-effectiveness, relevance and efficiency, measures of sustained impact are projections, not actual measures of what outcomes and impacts stood the test of time. I weeded out some desk studies that did not return to ask anyone who participated. Others titled ‘ex-post’ were barely midterms (I can only gather they misconstrued ‘ex-post’ as after-starting implementation?) and a few more reports only recommended doing ex-post evaluation after this final evaluation. For more lessons on how random and misconstrued ex-posts can be, see Valuing Voices’ research for Scriven. None of these 16 actual ex-posts even told us anything about what emerged (as we look at during Sustained and Emerging Impacts Evaluations) from local efforts in the years after assistance ended .
This is what I wish my ex-post haystack would look like, bountiful treasures of numerous ex-post-project evaluations, as numerous as these Christmas lights here in Tabor, Czech Republic.
If we had more ex-posts to learn from, we could learn from what lasted. What could locals sustain? Why? Why not? How can we do better next time? We could compare across sectors and countries, and we could see what conditions and processes during implementation supported sustainability -and importantly – why some failed, so we don’t repeat those mistakes.
We could move from our current orange slices that ends at closure to green sustainability of the project cycle:
I will be adding the ones I found to our Catalysts list soon, but when my client asked me who held databases of ex-post evaluations, I had to say only Valuing Voices and Japan’s JICA (since 1993 who even differentiates the ex-posts between Technical Grants and ODA Loans). This is not to say some cannot be found by trawling the OECD or the World Bank, but this is Needle-in-Haystack work again and so there are only 2 databases to learn from. Isn’t that shocking?
Now JICA has really upped the illumination ante, so to speak: They are now doing what they are calling ‘JICA’s Ex-post Monitoring’ which was like Christmas come early ! Returning to learn at least 7 years after the ex-post which was 1-3 years after closure, such as among this case of ex-post monitoring and learning from 10 projects (2007). They have done ex-post monitoring for a total of 91 cases, evaluating the sustained impacts of results, see if JICA’s recommendations to their partners had been implemented, how they had adapted to changes over a decade post-closure, and find learning for new programming. “Ex-post monitoring is undertaken 7 years after a project was completed in principle in order to determine whether or not the expected effects and impacts continue to be generated, to check that there are no sustainability-related problems with the technical capacities, systems and finances of the executing agency nor with the operation and management of developed facilities, etc., and to ascertain what action has been taken vis-a-vis the lessons learned and recommendations gleaned during the ex-post evaluation.” While it was unclear why these specific projects were selected, it is amazing they are doing 5-10 per year.
They are my ex-post gods/ goddesses and I fawned over two JICA evaluators at the last European Evaluation Society Conference. Why do I fawn? JICA lists 2273 results under ex-post evaluations of Technical Cooperation, Grant Aid, ODA loans! They are literally the only organization I know whose searched reports are actually ‘ex-post’.
What we can learn from returning again is illustrated by one of JICA’s water project loans in RSA, which ended in 2003, had an ex-post in 2006, followed by monitoring of sustainability in 2013 . While the report included issues of data access and evaluators expressed caution in attributing causation of positive changes to the project, but it not only continued functioning, the government of South Africa (RSA) solved barriers found at the ex-post:
- “Data for the supply and demand of water pertaining to the Kwandebele region could not be obtained. However, considering the calculation from the water supplied population and supplied volume and the result from the DWAF interview, water shortage could not be detected in the four municipalities studied by this project…” 
- “The ex-post evaluation indicated that the four components were not in the state to be operated and managed effectively. Currently, the components are operated and managed effectively and are operating under good condition [and] concerning sustainability, improvement can be seen from the time of ex-post evaluation. Shortage of employees and insufficient technical knowledge has been resolved…” 
- “Compared to the time of ex-post evaluation, improvement was seen in the under-five mortality and life expectancy. However, since the components implemented by this project are limited in comparison with the scope of the project, it is impossible to present a clear causal relationship” .
In another, from Indonesia’s air quality testing labs which involved capacity building and equipment maintenance 6 years after the ex-post, they mostly found training and use continued despite organizational changes and maintenance challenges: 
- “After the ex-post evaluation, many of the target laboratories changed their affiliation from the Ministry of Public Works (MOPW) and MOH to provincial governments. While the relocation of equipment has been carried out in a handful of provinces, in other provinces equipment is still located at the laboratories where it was originally installed and these laboratories still have the right of use” 
- In spite of some irregularities ”As the Ministry of Environment (MOE) still has ownership of the equipment, some laboratories have inappropriate audit results that show allocation of O&M budget to equipment which is not included in their accounting…” 
- “Out of 20 laboratories where the questionnaire survey confirmed that equipment still remained, 15 laboratories replied that spare parts for equipment are still available but are difficult to obtain…It takes several months to one year to obtain spare parts, occasionally out of Indonesia, even if a repair service is available” .
In this case, there were lessons learned for JICA and Indonesia’s Ministry of the Environment programs about ownership and the right use of the equipment and retiring obsolete equipment. Talk about a commitment to learning from the ongoing success or failure of one’s projects!
As you have read here on Valuing Voices for more than six years, unless we include post-project sustainability that asks our participants and partners how sustained their lives and livelihoods could be, and even resilient to shocks like political or climate change, we cannot say we are doing Sustainable Development. We need such lessons about what could be sustained and why.
We can prepare better to foster sustainability. In the coming months we are working on checklists to consider during funding, design, implementation, M&E pre-and post-exit, to foster sustainability. Will keep you posted, but as World Vision also found: “Measuring sustainability through ex-posts requires setting clear benchmarks to measure success prior to program closure, including timelines for expected sustainment.”
And as my gift to you this Holiday Season, let me share WV’s Learning Brief about Sustainability, with wise and provocative questions to ponder about dynamic systems, benchmarking, continuous learning, attribution, and managing expectations . World Vision shares how infrastructure and community groups and social cohesion fared well, yet lessons circled back to the need for JICA-like ‘monitoring’ and mirror rich ex-post lessons from FFP/Tufts (Rogers, Coates) and Hiller et al. that explains why we do ex-posts at all: “Project impact at the time of exit does not consistently predict sustainability“ .
Now my gift: a few big lessons from the six years of researching sustainability across the development spectrum. I have found no evaluations that were only positive. Most results trended downwards, a few held steady, and all were mixed. We cannot assume the sustainability of results at closure, nor optimistic projections as we’ve seen in the climate arena.
- Designing with our participants and partners so what we do,
- Implementing with partners far longer to make sure things still work,
- Adapting exit based on benchmarks to see how well the resources, partnerships, capacities, and ownership have been transferred,
- Using control or comparison groups to make sure ‘success’ was due to you and being careful about attributing results to your projects while considering how you contributed to a larger whole of ongoing country progress or stagnation,
- Being willing to jettison what is unlikely to be sustained and learn from what we designed and implemented poorly (due to our design, their implementation, external conditions),
- Given climate-change, learning fast, adaptively and revising fast given changing conditions,
- Without knowing what has been sustained we cannot replicate nor scale-up,
- Sharing lessons with your leaders – for people’s lives depend on our work,
- Learning from what emerged as our participants and partners refashioned implementation in new ways could sustain it (without the millions we brought),
- Refocusing ‘success’ from how much we have spent, to how much was sustained.
Please make our next Christmas merry. Do MANY ex-post evaluations, Learn TONS, Share WIDELY WHAT WORKED AND FAILED TO WORK (you will be praised!), and let’s CHANGE HOW WE DO SUSTAINABLE DEVELOPMENT.
May 2020 bring health, happiness, and to all of us a more sustainable world!
 Cekan, J., Zivetz, L., & Rogers, P. (2016). Sustained and Emerging Impacts Evaluation (SEIE). Retrieved from https://www.betterevaluation.org/en/themes/SEIE
 JICA. (n.d.). Ex-post Monitoring. Retrieved December, 2019, from https://www.jica.go.jp/english/our_work/evaluation/oda_loan/monitoring/index.html
 Matsuyama, K. (2012). Ex-Post Monitoring of Japanese ODA Loan Project: South Africa, Kwandebele Region Water Augmentation Project. Retrieved from https://www.jica.go.jp/english/our_work/evaluation/oda_loan/monitoring/c8h0vm000001rdlp-att/2012_full_03.pdf
 Kobayashi, N. (2009, August). Ex-post Monitoring of Completed ODA Loan Project: Indonesia, The Bepedal Regional Monitoring Capacity Development Project. Retrieved from https://www.jica.go.jp/english/our_work/evaluation/oda_loan/monitoring/c8h0vm000001rdlp-att/indonesia2008_01.pdf
 Trandafili, H. (2019). Learning Brief: What does sustainability look like post-program? Retrieved from https://valuingvoices.com/wp-content/uploads/2019/12/Sustainability-Learning-Brief_final_WV-icons.pdf
 Rogers, B. L., & Coates, J. (2015, December). Sustaining Development: A Synthesis of Results from a Four-Country Study of Sustainability and Exit Strategies among Development Food Assistance Projects. Retrieved from https://www.fsnnetwork.org/ffp-sustainability-and-exit-strategies-study-synthesis-report
What should projects accomplish… and for whom?
An unnamed international non-profit client contacted me to evaluate their resilience project mid-stream, to gauge prospects for sustainable handover. EUREKA, I thought! After email discussions with them I drafted an evaluation process that included learning from a variety of stakeholders, ranging from Ministries, local government and the national University who were to take over the programming work about what they thought would be most sustainable once the project ended and how in the next two years the project could best foster self-sustainability by country-nationals. I projected several weeks for in-depth participatory discussions with local youth groups and sentinel communities directly affected by the food security/ climate change onslaught and who benefited from resilience activities to learn what had worked, what didn’t and who would take what self-responsibility locally going forward.
Pleased with myself, I sent off a detailed proposal. The non-profit soon answered that I hadn’t fully understood my task. In their view the main task at hand was to determine what the country needed the non-profit to keep doing, so the donor could be convinced to extend their (U.S.-based) funding. The question at hand became how could I change my evaluation to feed them back this key information for the next proposal design?
Maybe it was me, maybe it was the autumn winds, maybe it was my inability to sufficiently subsume long-term sustainability questions under shorter-term non-profit financing interests that led me to drop this. Maybe the elephant in the living room that is often unspoken is the need for some non-profits to prioritize their own organizational sustainability to ‘do good’ via donor funding rather than working for community self-sustainability.
Maybe donor/funders should share this blame, needing to push funding out, proving success at any cost to get more funding and so the cycle goes on. As a Feedback Lab feature on a Effective Philanthropy report recently stated: “Only rarely do funders ask, ‘What do the people you are trying to help actually think about what you are doing?’ Participants in the CEP study say that funders rarely provide the resources to find the answer. Nor do funders seem to care whether or not grantees are changing behavior and programs in response to how the ultimate beneficiaries respond” .
And how much responsibility do communities themselves hold for not balking? Why are they so often ‘price-takers’ (in economic terms) rather than ‘price-makers’? As wise Judi Aubel asked in a recent evaluation list-serve discussion “When will communities rise up to demand that the “development” resources designed to support/strengthen them be spent on programs/strategies which correspond to their concerns/priorities??”
We can help them do just that by creating good conditions for them to be heard. We can push advocates to work to ensure the incoming Sustainable Development Goals (post-MDGs) listen to what recipient nations feel are sustainable, more than funders. We can help their voices be heard via systems that enable donor/ implementers to learn from citizen feedback, such as Keystone has via their Constituent Voice practice (in January 2015 it is launching an online feedback data sharing platform called the Feedback Commons) or GlobalGiving’s new Effectiveness Dashboard (see Feedback Labs).
We can do it locally in our work in the field, shifting the focus from our expertise to theirs, from our powerfulness to theirs. In field evaluations can use Empowerment Evaluation. We can fund feedback loops pre-RFP (requests for proposals), during project design, implementation and beyond, with the right incentives tools for learning from community and local and national-level input so that country-led development begins to be actual not just a nice platitude. We can fund ValuingVoices’ self-sustainability research on what lasts after projects end. We can conserve project content and data in Open Data formats for long-term learning from country-nationals.
Most of all, we can honour our participants as experts, which is what I strive to do in my work. I’ll leave you with a story from Mali. in 1991 I was doing famine-prevention research in Koulikoro Mali where average rainfall is 100mm a year (4 inches). I accompanied women I was interviewing to a deep well which was 100m deep (300 feet). They used plastic pliable buckets and the first five drew up 90% of the bucket full. When I asked to try, they seriously gave me a bucket. I laughed, as did they when we saw that only 20% of my bucket was full. I had splashed the other 80% out on the way up. Who’s the expert?
How are we helping them get more of what they need, rather than what we are willing to give? How are we prioritizing their needs over our organizational income? How are we #ValuingVoices?
 The Center for Effective Philanthropy. (2014, October 27). Closing the Citizen Feedback Loop. Retrieved December 2014, from https://web.archive.org/web/20141031130101/https://feedbacklabs.org/closing-the-citizen-feedback-loop/
 Better Evaluation. (n.d.). Empowerment Evaluation. Retrieved December 2014, from https://www.betterevaluation.org/plan/approach/empowerment_evaluation
 Sonjara. (2016). Content and Data: Intangible Assets Part V. Retrieved from http://www.sonjara.com/blog?article_id=135
Pineapple, Apple- what differentiates Impact from self-Sustainability Evaluation?
There is great news. Impact Evaluation is getting attention and being funded to do excellent research, such as by the International Initiative for Impact Evaluation (3ie), by donors such as the World Bank, USAID, UKAid, the Bill and Melinda Gates Foundation in countries around the world. Better Evaluation tell us that "USAID, for example, uses the following definition: “Impact evaluations measure the change in a development outcome that is attributable to a defined intervention; impact evaluations are based on models of cause and effect and require a credible and rigorously defined counterfactual to control for factors other that the intervention that might account for the observed change.”
William Savedoff of CGD reports in Evaluation Gap reports that whole countries are setting up such evaluation institutes: "Germany's new independent evaluation institute for the country's development policies, based in Bonn, is a year old. DEval has a mandate that looks similar to Britain's Independent Commission for Aid Impact (discussed in a previous newsletter ) because it will not only conduct its own evaluations but also help the Federal Parliament monitor the effectiveness of international assistance programs and policies. DEval's 2013-2015 work program is ambitious and wide – ranging from specific studies of health programs in Rwanda to overviews of microfinance and studies regarding mitigation of climate change and aid for trade." There is even a huge compendium of impact evaluation databases.
There is definitely a key place for impact evaluations in analyzing which activities are likely to have the most statistically significant (which means definitive change) impact. One such study in Papua New Guinea found SMS (mobile text) inclusion in teaching made a significant difference in student test scores compared to the non-participating 'control group' who did not get the SMS (texts). Another study, the Tuungane I evaluation by a group of Columbia University scholars showed clearly that an International Rescue Committee program on community-level reconstruction did not change participant behaviors. The study was as well designed as an RCT can be, and its conclusions are very convincing. But as the authors note, we don't actually know why the intervention failed. To find that out, we need the kind of thick descriptive qualitative data that only a mixed methods study can provide.
Economist Kremer from Harvard says "“The vast majority of development projects are not subject to any evaluation of this type, but I’d argue the number should at least be greater than it is now.” Impact evaluations use 'randomized control trials', comparing the group that got project assistance to a similar group that didn't to gauge the change. A recent article that talks about treating poverty as a science experiment says "nongovernmental organizations and governments have been slow to adopt the idea of testing programs to help the poor in this way. But proponents of randomization—“randomistas,” as they’re sometimes called—argue that many programs meant to help the poor are being implemented without sufficient evidence that they’re helping, or even not hurting." However we get there, we want to know – the real (or at least likely)- impact of our programming, helping us focus funds wisely.
Data gleaned from impact evaluations is excellent information to have before design and during implementation. While impact evaluations are a thorough addition to the evaluation field, experts recommend they be done from the beginning of implementation. While they ask “Are impacts likely to be sustainable?”, and “to what extent did the impacts match the needs of the intended beneficiaries?” and importantly “did participants/key informants believe the intervention had made a difference?” they focus only on possible sustainability, using indicators we expect to see at project end rather than tangible proof of sustainability of the activities and impacts that communities define themselves that we actually return to measure 2-10 years later.
That is the role for something that has rarely been used in 30 years – for post-project (ex-post) evaluations looking at:
The resilience of expected impacts of the project 2, 5, 10 years after close-out
The communities’ and NGOs’ ability to self-sustain which activities themselves
Positive and negative unintended impacts of the project, especially 2 years after, while still in clear living memory
Kinds of activities the community and NGOs felt were successes which could not be maintained without further funding
Lessons for other projects across projects on what was most resilient that communities valued enough to do themselves or NGOs valued enough to get other funding for, as well as what was not resilient.
Where is this systematically happening already? There are our catalysts ex-post evaluation organizations, drawing on communities' wisdom. Here and there there are other glimpses of ValuingVoices, mainly to inform current programming, such as these two interesting approaches:
Vijayendra Rao describes how a social observatory approach to monitoring and evaluation in India’s self-help groups leads to “Learning by Doing”– drawing on material from the book Localizing Development: Does Participation Work? The examples show how groups are creating faster feedback loops with more useful information by incorporating approaches commonly used in impact evaluations. Rao writes: “The aim is to balance long-term learning with quick turnaround studies that can inform everyday decision-making.”
Ned Breslin, CEO of Water For People talks about “Rethinking Social Entrepreneurism: Moving from Bland Rhetoric to Impact (Assessment)”. His new water and sanitation program, Everyone Forever, does not focus on the inputs and outputs, including water provided or girls returning to school. Instead it centers instead on attaining the ideal vision of what a community would look like with improved water and sanitation, and working to achieve that goal. Instead of working on fundraising only, Breslin wants to redefine the meaning of success as a world in which everyone has access to clean water.
We need a combination. We need to know how good our programming is now through rigorous randomized control trials, and we need to ask communities and NGOs how sustainable the impacts are. Remember, 99% of all development projects worth hundreds of millions of dollars a year are not currently evaluated for long-term self-sustainability by their ultimate consumers, the communities they were designed to help.
We need an Institute of Self-Sustainable Evaluation and a Ministry of Sustainable Development in every emerging nation, funded by donors who support national learning to shape international assistance. We need a self-sustainability global database, mandatory to be referred to in all future project planning. We need to care enough about the well-being of our true client to listen, learn and act.