La valutazione da vicino - politichepubbliche.org · più citati per il disegno della valutazione...
Transcript of La valutazione da vicino - politichepubbliche.org · più citati per il disegno della valutazione...
“Evaluation is an objective process of understanding how a policy or otherintervention was implemented, what effects it had, for whom, how and why.”1.1. La scelta di politiche programmi progetti (ppp) da sottoporre avalutazioneLa scelta è in genere demandata a un organo politico. E tuttavia il tecnicopuò indirizzarla.Evaluability AssessmentAn assessment of the extent to which an intervention can be evaluated in areliable and credible fashionCriteri: la valutazione deve essere•useful (utile e utilizzata)•comprehensive•proportionate•practicable
1 Planning and Designing Useful Evaluations
Time and resources for analysis are limitedCriteri per definire le priorità:• materiality or the size of the program;• risk to clients, stakeholders, the agency and Government;• alignment with agency and government priorities;• complexity of delivery or uncertainty about programresults; (..)• external requirements for review (such as programs subjectto a Sunset Clause); (..)
1 Planning and Designing Useful Evaluations La scelta dei ppp
1.2. Criteri per imposare la valutazione: Rigour, Utility,Feasibility and Ethics in Program EvaluationUna volta selezionati i ppp da sottoporre a valutazione, i criteripiù citati per il disegno della valutazione sono:• ‘Rigour’ in evaluation refers to the quality of the evidence, and the validityand certainty around the findings. For results driven evaluations inparticular, rigour includes assessing the extent that observed results weredue to the program.• ‘Utility’ refers to the scope for evaluation users to actually use the findings,particularly when information is needed at a certain times to informdecisions.• ‘Feasibility’ refers to the practicalities of collecting evidence in relation tothe maturity of the program, and to the availability of time, skills and relevantdata.• ‘Ethics’ refers to reducing the risk of harm from the evaluation andalso doing the evaluation in a culturally appropriate way.
1 Planning and Designing Useful Evaluations
1.3. Prerequisito per una buona valutazione: BuildingEvaluation into the Design of Programs
OMB Director Peter Orszag: new initiatives should ideally have“evaluation standards built into their DNA” (OMB 2009b)
Legislation can encourage stronger and more cost-effectiveevaluation in many ways. One is through language thatrecognizes the importance of conducting rigorousevaluations. Another is by making sure already-collectedprogram data are made available for such statistical andanalytical purposes.
1 Planning and Designing Useful Evaluations
Per poter valutare ex post una politica, occorre che dalle fasi inizialiemergano:
1. una chiara ‘teoria del cambiamento’, una policy theory: cioè unaprecisa definizione di quali meccanismi dovrebbero essere attivatiper produrre i risultati attesi, e perché
2. risultati SMART (v. slide successive)3. indicatori OVI (Objectively Verifiable Indicators). Le misurazioni
sono importanti: “Se non puoi misurare una cosa, non puoimigliorarla” (Lord Kelvin, primo presidente della InternationalElectrotechnical Commission (1906) e inventore della ‘scala diKelvin’)
Quindi, il disegno iniziale deve già prevedere alcuni indicatori dautilizzare nella fase della valutazione ex post
1 Planning and Designing Useful Evaluations Prerequisito:l’importanza del policy design
‘SMART’ results: A describable and measureable change that is derivedfrom a cause-and-effect relationship.‘SMART’ results are the same as outcomes and are defined as Specific,Measurable, Attainable, Relevant and Time-bound.
Reasonable timeframe to achieve the goal. A time-bound result isintended to establish a sense of urgency. For example, can databe collected to ensure that it aligns with the required reportingtimelines?
Time-bound
Choosing results that matter within the constraints of resources,knowledge and time. That is, results that will drive the programforward.
Relevant
Is there a realistic path to achievement? Neither out of reach norbelow standard performance.
Attainable
The need for concrete criteria for measuring progress and toknow when it has been achieved.
MeasurableClear and well defined.Specific
DescriptionCriteria
Wikipedia 2013, ‘SMART’ Criteria http://en.wikipedia.org/wiki/SMART_criteria
1 Planning and Designing Useful Evaluations Prerequisito: l’importanza delpolicy design
Indicators
Are indicators safe against manipulation?Robust
Is data/info cheaply available and easy to monitor?Easy
Are they credible for reporting purposes? Easy to interpret?Credible
Have they been discussed with staff and their shortcomings andinterpretation agreed?
Accepted
Is there a clear link between the indicator and the objective to bereached?
Relevant
DefinitionRACERcriteria
1 Planning and Designing Useful Evaluations Prerequisito: l’importanza delpolicy design
La scelta dei metodi:• quantitativi• qualitativi• mixedMixed methodsSINGLE METHOD IS NOT SUFFICIENT ANY MOREPublic policies, programmes and service delivery operate in increasingly
complex and ever-changing social, economic, ecological andpolitical contexts. No single M&E methodology can adequatelydescribe and analyze the interactions among all of these differentfactors.
Mixed methods allow for triangulation – or comparative analysis – whichis better suited to capture complex realities and to provide differentperspectives on the effect of public policies, programmes or servicedelivery.
1 Planning and Designing Useful Evaluations
I principals della valutazione: 3 diversi referentiWho should formulate the evaluation questions?
Who should design and implement the evaluation research process andtake responsibility for its outputs?
Three alternative models of evaluation:1. External evaluation2. Internal evaluation3. Participatory evaluation
1. External and internal evaluationBoth are based on a sharp distinction between the evaluators and the
evaluated and both rely on conventional social science researchmethods.
1 Planning and Designing Useful Evaluations
External evaluationIndependent evaluations are based on a clear demarcation between thosewho conduct the evaluation and those who are the object of evaluation.
• The evaluators stand outside the evaluated activities and have nostake in the outcome of the evaluation.
• Stakeholder participation tends to be limited.In a stronger sense, an evaluation is independent when the organisation thatformulates the evaluation questions and recruits the evaluators is alsoindependent of the evaluated activity
1 Planning and Designing Useful Evaluations i principals dellavalutazione
Internal evaluationsThe evaluators are organisationally attached to the evaluated activities.In many cases there are institutional safeguards protecting theindependence and integrity of internal evaluators.Advantages:•internal evaluators tend to have a better understanding of the organisation to beevaluated.•they are in a better position to facilitate processes of use, learning, and follow-up.•Disadvantages:•they have less credibility in relation to external audiences. When accountability isthe purpose of the evaluation they cannot replace external evaluators.
1 Planning and Designing Useful Evaluations i principals dellavalutazione
Participatory evaluationthe distinction between expert and layperson, researcher and researched, isde-emphasised and redefined.Participatory evaluations are led by professionals: they are mainlyfacilitators and instructors helping others to make the assessment.
Participation means putting ordinary people first, and redefining the roles ofexperts and laypersons.
– 1. People have a voice in matters that affect their interests.– 2. Participation helps mobilise local knowledge
1 Planning and Designing Useful Evaluations i principals dellavalutazione
Participatory evaluation•serves as an instrument of downward accountability and popularempowerment.•increases the effectiveness of efforts by mobilising popular knowledge•strengthens the participants’ sense of ownership with regard to theevaluated activities.As it allows participants to engage in open and disciplined reflection onquestions concerning the public good, it is seen as a kind of self-educationin democratic governance. (v. policy inquiry: l’analisi come riflessività)
Distinction between•the general concept of stakeholder participation•a more narrow concept of popular participation.
1 Planning and Designing Useful Evaluations i principals dellavalutazione
The best way to promote popular participation is to strengthen the elementof participation in the preceding stages of the intervention process.Necessary conditions for the successful use of participatory approaches:
Shared understanding among beneficiaries, programme staff and otherstakeholders of programme goals, objectives and methods.
Willingness among programme partners to allocate sufficient time andresources to participatory monitoring and evaluation.
A bottom-up methodology: Adopting a participatory approach can bedifficult if the intervention has been planned and managed in a top-downfashion.
A reasonably open and egalitarian social structure. Where localpopulations are internally divided by distinctions of class, power and status aparticipatory approach is not likely to be successful.
1 Planning and Designing Useful Evaluations i principals dellavalutazione Participatory evaluation
Tipi di evaluation (v. US, 2014)1. “process” evaluationanalyzes the effectiveness of how programs deliver services relative to programdesign, professional standards, or regulatory requirements. (..) Processevaluations help ensure that programs are running as intended, but in generalthese evaluations do not directly examine whether programs are achieving theiroutcome goals (GAO 2011)2. “performance” measurementPerformance measurement is a broader category that encompasses “the ongoingmonitoring and reporting of program accomplishments, particularly progress towardpre-established goals” (GAO 2011).Typically, performance measures provide a descriptive picture of how aprogram is functioning and how participants are faring on various “intermediate”outcomes, but do not attempt to rigorously identify the causal effects of theprogram. For instance, performance measures for a job training program mightcapture how many individuals are served, what fraction complete the training, andwhat fraction are employed a year later. But these measures will not answer thequestion of how much higher these individuals’ employment rates are as a result ofhaving completed the training. Nonetheless, performance measures serve asimportant indicators of program accomplishments and can help establish that aprogram is producing apparently promising (or troubling) outcomes.3.“impact” evaluationaims to measure the causal effect of a program or intervention on importantprogram outcomes.
2 Approaches to Evaluation and Key Questions
“process” evaluation
“performance” measurement
“impact” evaluation
2 Approaches to Evaluation and Key Questions
INPUTS PROCESSI VARIABILIESTERNE
OUTPUTS OUTCOMES IMPATTO
ECONOMICITA’
CONTROLLO INTERNOALL’ENTE
EFFICIENZA EFFICACIA
POLITICA VALUTATA
Dall’analisi alla valutazione: il modelllo logico
2 Approaches to Evaluation and Key Questions
Gli aspetti sottoposti a valutazione sono riconducibili alle seguenti categorie:
- effetti duraturi nel tempo- effetti a prova di condizioni avverse...
Capacità di modificarestabilmente il contesto ole condizioni di vita deidestinatari
Impatto
- adeguatezza- tempestività- soddisfazione dei destinatari
Capacità di forniresoluzioni ai problemisecondo criteri diefficacia
Risultati(Outcomes)
- beni- servizi- capacità- informazioni..
Capacità di fornire iprodotti richiesti secondocriteri di efficienza
Prodotti(Outputs)
- le interazioni con gli stakeholders- i conflitti allocativi- gli imprevisti dell’attuazione..
Capacità di gestire lacomplessità
Processo
- risorse finanziarie- risorse umane- ICT- risorse conoscitive- procurement di beni e servizi (diretto, outsourcing..)- comunicazione..
Capacità di acquisire lerisorse necessariesecondo criteri dieconomicità
Input
2 Approaches to Evaluation and Key Questions
‘al netto’ delle dinamiche esterne
di lungo periodoimpattoper le capacità dell’amministrazione
per l’ambiente fisico..
per il contesto socio-economico
per i destinataririsultato•valutazione delprogramma•efficacia, risultati,impatto•arco temporale lungo
manutenzioni..
regolazioni
servizi
benioutputtempestività...
coordinamento
consenso
leadershipprocessogli investimenti in ICT
le modalità di acquisizione dei beni e servizi dall’esterno
la nitidezza formale del mandato
risorse umane
risorse finanziarieinput:•valutazioni diperformance interna•Efficienza, output•arco temporale breve
Data l’importanza del modello logico, ne forniamo una seconda visualizzazione,perché sia chiaro lo schema di fondo
2 Approaches to Evaluation and Key Questions
2. Performance measurementQuesto esercizio può essere attuato in prospettive analitichediverse da quella della ppp evaluation
prospettiva ‘budget performance budgeting“Pay for Success” approach: a form of budgeting that relatesfunds allocated to measurable results
prospettiva management organizational Performancemetodi per aumentare l’allineamento dell’organizzazione agliobiettivi che si è data
2 Approaches to Evaluation and Key Questions
2. Performance measurement: rischiGoodhart's law is named after the banker who originated it,Charles Goodhart. “Any observed statistical regularity will tendto collapse once pressure is placed upon it for controlpurposes” Goodhart's original 1975 formulation, reprinted on p.116 in Goodhart 1981Its most popular formulation is: "When a measure becomes atarget, it ceases to be a good measure." The law is implicit inthe economic idea of rational expectations (Wikipedia)
2 Approaches to Evaluation and Key Questions
Rischi ES: metodo di finanziamento delle università
2 Approaches to Evaluation and Key Questions performancemeasurement
Rischi ES: metodo di finanziamento delle università
2 Approaches to Evaluation and Key Questions performancemeasurement
Gov’t/programproductionfunction
Users meetservicedelivery
INPUTS
OUTPUTS
OUTCOMES
IMPACTSProgram impactsconfounded by local,national, global effects
difficulty ofshowingcausality
slide da: Lori Beaman,Impact Evaluation: An Overview
La valutazione più difficile è quella finale, decisiva: Impact Evaluation
Possiamo davvero attribuire al nostro intervento i cambiamenti osservati tra idestinatari, o piuttosto questi effetti vanno attribuiti a fattori esterni al progetto?
2 Approaches to Evaluation and Key Questions
Impact Evaluation =
Form of stable outcomes evaluation
Assess net effect of a program
Compare
ProgramOutcomes
Estimate of what wouldhave happened in theabsence of the program
S. Kay Rockwell, A Hierarchy for Targeting Outcomes and Evaluating TheirAchievement http://deal.unl.edu/TOP/
2 Approaches to Evaluation and Key Questions
La valutazione dell’impattoQuesta parte della valutazione cerca di rispondere proprio alla domandapiù difficile: è fondato attribuire alla politica i cambiamenti osservati dopola sua implementazione?
E’ sbagliato inferire dalle modificazioni dell’outcome il successo dellapolitica: NO post-hoc propter hoc.
La logica controfattuale cerca di evitare questo errore: l‘effetto di unintervento è la differenza tra quanto si osserva in presenzadell'intervento e quanto si sarebbe osservato in sua assenza (ilcontrofattuale).
2 Approaches to Evaluation and Key Questions Impact evaluation
Estimation of Causal Effects of a Program orInterventionLa metafora del ‘trattamento’Consider a treatment delivered at the individual level: eitherthe individual received the treatment, or did not. Thedifference between the potential outcome if the individualreceived the treatment and the potential outcome if theindividual did not is the effect of the treatment on theindividual.
2 Approaches to Evaluation and Key Questions Impact evaluation
Estimation of Causal Effects of a Program orInterventionThe problem of not observing the counterfactualoutcomeThe challenge of estimating this treatment effect stems fromthe fact that any given individual either receives thetreatment or does not (for example, a child either does ordoes not attend preschool). Thus, for any given person onlyone of two potential outcomes can be observed. The factthat we cannot directly observe the counterfactual outcome(for example, the earnings a person who went to preschoolwould have had if they had, in fact, not gone to preschool)implies that we cannot directly measure the causal effect.Randomization provides a solution
2 Approaches to Evaluation and Key Questions Impact evaluation
Estimation of Causal Effects of a Program orInterventionRandomized experimentsBenefici e costi:-difficile selezionare gruppi di controllo adeguati-limiti operativi e normativiQuasi-experimentsBecause randomized experiments can be expensive orinfeasible, researchers have also developed methods to useas-if random variation in what is known as a quasi-experiment. The necessary condition for a high-qualityquasi-experimental design is that people are assigned to atreatment or control group in a way that mimicsrandomness. This can be done by forming treatment andcontrol groups whose individuals have similar observablecharacteristics,
2 Approaches to Evaluation and Key Questions Impact evaluation
t=0 t=1tempo
effetto osservato
controfattuale
risultati
impatto
L’analisi controfattuale
2 Approaches to Evaluation and Key Questions Impact evaluation
Un esempio. Un'amministrazione locale stanzia 10 milioni di euro per offrire contributialle imprese che assumono a tempo indeterminato propri collaboratori a progetto (5.000euro x ogni stabilizzazione).In poche settimane, sono richiesti e erogati tutti i finanziamenti, per 2000 effettivestabilizzazioni. Sembra un grande successo.Ma un ricercatore propone una verifica rispetto al numero di stabilizzazioni che sisarebbero probabilmente verificate in assenza di interventi.
Il numero di stabilizzazioni che eccedono il trend scende a 700
2 Approaches to Evaluation and Key Questions Impact evaluation
Un altro ricercatore, ancora più curioso, effettua una comparazione con lestabilizzazioni che nello stesso periodo si sono verificate in contesti economiciparagonabili a quello dell’intervento.
Ogni stabilizzazione attribuibile all’intervento ha quindi avuto un costo di 50.000 euro(10 mil./200). Con ogni probabilità, i soldi potevano essere spesi meglio(adattamento da Valut-AZIONE – CAPIRe, Dicembre 2012, Gli incentivi dati alle impreseriescono a ridurre il precariato? (da uno studio di Alberto Martini e Luca MoCostabella) http://www.capire.org/capireinforma/scaffale/valut-azione6122012.pdf
2 Approaches to Evaluation and Key Questions Impact evaluation
Recenti tendenze1. Il Social Problem SolvingSoprattutto i criteri 7. Partecipazione, e 8. Collaborazione, meritano unaparticolare attenzione.Dalla seconda metà degli anni ’80, anche nel campo delle politichepubbliche i modelli per il problem solving prestano grande considerazionealle valutazioni dei destinatari: singole persone, famiglie, imprese,associazioni.I cittadini, in quanto utilizzatori – sperimentatori delle politiche pubbliche,accumulano competenze straordinarie circa- la natura dei problemi- le loro cause- le possibili soluzioni- il funzionamento dell’implementazione- la valutazione dei risultati.Le loro esperienze possono valere come migliaia di rilevazioni sperimentalifatte sul campo.
3 Recenti tendenze
2. Una valutazione più attenta dei risultati latentiSul piano metodologico, sono stati formulati correttivi per compensare latendenza implicita in alcune applicazioni della AVP a fornire un bilancionegativo dei programmi esaminati.La valutazione di apprezzamento (appreciative evaluation), ad esempio,permette di portare alla luce i risultati positivi latenti, in modo da rafforzare lafiducia interna all’organizzazione, favorire la diffusione di buone pratiche,aumentare la credibilità verso l’esterno.In questa logica, alle tradizionali funzioni della valutazione, l’apprendimentoe l’accountability, si aggiunge anche una terza funzione: la creazione dellafiducia e del capitale sociale.
3 Recenti tendenze
3 Recenti tendenze
THE MICRO-NARRATIVE
What is it?
• the collection and aggregation of thousands of short stories from citizens using specialalgorithms to gain insight into real-time issues and changes in society
Why is it innovative?
• information collected in the shape of stories is interpreted by the person who has told astory, therefore removing the need for - and the potential bias of - a third party tointerpret the data; this meets a core challenge for M&E by reducing or eliminatingpotential biases of monitoring staff and evaluators (^process improvement)
• by using a large number of stories, this approach turns previously mostly qualitativedata (e.g., in the form of a limited number of not representative case studies included inan evaluation) into aggregated statistical data; the approach has the potential to replacetraditional monitoring tools like surveys and focus groups (=catatytic)
• pattern detection software for analyzing micro-narratives exist and the approach isalready implemented in a number of countries and projects (=concrete)
How and when
• when real-time quantitative information from a large number of beneficiaries is requiredthat cannot best to use it: otherwise be collected
General quality criteria:
CLEAR STATEMENT OF THE EVALUATION QUESTIONSThe readers can understand how the information in the report should be interpreted.
CLEAR PRESENTATION OF CRITERIA AND STANDARDS OF PERFORMANCEThe grounds for value judgements made in the report should be explicitly stated.
TRANSPARENT ACCOUNT OF RESEARCH METHODSThe report should include an account of sources of data and methods of data collection
JUSTIFIED CONCLUSIONSIt should be possible for readers to follow each step of the argument leading from question toanswer. Supporting evidence should be clearly presented and alternative explanations offindings explicitly considered and eliminated.
IMPARTIAL REPORTINGThe perspectives of all major stakeholder groups should be impartially reflected in the report.It must cover both strengths and weaknesses, and should not be written in a manner thatsuggests that it is totally unbiased and represents the final truth.
CLEAR STATEMENT OF LIMITATIONSAll studies have limitations. Therefore, an account of major limitations should normally beincluded in reports.
4. Reporting and Dissemination
Reporting format
from the Sida’s Evaluation Manual.Swedish International Development Cooperation Agency
(SIDA)
Recommended Outline
EXECUTIVE SUMMARYINTRODUCTIONTHE EVALUATED INTERVENTIONFINDINGSEVALUATIVE CONCLUSIONSLESSONS LEARNEDRECOMMENDATIONSANNEXES
Summary of the evaluation, with particular emphasis on• main findings,• conclusions,• lessons learned and• recommendations.
Should be short!!!
EXECUTIVE SUMMARY
FINDINGS
Factual evidence,data and observations that are relevant to the specificquestions asked by the evaluation.
Reporting format
EVALUATIVE CONCLUSIONS
Assessment of the intervention and Its results against• given evaluation criteria,• standards of performance• and policy issues.
Reporting format
LESSONS LEARNED
General conclusions that are likely to have a potential for widerapplication and use.
Reporting format
RECOMMENDATIONS
Actionable proposals to the evaluation’s users for improvedintervention cycle management and policy.
Reporting format
ANNEXES
• Terms of reference,• methodology for data gathering and analysis,• references, etc.
Reporting format
Some questions to be asked
Was there a specific objective for the evaluation – also to befound in the Terms of Reference (ToR)?Were the ToR attached to the evaluation?Were the qualifications of the evaluators explicitly stated?Were there OVIs (Objectively Verifiable Indicators)?Were there any specific references to Guidelines, Manuals,Methods in the ToR and in the Evaluation itself?Is it clear from the document when, where and by whom theevaluation was made?Was a base-line study needed? If so, was it carried out?Has the issue of cost-effectiveness been dealt with? Is thereany discussion of costs and benefits in the evaluation? (...)
Bibliografia
Associazione per lo Sviluppo della Valutazione e I'Analisi delle Politiche Pubbliche(ASVAPP) (2012), Counterfactual impact evaluation of cohesion policy: Impactand cost-effectiveness of Investment subsidies in Italy, http://www.prova.org/studi-e-analisi/ASVAPP%20CIE%20WP1%20FINAL%20REPORT.pdf
Beaman L., Impact Evaluation: An Overview, UC Berkeley,http://cega.berkeley.edu/assets/cega_events/31/Impact_Evaluation_Overview.ppt.
Hallam Alistair (2011), Harnessing the Power of Evaluation in Humanitarian Action:An initiative to improve understanding and use of evaluation, The Active LearningNetwork for Accountability and Performance in Humanitarian Action (ALNAP)http://www.alnap.org/resource/6123.aspx
International Federation of Red Cross and Red Crescent Societies (2011), IFRCFramework for Evaluation http://www.ifrc.org/Global/Publications/monitoring/IFRC-Framework-for-Evaluation.pdf
Médecins Sans Frontières (MSF) (2013), Evaluation Manual. A handbook forinitiating, managing and conducting evaluations in MSF,http://evaluation.msf.at/fileadmin/evaluation/files/documents/resources_MSF/Evaluation_Manual_April_2013_online.pdf
Molund S. and Schill G. (2004), Looking Back, Moving Forward, Sida EvaluationManual,http://gametlibrary.worldbank.org/FILES/244_Evaluation%20Manual%20for%20Evaluation%20Managers%20-%20SIDA.pdf