10:20am Welcome and Introduction
Institutional welcome from IMT School for Advanced Studies Lucca Director Prof. Pietro Pietrini

Methods Session (Chair Prof. K. De Witte, KU Leuven)

10:30am Selecting Subpopulations for Causal Inference in Regression-Discontinuity Studies
Prof. Fabrizia Mealli (University of Florence and Florence Center for Data Science), [Slides]

Extracting causal information from regression-discontinuity (RD) studies, where the treatment assignment rule depends on some type of cutoff formula, may be challenging, especially in the presence of big data. Following Li, Mattei and Mealli (2015), we formally describe RD designs as local randomized experiments within the potential outcome approach. Under this framework causal inference concerns units belonging to some subpopulation where a local overlap assumption, SUTVA and a local randomization assumption (RD assumptions) hold. Unfortunately we do not usually know the subpopulations for which we can draw valid causal inference. We propose to use a model-based finite mixture approach to clustering in a Bayesian framework to classify observations into subpopulations for which we can draw valid causal inference and subpopulations from which we can extract no causal information on the basis of the observed data and the RD assumptions. This approach has important advantages: It explicitly accounts for the uncertainty about sub-population membership; it does not impose any constraint on the shape of the subpopulation; and it properly works in high-dimensional settings. We illustrate the framework in a high-dimensional RD study concerning the effects of the Borsa Famı́lia program, a social welfare program of the Brazilian government, on leprosy incidence.

11:30am Recurrent Individual Treatment Assignment: A Treatment Policy Approach to Account for Heterogeneous Treatment Effects
Prof. Chris Van Klaveren (Vrije Universiteit Amsterdam), [Slides]

Longitudinal clinical trials generally assign individuals randomly to treatments at baseline and then evaluate how differential average treatment effects evolve over time. This study argues that longitudinal settings could benefit from focusing on Recurrent Individual Treatment Assignment (RITA) instead, particularly in the face of such (dynamic) heterogeneous treatment effects. Such a focus on optimizing treatment assignment, rather than treatment effects, ensures that unobserved heterogeneous treatment effects can be acknowledged. Overall treatment response will then be improved, when compared to treatment policies in longitudinal settings based on clinical-trial derived average treatment effects. This study develops a RITA-algorithm and evaluates its performance in a multiperiod simulation setting, considering two alternative treatments and varying the extent of unobserved heterogeneity in individual treatment response. The results show that RITA can learn quickly, and adapts individual assignments effectively. If treatment heterogeneity exists, the inherent focus on both exploit and explore enable RITA to outperform a conventional assignment strategy that relies on clinical-trial derived average treatment effects.

12.30am Lunch Break

Empirical Applications' Session (Chair Prof. M. Smet, KU Leuven)

2:30pm Machine Learning for Zombie Hunting. Firms' Failures and Financial Constraints.
Prof. Massimo Riccaboni (IMT School for Advanced Studies Lucca), [Slides]

In this contribution, we exploit machine learning techniques to predict the risk of failure of firms. Then, we propose an empirical definition of zombies as firms that persist in a status of high risk, beyond the highest decile, after which we observe that the chances to transit to lower risk are minimal. We implement a Bayesian Additive Regression Tree with Missing Incorporated in Attributes (BART-MIA), which is specifically useful in our setting as we provide evidence that patterns of undisclosed accounts correlate with firms' failures. After training our algorithm on 304,906 firms in Italy in the period 2008-2017, we show how it outperforms proxy models like the Z-scores and the Distance-to-Default, traditional econometric methods, and other widely used machine learning techniques. We document that zombies are on average 21% less productive, 76% smaller, and they increased in times of financial crisis. In general, we argue that our application helps in the design of evidence-based policies in the presence of market failures, for example optimal bankruptcy laws. We believe our framework can help to inform the design of support programs for highly distressed firms after the recent pandemic crisis.

3:30pm Refining Pre-disaster Strategic Preparedness: a Causal Machine Learning Model for Identification of Communities facing the Highest Health Risks from an Impending Tropical Cyclone
Prof. Rachel C. Nethery (Harvard University), [Slides]

Climate change is expected to increase the intensity of tropical cyclones (TC), thus they represent an escalating threat to human health over this century. The risks are exacerbated by our current incomplete understanding of the full spectrum of TC health impacts. The goal of our work is to create a predictive tool that provides information in real-time about the areas of highest health risk and the types of health risks anticipated for an impending TC threatening the United States, in order to maximize the protective impact of strategic preparedness efforts. Motivated by our enormous database of historic TC exposures and Medicare health records, we build a causal machine learning model that simultaneously (1) estimates county-specific causal effects of past TC on numerous health outcomes and (2) uses these effect estimates to construct a predictive model that captures relationships between county and TC features and health impacts. This model enables prediction of county-specific health effects of an impending TC, accompanied by measures of predictive uncertainty, which can inform on-the-ground preparedness and response.

4:30pm Virtual Coffee & Tea Break

Doctoral Defense Session (Chair Prof. M. Riccaboni)

5:00pm Machine Learning in Social and Health Sciences
Falco J. Bargagli Stoffi (IMT School for Advanced Studies Lucca and KU Leuven), [Slides]

In this Dissertation, we deal with a series of applications of machine learning in the fields of social and health science. In particular, we introduce a set of novelties in the traditional usage of machine learning algorithms for predictive and causal inference tasks. In Part 1, we explore the field of machine learning for causal inference and we introduce two innovative techniques that combine state-of-the-art machine learning algorithms with causal inference methodologies. In the first Chapter, we introduce a novel Bayesian tree-based methodology to draw causal inference on heterogeneous effects in quasi-experimental scenarios. In the second Chapter, we account for possible drawbacks of tree-based methodologies by proposing a composite algorithm with a high level of interpretability and precision. In Part 2, we introduce applications of machine learning predictive power to forecast students' financial literacy scores and firm's financial distress. In the third Chapter, we innovate the applied machine learning literature by proposing a novel sensitivity analysis for predictions. Finally, in the fourth Chapter, we show how economic intuition can boost the performance of machine learning algorithms. The Dissertation contributes to the literatures on causal and predictive machine learning mainly by: (i) extending the current framework to novel scenarios and applications (Chapter 1 - Chapter 3); (ii) introducing interpretability in the learning models (Chapter 2 - Chapter 4); (iii) developing a novel methodology to assess the robustness of predictions (Chapter 3); (iv) informing the choice of the technique used by specific economic knowledge on the field of investigation (Chapter 4).