Program
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
Monday, 12 May
morning (9.30-12.30)
Introduction to complex networks (Airoldi):
Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a collections of measurements on pairs of objects. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active “network community” and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online “networking communities” such as Facebook and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. In this workshop, I will review a few ideas that are central to this burgeoning literature. I will emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. I will conclude by describing open problems and challenges for machine learning and statistics.
RELATED PAPERS:
– https://arxiv.org/abs/0912.5410
– https://www.jmlr.org/papers/v9/airoldi08a.html
– https://proceedings.mlr.press/v22/azari12.html
– https://proceedings.neurips.cc/paper/2013/hash/b7b16ecf8ca53723593894116071700c-Abstract.html
– https://proceedings.mlr.press/v32/chan14.html
afternoon (14.30-17.30)
Machine learning and signal processing on graphs and networks (Schaub):
In this lecture we will provide a short introduction to machine learning on graphs and graph signal processing, covering both the fundamental theory as well as more pragmatic aspects and some future directions.
Tuesday, 13 May
morning (9.30-12.30)
Design and analysis of experiments on social, healthcare and information networks (Airoldi): Designing experiments that can estimate causal effects of an intervention when the units of analysis are connected through a network is the primary interest, and a major challenge, in many modern endeavors at the nexus of science, technology and society. Examples include HIV testing and awareness campaigns on mobile phones, improving healthcare in rural populations using social interventions, promoting standard of care practices among US oncologists on dedicated social media platforms, gaining a mechanistic understanding of cellular and regulation dynamics in the cell, and evaluating the impact of tech innovations that enable multi-sided market platforms. A salient technical feature of all these problems is that the response(s) measured on any one unit likely also depends on the intervention given to other units, a situation referred to as “interference” in the parlance of statistics and machine learning. Importantly, the causal effect of interference itself is often among the inferential targets of interest. On the other hand, classical approaches to causal inference largely rely on the assumption of “lack of interference”, and/or on designing experiments that limit the role interference as a nuisance. Classical approaches also rely on additional simplifying assumptions, including the absence of strategic behavior, that are untenable in many modern endeavors. In the technical portion of this talk, we will formalize issues that arise in estimating causal effects when interference can be attributed to a network among the units of analysis, within the potential outcomes framework. We will introduce and discuss several strategies for experimental design in this context centered around a useful role for statistics and machine learning models. In particular, we wish for certain finite-sample properties of the estimator to hold even if the model catastrophically fails, while we would like to gain efficiency if certain aspects of the model are correct. We will then contrast design-based, model-based and model-assisted approaches to experimental design from a decision theoretic perspective.
RELATED PAPERS:
– https://www.science.org/doi/10.1126/science.adi5147
– https://academic.oup.com/biomet/article/105/4/849/5066791
– https://www.pnas.org/doi/full/10.1073/pnas.2208975119
afternoon (14.30-17.30)
No lectures
Wednesday, 14 May
morning (9.30-12.30)
The Analysis of Diffusion Dynamics on Communication Networks (Gonzalez-Bailon):
Networks allow information to flow from person to person. Analyzing the structure that channels information flows can help uncover propagation mechanisms and pathways for diffusion, which can, in turn, help explain the chain reactions driving events like the spread of misinformation or calls for mobilization. In the digital realm, however, the structure of networks is not the only thing that matters: algorithmic curation and content moderation can both affect how things propagate online. In this lecture, I will discuss empirical research analyzing diffusion on networks, with a focus on social media and the ways in which they create the possibility for rapid, viral spread of content – contingent on interventions designed to control information flows. I will consider factors that are endogenous and exogenous to the network, and how data access (and the limitations imposed by existing data architectures) impact the conclusions we can reach with our empirical investigations.
RELATED PAPERS:
— Bakshy, Eytan, Itamar Rosenn, Cameron Marlow, and Lada A. Adamic. 2012. “The Role of Social Networks in Information Diffusion.” Proceedings of the 21st International Conference on World Wide Web. 519–28. https://doi.org/10.1145/2187836.2187907
— Baribi-Bartov, Sahar, Briony Swire-Thompson, and Nir Grinberg. “Supersharers of Fake News on Twitter.” Science 384:979–82. https://doi.org/10.1126/science.adl4435
— Goel, Sharad, Ashton Anderson, Jake Hofman, Duncan J. Watts. 2016. “The Structural Virality of Online Diffusion.” Management Science 62:180. https://doi.org/10.1287/
— González-Bailón, Sandra, David Lazer, Pablo Barberá et al. 2024. “The Diffusion and Reach of (Mis)Information on Facebook during the U.S. 2020 Election”, in press.
— Juul, Jonas L. and Johan Ugander. 2021. “Comparing Information Diffusion Mechanisms by Matching on Cascade Size.” Proceedings of the National Academy of Sciences 118:e2100786118. https://doi.org/10.1073/pnas.2100786118
— Liben-Nowell, David and Jon Kleinberg. 2008 “Tracing Information Flow on a Global Scale Using Internet Chain-Letter Data.” Proceedings of the National Academy of Sciences 105:4633. https://doi.org/10.1073/pnas. 0708471105
— Vosoughi, Soroush, Deb Roy, and Sinan Aral. 2018. “The Spread of True and False News Online.” Science 359:1146. https://doi.org/10.1126/science.aap9559
afternoon (14.30-17.30)
Short talks by students
evening (20.00)
Social dinner
Thursday, 15 May
morning (9.30-12.30)
Computational social science /2 – Signed Networks in Computational Social Science (Teixeira):
In this lecture, we will explore how signed networks provide a powerful framework for studying the interplay of positive and negative relationships in social systems, covering theory, methods, and applications. We will begin with the foundational theories and explore their implications for understanding human behavior and social dynamics. We will discuss how peer influence mechanisms shape triadic relationships and explore innovative methods, such as Multiscale Semiwalk Balance, for quantifying polarization and structural balance. The applications of signed network analysis will span diverse contexts, including the cognitive and emotional patterns expressed in suicide notes, the equilibrium properties of dynamically evolving social networks, the evolution of international relationships, among others. Together, these topics will illustrate the relevance of signed networks for addressing real-world challenges, from modeling collective behavior to unfolding polarization and emotional dynamics in human interactions.
afternoon
No lectures
Friday, 16 May
morning (9.30-12.30)
Computational social science /3 – Collective Behavior: From Digital Traces to Generative Agents (Garcia):
This lecture will provide an overview of how social media data can be used to study and model human behavior. First, we will focus on the case of individual and collective emotions: how to collect data on emotional expression, how to measure emotions based on text, how to validate those measurements, and how to use them to test theories about the role of emotions in society. We will then continue with the case of polarization, in particular with respect to issue alignment and its manifestation in positive and negative interactions in online discussions. This will build the basis for the creation of generative models of human behavior, combining agent-based models with Large Language Models to calibrate the behavior of agents. We will explore this new generative agent approach with two examples: the growth of online social networks and the spontaneous emergence of cooperation and its limits.
RELATED PAPERS:
https://arxiv.org/abs/2107.13236
https://www.nature.com/articles/s41598-022-14579-y
https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-023-00427-0
https://academic.oup.com/pnasnexus/advance-article/doi/10.1093/pnasnexus/pgae276/7713083
https://arxiv.org/abs/2409.02822