talks
upcoming talks
Latent contagion in low-rank networks
2023-11-27 @ 4 pm, Location TBD, Statistics Grad Student Seminar
Alex Hayes and Keith Levin
Last updated on 2023-10-20.
past talks
Latent contagion in low-rank networks
2023-10-11 @ 2 pm, SMI 133, Levin Lab Meeting Alex Hayes and Keith Levin
slides (private for the time being)
Peer diffusion over uncertain networks
2023-09-18 @ 12:30 pm, WID 1145, IFDS Ideas Seminar
Alex Hayes and Keith Levin
slides (private for the time being)
Estimating network-mediated causal effects via spectral embeddings
2023-08-09 @ 10:30 am, Recent Developments in Causal Inference, JSM 2023
Alex Hayes, Mark M. Fredrickson and Keith Levin
Causal inference for network data is an area of active interest in the social sciences. Unfortunately, the complicated dependence structure of network data presents an obstacle to many causal inference procedures. We consider the task of mediation analysis for network data, and present a model in which mediation occurs in a latent embedding space. Under this model, node-level interventions have causal effects on nodal outcomes, and these effects can be partitioned into a direct effect independent of the network, and an indirect effect induced by homophily. To estimate network-mediated effects, we embed nodes into a low-dimensional space and fit two regression models: (1) an outcome model describing how nodal outcomes vary with treatment, controls, and position in latent space; and (2) a mediator model describing how latent positions vary with treatment and controls. We prove that the estimated coefficients are asymptotically normal about the true coefficients under a sub-gamma generalization of the random dot product graph, a widely-used latent space model. We show that these coefficients can be used in product-of-coefficients estimators for causal inference. Our method is easy to implement, scales to networks with millions of edges, and can be extended to accommodate a variety of structured data. slides, pre-print
Estimating network-mediated causal effects via spectral embeddings
2023-05-24 @ 5:30 pm, Poster Session 1, ACIC 2023
Alex Hayes, Mark M. Fredrickson and Keith Levin
We consider the task of mediation analysis for network data, and present a model in which mediation occurs in a latent embedding space. Under this model, node-level interventions have causal effects on nodal outcomes, and these effects can be partitioned into a direct effect independent of the network, and an indirect effect induced by homophily. poster, pre-print
Thanks to the University of Wisconsin for funding a portion of my travel to ACIC with a Student Research Grants Competition-Combined Award.
Estimating network-mediated causal effects via spectral embeddings
2023-04-24 @ 12:30 pm, Orchard View @ the WID, IFDS Ideas Seminar
Alex Hayes, Mark M. Fredrickson and Keith Levin
Causal inference for network data is an area of active interest in the social sciences. Unfortunately, the complicated dependence structure of network data presents an obstacle to many causal inference procedures. We consider the task of mediation analysis for network data, and present a model in which mediation occurs in a latent embedding space. Under this model, node-level interventions have causal effects on nodal outcomes, and these effects can be partitioned into a direct effect independent of the network, and an indirect effect induced by homophily. To estimate network-mediated effects, we embed nodes into a low-dimensional space and fit two regression models: (1) an outcome model describing how nodal outcomes vary with treatment, controls, and position in latent space; and (2) a mediator model describing how latent positions vary with treatment and controls. We prove that the estimated coefficients are asymptotically normal about the true coefficients under a sub-gamma generalization of the random dot product graph, a widely-used latent space model. We show that these coefficients can be used in product-of-coefficients estimators for causal inference. Our method is easy to implement, scales to networks with millions of edges, and can be extended to accommodate a variety of structured data. pre-print, slides
Estimating network-mediated causal effects via spectral embeddings
2022-10-14 @ 4:15 pm in MSC 1210, Statistics Graduate Student Association Seminar
The last several years have seen a renewed and concerted effort to incorporate network data into standard tools for regression analysis, and to make network-linked data legible to practicing scientists. Thus far, this literature has primarily developed tools to infer associative relationships between nodal covariates and network structure. In contrast, we augment a statistical model for network regression with counterfactual assumptions and show how causal effects on a network can be partitioned into a direct effect that is uninfluenced by the network, and an indirect effect that is induced by homophily. slides
Estimating indirect effects induced by homophily via spectral network regression
2022-07-07, Tianxi Li and Can Le Joint Lab Meeting
The last several years have seen a renewed and concerted effort to incorporate network data into standard tools for regression analysis, and to make network-linked data legible to practicing scientists. Thus far, this literature has primarily developed tools to infer associative relationships between nodal covariates and network structure. In contrast, we augment a statistical model for network regression with counterfactual assumptions and show how causal effects on a network can be partitioned into a direct effect that is uninfluenced by the network, and an indirect effect that is induced by homophily. slides
distributions3
: From basic probability to probabilistic regression
2022-06-23, UseR 2022
Achim Zeileis, Moritz Lang and Alex Hayes
The distributions3
package provides a beginner-friendly and lightweight interface to probability distributions. It allows to create distribution objects in the S3 paradigm that are essentially data frames of parameters, for which standard methods are available: e.g., evaluation of the probability density, cumulative distribution, and quantile functions, as well as random samples. It has been designed such that it can be employed in introductory statistics and probability courses. By not only providing objects for a single distribution but also for vectors of distributions, users can transition seamlessly to a representation of probabilistic forecasts from regression models such as GLM (generalized linear models), GAMLSS (generalized additive models for location, scale, and shape), etc. We show how the package can be used both in teaching and in applied statistical modeling, for interpreting fitted models, visualizing their goodness of fit (e.g., via the topmodels
package), and assessing their performance (e.g., via the scoringRules
package). video, slides
The Low Hanging Fruit of the Twitter Following Graph
2021-08-11, Joint Statistical Meetings
Alex Hayes and Karl Rohe
In recent applied work on the Twitter media ecosystem, we have found that Twitter metadata (such as follows, likes, quotes, retweets, mentions, etc) is often more informative than the actual content of tweets themselves. The metadata, in some sense, is the right data to use for many inference tasks. In particular, we find that embedding the Twitter following graph is highly informative. However, collecting the following graph is rather challenging due to API rate limits, and storing graphs can also be challenging. We present some computational infrastructure to make access and storage of this high signal data more straightforward, and suggest that research progress would be well served by an increased focus on instrumentation. slides
Solving the model representation problem with broom
2019-01-25, rstudio::conf(2019)
The R objects used to represent model fits are notoriously inconsistent, making data analysis inconvenient and frustrating. The broom package resolves this issue by defining a consistent way to represent model fits. By summarizing essential information about fits in tidy tibbles, broom makes it easy to programmatically work with model objects. Combining broom with list-columns results in an especially powerful way to work with many model fits at once. This talk will feature several case studies demonstrating how broom resolves common problems in data analysis. video, slides
Solving the model representation problem with broom
2018-09-19, Statistics Graduate Student Seminar
slides
Convenient data analysis with broom
2018-11-30, RStudio Webinar Series
Broom is a package that converts statistical objects into tibbles. This consistent structure makes it easier to accomplish many standard modelling tasks. In this webinar I’ll demonstrate how to use to broom to work with many models at once. We’ll see how broom makes it easier to visualize models, work with bootstrapped fits and assess model diagnostics. video, slides
Solving the model representation problem with broom
2018-09-19, Madison R User Group
slides
slides from various informal presentations
Identifiability of homophily and contagion in social networks
2022-02-23, Madison Networks Reading Group
slides
Triangles & networks
2021-02-17, STAT 992 Seminar on Tensors
slides
A new way to think about citations
2020-11-17, Rohe Lab Group Meeting
slides
The linear probability model
2019-11-19, STAT 992 Seminar Course presentation
slides
rstudio internship progress update
2018-07-23, RStudio tidyverse team
slides
Locally Interpretable Model-Agnostic Explanations
2018-03-29, Rice DataSci club
slides
Your First R Package
2018-02-22, Rice DataSci club
slides