Invited 02

## Scaling Limits of Disordered Systems and Disorder Relevance (Organizer: Rongfeng Sun)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

### Exceptional geodesic pairs in the directed landscape

5
Within the Kardar-Parisi-Zhang universality class, the space-time Airy sheet is conjectured to be the canonical scaling limit for last passage percolation models. In recent work of Dauvergne, Ortmann, and Virág, this object was constructed and shown to be the limit after parabolic correction of one such model: Brownian last passage percolation. This limit object, called the directed landscape, admits geodesic paths between any two space-time points $(x,s)$ and $(y,t)$ with $s < t$. Here we examine fractal properties of the set of these paths. Our main results concern exceptional endpoints admitting disjoint geodesics. First, we fix two distinct starting locations $x_1$ and $x_2$, and consider geodesics traveling $(x_1,0)\to(y,1)$ and $(x_2,0)\to(y,1)$. We prove that the set of $y\in\mathbb{R}$ for which these geodesics coalesce only at time 1 has Hausdorff dimension one-half. Second, we consider endpoints $(x,0)$ and $(y,1)$ between which there exist two geodesics intersecting only at times 0 and 1. We prove that the set of such $(x,y)\in\mathbb{R}^2$ also has Hausdorff dimension one-half. The proofs require several inputs of independent interest, including (i) connections to the so-called difference weight profile studied by Basu, Ganguly, and Hammond; and (ii) a tail estimate on the number of disjoint geodesics starting and ending in small intervals. The latter result extends the analogous estimate proved for the prelimiting model by Hammond.

This talk is based on joint work with Shirshendu Ganguly and Alan Hammond.

### Disorder relevance and the continuum random field Ising model

4
Since its introduction by Lenz in 1920, the Ising model has been one of the most studied statistical mechanics models. It has been particularly central in the theory of critical phenomena since Peierls famously proved that it undergoes a phase transition in dimension at least 2. We discuss the long considered question of whether this picture is changed by the addition of disorder acting as a small random external field and whether the model admits a disordered continuum limit.

### A CLT for KPZ on torus

Yu Gu (Carnegie Mellon University)

5
I will present a joint work with Tomasz Komorowski on proving a central limit theorem for the KPZ equation on torus.

### Q&A for Invited Session 02

0
This talk does not have an abstract.

###### Session Chair

Rongfeng Sun (National University of Singapore)

Invited 07

## High-dimensional Robustness (Organizer: Stanislav Minsker)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

### Distribution-free robust linear regression

Nikita Zhivotovskiy (Swiss Federal Institute of Technology Zürich)

4
We study random design linear regression with no assumptions on the distribution of the covariates and with a heavy-tailed response variable. When learning without assumptions on the covariates, we establish boundedness of the conditional second moment of the response variable as a necessary and sufficient condition for achieving deviation-optimal excess risk rate of convergence. In particular, combining the ideas of truncated least squares, median-of-means procedures and aggregation theory, we construct a non-linear estimator achieving excess risk of order d/n with the optimal sub-exponential tail. While the existing approaches to learning linear classes under heavy-tailed distributions focus on proper estimators, we highlight that the improperness of our estimator is necessary for attaining non-trivial guarantees in the distribution-free setting considered in this work. Finally, as a byproduct of our analysis, we prove an optimal version of the classical bound for the truncated least squares estimator due to Györfi, Kohler, Krzyzak, and Walk.

### Algorithmic high-dimensional robust statistics

4
Fitting a model to a collection of observations is one of the quintessential questions in statistics. The standard assumption is that the data was generated by a model of a given type (e.g., a mixture model). This simplifying assumption is at best only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model must also be robust in the presence of corrupted data. This is the prototypical goal in robust statistics, a field that took shape in the 1960s with the pioneering works of Tukey and Huber. Until recently, even for the basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. Moreover, the quality of the common heuristics degrades badly as the dimension increases. In this talk, we will survey the recent progress in algorithmic high-dimensional robust statistics. We will describe the first computationally efficient algorithms for robust mean and covariance estimation and the main insights behind them. We will also present practical applications of these estimators to exploratory data analysis and adversarial machine learning. Finally, we will discuss new directions and opportunities for future work.

### Robust estimation of a mean vector with respect to any norm : a minimax MOM and a Stahel-Donoho Median of means estimators

Guillaume Lecué (Center for Research in Economics and Statistics (CREST))

3

### Q&A for Invited Session 07

0
This talk does not have an abstract.

###### Session Chair

Stanislav Minsker (University of Southern California)

Invited 08

## Functional Data Analysis (Organizer: Aurore Delaigle)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

### Partially specified covariance operators and intrinsically functional graphical models

Victor Panaretos (École polytechnique fédérale de Lausanne)

7
Motivated by the problem of covariance recovery from functional fragments, we consider the problem of completing a partially specified covariance kernel on the unit square. By representing the underlying stochastic process as an undirected graphical model with uncountable vertices and edges, we show that a canonical completion always exists and can be explicitly described, under weak assumptions. For partial covariances specified on nearly banded domains containing the diagonal, we present necessary and sufficient conditions for unique completion, and characterise all completions under non-uniqueness. Finally, we show how the estimation of the canonical completion reduces to a system of ill-posed linear inverse problems in the space of Hilbert-Schmidt operators, and derive rates of convergence under standard source conditions.

Based on joint work with K. Waghmare (EPFL).

### Domain selection for functional linear models: a dynamic RKHS approach

Jane-Ling Wang (University of California at Davis)

2
In conventional scalar-on-function linear regression model, the entire trajectory of the predictor process on the whole domain is used to model the response variable. However, the response may only be associated with the covariate process X on a subdomain. We consider the problem of estimating the domain of association when assuming that the regression coefficient function is nonzero on a subinterval. We propose a solution based on the reproducing kernel Hilbert space (RKHS) approach to estimate both the domain and the regression function. A simulation study illustrates the effectiveness of the proposed approach. Asymptotic theory is developed for both estimators.

### Simultaneous Inference for function-valued parameters: A fast and fair approach

Dominik Liebl (University of Bonn)

4
Quantifying uncertainty using confidence regions is a central goal of statistical inference.  Despite this, methodologies for confidence bands in Functional Data Analysis are underdeveloped compared to estimation and hypothesis testing.  This work represents a major leap forward in this area by presenting a new methodology for constructing simultaneous confidence bands for functional parameter estimates. These bands possess a number of striking qualities: (1) they have a nearly closed-form expression, (2) they give nearly exact coverage, (3) they have a finite sample correction, (4) they do not require an estimate of the full covariance of the parameter estimate, and (5) they can be constructed adaptively according to a desired criteria. One option for choosing bands we find especially interesting is the concept of fair bands which allows us to do fair (or equitable) inference over subintervals and could be especially useful in longitudinal studies over long time scales.  Our bands are constructed by integrating and extending tools from Random Field Theory, an area that has yet to overlap with Functional Data Analysis.

### Q&A for Invited Session 08

0
This talk does not have an abstract.

###### Session Chair

Yunjin Choi (University of Seoul)

Invited 32

## Statistical Learning (Organizer: Yichao Wu)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

### Equivariant Variance Estimation for Multiple Change-point Model

Ning Hao (University of Arizona)

4
The variance of noise plays an important role in many change-point detection procedures and the associated inferences. Most commonly used variance estimators require strong assumptions on the true mean structure or normality of the error distribution, which may not hold in applications.  In this talk, we introduce a framework of equivariant variance estimation for multiple change-point models. In particular, we characterize the set of all equivariant unbiased quadratic variance estimators for a family of change-point model classes, and develop a minimax theory for such estimators.

### A forward approach for sufficient dimension reduction in binary classification

Seung Jun Shin (Korea University)

4
Since the seminal sliced inverse regression (SIR) proposed, the inverse-type methods have been canonical in sufficient dimension reduction (SDR). However, they often suffer in binary classification since the binary response yields two slices at most. In this article, we develop a forward approach for SDR in binary classification based on weighted large-margin classifiers. We first show that the gradient of a large-margin classifier is unbiased for SDR as long as the corresponding loss function is Fisher consistent. This leads us to propose what we call weighted outer-product of gradients (wOPG) method. The WOPG can recover the central subspace exhaustively without linearity or constant variance conditions routinely required for the inverse-type methods. We analyze the asymptotic behavior of the proposed estimator, and demonstrate its promising finite-sample performance for both simulated and real data examples.

### Variable Selection for Global Fréchet Regression

Danielle Tucker (University of Illinois at Chicago)

3
Global Fréchet regression is an extension of linear regression to cover more general types of responses, such as distributions, networks and manifolds, which are becoming more prevalent. In such models, predictors are Euclidean while responses are metric space valued. Predictor selection is of major relevance for regression modeling in the presence of multiple predictors but has not yet been addressed for Fréchet regression. Due to the metric space valued nature of the responses, Fréchet regression models do not feature model parameters, and this lack of parameters makes it a major challenge to extend existing variable selection methods for linear regression to global Fréchet regression. In this talk, we share our recent work which addresses this challenge and proposes a novel variable selection method with good practical performance. We provide theoretical support and demonstrate that the proposed variable selection method achieves selection consistency. We also explore the finite sample performance of the proposed method with numerical examples and data illustrations.

### Q&A for Invited Session 32

0
This talk does not have an abstract.

###### Session Chair

Yichao Wu (University of Illinois at Chicago)

Invited 41

## Bernoulli Paper Prize Session (Organizer: Bernoulli Society)

Conference
10:30 PM — 11:00 PM KST
Local
Jul 20 Tue, 9:30 AM — 10:00 AM EDT

### Bernoulli Prize for an outstanding survey article in Probability: From infinite random matrices over finite fields to square ice

Leonid Petrov (University of Virginia)
(Chair: Ofer Zeitouni)

3
Asymptotic representation theory of the symmetric group is a rich and beautiful subject with deep connections with probability, mathematical physics, and algebraic combinatorics. I will discuss a one-parameter deformation of this theory related to infinite random matrices over a finite field, which has an interesting connection to the six vertex (square ice) model and traffic systems on a 1-dimensional lattice.

### Bernoulli Journal Read Paper Award: A general frequency domain method for assessing spatial covariance structures

(Chair: Richard Samworth)

4
When examining dependence in spatial data, it can be helpful to formally assess spatial covariance structures that may not be parametrically specified or fully model-based. That is, one may wish to test for general features regarding spatial covariance without presupposing any particular, or potentially restrictive, assumptions about the joint data distribution. Current methods for testing spatial covariance are often intended for specialized inference scenarios, usually with spatial lattice data. We propose instead a general method for estimation and testing of spatial covariance structure, which is valid for a variety of inference problems (including nonparametric hypotheses) and applies to a large class of spatial sampling designs with irregular data locations. In this setting, spatial statistics have limiting distributions with complex standard errors depending on the intensity of spatial sampling, the distribution of sampling locations, and the process dependence. The proposed method has the advantage of providing valid inference in the frequency domain without estimation of such standard errors, which are often intractable, and without particular distributional assumptions about the data (e.g., Gaussianity). To illustrate, we develop the method for formally testing isotropy and separability in spatial covariance and consider confidence regions for spatial parameters in variogram model fitting. A broad result is also presented to justify the method for application to other potential problems and general scenarios with testing spatial covariance. The approach uses spatial test statistics, based on an extended version of empirical likelihood, having simple chi-square limits for calibrating tests. We demonstrate the proposed method through several numerical studies.

### Q&A for Invited Session 41

0
This talk does not have an abstract.

###### Session Chair

Ofer Zeitouni (Weizmann Institute of Science) / Richard Samworth (University of Cambridge)