Organized 02

Nonlocal Operators Related to Probability (Organizer: Ildoo Kim)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 18 Sun, 10:30 PM — 11:00 PM EDT

A Sobolev space theory for time-fractional stochastic PDE driven by Levy processe

Daehan Park (Korea Advanced Institute of Science and Technology (KAIST))

8
There are many kinds of studies related to time-fractional equations (both deterministic and stochastic). In this talk, the speaker will give a Sobolev space theory for time-fractional stochastic PDE driven by Levy processes. The existence of the kernel to represent solutions gives the way how we control the derivatives of solutions. Precisely, we find suitable decay of Fourier transform of the kernel and use the Littlewood-Paley theory. From this, we find a suitable condition to give regularity to solutions.

A regularity theory for stochastic modified Burgers' equation driven by multiplicative space-time white noise

Beom-Seok Han (Pohang University of Science and Technology)

7

General Law of iterated logarithm for Markov processes

Jaehun Lee (Korea Institute for Advanced Study)

11
In this talk, we discuss general criteria and forms of laws of iterated logarithm (LIL) for continuous-time Markov processes. We consider minimal assumptions for LILs to hold at zero (at infinity, respectively) in general metric measure spaces. We establish LILs under local assumptions near zero (near infinity, respectively) on uniform bounds of the first exit time from balls and uniform bounds on the tails of the jumping measure. We provide a general formulation of liminf and limsup LILs, which covers a large class of subordinated diffusions, jump processes with mixed polynomial local growths, jump processes with singular jumping kernels and random conductance models with long range jumps. We also introduce our recent results on the laws of iterated logarithm for occupation times of ball and local time for continuous-time Markov processes.

This talk is based on the joint work with Soobin Cho and Panki Kim.

Heat kernel estimates for subordinate Markov processes and their applications

Soobin Cho (Seoul National University)

11
In this talk, we discuss sharp two-sided estimates for transition densities of a large class of subordinate Markov processes. As applications, we show that parabolic Harnack inequality and Hölder regularity hold for parabolic functions of such processes, and derive sharp two-sided Green function estimates.

A maximal $L_p$-regularity theory to initial value problems with time measurable nonlocal operators generated by additive processes

Jae-Hwan Choi (Korea University)

8

Q&A for Organized Contributed Session 02

0
This talk does not have an abstract.

Session Chair

Ildoo Kim (Korea University)

Organized 15

Network-related Statistical Methods and Analysis (Organizer: Donghyeon Yu)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 18 Sun, 10:30 PM — 11:00 PM EDT

Estimation of particulate levels using deep dehazing network and temporal prior

SungHwan Kim (Konkuk University)

5
Particulate matters (PM) have become one of the important pollutants that deteriorate public health. Since PM is ubiquitous in the atmosphere, it is closely related to life quality in many different ways. Thus, a system to accurately monitor PM in diverse environments is imperative. Previous studies using digital images have relied on individual atmospheric images, not benefiting from both spatial and temporal effects of image sequences. This weakness led to undermining predictive power. To address this drawback, we propose a predictive model using the deep dehazing cascaded CNN and temporal priors. The temporal prior accommodates instantaneous visual moves and estimates PM concentration from residuals between the original and dehazed images. The present method also provides, as by-product, high-quality dehazed image sequences superior to the nontemporal methods. The improvements are supported by various experiments under a range of simulation scenarios and assessments using standard metrics.

Graph-regularized contextual bandits with scalable Thompson sampling and semi-parametric reward models

Young-Geun Choi (Sookmyung Women's University)

6
Graph-based bandit algorithms for multiple users have received attention as the relationship among users captured by a social network or graph can improve personalized content recommendation. The graph-based Thompson sampling algorithm by Vaswani et al. (2017) is one of the state-of-the-art method where the relationship between users is represented by a simple undirected graph, however, a large graph poses computational challenges. We propose a novel Thompson sampling algorithm for multiple users with graph. We show that the proposed algorithm improves the regret bound by $\sqrt{n}$ from the algorithm by Vaswani et al. (2017), where $n$ is the number of users. Furthermore, we propose a method for a semi-parametric bandit problem with multiple users and graph, which is the first algorithm proposed in the same setting. We show that the upper bound of the cumulative regret has the same order as in the setting without the semi-parametric term. In establishing the proposed algorithms, novel local estimators play a crucial role in improving the bound with reduced computational cost.

INN: a stable method identifying clean-annotated samples via consistency effect in deep neural networks

Dongha Kim (Sungshin Women's University)

7
In classification problems with deep neural networks, researchers have been in trouble with collecting massive clean-annotated data and a lot of efforts have been put in to handle data with noisy labels. Many recent solutions for noisy label problems share the key idea so-called the memorization effect. While the memorization effect is a powerful tool, the performances are sensitive to the choice of a training epoch necessary in utilizing the memorization effect. In this paper, we introduce a new method called INN (Integration with the Nearest Neighborhoods) for refining noisy labels, which is more stable as well as more powerful. Our method is based on a new finding, called consistency effect, that the discrepancies of predictions at neighbor regions of clean labeled data and noisy labeled data are consistently observed regardless of training epochs. By applying the INN to the DivideMix algorithm, we propose a new learning framework called INN-DivideMix which improves the INN. By conducting various experiments including both performance test and ablation study, we demonstrate the superiority and stability of our proposed two methods.

An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units

Donghyeon Yu (Inha University)

7
Large-scale sparse precision matrix estimation has attracted wide interest from the statistics community. The convex partial correlation selection method (CONCORD) developed by Khare et al. (2015) has recently been credited with some theoretical properties for estimating sparse precision matrices. The CONCORD obtains its solution by a coordinate descent algorithm (CONCORD-CD) based on the convexity of the objective function. However, since a coordinate-wise update in CONCORD-CD is inherently serial, a scale-up is nontrivial. In this paper, we propose a novel parallelization of CONCORD-CD, namely, CONCORD-PCD. CONCORD-PCD partitions the off-diagonal elements into several groups and updates each group simultaneously without harming the computational convergence of CONCORD-CD. We guarantee this by employing the notion of edge coloring in graph theory. Specifically, we establish a nontrivial correspondence between scheduling the updates of the off-diagonal elements in CONCORD-CD and coloring the edges of a complete graph. It turns out that CONCORD-PCD simultaneously updates off-diagonal elements in which the associated edges are colorable with the same color. As a result, the number of steps required for updating off-diagonal elements reduces from p(p-1)/2 to p-1 (for even p) or p (for odd p), where p denotes the number of variables. We prove that the number of such steps is irreducible In addition, CONCORD-PCD is tailored to single-instruction multiple-data (SIMD) parallelism. A numerical study shows that the SIMD-parallelized PCD algorithm implemented in graphics processing units (GPUs) boosts the CONCORD-CD algorithm multiple times.

Q&A for Organized Contributed Session 15

0
This talk does not have an abstract.

Session Chair

Donghyeon Yu (Inha University)

Organized 23

Recent Advances in Statistical Methods for Large Scale Complex Data (Organizer: Seyoung Park)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 18 Sun, 10:30 PM — 11:00 PM EDT

Multivariate responses quantile regression for regional quantiles with applications to CCLE data

Seyoung Park (Sungkyunkwan University)

8
Cancer Cell Line Encyclopedia(CCLE) is a large-scale project that have generated resources with cancer cell lines characterized by high-dimensional molecular profiles along with pharmacological profiles. In CCLE, identifying gene-drug interaction is important to elucidate mechanisms of drug actions. Considering interrelations between the pharmacological responses, multivariate responses regression can be applied to identify meaningful gene-drug interaction. Quantile regression, as an alternative to classical linear regression, may better reveal the relation-ship between molecular profiles and pharmacological responses because quantile regression permits investigation of heterogeneity across quantiles. In this study, we propose a new multivariate responses quantile regression framework considering an interval of quantile levels. We aim to select relevant variables to any $\tau$-th conditional quantiles for multiple responses, where $\Delta$ is an interval of quantile levels of interest. We propose a penalized composite quantile regression framework with double group Lasso penalty to estimate the quantile coefficient function. In theory, we show the oracle property of the proposed estimator, in combination with a novel information criterion with theoretical guarantees. Numerical examples and applications to CCLE data demonstrate the effectiveness of the proposed method.

On sufficient graphical models

Kyongwon Kim (Ewha Womans University)

6
We introduce a Sufficient Graphical Model by applying the recently developed nonlinear sufficient dimension reduction techniques to the evaluation of conditional independence. The graphical model is nonparametric in nature, as it does not make distributional assumptions such as the Gaussian or copula Gaussian assumptions. However, unlike a fully nonparametric graphical model, which relies on the high-dimensional kernel to characterize conditional independence, our graphical model is based on conditional independence given a set of sufficient predictors with a substantially reduced dimension. In this way, we avoid the curse of dimensionality that comes with a high-dimensional kernel. We develop the population-level properties, convergence rate, and variable selection consistency of our estimate. By simulation comparisons and an analysis of the DREAM 4 Challenge data set, we demonstrate that our method outperforms the existing methods when the Gaussian or copula Gaussian assumptions are violated, and its performance remains excellent in the high-dimensional setting.

Principal component analysis in the wavelet domain

Yaeji Lim (Chung Ang University)

10
In this paper, we propose a new method of principal component analysis in the wavelet domain that is useful for dimension reduction of multiple non-stationary time series and for identify- ing important features. The proposed method is constructed by a novel combination of eigen analysis and the local wavelet spectrum defined in the locally stationary wavelet process. So, it can be expected that the proposed method reflects a more generalized non-stationary time series beyond some limited types of signals that existing methods have performed. We investigate the theoretical results of estimated principal components and their loadings. Results from numerical examples, including analysis of real seismic data and financial data, show the promising empirical properties of the proposed approach.

Bayesian inference of evolutionary models from genomic data

Yujin Chung (Kyonggi University)

6
The evolutionary history of a group of organisms explains the process of their genetic variation over time. Due to recent sequencing and computing advances, statistical inference has become an essential discipline in the study of evolutionary history from genomic data. However, typical analyses are either limited to a small amount of data or fail to estimate complex and diverse evolutionary models. In this talk, I will present a Bayesian method for estimating population/species-level history, including population sizes, splitting time of two populations, and migration rates. The method resolves statistical limitations and overcomes major roadblocks to analyze genome-scale data. Using importance sampling and a Markov chain representation of genealogy, the method scales to genomic data without mixing difficulty in a Markov chain Monte Carlo simulation. The method also provides for the calculation of the joint posterior density for all model parameters, thus resolving the problem of high false positive rates that arises for the likelihood ratio tests for migration rates using other existing Bayesian approaches. I will demonstrate the method with simulated data and real DNA sequences.

Q&A for Organized Contributed Session 23

0
This talk does not have an abstract.

Session Chair

Seyoung Park (Sungkyunkwan University)