Organized 07

## Anomalous Diffusions and Related Topics (Organizer: Zhen-Qing Chen)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

### Lp-Kato class measures for symmetric Markov processes under heat kernel estimates

Kazuhiro Kuwae (Fukuoka University)

4

### Green function estimates and Boundary Harnack principles for non-local operators whose kernels degenerate at the boundary

Panki Kim (Seoul National University)

10
In this talk, we discuss the potential theory of Markov processes with jump kernels decaying at the boundary of the half space. The boundary part of kernel is comparable to the product of three terms with parameters appearing as exponents in these terms. The constant c in the killing term can be written as a function of a parameter p which is strictly increasing in p. We establish sharp two-sided estimates on the Green functions of these processes for all admissible values of p and parameters in the boundary part of kernel. Depending on the regions where parameters and p belong, the estimates on the Green functions are different. In fact, the estimates have three different forms depending on the regions the parameters belong to. As applications, we completely determine the region of the parameters where the boundary Harnack principle holds or not. This talk is based on joint works with Renming Song and Zoran Vondracek.

### Heat kernel upper bounds for symmetric Markov semigroups

Jian Wang (Fujian Normal University )

6
It is well known that Nash-type inequalities for symmetric Dirichlet forms are equivalent to on-diagonal heat kernel upper bounds for associated symmetric Markov semigroups. In this talk, we show that both imply (and hence are equivalent to) off-diagonal heat kernel upper bounds under some mild assumptions. Our approach is based on a new generalized Davies's method. Our results extend that by Carlen-Kusuoka-Stroock for Nash-type inequalities with power order considerably and also extend that by Grigor'yan for second order differential operators on a complete non-compact manifold.The talk is based on a joint work with Z.-Q. Chen (Seattle), P. Kim (Seoul) and T. Kumagai (Kyoto).

### Inverse local time of one-dimensional diffusions and its comparison theorem

Lidan Wang (Nankai University)

4
It is well known that for a reflecting Bessel process, the inverse local time at $0$ is an $\alpha$-stable subordinator, then the corresponding subordinate Brownian motion is a $2\alpha$-stable process. Based on discussions of some transforms and regenerative theory of general diffusions, we get a comparison result between inverse local times of Bessel processes and perturbed Bessel processes. An immediate application will be the stability of Green function estimates for trace processes.

### Archimedes' principle for ideal gas

Krzysztof Burdzy (University of Washington)

7
I will present Archimedes' principle for a macroscopic ball in ideal gas consisting of point particles with non-zero mass. The main result is an asymptotic theorem, as the number of point particles goes to infinity and their total mass remains constant. Asymptotically, the gas has an exponential density as a function of height. The asymptotic inverse temperature of the gas is identified as the parameter of the exponential distribution.

Joint work with Jacek Malecki

### Q&A for Organized Contributed Session 07

0
This talk does not have an abstract.

###### Session Chair

Zhen-Qing Chen (University of Washington)

Organized 17

## The Advances in Time Series and Spatial Statistics (Organizer: Wei-Ying Wu)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

### Interpretable, predictive spatio-temporal models via enhanced pairwise directions estimation

ShengLi Tzeng (National Sun Yat-sen University)

4
Spatio-temporal phenomena are often complicated, but kriging methods are widely used in modeling such data, where only a very simple mean structure is assumed. We instead develop a novel approach based on supervised dimension reduction for such data in order to capture nonlinear mean structures without requiring a prespecified parametric model. In addition to prediction as a common interest, our approach focuses more on the exploration of geometric information in the data. The method of Pairwise Directions Estimation (PDE) is incorporated in our approach to implement the data-driven function searching of spatial structures and temporal patterns, useful in exploring data trends. We further enhance PDE, referring to it as PDE+, by using resolution adaptive fixed rank kriging to estimate the random effects not explained in the mean structures. Our proposal can not only produce more accurate and explainable prediction, but also increase the computation efficiency for model building. Illustrative applications to two real datasets are also presented. The results demonstrate that the proposed PDE+ method is very useful for exploring and interpreting the patterns of trend for spatio-temporal data.

### Model selection with a nested spatial correlation structure

Chun-Shu Chen (National Central University)

4
In spatial regression analysis, a suitable specification of the mean regression model is crucial for unbiased analysis. Suitably account for the underlying spatial correlation structure of the response variables is also an important issue. Here, we focus on selection of an appropriate mean model in spatial regression analysis under a general anisotropic nested spatial correlation structure. We propose a distribution-free model selection criterion which is an estimate of the weighted mean squared error based on assumptions only for the first two moments of the response data. The simulations under the settings of covariate selection reveal that the proposed criterion performs well for covariate selection in the mean model regardless of the underlying spatial correlation structure is nested/non-nested, isotropic/anisotropic. Also, the proposed criterion accommodates both continuous and count response data. Finally, a real data example regarding the fine particulate matter concentration is also analyzed for illustration.

### Consistent order selection for ARFIMA models

Kun Chen (Southwestern University of Finance and Economics)

3
Estimating the orders of the autoregressive fractionally integrated moving average (ARFIMA) model has been a long-standing challenge in time series analysis. This paper tackles the challenge by establishing the consistency of the Bayesian information criterion (BIC) in the ARFIMA model with independent errors. Since we allow the model's memory parameter to be any unknown real number, our consistency result can apply simultaneously to short-memory, long-memory,and non-stationary time series. We further extend BIC's consistency to the ARFIMA model with conditional heteroskedastic errors, thereby broadening the criterion's range of applications. Finally, the finite-sample implications of our theoretical results is illustrated using numerical examples.

### Whittle likelihood for irregularly spaced spatial data

4
Under some regularity conditions, including that the process is Gaussian, the sampling region is rectangular, and that the parameter space $\Theta$ is compact, Matsuda and Yajima (2009) showed that the Whittle estimator $\widehat{\theta}_{n}$ minimizing their version of Whittle likelihood is consistent (for $d\leq 3$) and one can construct large sample confidence regions for covariance parameters $\theta$ using the asymptotic normality of the Whittle estimator $\widehat{\theta}_{n}$. However, this requires one to estimate the asymptotic covariance matrix, which involves integrals of the spatial sampling density. Moreover, nonparametric estimation of the quantities in the asymptotic covariance matrix requires specification of a smoothing parameter and is subject to the curse of dimensionality. In comparison, we propose a spatial frequency domain empirical likelihood method (cf. Bandyopadhyay et al. (2015), Van Hala et al. (2017)) based approach which can be employed to produce asymptotically valid confidence regions and tests on $\theta$, without requiring explicit estimation of such quantities.

### Q&A for Organized Contributed Session 17

0
This talk does not have an abstract.

###### Session Chair

Wei-Ying Wu (National Dong Hwa University)

Organized 24

## Advanced Statistical Methods for Complex Data (Organizer: Jongho Im)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 19 Mon, 10:30 PM — 11:00 PM EDT

### On the verifiable identification condition in NMAR missing data analysis

Kosuke Morikawa (Osaka University and The University of Tokyo)

3
Missing data often causes undesired properties such as bias and loss of efficiency. By modeling the distribution of complete data and its missing-data mechanism, incorporating them into the likelihood can solve the problem. However, especially when the missing-data mechanism is NMAR (Not missing at random), there are two problems: (i) we cannot verify sufficient conditions for the distribution of complete data; (ii) guaranteeing model identifiability is difficult even for relatively simple models. Some recent studies tackle the first problem (i) by modeling the distribution of observed data, not complete data, which is impossible to obtain. As for problem (ii), we have derived sufficient conditions for model identifiability under nonignorable nonresponse by specifying that the distribution of the outcome model is a normal or normal mixture, but the missing-data mechanism is any parametric model. The new conditions can check whether assumed models are identifiable by the observed data in NMAR missingness.

### Bayesian hierarchical spatial model for small-area estimation with non-ignorable nonresponses and its application to the NHANES dental caries data

Ick Hoon Jin (Yonsei University)

3
The National Health and Nutrition Examination Survey (NHANES) is a major program of the National Center for Health Statistics, designed to assess the health and nutritional status of adults and children in the United States. The analysis of NHANES dental caries data faces several challenges, including (1) the data were collected using a complex, multistage, stratified, unequal-probability sampling design; (2) the sample size of some primary sampling units (PSU), e.g., counties, is very small; (3) the measures of dental caries have complicated structure and correlation, and (4) there is a substantial percentage of nonresponses, which are expected not to be missing at random or non-ignorable. We propose a Bayesian hierarchical spatial model to address these analysis challenges. We develop a two-level Potts model that closely resembles the caries evolution process, and captures complicated spatial correlations between teeth and surfaces of the teeth. By adding Bayesian hierarchies to the Potts model, we account for the multistage survey sampling design, while also enabling information borrowing across PSUs for small-area estimation. We incorporate sampling weights by including them as a covariate in the model and adopt flexible B-splines to achieve robust inference. We account for non-ignorable missing outcomes and covariates using the selection model. We use data augmentation coupled with the noisy Monte Carlo algorithm to overcome the numerical difficulty caused by doubly-intractable normalizing constants and sample posteriors. Our analysis results show strong spatial associations between teeth and tooth surfaces, including that dental hygienic factors, such as fluorosis and sealant, reduce dental disease risks.

### Raking-based relabeling classification method for highly imbalanced data

Seunghwan Park (Kangwon National University)

3
We consider binary classification on the imbalanced data. A dataset is called imbalanced if the proportion of classes are heavily skewed. Classification on the imbalanced data is often challengeable, especially for high dimensional data, in the sense that the unequal classes deteriorate the performance of classifiers. Undersampling the majority class and/or oversampling the minority class are popular methods to construct balanced samples and then it helps to improve the classification performance. However, many existing sampling methods cannot be easily extended to high dimensional data and mixed variables, because they often require to approximate the distribution of attributes and this becomes another critical issue rather than classification. In this paper, we propose new sampling strategy, called counter-matching sampling, such that the attribute values of the major class are imputed for the values of the minor class in construction of balanced samples. The proposed algorithms produce the same or similar performance with the existing methods but is more flexible to data shape and size of attributes. Our sampling algorithm is very attractive in practice considering that our sampling algorithm does not require any density estimation for synthetic data generation in oversampling and is not bothered from mixed variables. Also, the proposed sampling strategy can be easily combined with many existing classifiers.

### Imputation approach for outcome dependent sampling design

Jongho Im (Yonsei University)

3
Outcome dependent sampling (ODS) has been widely used to enhance the study efficiency in epidemiology or biomedical studies. We consider a biased two phase sampling design where the second phase samples are selected based on the outcome variable and the covariate x are only observed at the second phase. Many methods have been proposed by incorporating the estimated inclusion probabilities into the target score function of outcome model. In this paper, we propose an imputation method that is essentially implemented by data augmentation. The predictive distribution is
nonparametrically estimated and then a Bayesian bootstrap method is used to generate imputed values. The proposed method employs Rubin's variance formula for variance estimation of imputation estimators. A limited simulation study shows that the proposed method performs well and is comparable to the previous methods.

### Q&A for Organized Contributed Session 24

0
This talk does not have an abstract.

###### Session Chair

Seung Hwan Park (Kangwon National University)