**Statistics Seminar**

**Seminars for Fall 2022**

Hume 321 at 4:30 PM

**Jialin Zhang**

Mississippi State University

**Unfolding Entropic Statistics**

This talk is organized into three parts.

1) Entropy estimation in Turing’s perspective is described. Given an iid sample from a countable alphabet under a probability distribution, Turing’s formula (introduced by Good (1953), hence also known as the Good-Turing formula) is a mind-bending non-parametric estimator of total probability associated with letters of the alphabet that are NOT represented in the sample. Some interesting facts and thoughts about entropy estimators are introduced.

2) Turing’s formula brought about a new characterization of probability distributions on general countable alphabets that provides a new way to do statistics on alphabets, where the usual statistical concepts associated with random variables (on the real line) no longer exist. The new perspective, in turn, inspires some thoughts on the characterization of probability distribution when the underlying sample space is unclear. An application example of authorship attribution is provided at the end.

3) Shannon’s entropy is only finitely defined for distributions with fast decaying tails on a countable alphabet. The unboundedness of Shannon’s entropy over thick-tailed distributions on an alphabet prevents its potential utility from being fully realized. Zhang (2020) proposed generalized Shannon’s entropy (GSE), which is finitely defined everywhere. Some interesting results about GSE and a new test of independence inspired by GSE are introduced. The new test does not require the knowledge of cardinality, and it is consistent and would detect any form of dependence structure in the general alternative space given a sufficiently large sample.

**Magda Peligrad**

University of Cincinnati

**The CLT for stationary Markov chains with trivial tail sigma field**

In this talk we consider stationary Markov chains with trivial two-sided tail sigma field and present the tools leading to the following result: Any additive functional of such a Markov chain satisfies the central limit theorem provided the variance of partial sums divided by n is bounded.

The method is based on martingale decomposition using a new idea involving conditioning with respect to both the past and the future of the chain. No assumption of irreducibility or aperiodicity is needed.

**Huybrechts Bindele**

University of South Alabama

**Robust estimation and selection for single-index regression model**

In this talk, we will consider a single-index regression model, from which we will discuss a robust estimation procedure for the model parameters and an efficient variable selection of relevant predictors. The proposed approach known as the penalized generalized signed-rank procedure will be introduced. Asymptotic properties of the resulting estimators will be discussed under mild regularity conditions. Extensive Monte Carlo simulation experiments will be carried out to study the finite sample performance of the proposed approach. The simulation results will demonstrate that the proposed approach dominates many of the existing ones in terms of robustness in estimation and efficiency of variable selection. Finally, a real data example will be discussed to illustrate the method.

**Ngongo Isidore Seraphin**

ENS, Universite de Yaounde 1, Cameroun.

**Inference for nonstationary time series of counts with application to change-point problems**

We consider an integer-valued time series Y = (Yt)t∈Z where the model after a time k∗ is Poisson

autoregressive with the conditional mean that depends on a parameter θ∗ ∈ Θ ⊂ Rd. The structure of the

process before k∗ is unknown; it could be any other integer-valued time series, that is, the process Y could

be nonstationary. It is established that the maximum likelihood estimator of θ∗ computed on the nonstationary

observations is consistent and asymptotically normal. Subsequently, we carry out the sequential

change-point detection in a large class of Poisson autoregressive models. We propose a monitoring scheme

for detecting change in the model. The procedure is based on an updated estimator, which is computed

without the historical observations. The asymptotic behavior of the detector is studied, in particular, the

above results of inference in a nonstationary setting are applied to prove the consistency of the proposed

procedure. A simulation study as well as a real data application are provided.

Keywords: Time series of counts, Poisson autoregression, likelihood estimation, change-point, sequential

detection, weak convergence.

**Xinyuan Chen**

Assistant Professor of Statistics

Mississippi State University

**A Bayesian Machine Learning Approach for Estimating Heterogeneous Survivor Causal Effects: Applications to a Critical Care Trial**

Assessing heterogeneity in the effects of treatments has become increasingly popular in the field of causal inference and carries important implications for clinical decision-making. While extensive literature exists for studying treatment effect heterogeneity when outcomes are fully observed, there has been limited development of tools for estimating heterogeneous causal effects when patient-centered outcomes are truncated by a terminal event, such as death. Due to mortality occurring during study follow-up, the outcomes of interest are unobservable, undefined, or not fully observed for specific subgroups of participants, therefore requiring the principal stratification framework to draw valid causal conclusions. Motivated by the Acute Respiratory Distress Syndrome Network (ARDSNetwork) ARDS respiratory management (ARMA) trial, we developed a flexible Bayesian machine learning approach to estimate the average causal effect and heterogeneous causal effects among the always-survivors stratum when clinical outcomes are subject to truncation. We adopted Bayesian additive regression trees (BART) to flexibly specify separate models for the potential outcomes and latent strata membership. In the analysis of the ARMA trial, we found that the low tidal volume treatment had an overall benefit for participants sustaining acute lung injuries on the outcome of time to returning home, but substantial heterogeneity in treatment effects among the always-survivors, driven most strongly by sex and the alveolar-arterial oxygen gradient at baseline (a physiologic measure of lung function and source of hypoxemia). These findings illustrate how the proposed methodology could guide the prognostic enrichment of future trials in the field. We also demonstrated through a simulation study that our proposed Bayesian machine learning approach outperforms other parametric methods in reducing the estimation bias in both the average causal effect and heterogeneous causal effects for always-survivors.

**Louis Aimé FONO**

Research Group in Applied Mathematics for Social Science

University of Douala-Cameroon

**On Some Probability Distributions of Customer Sensitivity for Premium Renewal in Non-life Insurance**

Every year, non- life insurers face the recursing problem of adjusting premium. This problem comes from the trade-off between the need of increasing the global revenue of the company and the need of retention of the existing customers of the portfolio. Traditional pricing methods (General Linear Model or Credibility Theory) solve this problem by a static approach and they do not take into account the customer sensitivity and/or the prices offered by competing compagnies. Elena et al. [1] formalized and solved the pricing renewal problem of a non-life insurance company by using a dynamic approach based on reinforcement learning (Markov Decision Problem). The insurer has a portfolio of costumers and therefore a total turnover (initial state). At the time of contract renewal, the insurer (agent) offers a renewal premium to the first insured (we say that the agent takes action). Whether or not the insured accepts the renewal premium, his decision leads the company to a new state (new income and new retention). Then, taking into account the new situation of the company, the insurer repeats sequentially the same action to all the others insureds in the portfolio.

This paper extends and improves the model of Elena et al. in various circumstances. More precisely, we propose some families of probability distributions that take into consideration sensitivity of insurers to the new premiums. We rewrite the Elena et al.’s model by replacing regression probability by the obtained probability distributions and we obtain our new pricing models. We find the best strategy for insurer to set renewal price through reinforcement learning algorithms. The implementation of the newly obtained reinforcement models on a portfolio of contracts by using backward SARSA( ) learning agent yields better results than those obtained by Elena and al. [1]. Keywords: Pricing renewal in Non-life insurance; Reinforcement learning; Customer sensitivity; Customer renewal probabilities.

References

[1] Elena K. and Garcia J., Maestre R. and Fernandez F. (2019) Reinforcement learning for pricing strategy optimization in the insurance industry, Engineering Applications of Artificial Intelligence, 80 (C) 8-19. https://doi.org/10.1016/j.engappai.2019.01.010

[2] Ngnié F.C. Mbama E.B., Fotso S. and Fono L.A. (2021) On the study of premium renewal problem in non-life insurance based on two families of customer renewal probability through reinforcement learning. Online Astin Colloquia.

See Previous Seminars

**Previous Statistics Seminars**

**Martial Longla**

University of Mississippi

**Sometimes, Disorder Helps** (pdf)

**Timothy Fortune**

University of Mississippi

**Local Limit Theorem for Linear Random Fields** (pdf)

**Dongsheng Wu**

University of Alabama-Huntsville

**Weak Convergence of Martingales and its Application to Nonlinear Cointegrating Model** (pdf)

**Xin Dang**

University of Mississippi

**Gini Distance Correlation and Feature Selection ** (pdf)

**Qian Zhou**

Mississippi State University

**Model Misspecification in Statistical Analysis ** (pdf)

**Tung-Lung Wu**

Mississippi State University

**Tests for High-Dimensional Covariance Matrices Using Random Matrix Projection ** (pdf)

**Dao Nguyen**

University of California-Berkeley

**Iterated Filtering and Iterated Smoothing Algorithms** (pdf)

**David Mason**

University of Delaware

**Bootstrapping the Student t‐Statistic** (pdf)

**Yichuan Zhao**

Georgia State University

**Jackknife Empirical Likelihood Methods for the Gini Index** (pdf)

**Junying Zhang**

Taiyuan University of Technology, Taiyuan, P. R. China

**Marginal Empirical Likelihood Independence Screening in Sparse Ultrahigh Dimensional Additive Models** (pdf)

**Yimin Xiao**

Michigan State University

**On the Excursion Probabilities of Gaussian Random Fields** (pdf)

**Charles Katholi**

University of Alabama at Birmingham

**Estimating Proportions by Group Testing: A Frequentist Approach** (pdf)

**Cuilan Gao**

St. Jude Children’s Research Hospital

**Evaluate Agreement of Differential Expression for Translational Cross-Species Genomics** (pdf)

**Yang Cheng**

Mississippi State University

**Orbit Uncertainty Propagation Using Sparse Grid-Based Method** (pdf)

**Meng Zhao**

Mississippi State University

**Local Linear Regression with Censored Data** (pdf)

**Pradeep Singh**

Southeast Missouri State University

**A Modified Approach in Statistical Significance for Genome Wide Studies** (pdf)

**Ebenezer Olusegun George**

University of Memphis

**On the Exchangeable Multinomial Distribution** (pdf)

**Deo Kumar Srivastava**

St. Jude Children’s Research Hospital

**Robust Multiple Regression based on Winsorization and Bootstrap Methods** (pdf)

**Paul Schliekelman**

University of Georgia

**Integrating Genome-wide Expression Information into Genome Scans for Complex Traits** (pdf)

**Justin Shows**

Mississippi State University

**Sparse Estimation and Inference for Censored Median Regression** (pdf)

**Hanzhe Zheng**

Merck Research Laboratories

**Adaptive Design in Clinical Trials** (pdf)

**Stan Pounds**

St. Jude Children’s Research Hospital

**Reference Alignment of SNP Microarray Signals for Copy Number Analysis of Tumors** (pdf)

**Russell Stocker**

Mississippi State University

**Optimal Goodness-of-Fit Tests** (pdf)

**Gauri Sankar Datta**

University of Georgia

**Bayesian approach to survey sampling** (pdf)

**Dawn Wilkins**

University of Mississippi

**Supervised and Unsupervised Learning with Microarray Data** (pdf)

**Hemant K. Tiwari**

University of Alabama at Birmingham

**Issues & Challenges in Genetic Analysis of Complex Disorders** (pdf)

**Ajit Sadana**

University of Mississippi

**A Fractal Analysis of Binding and Dissociation Kinetics of Glucose and Related Analytes on Biosensor Surfaces** (pdf)

**Jane L. Harvill**

Mississippi State University

**Modeling and Prediction for Nonlinear Time Series** (pdf)

**Fenghai Duan**

Yale School of Public Healthy

**Probe-level Correction in Analysis of Affymetrix Data** (pdf)

**J. Sunil Rao**

Case Western Reserve University

**Spike and slab variable selection: frequentist and Bayesian strategies (in DNA microarray data analysis)** (pdf)

**Warren May**

University of Mississippi Medical Center

**On Being a Statistician in a Medical Center Environment** (pdf)

**Malay Ghosh**

University of Florida

**Hierarchical Bayesian Neural Networks: An Application to Prostate Cancer Study** (pdf)

**Pranab K. Sen**

University of North Carolina at Chapel Hill

**Constrained Inference in Statistical Practice** (pdf)

**Ebenezer Olusegun George**

University of Memphis

**Statistical Methods for Analyzing Clustered Discrete Data: Applications to Teratology Studies** (pdf)

**Haimeng Zhang**

Concordia College

**Estimating Survival Functions In Koziol-Green Models** (pdf)

**Deo Kumar Srivastava**

St. Jude Children’s Hospital

**Impact of Censoring in Survival Analysis** (pdf)

**Z. Govindarajulu**

University of Kentucky

** Robustness of Small Sample Size Re-estimation Procedures** (pdf)

**Xueqin Wang**

University of Mississippi

**Asymptotics of the Theil-Sen Estimator in Simple Linear Regression Model With a Random Covariate** (pdf)

**Xueqin Wang**

University of Mississippi

**Unbiasedness of the Theil-Sen Estimator** (pdf)

**Patrick D. Gerard**

Mississippi State University

**Estimating Polulation Density in Line Transect Sampling Using Kernel Methods** (pdf)