Probability & Statistics Seminar
Seminars for Fall 2023
9:00am – 9:50am
Wilfried Youmbi
Department of Economics at the University of Western Ontario
Nonparametric Analysis of Random Coalitional Multi-Utility Models
In this paper, we study a method for testing the rational behavior of a population of consumers when observed choice data reveal a non-transitive preference relation. To this end, we develop a stochastic version of the coalitional multi-utility (CMU) model developed in Aguiar et al. (2022). The motivating application is to test the null hypothesis that a sample of cross-sectional demand distributions was generated by a population of rational consumers with complete, but not necessarily transitive, preference relations. We test a necessary and sufficient condition that does not rely on any restriction on unobserved heterogeneity or the number of goods. We provide an empirical characterization of this novel stochastic choice model and show that this characterization can be tested statistically. We also show how to evaluate the welfare implications of an observed price change. This work is a generalization of Kitamura & Stoye (2018)’s work on the nonparametric test of random utility models (RUM) for finite choice sets to situations where preferences are not transitive. We apply the new test to the UK Family Expenditure Survey (FES) and find evidence against RUM, while the random CMU model is not rejected in the dataset.
9:00am – 9:50am
Dr. Melik Masarifoglu (Turkey)
Senior Data Analyst at NMQ digitals
To Be Announced
9:00am – 9:50am
Dr. Qingyang Zhang
University of Arkansas
To Be Announced
9:00am – 9:50am
Mr. Thierry Taning Longla (PhD Student and Data analyst, Turkey)
University of Arkansas
Applying Data Science in Internet of Things
9:00am – 9:50am
Dr. Ngartelbaye Guerngar (Serge)
University of North Alabama
To Be Announced
9:00am – 9:50am
Dr. Guy-Vanie Miakonkana
CVP at Life Insurance New York
To Be Announced
9:00am – 9:50am
Dr. Kalimuthu Krishnamoorthy
University of Louisiana
To Be Announced
See Previous Seminars
Previous Statistics Seminars
11:00am – 12:00pm
Dr. Theophile Bougna Lonla
World Bank Economist
Poverty and transport modeling: Perspectives offered by Big Data and Machine Learning
Data and good models are at the forefront of all efficient decision-making processes, especially for poverty alleviation and transport planning. Technological advancement and the recent developments in ‘Big Data’ and machine learning provide useful information and methods that are nice complements to data collected through conventional methods and traditional models. The identification of key challenges and the current knowledge gaps in poverty and transport modeling are explored. Practical examples of how machine learning and big data are combined with statistical and economic models to tackle poverty and transport challenges. Promising areas for future opportunities and research, including new data collection, data analytics, and application development to support and inform policymakers’ decisions are also discussed.
11:00am – 12:00pm
Chathurika Abeykoon
University of Mississippi
The Double Descent Behavior In Two Layer Neural Network For Binary Classification
Recent studies observed a surprising concept about test error called the double descent phenomenon where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this, we worked on a two-layer neural network model with a ReLU activation function designed for binary classification under supervised learning. Our aim was to observe and find the mathematical concept behind the double descent behavior of the test error in the model for varying over-parameterization and under-parameterization ratios. We have been able to derive a closed-form solution for the test error of the model and a theorem to find the parameters with optimal empirical loss when model complexity increases. We proved the existence of the double descent phenomenon in our model for square loss function using the theorems derived.
11:00am – 12:00pm
Dr. Jeremy Clark
University of Mississippi
On two-dimensional Brownian motion singularly tilted through a point potential
A well-known but interesting characteristic of two-dimensional Brownian motion is that it will (almost surely) never return exactly to the origin even though it will reenter any given small neighborhood of the origin infinitely many times. I will discuss a two-dimensional diffusion process closely connected to Brownian motion
that has just enough drift towards the origin to enable it to return there. This opens up the possibility of formulating a theory of its local time, a characterization of the time spent in the vicinity of the origin. The transition probabilities for this diffusion process are defined through an integration kernel that has arisen in recent articles on the two-dimensional stochastic heat equation. The work that I will present is in collaboration with Barkat Mian.
11:00am – 12:00pm
Dr. Xin Dang
University of Mississippi
Feature screening for ultrahigh-dimensional classification via Gini distance correlation
Gini distance correlation (GDC) was recently proposed to measure dependence between a categorical variable and numerical random vector. In this talk, we utilize the GDC to establish a feature screening for ultrahigh-dimensional classification where the response variable is categorical. It can be used for screening individual features as well as grouped features. The proposed procedure possesses several appealing properties. It is model-free. No model specification is needed. It holds the sure independence screening property and the ranking consistency property. The proposed screening method can deal with the case that the response has divergent number of categories. Simulation and real data applications are presented to compare performance of the proposed screening procedure.
11:00am – 12:00pm
Mathias Muia Nthiani & Mous-Abou
University of Mississippi
A point on discrete vs continuous state-space markov chains/A comparison of estimation techniques for copula-based Markov chains
In this talk a Bernoulli Markov chain based on the Mardia copula family is considered. We obtain estimators for the parameters in the structure of the Markov chain and provide their confidence intervals. Moreover, for Markov chains generated by symmetric copulas with uniform marginals we provide new estimators and confidence intervals for copula parameters by considering several families of copulas introduced in Longla(2023). A simulation study is provided with a comparison to other known estimators such as the MLE and that of Longla and Peligrad (2021). We then make a comparison of discrete versus continuous state-space Markov chains.
11:00am – 12:00pm
Dr. Olivier Menoukeu Pamen
University of Liverpool, UK
A uniqueness and smoothness result for multidimensional SDE’s on the plane with nondecreasing coefficient
In this talk, we discuss the path by path uniqueness for multidimensional stochastic differentialequations driven by the Brownian sheet. We assume that the drift coefficient is unbounded, ver-ifies a spacial linear growth condition and is componentwise nondeacreasing. We first show theresult for bounded and measurable drift. Our proofs rely on a local time-space representation ofBrownian sheet and a type of law of the iterated logarithm for the Brownian sheet. The result inthe unbounded case then follows by using the Gronwall’s lemma on the plane. Under boundednessof the solution, we also prove that the obtained solution is Malliavin smooth.This talk is based on a joint work with A. M. Bogso and M. Dieye.
11:00am – 12:00pm
Dr. Martial Longla
University of Mississippi
Exchangeable copulas and m-dependent copulas
I will talk about a new set of copulas that I have been dealing with. I obtained these copulas while searching for conditions to have a Markov chain that is exchangeable. In the process, I was dragged into m-dependent Markov chains, and ended up providing a characterization of some families of copulas that I call m-dependent copulas, idempotent copulas and exchangeable copulas. Exchangeable copulas remind me of De Finetti’s theorem. The large sample theory of parameter estimators has been done for these families under The assumption that the Data has uniform marginal distribution.
Louis Aimé FONO
Research Group in Applied Mathematics for Social Science
University of Douala-Cameroon
On Some Probability Distributions of Customer Sensitivity for Premium Renewal in Non-life Insurance
Every year, non- life insurers face the recursing problem of adjusting premium. This problem comes from the trade-off between the need of increasing the global revenue of the company and the need of retention of the existing customers of the portfolio. Traditional pricing methods (General Linear Model or Credibility Theory) solve this problem by a static approach and they do not take into account the customer sensitivity and/or the prices offered by competing companies. Elena et al. [1] formalized and solved the pricing renewal problem of a non-life insurance company by using a dynamic approach based on reinforcement learning (Markov Decision Problem). The insurer has a portfolio of costumers and therefore a total turnover (initial state). At the time of contract renewal, the insurer (agent) offers a renewal premium to the first insured (we say that the agent takes action). Whether or not the insured accepts the renewal premium, his decision leads the company to a new state (new income and new retention). Then, taking into account the new situation of the company, the insurer repeats sequentially the same action to all the others insureds in the portfolio.
This paper extends and improves the model of Elena et al. in various circumstances. More precisely, we propose some families of probability distributions that take into consideration sensitivity of insurers to the new premiums. We rewrite the Elena et al.’s model by replacing regression probability by the obtained probability distributions and we obtain our new pricing models. We find the best strategy for insurer to set renewal price through reinforcement learning algorithms. The implementation of the newly obtained reinforcement models on a portfolio of contracts by using backward SARSA( ) learning agent yields better results than those obtained by Elena and al. [1]. Keywords: Pricing renewal in Non-life insurance; Reinforcement learning; Customer sensitivity; Customer renewal probabilities.
References
[1] Elena K. and Garcia J., Maestre R. and Fernandez F. (2019) Reinforcement learning for pricing strategy optimization in the insurance industry, Engineering Applications of Artificial Intelligence, 80 (C) 8-19. https://doi.org/10.1016/j.engappai.2019.01.010
[2] Ngnié F.C. Mbama E.B., Fotso S. and Fono L.A. (2021) On the study of premium renewal problem in non-life insurance based on two families of customer renewal probability through reinforcement learning. Online Astin Colloquia.
Xinyuan Chen
Assistant Professor of Statistics
Mississippi State University
A Bayesian Machine Learning Approach for Estimating Heterogeneous Survivor Causal Effects: Applications to a Critical Care Trial
Assessing heterogeneity in the effects of treatments has become increasingly popular in the field of causal inference and carries important implications for clinical decision-making. While extensive literature exists for studying treatment effect heterogeneity when outcomes are fully observed, there has been limited development of tools for estimating heterogeneous causal effects when patient-centered outcomes are truncated by a terminal event, such as death. Due to mortality occurring during study follow-up, the outcomes of interest are unobservable, undefined, or not fully observed for specific subgroups of participants, therefore requiring the principal stratification framework to draw valid causal conclusions. Motivated by the Acute Respiratory Distress Syndrome Network (ARDSNetwork) ARDS respiratory management (ARMA) trial, we developed a flexible Bayesian machine learning approach to estimate the average causal effect and heterogeneous causal effects among the always-survivors stratum when clinical outcomes are subject to truncation. We adopted Bayesian additive regression trees (BART) to flexibly specify separate models for the potential outcomes and latent strata membership. In the analysis of the ARMA trial, we found that the low tidal volume treatment had an overall benefit for participants sustaining acute lung injuries on the outcome of time to returning home, but substantial heterogeneity in treatment effects among the always-survivors, driven most strongly by sex and the alveolar-arterial oxygen gradient at baseline (a physiologic measure of lung function and source of hypoxemia). These findings illustrate how the proposed methodology could guide the prognostic enrichment of future trials in the field. We also demonstrated through a simulation study that our proposed Bayesian machine learning approach outperforms other parametric methods in reducing the estimation bias in both the average causal effect and heterogeneous causal effects for always-survivors.
Ngongo Isidore Seraphin
ENS, Universite de Yaounde 1, Cameroun.
Inference for nonstationary time series of counts with application to change-point problems
We consider an integer-valued time series Y = (Yt)t∈Z where the model after a time k∗ is Poisson
autoregressive with the conditional mean that depends on a parameter θ∗ ∈ Θ ⊂ Rd. The structure of the
process before k∗ is unknown; it could be any other integer-valued time series, that is, the process Y could
be nonstationary. It is established that the maximum likelihood estimator of θ∗ computed on the nonstationary
observations is consistent and asymptotically normal. Subsequently, we carry out the sequential
change-point detection in a large class of Poisson autoregressive models. We propose a monitoring scheme
for detecting change in the model. The procedure is based on an updated estimator, which is computed
without the historical observations. The asymptotic behavior of the detector is studied, in particular, the
above results of inference in a nonstationary setting are applied to prove the consistency of the proposed
procedure. A simulation study as well as a real data application are provided.
Keywords: Time series of counts, Poisson autoregression, likelihood estimation, change-point, sequential
detection, weak convergence.
Huybrechts Bindele
University of South Alabama
Robust estimation and selection for single-index regression model
In this talk, we will consider a single-index regression model, from which we will discuss a robust estimation procedure for the model parameters and an efficient variable selection of relevant predictors. The proposed approach known as the penalized generalized signed-rank procedure will be introduced. Asymptotic properties of the resulting estimators will be discussed under mild regularity conditions. Extensive Monte Carlo simulation experiments will be carried out to study the finite sample performance of the proposed approach. The simulation results will demonstrate that the proposed approach dominates many of the existing ones in terms of robustness in estimation and efficiency of variable selection. Finally, a real data example will be discussed to illustrate the method.
Magda Peligrad
University of Cincinnati
The CLT for stationary Markov chains with trivial tail sigma field
In this talk we consider stationary Markov chains with trivial two-sided tail sigma field and present the tools leading to the following result: Any additive functional of such a Markov chain satisfies the central limit theorem provided the variance of partial sums divided by n is bounded.
The method is based on martingale decomposition using a new idea involving conditioning with respect to both the past and the future of the chain. No assumption of irreducibility or aperiodicity is needed.
Hume 321 at 4:30 PM
Jialin Zhang
Mississippi State University
Unfolding Entropic Statistics
This talk is organized into three parts.
1) Entropy estimation in Turing’s perspective is described. Given an iid sample from a countable alphabet under a probability distribution, Turing’s formula (introduced by Good (1953), hence also known as the Good-Turing formula) is a mind-bending non-parametric estimator of total probability associated with letters of the alphabet that are NOT represented in the sample. Some interesting facts and thoughts about entropy estimators are introduced.
2) Turing’s formula brought about a new characterization of probability distributions on general countable alphabets that provides a new way to do statistics on alphabets, where the usual statistical concepts associated with random variables (on the real line) no longer exist. The new perspective, in turn, inspires some thoughts on the characterization of probability distribution when the underlying sample space is unclear. An application example of authorship attribution is provided at the end.
3) Shannon’s entropy is only finitely defined for distributions with fast decaying tails on a countable alphabet. The unboundedness of Shannon’s entropy over thick-tailed distributions on an alphabet prevents its potential utility from being fully realized. Zhang (2020) proposed generalized Shannon’s entropy (GSE), which is finitely defined everywhere. Some interesting results about GSE and a new test of independence inspired by GSE are introduced. The new test does not require the knowledge of cardinality, and it is consistent and would detect any form of dependence structure in the general alternative space given a sufficiently large sample.
Martial Longla
University of Mississippi
Sometimes, Disorder Helps (pdf)
Timothy Fortune
University of Mississippi
Local Limit Theorem for Linear Random Fields (pdf)
Dongsheng Wu
University of Alabama-Huntsville
Weak Convergence of Martingales and its Application to Nonlinear Cointegrating Model (pdf)
Xin Dang
University of Mississippi
Gini Distance Correlation and Feature Selection (pdf)
Qian Zhou
Mississippi State University
Model Misspecification in Statistical Analysis (pdf)
Tung-Lung Wu
Mississippi State University
Tests for High-Dimensional Covariance Matrices Using Random Matrix Projection (pdf)
Dao Nguyen
University of California-Berkeley
Iterated Filtering and Iterated Smoothing Algorithms (pdf)
David Mason
University of Delaware
Bootstrapping the Student t‐Statistic (pdf)
Yichuan Zhao
Georgia State University
Jackknife Empirical Likelihood Methods for the Gini Index (pdf)
Junying Zhang
Taiyuan University of Technology, Taiyuan, P. R. China
Marginal Empirical Likelihood Independence Screening in Sparse Ultrahigh Dimensional Additive Models (pdf)
Yimin Xiao
Michigan State University
On the Excursion Probabilities of Gaussian Random Fields (pdf)
Charles Katholi
University of Alabama at Birmingham
Estimating Proportions by Group Testing: A Frequentist Approach (pdf)
Cuilan Gao
St. Jude Children’s Research Hospital
Evaluate Agreement of Differential Expression for Translational Cross-Species Genomics (pdf)
Yang Cheng
Mississippi State University
Orbit Uncertainty Propagation Using Sparse Grid-Based Method (pdf)
Meng Zhao
Mississippi State University
Local Linear Regression with Censored Data (pdf)
Pradeep Singh
Southeast Missouri State University
A Modified Approach in Statistical Significance for Genome Wide Studies (pdf)
Ebenezer Olusegun George
University of Memphis
On the Exchangeable Multinomial Distribution (pdf)
Deo Kumar Srivastava
St. Jude Children’s Research Hospital
Robust Multiple Regression based on Winsorization and Bootstrap Methods (pdf)
Paul Schliekelman
University of Georgia
Integrating Genome-wide Expression Information into Genome Scans for Complex Traits (pdf)
Justin Shows
Mississippi State University
Sparse Estimation and Inference for Censored Median Regression (pdf)
Hanzhe Zheng
Merck Research Laboratories
Adaptive Design in Clinical Trials (pdf)
Stan Pounds
St. Jude Children’s Research Hospital
Reference Alignment of SNP Microarray Signals for Copy Number Analysis of Tumors (pdf)
Russell Stocker
Mississippi State University
Optimal Goodness-of-Fit Tests (pdf)
Gauri Sankar Datta
University of Georgia
Bayesian approach to survey sampling (pdf)
Dawn Wilkins
University of Mississippi
Supervised and Unsupervised Learning with Microarray Data (pdf)
Hemant K. Tiwari
University of Alabama at Birmingham
Issues & Challenges in Genetic Analysis of Complex Disorders (pdf)
Ajit Sadana
University of Mississippi
A Fractal Analysis of Binding and Dissociation Kinetics of Glucose and Related Analytes on Biosensor Surfaces (pdf)
Jane L. Harvill
Mississippi State University
Modeling and Prediction for Nonlinear Time Series (pdf)
Fenghai Duan
Yale School of Public Healthy
Probe-level Correction in Analysis of Affymetrix Data (pdf)
J. Sunil Rao
Case Western Reserve University
Spike and slab variable selection: frequentist and Bayesian strategies (in DNA microarray data analysis) (pdf)
Warren May
University of Mississippi Medical Center
On Being a Statistician in a Medical Center Environment (pdf)
Malay Ghosh
University of Florida
Hierarchical Bayesian Neural Networks: An Application to Prostate Cancer Study (pdf)
Pranab K. Sen
University of North Carolina at Chapel Hill
Constrained Inference in Statistical Practice (pdf)
Ebenezer Olusegun George
University of Memphis
Statistical Methods for Analyzing Clustered Discrete Data: Applications to Teratology Studies (pdf)
Haimeng Zhang
Concordia College
Estimating Survival Functions In Koziol-Green Models (pdf)
Deo Kumar Srivastava
St. Jude Children’s Hospital
Impact of Censoring in Survival Analysis (pdf)
Z. Govindarajulu
University of Kentucky
Robustness of Small Sample Size Re-estimation Procedures (pdf)
Xueqin Wang
University of Mississippi
Asymptotics of the Theil-Sen Estimator in Simple Linear Regression Model With a Random Covariate (pdf)
Xueqin Wang
University of Mississippi
Unbiasedness of the Theil-Sen Estimator (pdf)
Patrick D. Gerard
Mississippi State University
Estimating Polulation Density in Line Transect Sampling Using Kernel Methods (pdf)