Organized by Center for Data Science, Zhejiang University;
Operated by Hangzhou Qizhen Exhibition Service Co.,Ltd
Peter L. Bühlmann
ETH Zürich
Title: Causality-inspired Statistical Machine Learning
Abstract:
Reliable, robust and interpretable machine learning is a big emerging theme in data science and statistics, complementing the development of pure black box prediction algorithms. New connections between distributional robustness, external validity and causality provide methodological paths for improving the reliability and understanding of machine learning algorithms, with wide-ranging prospects for various applications.
Huazhen Lin
New Cornerstone Science Laboratory,
Center of Statistical Research and School of Statistics,
Southwestern University of Finance and Economics
Title: Deep regression learning with optimal loss function*
Abstract:
In this paper, we develop a novel efficient and robust nonparametric regression estimator under a framework of a feedforward neural network (FNN). There are several interesting characteristics for the proposed estimator. First, the loss function is built upon an estimated maximum likelihood function, which integrates the information from observed data as well as the information from the data structure. Consequently, the resulting estimator has desirable optimal properties, such as efficiency. Second, different from the traditional maximum likelihood estimation (MLE), the proposed method avoids the specification of the distribution and thus is flexible to any kind of distribution, such as heavy tails and multimodal or heterogeneous distributions. Third, the proposed loss function relies on probabilities rather than direct observations as in least square loss, hence contributing to the robustness of the proposed estimator. Finally, the proposed loss function involves a nonparametric regression function only. This enables the direct application of the existing packages, simplifying the computational and programming requirements. We establish the large sample property of the proposed estimator in terms of its excess risk and minimax near-optimal rate. The theoretical results demonstrate that the proposed estimator is equivalent to the true MLE where the density function is known. Our simulation studies show that the proposed estimator outperforms the existing methods in terms of prediction accuracy, efficiency and robustness. Particularly, it is comparable to the true MLE and even gets better as the sample size increases. This implies that the adaptive and data-driven loss function from the estimated density may offer an additional avenue for capturing valuable information. We further apply the proposed method to four real data examples, resulting in significantly reduced out-of-sample prediction errors compared to existing methods.
*Joint work with Xuancheng Wang and Ling Zhou