Significance tests of feature relevance for a black-box learner
主讲人 |
Ben Dai |
简介 |
<p>An exciting recent development is the uptake of deep neural networks in many scientific fields, where the main objective is outcome prediction with a black-box nature. Significance testing is promising to address the black-box issue and explore novel scientific insights and interpretations of the decision-making process based on a deep learning model. However, testing for a neural network poses a challenge because of its black-box nature and unknown limiting distributions of parameter estimates while existing methods require strong assumptions or excessive computation. In this article, we derive one-split and two-split tests relaxing the assumptions and computational complexity of existing black-box tests and extending to examine the significance of a collection of features of interest in a dataset of possibly a complex type, such as an image. The one-split test estimates and evaluates a black-box model based on estimation and inference subsets through sample splitting and data perturbation. The two-split test further splits the inference subset into two but requires no perturbation. Also, we develop their combined versions by aggregating the p -values based on repeated sample splitting. By deflating the bias-sd-ratio, we establish asymptotic null distributions of the test statistics and the consistency in terms of Type 2 error. Numerically, we demonstrate the utility of the proposed tests on seven simulated examples and six real datasets. Accompanying this article is our python library dnn-inference (https://dnn-inference.readthedocs.io/en/latest/) that implements the proposed tests.</p> |
时间 |
2025-03-07 (Friday) 16:40-18:00 |
地点 |
经济楼N302 |
讲座语言 |
中文 |
主办单位 |
厦门大学经济学院、王亚南经济研究院、邹至庄经济研究院 |
承办单位 |
|
类型 |
独立讲座 |
联系人信息 |
|
主持人 |
王晨笛 |
专题网站 |
|
专题 |
|
主讲人简介 |
<p><span style="font-family: 等线; font-size: 10.5pt;">Dr. Ben Dai an Assistant Professor in the Department of Statistics at The Chinese University of Hong Kong. His primary research interests include statistical consistency, theory-driven machine learning methods, theoretical foundation of machine learning, black-box significance testing, statistical computing and software development.</span></p> |
期数 |
|