主讲人 |
王汉生 |
简介 |
<p>Deep learning is a method of fundamental importance for many AI related applications. From a statistical perspective, deep learning can be viewed as a regression method with complicated input X and output Y. One unique character about deep learning is that the input X could be highly unstructured data (e.g., images and sentences) and the output(outputs?) Y are usual responses (e.g., class label). Another unique character about deep learning is that it has a highly nonlinear model structure with many layers. This provides us a basic theoretical foundation to understand deep learning from statistical perspectives. We do believe classical statistical wisdom combined with modern deep learning techniques can bring us novel methodology with outstanding empirical performance. In this talk, I would like to report 3 recent progresses made by our team in this regard. The first progress is about stochastic gradient descent (SGD) algorithm. SGD is the most popularly optimization algorithm used for deep learning model training. Its practical implementation relies on the specification of a tuning parameter: learning rate. In current practical applications, the choice of learning rate is highly subjective and depends on personal experiences. To solve the problem, we propose a local quadratic approximation (LQA) idea for an automatic and nearly optimal determination of the tuning parameter. The second progress focuses on model compression for convolutional neural networks (CNNs). Convolution neural network is one of the representative models in deep learning which has shown excellent performance in the field of computer vision. However, it is extremely complicated with a huge number of parameters. We then try to apply the classical principal component analysis (PCA) method to each convolutional layer, which leads to a new method called progressive principal component analysis (PPCA) method. Our PPCA method brings a significant reduction in model complexity without significantly sacrifices the out-of-sample forecasting accuracy. The last progress considers factor modeling. The input feature of a deep learning model is often of ultrahigh dimension (e.g., an image). In this case, a strong factor structure of the input feature can be detected constantly. This means a significant proportion of variability about the input feature can be explained by a low dimensional latent factor, which could be modeled in a very easy way. We then develop a novel factor normalization (FN) methodology. We first decompose an input feature X into two parts: the factor part and the residual part. Next, we reconstruct the baseline DNN model into a factor-assisted DNN model. Lastly, we provide a new SGD algorithm with adaptive learning rates for new model training. Our method leads to superior convergence speed and excellent out-of-sample performances compared to original model with a series of gradient type algorithms.</p>
<p>本场讲座采取预约制,名额有限,有意线上参加者请点击下方链接报名,报名后如无法参加请自行取消报名。</p>
<p>报名链接:<a href="https://us02web.zoom.us/webinar/register/WN_M9C8piKgRD6b95mhQINp7g">点击报名</a></p>
<p> </p>
<p>注册成功后zoom会通过email发送听会连接,请注意把no-reply@zoom.us放入白名单,或重新查看注册页面以获取听会连接 </p> |