激活函数的发展综述及其性质分析

张焕; 张庆; 于纪言

doi:10.12198/j.issn.1673-159X.3761

激活函数的发展综述及其性质分析

Overview of the Development of Activation Function and Its Nature Analysis

摘要

摘要: 为深入研究激活函数的作用机制，探讨优良激活函数应具备的性质，以提高卷积神经网络模型的泛化能力，文章综述了激活函数的发展，分析得到优良激活函数应具备的性质。激活函数大体可分为“S型”激活函数、“ReLU型”激活函数、组合型激活函数、其他类型激活函数。在深度学习发展初期，“S型”激活函数得到了广泛应用。随着网络模型的加深，“S型”激活函数出现了“梯度消失”问题。ReLU激活函数的出现缓解了这一问题，但ReLU负半轴“置0”则引入了“神经元坏死”的问题。随后出现的改进激活函数大多基于ReLU负半轴进行改动，以缓减“神经元坏死”。文章最后以多层感知机为例，推导了优良激活函数在前向、反向传播中的作用，并得出其应该具备的性质。

Abstract: In order to study the mechanism of the activation function in depth and discuss the properties of a good activation function to improve the generalization ability of the convolutional neural network model, the article reviews the development of the activation function and analyzes the properties that a good activation function should have. Activation functions can be roughly divided into "S-type" activation functions, "ReLU-type" activation functions, combined activation functions, and other types of activation functions. In the early stage of the development of deep learning, the "S-type" activation function has been widely used. With the deepening of the network model, it’s problem of "gradient disappearance" was found grandually. The emergence of the ReLU activation function alleviates this problem, but the negative half-axis of ReLU "set to 0" introduces the problem of "neuronal necrosis". Most of the subsequent improved activation functions were modified based on the negative semi-axis of ReLU to slow down "neuronal necrosis". At the end of the article, taking the multilayer perceptron as an example, the role of a good activation function in forward and backward propagation is deduced, and the properties that it should possess are derived.

HTML全文

参考文献(27)

文章被引

资源附件(0)