One of Japan's largest directories x find the right AI in as little as a minute

▶︎ For those who want to list their service

Subscribe to newsletter (free)
Subscribe to newsletter (free)
  1. AI BEST SEARCH
  2. AI Glossary & Keyword Index [AI BEST SEARCH]
  3. Activation Function

Activation Function

An activation function is a mathematical function that determines the output of each node (neuron) in a neural network, applying a non-linear transformation to the input signal. By introducing non-linearity, activation functions allow neural networks to learn complex patterns and non-linear relationships — going far beyond what a simple linear model can capture. The main purposes and effects of activation functions: • Introducing non-linearity into the network • Making it easier to emphasize and extract data features • Enabling the selection of functions suited to addressing issues like vanishing gradients and learning speed Representative activation functions and their characteristics: • ReLU (Rectified Linear Unit): Outputs 0 for values below 0 and passes positive values through unchanged. Simple, enables fast learning, and is used as the default in many models. • Sigmoid: S-shaped curve that outputs values in the range [0, 1]. Suited for binary classification but susceptible to the vanishing gradient problem. • Tanh (Hyperbolic Tangent): Output range of [-1, 1], with center symmetry that enables more stable training than Sigmoid. • Leaky ReLU, ELU, Swish, GELU, and others: Variants of ReLU proposed to address its limitations, chosen based on the task and model. The choice of activation function significantly affects the performance and training efficiency of a neural network. Selecting the right function for the model type (CNN, RNN, Transformer, etc.) and task (classification, regression, etc.) is important. In recent years, functions such as GELU (Gaussian Error Linear Unit) and Swish have increasingly been used in Transformer-based models — demonstrating that advances in activation functions go hand in hand with advances in AI models.