Industry Reflections

Why Backpropagation Falls Short of Its True Purpose

Let’s uncover how backpropagation is drifting away from truly recreating the brain’s learning process.

3Blue1Brown YT video

As we advance in the field of Machine Learning (ML) and continue using backpropagation to train and tune our neural networks (NN), it is crucial to recognize that this algorithm has strayed from its original purpose of mimicking the brain’s learning process. Although backpropagation has driven countless empirical breakthroughs in ML, it falls short of being a biologically plausible solution. But wait, why do we need a machine and its learning process to be biologically plausible?

One of the primary inspirations behind the development of artificial neural networks (ANN) was the desire to emulate the intricate workings of the human brain. Machine Learning as a field has long sought to replicate human intelligence, with neural networks designed to mirror the brain’s neural pathways. A crucial component in this endeavor is backpropagation–a process intended to mathematically simulate how humans learn from their mistakes. By adjusting weights in the network based on errors, backpropagation aims to refine the system’s performance in a manner that is similar to human cognitive processes, continually improving through experience and adaptation. This helps the network pinpoint which components in the information-processing pipeline are responsible for an error in output.

The role of backpropagation in an artificial neural network (ANN) is to perform credit assignment. Credit assignment refers to the process of determining which parts of the neural network are responsible for certain outputs. Credit assignment is essential in neural networks because it guides the adjustments (made during backpropagation) needed to minimize errors and improve performance; it determines which nodes, weights, and biases need adjustments. However, backpropagation falls short in two significant ways: it is not as robust as the human brain and requires faster, more energy-efficient procedures to be neurologically plausible. To achieve the broader goals of developing robust, human-like capabilities in intelligent machines and constructing efficient training and inference procedures, we need biologically plausible credit assignment methods.

(For this post, we will not get into neuro-inspired ML algorithms that can be made biologically plausible. But, I do plan on writing about them in a future post!)

Now that we’ve explored why backpropagation fails to achieve its goal of being biologically plausible, let’s delve into the specific reasons behind this shortfall.

1. Dynamic Changes of Weights across the Network

In backpropagation, weights in artificial neural networks are continually adjusted based on error gradients calculated during the backward pass. This process allows the network to learn and improve its performance over time. However, in our brains, synaptic connections between neurons are largely fixed once formed and do not dynamically change based on error feedback. In essence, the concept of continuously transporting and updating weights across the network diverges significantly from the biological model, where synaptic strengths remain relatively stable unless modified through processes like long-term potentiation or depression. This dynamic adjustment of weights in artificial networks contrasts with the static nature of biological synapses, highlighting a fundamental difference in how learning and adaptation occur in artificial versus biological systems.

This difference is significant because the static nature of biological synapses ensures robustness and efficiency in information processing, maintaining stable neural circuits that underpin consistent behaviors and cognitive functions. In contrast, the dynamic adjustment of weights in artificial networks may lead to faster learning but also raises challenges in terms of computational efficiency and the fidelity of neural representations over time. Therefore, understanding and reconciling these differences is crucial for advancing both artificial intelligence and our understanding of biological cognition.

2. Two-sided Traffic of Computations

In the context of backpropagation, I am referring to a two-sided traffic of computations in the neural network that encompasses both Forward Locking (FL) and Backward Locking (BL).

Forward Locking (FL) refers to the requirement that activities in one layer cannot be computed until all preceding layers’ activities are computed. This sequential dependency creates bottlenecks in information processing, diverging from the parallel and distributed nature of computation in biological neural networks. Moreover, FL necessitates storing neuron values in memory, which poses challenges for efficient implementation on local and parallel neuromorphic hardware because the stored information might get lost due to parallelization of computations. Thus, the dependency on Forward Locking in backpropagation is one of the reasons it deviates from being biologically plausible.

Similarly, Backward Locking (BL) entails delaying the computation of teaching signals and synaptic updates for a layer until the teaching signals for subsequent layers are computed. This dependency contradicts the local, parallel processing observed in biological neural systems, where information flows more dynamically and concurrently.

Overall, these constraints highlight how current neural network architectures deviate from the efficiency and parallelism found in biological brains. This creates a “traffic” of information/computaitons within the network while it is trying to flow forward or backward. Ultimately, it poses challenges for constructing these models to be more biologically similar to our brains.

3. Structural Shortcomings of Forward and Backward Passes

In backpropagation, the forward and backward passes serve very distinct purposes and use very different computations. During the forward pass, information is transmitted through the network to generate predictions or outputs. In contrast, the backward pass calculates gradients to adjust weights based on prediction errors. This divergence in computation between the two passes is seen as implausible and contrasts sharply with the localized and time-constrained plasticity of real synaptic connections.

In our brains, the real synaptic connections are localized which allows for precise and adaptive learning, where adjustments in neural connections reflect the specific patterns of activity and experience. To put it in simpler terms, the connections made in our brain are local to the part of the brain that is responsible for a certain task. This helps us remember and apply patterns and experiences. Secondly, biological synapses exhibit dynamic changes over time, but these changes are constrained by physiological processes. For instance, processes like long-term potentiation (LTP) and long-term depression (LTD) involve lasting changes in synaptic efficacy, but they occur over longer periods and are regulated by biochemical mechanisms. In contrast, backpropagation in artificial neural networks updates weights instantaneously based on error gradients computed during the backward pass, which does not capture the gradual and biologically constrained nature of synaptic changes.

In summary, the separation of roles in backpropagation — where the forward pass predicts and the backward pass adjusts based on errors — does not fully capture the localized, time-bound plasticity observed in real synaptic connections. This divergence underscores the challenge of achieving true biological plausibility in artificial neural networks and highlights the need for more biologically inspired learning algorithms in the field of machine learning.

For these reasons and more, researchers are now shifting towards creating algorithms that more closely resemble the brain’s internal workings, both algorithmically and in hardware. If you are interested in learning more about the future of this technology and its current state, I highly recommend reading this research article. It delves into the shortcomings of our current ANNs and explores the future of neuro-inspired machine learning.

Thank you for reading :)

This article is inspired by: https://arxiv.org/pdf/2403.18929v1

Why Backpropagation Falls Short of Its True Purpose

1. Dynamic Changes of Weights across the Network

2. Two-sided Traffic of Computations

3. Structural Shortcomings of Forward and Backward Passes

Read next

A Checklist for Your Next SWE Interview

Transform Your Networking Skills: 5 Steps to Building Powerful Connections for Recruitment Season

Ultimate Timeline for Landing a Summer SWE Internship