Long Short-Term Memory

Abstract

Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

Paper References

Learning long-term dependencies in NARX recurrent neural networks
Learning Unambiguous Reduced Sequence Descriptions
Bridging Long Time Lags by Weight Guessing and \long Short Term Memory
Induction of Multiscale Temporal Structure
Learning Sequential Structure with the Real-Time Recurrent Learning Algorithm
A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks
Continuous history compression
Learning long-term dependencies with gradient descent is difficult
Learning Complex, Extended Sequences Using the Principle of History Compression
Generalization of backpropagation with application to a recurrent gas market model
Gradient calculations for dynamic recurrent neural networks - a survey
Finding Structure in Time
Credit Assignment through Time - Alternatives to Backpropagation
Finite State Automata and Simple Recurrent Networks
Learning Sequential Tasks by Incrementally Adding Higher Orders
Learning State Space Trajectories in Recurrent Neural Networks
An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories
Language Induction by Phase Transition in Dynamical Recognizers
Adaptive neural oscillator using continuous-time back-propagation learning
Experimental Comparison of the Effect of Order in Recurrent Neural Networks
The Recurrent Cascade-Correlation Architecture
LSTM can Solve Hard Long Time Lag Problems
Contrastive Learning and Neural Oscillations
Dynamics and architecture for neural computation
Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks
A time-delay neural network architecture for isolated word recognition
Generalization of back-propagation to recurrent neural networks.
Holographic Recurrent Networks
A Theory for Neural Networks with Time Delays
Induction of Finite-State Languages Using Second-Order Recurrent Networks
Guessing can Outperform Many Long Time Lag Algorithms
Bifurcations in the learning of recurrent neural networks
Untersuchungen zu dynamischen neuronalen Netzen
A learning rule for asynchronous perceptrons with feedback in a combinatorial environment
Gradient-Based Learning Algorithms for Recurrent Networks
Gradient-based learning algorithms for recurrent networks and their computational complexity
A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks
Netzwerkarchitekturen, Zielfunktionen und Kettenregel
Learning long-term dependencies is not as difficult with NARX recurrent neural networks
A time delay neural network architecture for speech recognition
Time Warping Invariant Neural Networks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1997-long-short-term-memory.md

1997-long-short-term-memory.md

Long Short-Term Memory

Abstract

Paper References

Files

1997-long-short-term-memory.md

Latest commit

History

1997-long-short-term-memory.md

File metadata and controls

Long Short-Term Memory

Abstract

Paper References