Meta Learning Backpropagation And Improving It

Many concepts have been proposed for meta learning with neural networks (NNs), e.g., NNs that learn to reprogram fast weights, Hebbian plasticity, learned learning rules, and meta recurrent NNs. Our Variable Shared Meta Learning (VSML) unifies the above and demonstrates that simple weight-sharing and sparsity in an NN is sufficient to express powerful learning algorithms (LAs) in a reusable fashion. A simple implementation of VSML where the weights of a neural network are replaced by tiny LSTMs allows for implementing the backpropagation LA solely by running in forward-mode. It can even meta learn new LAs that differ from online backpropagation and generalize to datasets outside of the meta training distribution without explicit gradient calculation. Introspection reveals that our meta learned LAs learn through fast association in a way that is qualitatively different from gradient descent.

Is it possible to implement modifiable versions of backpropagation or related algorithms as part of the end-to-end differentiable activation dynamics of a neural net (NN), instead of inserting them as separate fixed routines? Here we propose the Variable Shared Meta Learning (VSML) principle for this purpose. It introduces a novel way of using sparsity and weight-sharing in NNs for meta learning. We build on the arguably simplest neural meta learner, the meta recurrent neural network (Meta RNN) [16,10,56], by replacing the weights of a neural network with tiny LSTMs. The resulting system can be viewed as many RNNs passing messages to each other, or as one big RNN with a sparse shared weight matrix, or as a system learning each neuron’s functionality and its LA. VSML generalizes the principle behind end-to-end differentiable fast weight programmers [45,46,3,41], hyper networks [14], learned learning rules [4,13,33], and hebbian-like synaptic plasticity [44,46,25,26,30]. Our mechanism, VSML, can implement backpropagation solely in the forward-dynamics of an RNN. Consequently, it enables meta-optimization of backproplike algorithms. We envision a future where novel methods of credit assignment can be meta learned while still generalizing across vastly different tasks.

I think this type of meta learning strategies where you learn something analogous to back propagation are going to become more common. However, I still haven’t seen many examples of them being practical to use for large neural networks.

What is Going on Inside Recurrent Meta Reinforcement Learning Agents?

Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of “learning a learning algorithm”. After being trained on a pre-specified task distribution, the learned weights of the agent’s RNN are said to implement an efficient learning algorithm through their activity dynamics, which allows the agent to quickly solve new tasks sampled from the same distribution. However, due to the black-box nature of these agents, the way in which they work is not yet fully understood. In this study, we shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework. We hypothesize that the learned activity dynamics is acting as belief states for such agents. Several illustrative experiments suggest that this hypothesis is true, and that recurrent meta-RL agents can be viewed as agents that learn to act optimally in partially observable environments consisting of multiple related tasks. This view helps in understanding their failure cases and some interesting model-based results reported in the literature.

My understanding of what’s being done here is training a LSTM neural network using standard back propagation. They then run the network on tasks similar to the ones it was trained on and the network can learn to do that specific task better using the information stored in the LSTM memory.

The idea of “getting AIs to program themselves” is a good one even if it’s not very specific. I think on current trends the way this is going to happen is to have language models write Python and C++ code to improve themselves and other AI programs.

I’ve been using GitHub CoPilot to write Argos Translate unit tests and will hopefully do more with it as it’s capabilities improve.