🧠 Why Neural Networks Take So Long To Learn

And how we can improve them

Dec 11, 2022

I was listening to Noam Brown on the Lex Fridman Podcast this week and he said something that has stuck with me since.¹

Neural Networks (NNs) need a lot of time to learn how to complete a new task. Noam uses AlphaGo by DeepMind as an example.² AlphaGo is a large NN with hundreds of thousands of neurons all aimed towards beating human players at Go. It took multiple took months of training, and most likely millions of games to get to its current level. Expert human players on the other hand have probably played closer to 10-100k games.

Big numbers are a little hard to understand though, so lets scale it down. When a NN learns to play Super Mario World, it has to play in the range of 50-100 games to get a basic understanding.³ A human only needs to play a few times to get a grasp of the fundamental mechanics.

This is because when a NN learns to do something new, it is learning to do so from scratch. It doesn’t know anything before learning the new activity.

For example, when humans learn to play chess, they most likely already have a baseline understanding of many concepts. They know what a piece is, they know what a board is, they know what a game is, they might have played a similar game in the past. A NN has no background knowledge whatsoever. It is starting from square one. Chess is everything it will ever experience, and at the start, it has yet to experience anything! And believe it or not, having past experiences is very helpful when learning something new.

Afterwards, I watched Jeff Dean’s ted talk about the potential of AI. He presents the idea that we should stop building isolated models, and instead build one big model to do multiple tasks. This means re-using old neurons.

I have a strong feeling that if NNs started with an understanding of something, they could learn new tasks quicker. Eventually, similarly to our brains, they could pull from past experiences and guess the goal of a new task without it ever being explained. I agree with this idea, but instead of building the model, I think we should evolve it.

Evolving AGI

There seems to be an assumption that for AGI to be created, it would need to understand how to edit its own source code and upgrade its architecture. I disagree. I believe there are a limited number of conditions which actually need to be met for an algorithm to have the potential of becoming general.

The number of inputs must be mutable during training
The topology of the network must be mutable during training

The number of inputs must be mutable because if we want truly general AI, then it needs to accept all forms of inputs. Images, of any resolution, sound (though often turned into spectrograms), and general numbers.

A NN that has an evolving topology could evolve its structure for a certain task. Then if a new (but similar) task is presented, it could re-use the majority of its pre-existing neurons and only evolve a few new ones. Or if the task is drastically different, it could re-use a small portion of the old neurons and create a multitude of new ones.

My hypothesis is that when learning a task similar to something it has learned before, it will learn much faster. To test this hypothesis, next week I will train a NN using the NEAT algorithm developed by Kenneth O. Stanley and Risto Miikkulainen⁵ to play a platforming game with the goal of getting as high up as possible.

Then, I will duplicate this network, and put it into a new environment. One where the goal is to go as far right as possible, whilst collecting coins. The pre-developed AIs will compete against blank slates. My assumption is that those who have developed basic platforming techniques will outperform those which have not.

Afterwards, I will test long-term effectiveness by transferring the new NNs back to their original task. I’m assuming that the new modifications will prevent them from completing it instantaneously. But if the connections are only disabled instead of deleted when not in use, it again wouldn’t take long for it to remember its old processes.

We could then repeat this process for different tasks, over and over, until it reaches a substantial knowledge base. Practically making the NN go through school 😅

I believe AGI is hidden behind evolution. It will be a matter of developing an architecture which allows a NN to evolve its topology in order to complete the presented task, without completely forgetting how to accomplish the old one.

Nicolas’s Campfire

Discussion about this post

Nicolas’s Campfire

🧠 Why Neural Networks Take So Long To Learn

And how we can improve them

Evolving AGI

References

Discussion about this post