, who states that the reason for memory in the brain is not to recall an accurate record of the past, but to predict the future and reconstruct the past from the scenes and events we experienced, using the same stored information and process in the brain that we use to look into the future to predict what will happen, or to plan what to do. Therefore the underlying storage of human memories must be structured in an abstracted representation in such a way that memories can be reconstructed from some for the purpose at hand, be it reconstructing the past, predicting the future, planning, or imagining stories and narratives – all hallmarks of human intelligence.
Replicating all of the brain’s capabilities seems daunting when seen through the tools of deep learning – image recognition, vision, speech, natural language understanding, written composition, solving mazes, playing games, planning, problem solving, creativity, imagination, because deep learning is using single-purpose components that cannot generalize. Each of the DNN/RNN tools is a one-of, a specialization for a specific task, that cannot generalize, and there is no way we can specialize and combine them all to accomplish all these tasks.
But, the human brain is simpler, more elegant, using fewer, more powerful, general-purpose building blocks – the biological neuron, and connecting them by using the instructions of a mere 8000 genes, so nature has, through a billion years of evolution, come up with an elegant and easy to specify architecture for the brain and its neural network structures that is able to solve all the problems we met with during this evolution. We are going to start by just copying as much about the human brain’s functionality as we can, then using evolution to solve the harder design problems.
So now we know more about the human brain, and how the neurons and neural networks in it are completely different from the DNNs that deep learning is using, and how much more sophisticated our simulated neurons, neural networks, cortices and neural networks would have to be to even begin attempting to build something on par with, or superior to the human brain.
Here is a the video about neuroscience and AGI that I submitted to NVIDIA GTC 2021 Conference
How can we build an Artificial General Intelligence?
To build an AGI, we need better neural networks, with more powerful processing in both space and time and the ability to form complex circuits with them, including feedback. We will pick spiking neural networks, which have signals that travel between neurons, gated by synapses.
With these, we can build bidirectional neural network autoencoders that take sensory input data and encode it to compact engrams with the unique input data, keeping the common data in the autoencoder. This allows us to process all the sensory inputs – vision, speech, and many others into consolidated, usable chunks of data called engrams, stored to short-term memory.
Now to store it to long term memory, we process a set of input engrams to reside in a multi-layered, hierarchical, fragmented long-term memory. First we sort the engrams into clusters based on the most important information axis, then autoencode those clusters further with bidirectional networks to create engrams that highlight the next most important information, and so on. At each layer, the bidirectional autoencoder is like a sieve, straining out the common data or features in the cluster, leaving the unique identifying information in each engram, allowing them to then be sorted on the next most important identifying information. Our AI basically divides the world it perceives by distinguishing features, getting more specific as it goes down each level, with the lowest level engram containing the key for how to reconstruct the engram from the features stored in the hierarchy. This leaves it with a differential, non-local, distributed, Hierarchical Fragmented Memory (HFM), containing an abstracted model of the world, similar to how human memory is thought to work.
An example of our encoding process is processing faces. We encode the pictures of faces using the process above. Then we apply alternating layers of autoencoding and clustering to keep sorting those faces and encoding them by implicit features that could be eye color, hair style, hair color, nose shape, and other features (implicitly determined by the layers of autoencoding, and with bins for different classes of features overlapping) – to create a facial recognition system that just by looking at people, autoencodes their face and its features and can assign the associated name that was heard when they were introduced - to that person’s face. Later when we meet a new person, the memory structure and autoencoders are already there to encode them quickly and compactly.
It also encodes language (spoken and written) along with the input information, turning language into a skeleton embedded in the HFM engrams, used to reference the data with, to mold it with, with the HFM give structure and meaning to the language.
When our AI wants to reconstruct a memory (or create a prediction), it works from the bottom up, using language or other keys to select the elements it wants to propagate upwards, re-creating scenes, events, and people, or creating imagined events and people from the fragments by controlling how it traverses upwards. It is this foundation that all of the rest of our design is based on, as once we can re-create past events and imagine new events, we have the ability to predict the future, and plan possible scenarios, doing cognition and problem solving.
We may be able to build a very simple brain that demonstrates these principles, but to scale, we need Charles Darwin - evolutionary genetic algorithms. Basically we define every neuron, synapse, and neural network parameter and how they are organized into layers and cortices and brains - by a genome.
The human brain is represented by only 8000 genes, and decoded by the growth process during fetal developing. We will do the same, because we can’t run genetic algorithms directly on 100 billion neurons, but we can do so on a few thousand genes to run genetic algorithms on much more efficiently, then expand the cross-bred genes to 100 billion neuron brains.
So as we breed generations of ever more sophisticated artificial brains, with more efficient neural networks specialized for specific purposes, we want to steer it into being human-like, or at least able to act and think like a human. For one, we could apply the same cognitive tests we do for children, starting from age 5 and up, to develop them like a human child. Seems logical.
Then, as the AGI starts to grow up - we can pull a trick from the film VFX animation community - do a motion/performance capture of a person, recording their motion, facial expressions and speech, as they go through everyday routines, then setting our artificial brain to train on that dataset, and keep selecting the ones that perform best every generation till we get a human mimic. It will not be AGI, nor human-level intelligence, but it is the best we can do till we make these things have to think more.
To take that all the way to AGI, I would create multiple such AI mimics and put them to work in different professions, writing some specialty code, and evolving specific AIs for each profession, so they have a broad but shallow layer of being conversation bots, but deep skills in their profession.
Now if we have a network of hundreds of different professions, serving millions of clients at once, all with the same brain architecture, with common language and interaction capabilities, how do me make an AGI. Maybe we just network them and that becomes an AGI?
At the very least, we have a framework and input and output on which to train and evolve an AGI, so that all the specialty skills of each vocation are assumed by a more generalized AGI brain, and in the process, that AGI brain becomes better at all the skills humans excel at.
From there, we just keep going till we have a superintelligence, and beyond. Here is the diagram we started with. Deep neural nets don't scale past a certain point because no matter how many more layers of neurons we add, no matter how much labelled data we train with, and no matter how much compute we throw at the problem, the underlying network model is too crude, too approximate, and gains no further cognitive or problem solving capability with these increases. It plateaus for simple problems and is incapable of solving more complex problems.
In our AGI design, we have mapped a very powerful, general purpose, analog spiking neural network computer on top of powerful NVIDIA GPU digital computers, and that flexible design not only scales with compute power, but also scales as those neural networks evolve faster and faster to become larger, broader, and more functional and efficient with time, giving an overall exponential increase in capability greater than the Moore's law increase in the underlying processor power.