Tag Archives: Evolution

General Evolvable Brains

Human brains today can do a remarkably wide range of tasks. Our mental capacities seem much more “general” than that of all the artificial systems we’ve ever created. Those who are trying to improve such systems have long wondered: what is the secret of human general intelligence? In this post I want to consider we can learn about this from fact that the brain evolved. How would an evolved brain be general?

A key problem faced by single-celled organisms is how to make all of their materials and processes out of the available sources of energy and materials. They do this mostly via metabolism, which is mostly a set of enzymes that encourage particular reactions converting some materials into others. Together with cell-wall containers to keep those enzymes close to each other. Some organisms are more general than others, in that they can do this key task in a wider range of environments.

Most single-celled organisms use an especially evolvable metabolism design space. That is, their basic overall metabolism system seems especially well-suited to finding innovations and adaptations mostly via blind random search, in a way that avoids getting stuck in local maxima. As I explained in a recent post, natural metabolisms are evolvable in part because they have genotypes that are highly redundant relative to phenotypes: many sets of enzymes can map any given set of inputs into any given set of outputs. And this redundancy requires a substantial overcapacity; the metabolism needs to contain many more enzymes than are strictly needed to create any given mapping.

The main way that such organisms are general is that they have metabolisms with a large library of enzymes. Not just a large library of genes that could code for enzymes if turned on, but an actual large set of enzymes usually created. They make many more enzymes than they actually need in each particular environment where they find themselves. This comes at a great cost; making all those enzymes and driving their reactions doesn’t come cheap.

A relevant analogous toy problem is that of logic gates mapping input signals onto output signals:

[In] a computer logic gate toy problem, … there are four input lines, four output lines, and sixteen binary logic gates between. The genotype specifies the type of each gate and the set of wires connecting all these things, while the phenotype is the mapping between input and output gates. … All mappings between four inputs and four outputs can be produced using only four internal gates; sixteen gates is a factor of four more than needed. But in the case of four gates the set of genotypes is not big enough compared to the set of phenotypes to allow easy evolution. For [evolvable] innovation, sixteen gates is enough, but four gates is not. (more)

Note that evolution doesn’t always use such highly evolvable design spaces. For example, our skeletal structure doesn’t have lots of extra bones sitting around ready to be swapped into new roles in new environments. In such cases, evolution chose not to pay large extra costs for generality and evolvability, because the environment seemed predictable enough to stay close to a good enough design. As a result, innovation and adaptation of skeletal structure is much slower and more painful, and could fail badly in novel enough environments.

Now let’s consider brains. It may be that for some tasks, evolution found such an effective structure that it chose to commit to that structure, betting that its solution was stable and reliable enough across future environments to let it forgoe the big extra costs of more general and evolvable designs. But if we are looking to explain a surprising generality, flexibility, and rapid evolution in human brains, it makes sense to consider the possibility that human brain design took a different path, one more like that of single-celled metabolism.

That is, one straightforward way to design a general evolvable brain is to use a extra large toolbox of mental modules that can be connected together in many different ways. While each tool might be a carefully constructed jewel, the whole set of tools would have less of an overall structure. Like a pile of logical gates that can be connected many ways, or metabolism sub-networks that can be connected together into many networks. In this case, the secret to general evolvable intelligence would be less in the particular tools and more in having an extra large set of tools, plus some simple general ways to search in the space of tool combinations. A tool set so large that the brain can do most tasks in a great many different ways.

Much of the search for brain innovations and adaptations would then be a search in the space of ways to connect these tools together. Some aspects of this search could happen over evolutionary timescales, some could happen over the lifetime of particular brains, and some could happen on the timescale of cultural evolution, once that got started.

On the timescale of an individual brain lifetime, a search for tool combinations would start with brains that are highly connected, and then prune long term connections as particular desired paths between tools are found. As one learned how to do a task better, one would activate smaller brain volumes. When some brain parts were damaged, brains would often be able to find other combinations of the remaining tools to achieve similar functions. Even losing a whole half of a brain might not greatly reduce performance. And these are all in fact common patterns for human brains.

Yes, something important happened early in human history. Some key event changed the growth rate of human abilities, though not immediate ability levels, and it did this without much changing brain modules and structures, which remain quite close to those of other primates. Plausibly, we had finally collected enough hard-wired tools, or refined them well enough, to let us start to reliably copy each others’ behaviors. And that allowed cultural evolution, a much-faster-than-evolutionary search in the space of practices. Such practices included choices of which combinations of brain modules to activate in which contexts.

What can this view say about the future of brains? On ems, it suggests that human brains have a lot of extra capacity. We can probably go far in taking an em that can do a job task and throwing away brain modules not needed for that task. At some point cutting hurts performance too much, but for many job tasks you might cut 50% to 90% before then.

Regarding other artificial intelligence, it suggests that if we still have a lot to learn via substantially random search, with no grand theory to integrate it all, then we’ll have to focus on collecting more better tools. Machines would gradually get better as we collect more tools. There may be thresholds where you need enough tools to do a certain jobs well, and while most tools would make only small contributions, perhaps there are a few bigger tools that matter more. So key thresholds would come from the existence of key jobs, and from the lumpiness of tools. We should expect progress to be relatively continuous, except perhaps due to the discovery of especially  lumpy tools, or to passing thresholds that enable key jobs to be done.

GD Star Rating
Tagged as: ,

How Does Evolution Escape Local Maxima?

I’ve spend most of my intellectual life as a theorist, but alas it has been a while since I’ve taken the time to learn a new powerful math-based theory. But in the last few days I’ve enjoyed studying Andreas Wagner’s theories of evolutionary innovation and robustness. While Wagner has some well-publicized and reviewed books, such as Arrival of the Fittest (2014) and Robustness and Evolvability in Living Systems (2005), the best description of his key results seems to be found in a largely ignored 2011 book: The Origins of Evolutionary Innovations. Which is based on many academic journal articles.

In one standard conception, evolution does hill-climbing within a network of genotypes (e.g, DNA sequence), rising according to a “fitness” value associated with the phenotype (e.g., tooth length) that results from each genotype. In this conception, a big problem is local maxima: hill-climbing stops once all the neighbors of a genotype have a lower fitness value. There isn’t a way to get to a higher peak if one first must travel through a lower valley to reach it. Maybe random noise could let the process slip through a narrow shallow valley, but what about valleys that are wide and deep? (This is a familiar problem in computer-based optimization search.)

Wagner’s core model looks at the relation between genotypes and phenotypes for metabolism in an organism like E. coli. In this context, Wagner defines a genotype as the set of chemical reactions which the enzymes of an organism can catalyze, and he defines a phenotype as the set of carbon-source molecules from which an organism could create all the other molecules it needs, assuming that this source was its only place to get carbon (but allowing many sources of other needed molecules). Wagner defines the neighbors of a genotype as those that differ by just one reaction.

There are of course far more types of reactions between molecules than there are types of molecules. So using Wagner’s definitions, the set of genotypes is vastly larger than the set of phenotypes. Thus a great many genotypes result in exactly the same phenotype, and in fact each genotype has many neighboring genotypes with that same exact phenotype. And if we lump all the connected genotypes that have the same phenotype together into a unit (a unit Wagner calls a “genotype network”), and then look at the network of one-neighbor connections between such units, we will find that this network is highly connected.

That is, if one presumes that evolution (using a large population of variants) finds it easy to make “neutral” moves between genotypes with exactly the same phenotype, and hence the same fitness, then large networks connecting genotypes with the same phenotype imply that it only takes a few non-neutral moves between neighbors to get to most other phenotypes. There are no wide deep valleys to cross. Evolution can search large spaces of big possible changes, and doesn’t have a problem finding innovations with big differences.

Wagner argues that there are also far more genotypes than phenotypes for two other cases: the evolution of DNA sequences that set the regulatory interactions among regulatory proteins, and for the sequences of ribonucleotides or amino acids that determine the structure and chemical activity of molecules.

In addition, Wagner also shows the same applies to a computer logic gate toy problem. In this problem, there are four input lines, four output lines, and sixteen binary logic gates between. The genotype specifies the type of each gate and the set of wires connecting all these things, while the phenotype is the mapping between input and output gates. Again, there are far more genotypes than phenotypes. However, the observant reader will notice that all mappings between four inputs and four outputs can be produced using only four internal gates; sixteen gates is a factor of four more than needed. But in the case of four gates the set of genotypes is not big enough compared to the set of phenotypes to allow easy evolution. For easy innovation, sixteen gates is enough, but four gates is not.

If we used a larger space of genotypes within which the number of logic gates could vary, and if the fitness function had a penalty for using more logical gates, then we’d have a problem. No matter where the genotype started, evolution might quickly cut the number of gates down to the minimum needed to implement its current input-output mapping, and then after that too few neutral changes would be possible to make evolution easy. The same problem seems possible in Wagner’s core model of metabolism; if the fitness function has a penalty for the number of enzymes used, evolution might throw away enzymes not needed to produce the current phenotype, after which too few neutral changes might be possible to allow easy evolution.

Wagner’s seems to suggest a solution: larger more complex systems are needed for robustness to varying environments:

Based on our current knowledge, the metabolic reaction networks of E. coli and yeast comprise more than 900 chemical reactions. However in a glucose minimal environment, more than 60 percent of these reactions are silent. … Overall, in E. coli, the fraction of reactions that would not reduce bio-mass growth when eliminated exceeds 70 percent. This is … a general property of viable networks that have similar complexity. … As a metabolic generalist, the E. coli metabolic network can synthesize its biomass from more than 80 alternative carbon sources. … All these observations indicate that the large metabolic networks of free-living organisms are much more complex than necessary to sustain life in any one environment. Their complexity arises from their viability in multiple environments. A consequence is that these networks appear highly robust to reaction removal in any one environment, where every metabolic networks has multiple natural neighbors. This neutrality, however, is conditional on the environment. (pp.153-154)

I’m not sure this solves the problem, however. In the logic gate toy problem, even if phenotype fitness is given by a weighted average over environments, we’ll still have the same temptation to increase fitness by dropping gates not needed to implement the current best bit mapping. In the case of enzymes for metabolism, fitness given by a weighted average of environments may also promote an insufficient complexity of enzymes. It seems we need a model that can represent the value of holding gate or enzyme complexity in reserve against the possibility of future changes.

I worry that this more realist model, whatever it may be, may contain a much larger set of phenotypes, so that the set of genotypes is no longer much larger, and so no longer guarantees many neutral changes to genotypes. Perhaps a “near neutrality” will apply, so that many genotype neighbors have only small fitness differences. But it may require a much more complex analysis to show that outcome; mere counting may not be enough. I still find it hard to believe that for realistic organisms, the set of possible phenotypes is much less than the set of genotypes. Though perhaps I could believe that many pairs of genotypes produce the same distribution over phenotypes, as environments vary.

Added 10am: Another way to say this: somehow the parameter that sets how much complexity to keep around has to change a lot slower than do most other parameters encoded in the genome. In this way it could notice the long term evolvability benefits of complexity.

GD Star Rating
Tagged as: ,