Discover more from Overcoming Bias
How Does Evolution Escape Local Maxima?
I’ve spend most of my intellectual life as a theorist, but alas it has been a while since I’ve taken the time to learn a new powerful math-based theory. But in the last few days I’ve enjoyed studying Andreas Wagner’s theories of evolutionary innovation and robustness. While Wagner has some well-publicized and reviewed books, such as Arrival of the Fittest (2014) and Robustness and Evolvability in Living Systems (2005), the best description of his key results seems to be found in a largely ignored 2011 book: The Origins of Evolutionary Innovations. Which is based on many academic journal articles.
In one standard conception, evolution does hill-climbing within a network of genotypes (e.g, DNA sequence), rising according to a “fitness” value associated with the phenotype (e.g., tooth length) that results from each genotype. In this conception, a big problem is local maxima: hill-climbing stops once all the neighbors of a genotype have a lower fitness value. There isn’t a way to get to a higher peak if one first must travel through a lower valley to reach it. Maybe random noise could let the process slip through a narrow shallow valley, but what about valleys that are wide and deep? (This is a familiar problem in computer-based optimization search.)
Wagner’s core model looks at the relation between genotypes and phenotypes for metabolism in an organism like E. coli. In this context, Wagner defines a genotype as the set of chemical reactions which the enzymes of an organism can catalyze, and he defines a phenotype as the set of carbon-source molecules from which an organism could create all the other molecules it needs, assuming that this source was its only place to get carbon (but allowing many sources of other needed molecules). Wagner defines the neighbors of a genotype as those that differ by just one reaction.
There are of course far more types of reactions between molecules than there are types of molecules. So using Wagner’s definitions, the set of genotypes is vastly larger than the set of phenotypes. Thus a great many genotypes result in exactly the same phenotype, and in fact each genotype has many neighboring genotypes with that same exact phenotype. And if we lump all the connected genotypes that have the same phenotype together into a unit (a unit Wagner calls a “genotype network”), and then look at the network of one-neighbor connections between such units, we will find that this network is highly connected.
That is, if one presumes that evolution (using a large population of variants) finds it easy to make “neutral” moves between genotypes with exactly the same phenotype, and hence the same fitness, then large networks connecting genotypes with the same phenotype imply that it only takes a few non-neutral moves between neighbors to get to most other phenotypes. There are no wide deep valleys to cross. Evolution can search large spaces of big possible changes, and doesn’t have a problem finding innovations with big differences.
Wagner argues that there are also far more genotypes than phenotypes for two other cases: the evolution of DNA sequences that set the regulatory interactions among regulatory proteins, and for the sequences of ribonucleotides or amino acids that determine the structure and chemical activity of molecules.
In addition, Wagner also shows the same applies to a computer logic gate toy problem. In this problem, there are four input lines, four output lines, and sixteen binary logic gates between. The genotype specifies the type of each gate and the set of wires connecting all these things, while the phenotype is the mapping between input and output gates. Again, there are far more genotypes than phenotypes. However, the observant reader will notice that all mappings between four inputs and four outputs can be produced using only four internal gates; sixteen gates is a factor of four more than needed. But in the case of four gates the set of genotypes is not big enough compared to the set of phenotypes to allow easy evolution. For easy innovation, sixteen gates is enough, but four gates is not.
If we used a larger space of genotypes within which the number of logic gates could vary, and if the fitness function had a penalty for using more logical gates, then we’d have a problem. No matter where the genotype started, evolution might quickly cut the number of gates down to the minimum needed to implement its current input-output mapping, and then after that too few neutral changes would be possible to make evolution easy. The same problem seems possible in Wagner’s core model of metabolism; if the fitness function has a penalty for the number of enzymes used, evolution might throw away enzymes not needed to produce the current phenotype, after which too few neutral changes might be possible to allow easy evolution.
Wagner’s seems to suggest a solution: larger more complex systems are needed for robustness to varying environments:
Based on our current knowledge, the metabolic reaction networks of E. coli and yeast comprise more than 900 chemical reactions. However in a glucose minimal environment, more than 60 percent of these reactions are silent. … Overall, in E. coli, the fraction of reactions that would not reduce bio-mass growth when eliminated exceeds 70 percent. This is … a general property of viable networks that have similar complexity. … As a metabolic generalist, the E. coli metabolic network can synthesize its biomass from more than 80 alternative carbon sources. … All these observations indicate that the large metabolic networks of free-living organisms are much more complex than necessary to sustain life in any one environment. Their complexity arises from their viability in multiple environments. A consequence is that these networks appear highly robust to reaction removal in any one environment, where every metabolic networks has multiple natural neighbors. This neutrality, however, is conditional on the environment. (pp.153-154)
I’m not sure this solves the problem, however. In the logic gate toy problem, even if phenotype fitness is given by a weighted average over environments, we’ll still have the same temptation to increase fitness by dropping gates not needed to implement the current best bit mapping. In the case of enzymes for metabolism, fitness given by a weighted average of environments may also promote an insufficient complexity of enzymes. It seems we need a model that can represent the value of holding gate or enzyme complexity in reserve against the possibility of future changes.
I worry that this more realist model, whatever it may be, may contain a much larger set of phenotypes, so that the set of genotypes is no longer much larger, and so no longer guarantees many neutral changes to genotypes. Perhaps a “near neutrality” will apply, so that many genotype neighbors have only small fitness differences. But it may require a much more complex analysis to show that outcome; mere counting may not be enough. I still find it hard to believe that for realistic organisms, the set of possible phenotypes is much less than the set of genotypes. Though perhaps I could believe that many pairs of genotypes produce the same distribution over phenotypes, as environments vary.
Added 10am: Another way to say this: somehow the parameter that sets how much complexity to keep around has to change a lot slower than do most other parameters encoded in the genome. In this way it could notice the long term evolvability benefits of complexity.