How Good 99% Brains?

Software systems are divided into parts, and we have two main ways to measure the fraction of a system that each part represents: lines of code, and resources used. Lines (or bits) of code is a rough measure of the amount of understanding that a part embodies, i.e., how hard it is to create, modify, test, and maintain. For example, a system that is more robust or has a wider range of capacities typically has more lines of code. Resources used include processors, memory, and communication between these items. Resources measure how much it costs to use each part of the system. Systems that do very narrow tasks that are still very hard typically take more resources.

Human brains can be seen as software systems composed of many parts. Each brain occupies a spatial volume, and we can measure the fraction of each brain part via the volume it takes up. People sometimes talk about measuring our understanding of the brain in terms of the fraction of brain volume that is occupied by systems we understand. For example, if we understand parts that take up a big fraction of brain volume, some are tempted to say we are a big fraction of the way toward understanding the brain.

However, using the software analogy, brain volume seems usually to correspond more closely to resources used than to lines of code. For example, brain volumes seem to have roughly similar levels of activity, which isn’t what we’d expect if they corresponded more to lines of code than to resources used.

Consider two ways that we might shrink a software system: we might cut 1% of the lines of code, or 1% of the resources used. If we cut 1% of the resources used via cutting the lines of code that use the fewest resources, we will likely severely limited the range of abilities of a broadly capable system. On the other hand, if we cut the most modular 1% of the lines of code, that system’s effectiveness and range of abilities will probably not fall by remotely as much.

So there can be a huge variation in the effective lines of code corresponding to each brain region, and the easiest parts to understand are probably those with the fewest lines of code. So understanding the quarter of brain volume that is easiest to understand might correspond to understanding only 1% or less of lines of code. And continuing along that path we might understand 99% of brain volume and still be a very long way from being able to create a system that is as productive or useful as a full human brain.

This is why I’m not very optimistic about creating human level AI before brain emulations. Yes, when we have nearly the ability to emulate a whole brain, we will have better data and simulations to help us understand brain parts. But the more brain parts there are to understand, the harder it will be to understand them all before brain emulation is feasible.

Those who expect AI-before-emulations tend to think that there just aren’t that many brain parts, i.e., that the brain doesn’t really embody very many lines of code. Even though the range of capacities of a human brain, even a baby brain, seems large compared to most known software systems, these people think that this analogy is misleading. They guess that in fact there is a concise powerful theory of intelligence that will allow huge performance gains once we understand it. In contrast, I see the analogy to familiar software as more relevant; the vast capacity of human brains suggests they embody the equivalent of a great many lines of code. Content matters more than architecture.

GD Star Rating
Trackback URL:
  • I suspect you are making an implied argument here, but am not sure. Maybe you could clarify.

    As you know, there are projects to model simpler organisms than humans, such as roundworms

    Which even turn up surprise neurons every now and then

    So one approach to brain emulation is to start with simple organisms and work your way up to more complex brains. Now the question is how far into complexity (worms, insects, fish, mammels, primates) you have to go before you capture the essential brain functions needed for human level intelligence. This corresponds to those critical 1% lines of code you talk about in your post, which even if we can know the easiest 25% of the brain, we may only be a very small way to understanding. And of course the prime suspect here would be the neocortex, or some complex aspects of how that works with rest of brain.

    So does your argument imply that there is some discontinuous break in modeling simple to complex organisms where if we don’t cross that break we can’t get at the essentials of what makes human brains work so well?

    My priors are that mouse brains would be good enough to figure out how human brains work their magic. Likely frogs, or even simpler, perhaps even insects. Hence I believe that animal ems will lead to breakthroughs in AGI, which will happen prior to human ems being practical. But from this post I suspect you are also saying that the threshold of getting to really good AGI is much above that level, and that human brains have evolved tricks which are built on some other complexity that is not in simpler animals. Evultion is a tinkerer, so my priors are that humans have just a turbo version built on the same tricks, not something totally new. But where would that threshold lie? Which animal em would need to be compelted and played with before you think we could tease apart how AGI could be created? Humans? Primates? Birds? Mice? Frogs? Insects?

    More to the point, why wouldn’t modeling animal ems lead to AGI breakthroughs (figuring out that secret 1% of the software code in human brains) before we can do human ems?

    • I expect animal brains also require the equivalent of a great many lines of code. There probably are simpler animals, but I don’t see that it is very valuable to emulate those, relative to humans.

      • ok. thanks. I believe you are then saying cognition in human brains has some fundamental difference from animals, not just an emergent difference resulting from same basic process (lines of code/algorithm) running with more scale. Hence you saying it’s not very valuable to emulate simpler animals. If this is NOT true, then taking a mouse and giving it’s em an artificially 100x neocortex (assuming you trial and error to not killing it or making it insane), would make it potentially human or more smart. And tinkering may reveal underlying processes for a straight up AGI. Which to me seems more likely. But of course no way to be certain right now. Either possibly true. Corvids pretty smart without neocortex, so increases odds the fundamental process for cognition latent in very basic brain function of vertebrates. But who knows.

      • The key difference between humans and other animals is sufficient support for cultural evolution. This may end up being only a small fraction of the total lines of code, but it is a crucial fraction that makes all the difference.

      • Ah! This gets to the root of it. Capacity for cultural evolution. Thanks.

        I just finished reading The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter by Joseph Henrich. (from 2015). That is his thesis on what kicked of the evolutionary ratchet for hominins. One of my favorite reads this year. Though it’s very hedgehog in single explanation theme driving his entire book. (that’s fine, but was sometimes overdone).

        Side note: not Henrich’s fault of course, but really awful book cover, especially the font (Monotype Grotessque?). In contrast I really like the cover/subtitle for your new book. Frist rate artwork and subtitle! Looking forward to it.

  • Tim Brownawell

    There’s also the possibility to run the exact same code on smaller hardware with less cores or memory.

  • free_agent

    Of course, there’s a bound on the amount of information that the brain contains per se: the number of genes that build the brain. There’s only a certain number of lines of code.

    In regard to resources, I’ve read that IQs are fairly strongly correlated with nerve conduction speed. So the number of lines of code might not be the critical factor in human intelligence. Certainly the human brain is infamous for high resource consumption, to the point that evolutionary biologists speculate that the human diet and digestive tract are optimized specifically to deliver energy to keep the brain going.

    • Yes, the number of genes bounds the lines of code.

      • truth_machine

        Both of you are completely clueless. There are one-line programs that can produce infinite outputs, and those outputs can in turn be lines of code.

      • The amount of information (which I am distinguishing from “output”, because an infinite repeating output contains limited information) that can be generated is limited by the size of the code. That can grow very quickly with size (see Busy Beaver numbers), but it’s still bounded.

      • truth_machine

        More cluelessness. First, repeated output has no more information than a single instance … except for the number of repetitions, and with no bound on that number, there is no bound on the amount extra information … the number of repetitions could be a Godel numbering of any of an infinite number of messages. So even on that score you’re wrong. Second, the amount of information in the code only limits the amount of information in the output if the code is *self-contained*. But code has *inputs* that are independent of it. A small piece of code that simply copies its input to its output will produce as much information in the output as there is in the input … duh. That’s why it is so absurdly inept for ignorant dolts like Hanson to blather about the “the number of genes” limiting the information in the brain. Brains *learn* … they incorporate information from external sources. Sheesh.

      • The amount of information (which I am distinguishing from “output”, because an infinite repeating output contains limited information) that can be generated is limited by the size of the code.

        Isn’t this contradicted by natural alphabetical languages like English, which can convey unlimited information with a finite code?

      • truth_machine

        I don’t know about you, but my English is not infinitely repeating.

      • Programs are ultimately written in binary, which is only two characters, but as Quine noted is sufficient to express anything that can be written in English. However, in order to express all those things, one would need a certain number of those two characters. Are genes like characters or sentences (which are used in lines of English pseudo-code)? I’d say the base-pairs used to write genes are more analogous to characters. There are different conceptions of what a “gene” is, Dawkins’ is an information-theoretic one closer to a sentence or line of code.

      • Would someone spell out why this is the case?

      • David Condon

        1st law of thermodynamics

        Everything has to have at some point come from outside the person. A person cannot create or destroy anything because that would break thermodynamics.

        People are by necessity a product of their initial starting point (their genes) and their environmental influences as they develop (the external stimuli); the brain then must be a product of those two things. If we specify the environmental influences as reinforcement learning algorithms occurring while the program is running (while the person is learning from their environment), then that leaves just genes which would correspond to the lines of code; the initial starting point which created the organism.

        Now, it may be very difficult to solve the problem in that way. We may need to specify a much more elaborate system than genes use if we don’t fully understand how those environmental influences produce the product from the initial starting point, but a human-level intelligence program that has been perfectly optimized to take up as few lines of code as possible must be less than the size of the human genome.

    • truth_machine

      “there’s a bound on the amount of information that the brain contains per se: the number of genes that build the brain. ”

      Nonsense. The brain contains information gathered through the senses. The number of genes is irrelevant … there’s a universal Turing Machine with only 2 states and three symbols.

  • Brian Slesinsky

    The problem with these crude measures is that they don’t take into account differing incentives for improving efficiency. There is strong evolutionary pressure on birds to reduce weight. Not so for whales. Can we compare brain sizes between birds and whales? Probably not. I would expect that a crow or parrot’s brain is much more efficient than a whale’s brain.

    Similarly in software. Researchers often write poor quality code quickly, since it will be abandoned once they publish their paper. A developer targeting mobile phones in Africa has strong incentives to reduce network bandwidth and unnecessary CPU usage. Developers targeting iPhones in the US optimize for user experience by including high-resolution graphics files and “unnecessary” animations. Web page developers optimize for download speed much more than developers of mobile phone apps that are downloaded in advance. Teams targeting desktops or server-side applications have little incentive to reduce code size – often code is kept around just because nobody has gotten around to deleting it yet, or it’s too time-consuming or risky to figure out whether it’s still used. Sometimes the only real limit s that developer productivity goes down when maintaining large, difficult to understand systems.

    It’s even worse in business software where larger software packages often win sales by having more “checkbox features” that help salespeople win contracts. Sometimes these features are very low quality code written in a hurry.

    We can only compare resource usage between similar systems designed under similar incentives and constraints. Comparing humans with AI software is particularly suspect.

    • truth_machine

      Hanson seems to have at best a rudimentary understanding of software … about the level of being able to count lines of code. And his understanding of the brain doesn’t seem to be any better.

  • You are probably right, that our technology will be able to emulate a brain, before it can completely understand a human brain’s algorithms. But fully understanding a human brain is not the only path to AGI. Airplanes are not copies of birds, Deep Blue doesn’t play chess the same way Kasparov did, and AlphaGo also plays differently than Lee Sedol. Perhaps there is a path to AGI-level capabilities that sidesteps fully understanding actual human brains.

    • I agree there are other possible paths. Even so, the chance of this particular path affects the chance of the overall move.

  • Lord

    This seems to assume emulations are possible without understanding those lines of code. While I agree volume/resources are of lessor importance, lessor animals are quite intelligent after all and the difference between them and us is less than we flatter ourselves with, I see most of emulation as understanding the fine structure of memory storage, retrieval, organization, and manipulation that do involve vast amounts of code and emulation won’t be possible without that. I am not even convinced these are actually different strategies.

    • No, it is conceivable that you could emulate merely by reproducing the functionality of an individual neuron, then mapping all the specific neurons in a specific human brain, then emulating the entire structure — without having any understanding at all about how it works (beyond the individual neuron).

      • Lord

        That is one immense ‘merely’, and a larger task than all the rest put together. The question is whether we will ever be able to create true intelligence without understanding that. I think not.

      • You think understanding how a single neuron works, is much more difficult than understanding the software architecture of the 100 billion neurons that make up general intelligence in whole human brains?

      • Lord

        To a large extent we won’t be able to understand how a neuron works without understanding how it is interacting with all the others, at least in a simplified general way as we will find neurons adapted to different uses, and any attempt to replace the neuron with a set of inputs and outputs would just be the equivalent of assuming a homunculus.

      • truth_machine

        What you wrote was ” merely by reproducing the functionality of an individual neuron, then mapping all the specific neurons in a specific human brain, then emulating the entire structure ” — that’s not just understanding how a single neuron works.

      • truth_machine

        That something is conceivable is of no relevance.

        Go build your emulation. Start it up. It just sits there, or produces wrong results in any of a million different ways … now what?

        Daniel Dennett’s “Intentional Stance Theory” suggests three levels at which to understand something: physical, design, and intentional. All you have is the physical level, which is not enough.

  • Sigivald

    Consider two ways that we might shrink a software system: we might cut
    1% of the lines of code, or 1% of the resources used. If we cut 1% of
    the resources used via cutting the lines of code that use the fewest
    resources, we will likely severely limited the range of abilities of a
    broadly capable system. On the other hand, if we cut the most modular 1%
    of the lines of code, that system’s effectiveness and range of
    abilities will probably not fall by remotely as much.

    … say what?

    Why would one assume that the lines of code that use the least resources are the ones that would severely limit the abilities of the system?

    (Or was there some translation error, or miscommunication, or misreading on my part?)

    From the programmer’s perspective, it’s the code that runs all the time that does all the work, definitionally; the most re-used code is the stuff used everywhere, in general software terms.

    (To shrink the “entire system”, “lines of code” is a terrible metric; at very least use code cyclomatic complexity to get a little closer.

    Resources isn’t necessarily terrible, but …

    I can see the same possible awkwardness with consciousness emulation with random software – we might well use most of our “resources” on stuff that isn’t the core; whether it’s “autonomic processes keeping the body working” or “keeping the UI responsive with pretty animations”.

    The interesting bits (consciousness, business/scientific logic) may or may not be the complex or resource-intensive ones … and you can’t tell without understanding the code in the first place.)

    • truth_machine

      “Why would one assume that the lines of code that use the least resources are the ones that would severely limit the abilities of the system?”

      Imagine removing the math library.

    • If a system spends 90% of its time in some inner loop, then cutting the least used 10% of the system takes away the entire rest of the system except for that inner loop.

  • truth_machine

    This is wrong in so many different ways … AI does not have to do exactly what brains do, nor do it how brains do it. For the brain, architecture is far more important than content, other than the content of DNA that encodes the architecture. Content comes from experience, encoding outside signals into the fabric of the brain. In computers, architecture is the basic framework into which both data content and software content is loaded. For most data, content is almost everything, but for other data, the relationships are essential. For software, architecture and content are closely intertwined … some lines of code (declarative) define structure, others (imperative) perform operations on the structure.

    “the vast capacity of human brains suggests they embody the equivalent of a great many lines of code”

    That utterly confused analogy is a startling example of “not even wrong”. Here’s one to consider instead: DNA as a macro or code generating system. Counting the number of lines of output of generated code is something that a bean counter with no understanding of the system architecture might do.

  • David Condon

    The human genome is only about 3 giga base pairs; roughly comparable to 3 gigabytes of code. At least some of it is likely unrelated to human reasoning, so the relevant size is smaller still. That’s a lot of code, but not exactly unfathomable. Modern operating systems have around a hundred million lines of code. The brain has a much larger capacity, but the rest of it is conditioning; learning through practice. It’s not clear that the first AI program will be nearly as efficient as human DNA, but it doesn’t have to be. Computer software is already in the ballpark scalewise.

    • Yes, in the ballpark. But my point is that for familiar large software systems it isn’t remotely enough to known the code where 99% of the runtime is spent.

      • David Condon

        Well, I do think you’re right that we’ll have a brain emulation before we understand the human brain (although depending on where technological growth is at when that happens, the time between the first milestone and the second milestone might be very short). But I’d also say we know very little about the human brain at this point, which increases the likelihood that another avenue will produce human-like intelligence.

    • Gunnar

      But the genetic code has a different quality than human code that is written for e.g. maintenance and simplicity. This is well illustrated here:

  • Joshua Brulé

    I find this reasoning pretty convincing, but I can still think of a lot of cases where the solution evolution produced is much more difficult for us to copy than a solution humans invented. Flapping wings, instead of fixed wing aircraft. Photosynthesis instead of solar panels. Fermentation or aerobic respiration and ATP instead of an internal combustion engine.

    Maybe I’m cherry-picking my examples – I would think that I’m more likely to remember concise, powerful theories because they’re so useful. For the problem at hand, I don’t know whether to expect a concise, powerful theory of intelligence or not.

  • Pingback: Outside in - Involvements with reality » Blog Archive » Chaos Patch (#107)()

  • efalken

    There are cases where people have hydrocephalus that reduces their brain size by 75% and their IQs are ‘normal’.

  • Pingback: Links for March 2016 - foreXiv()

  • Elliot Olds

    So the key insight from the triumph of machine learning over “Good Old Fashioned AI” is that intelligence arises from data, not code (at least, for the kind of intelligence we can create now). In some sense the structure of the machine learning models that arise from training on lots of data is “code”, but that complicated code falls out of a a pretty general training method combined with data.

    The complicated content in our brains is a result of a pretty dumb process (natural selection) involving lots and lots of data and some simple simple criteria for when one organism is better than another and how organisms can mutate.

    The main advantage of evolution over our machine learning techniques is that it has had billions of years over which to operate, combined with an “environment” (reality) which is very rich and detailed (hence dealing with it benefits from lots of content).

    If we could quickly simulate detailed environments on computers, we could reproduce this process. Computers can simulate simple environments and generate lots of data from them very quickly. As environments get more complex simulating them gets more expensive. So a key question is: what is the level of environment detail/structure that offers the best [ease/quickness of simulation] / [rich enough to yield capable intelligence] tradeoff. It could be far below the level of detail of reality.

    For instance, it seems clear that if somehow humans had evolved in an environment where Newtonian physics was true and quantum / relativistic effects didn’t have to be calculated, our resulting intuitions about physics would not be significantly different. So we can cut out a lot of computational effort without losing practical benefit by not simulating quantum effects in the environments we use to create AIs.

    You’ve probably heard of DeepMind creating an AI that can play any of ~50 Atari games extremely well, using a single general learning algorithm. Here the environments are simple so lots of data can be generated quickly, but the resulting skills don’t carry over into other more complex environments. What happens as the game environments get more complex and more similar to our world, or as we specifically construct “games” to reward more general reasoning skills?

    IMO, training AIs on data from rapid simulations of increasingly complex environments is a pretty plausible road to general AI which doesn’t depend on humans needing to manually feed complex content to the AI (via writing code) or understand brain architecture. This approach works regardless of whether you or Yudkowsky are right about the importance of content vs. architecture.

    • zarzuelazen

      Yup, the end (true AGI) is quite near, anyone who thinks otherwise just hasn’t done their homework.

      As I stated in previous thread:

      “Full-reflective reasoning is entirely captured in a hierarchy consisting of a mere 3-levels of abstraction. AGI will indeed just be a generalized version of the 3-level architecture already on display in AlphaGo.

      Level 1: Evaluation network – evaluates ‘position’ of the world

      Level 2: Policy network – selects ‘moves’ (actions) in the world

      Level 3: Planning – world model (simulation) of possibilities”

      The key insights are that infinite recursion is always at most 3-levels deep, and that you can combine model-free methods (levels 1 and 2, which are machine learning pattern recognition) with high-level symbolic reasoning (level 3, which consists of intelligent simulations of the world based on combinations of concepts) to achieve full-blown AGI.

      I strongly urge skeptics to read the paper I link to below, which provides a long detailed summary of the state-of-the-art of cognitive science and offers clear approaches to the final missing links for AGI (high-level modelling and concepts).

      AGI cannot possibly take longer than 20 years, and will be likely be done within 5-10 years. I repeat: anyone who thinks otherwise just hasn’t done their homework. Please carefully read the paper below and follow up all the references therein.

      Anyone who wants to have a shot at AGI needs to stop sitting around doing philosophy and start coding now.


      Building Machines That Learn and Think Like Humans

      • jhertzli
      • zarzuelazen

        Nope, any recursion more than 3 levels can always be collapsed to 3 levels. So the extra levels are redundant.
        This is because a full representational system can always be obtained on the 3rd level (any extra levels of recursion are not expressing anything that cannot be entirely captured by an equivalent system of only 3 levels).

    • Joe

      Two problems I can see with this argument:

      – It seems like it should apply to everything we might want to build, not just brains. Most research should be moving towards setting up an initial simulated environment and letting a genetic algorithm find the best design to solve the problem we’re wanting to address. I’m sure you can point me to some particular cases where this kind of approach has been introduced and used successfully – I can think of some examples myself – but for this to be the new dominant paradigm, rather than just another tool in our toolbox, will require more evidence than that. Maybe you think that we will in fact see this taking over all design work, but just not for a while yet?

      – Paging Robin on the handful of discrete growth modes of humans and prehuman life: it seems like an industrial society is vastly more powerful than evolution. And this seems to hold up to simple observation too, when you look at how damn much we’ve accomplished over the last few hundred years compared to what evolution took billions of years to produce. While I do think there are probably benefits to the ‘evolutionary approach’, i.e. progressing via many small advancements which must each be justified in their own right, I’m doubtful that using it to the exclusion of everything else is going to turn out to be the most effective approach to building systems when historically the opposite seems to have been true.

      • Not sure what made you think I’m big on an evolutionary relative to industrial approach.

      • Joe

        Nothing did, I just used the wrong word. Oops.

      • Elliot

        It isn’t worth it to use this approach for every type of problem because there is lots of overhead involved. In some cases it will be worth it, in many cases it won’t be.

        Most computer systems people build are very simple in comparison to brains. If you’re writing a program to calculate the area of a geometrical shape, it’s overkill to try to train a neural net to understand the general concepts of shape and area, since you already understand them (thanks to evolution and to all the learning your brain has absorbed since you came into existence) and the knowledge is already represented very compactly in a way that you can transfer to a computer in a few minutes.

        Think of a brain or an AI system as a sort of cache that represents some fraction of the intelligence embedded in lots of raw data. Once this ‘intelligence cache’ exists, using it for problems similar to those it’s good at solving can be way more efficient than creating a new cache. When you start encountering sufficiently hard or different problems, a new cache may be warranted.

        I think AI will eventually take over all design work (either ems, as Robin writes about, or machine learned systems as I think is more likely), but we won’t create a new ML systems for each design problem just as animals don’t generate new brains for each problem they face.

        I agree, our society is way more powerful than evolution. I am not saying evolution is the most efficient way of creating intelligent systems. We can use some principles from evolution and combine them with our own techniques to create better systems. Deep learning isn’t just virtual evolution. What they share is the creation of a very complex system by starting with a simple learning architecture, feeding it lots of raw data, and allowing it to adapt to capture the intelligence embedded in this data.

        Maybe there’s a smarter “industrial” way to create general AI, but I don’t see it, and I think the path I described in my original post will arrive sooner than ems.

      • Elliot

        My earlier reply to this seems to have gotten eaten. this as a test comment. If it goes through I’ll delete this and try again.

  • Pingback: Overcoming Bias : Why Does Software Rot?()

  • Pingback: Overcoming Bias : A Tangled Task Future()