29 Comments

You write, "Since horizontal transfer probably doesn't work so well in multi-celled organisms".

That's true, but it's by design -- the cells of a multi-celled organism can cooperate only because they "know" that their neighbor cells are genetically identical.

Expand full comment

The times are in the 20th century for the most part. We have no clue what genomes bacteria had a billion years ago.

Expand full comment

So, 10 billions years ago, we had one base pair lifeforms.

This is quite ridiculous. Clearly at some point it's not life any more, and it's not governed by evolution, it's just chemicals in a chemical equilibrium (which implies there will be fairly complicated molecules at low concentrations). You start with quite a bit of 'complexity' already - the molecules arrange themselves into some auto-catalytic form by mere chance, and then you acquire extra complexity very rapidly as the proto-organism is replicating into the sterile goop with no competition.

Expand full comment

This post reminded me that it would be fun to see you do one of those "totally conventional views that I hold" posts that some of your colleagues have been doing.

Expand full comment

I am not sure it is at all relevant. Genome size varies a lot more than indicated. I understand pine trees,

Paris japonica, and lung fish make mammals genomes look trivial.

Expand full comment

Also what the hell is the genome size of "worms"?

Expand full comment

That paper has the same author, it is not an independent source. The reviewers tore it to shreds, and rightly so.

There is no good methodology given for picking the sizes or times of even the 'eukaryote' and 'prokaryote' datapoints. Thousands of such points are available, and they form massive spreads. And what prokaryote and eukaryote are they using? The nearly 1 terabase amoeba? The 12 megabase yeast? Some ferns are known to have GIGANTIC genomes in the tens of gigabases and that lineage has been around for 350 megayears, More to the point where do you make the boundary between a large clade and the small subclade you then include as a small further right datapoint? And how do you pick at what time to put the point?

Bacteria are still around and constantly speciating. Why not include those speciation events rather than speciation events within the vertebrates on the right?

Within any of those large genome clades, genome sizes random walk up and down dependent upon the exact history of duplication events and transposon resistances and DNA repair mechanisms in any given lineage. Within land plants you have genomes ranging over three orders of magnitude in size and gene numbers ranging over a factor of four off the top of my head. Similar in the metazoa. Where do you pick your cutoff size?

If you want a more rigorous treatment of genome size and the evolution thereof, I recommend a great book by a person I very nearly worked for: "The Origins of Genome Architecture" by Dr. Michael Lynch of Indiana University. Does actual math of population genetics and lays out very interesting ideas about how in the long run, genome size is much more controlled by long-term population size than by organismal complexity with incredibly common organisms like bacteria experiencing direct selection towards small genome size, and the ability to select against lots of DNA experiencing a few discontinuities as population numbers drop as you go to larger eukaryotic cells and then gargantuan multicellular creatues with piddlingly tiny population sizes. Particulars of DNA replication and DNA repair come into play as well, as do the histories of each lineage as in some got virulent transposons or large deletion events or some did not.

The people who wrote these papers you link to, to use a phrase from physics, aren't even wrong and their papers don't deserve much respect.

Expand full comment

Correction: The article mentioned in the post does not have reviewers. I was referring to http://www.ncbi.nlm.nih.gov... mentioned below.

Expand full comment

Maybe relevant is this: Lithopanspermia in Star Forming Clusters http://arxiv.org/abs/astro-... which argues that panspermia would have been a lot easier at the very beginning of the solar system. If the origin of prokaryote life was really difficult and took a long time, on another planet, then there might be patches here and there of planets seeded with bacteria, because their parent molecular cloud was "infected," while many more planets might have missed out on this head start.

Expand full comment

Yup, it's not like the approximate size of the genomes of those species was unknown two years ago. This paper just does not represent the latest knowledge of two years ago, let alone that of today. The "we used all the data that was available at the time" defense does not hold here.

Expand full comment

Aside from the methodological criticisms below, this paper is based on some really bad assumptions. Nucleic acid polymers are extremely themodynamically unfavorable and you've got to have something "living" to even create the building blocks of current life. We really have no idea what "pre-life" life was like, and certainly no reason to think it has similar evolutionary processes and rates to current life.

Further, current prokaryotes are sometimes rather "minimal" in that they have basically the minimum number of genes for a free-living organism to exist. IMO they're probably actually simplified versions of more complex (but less efficient) ancestors. In any case, there's no good reason to think modern prokaryotes, the product of literally billions of years of selection for efficiency, rapid reproduction, and mutation resistance, are in any way representative of early Earth life.

Expand full comment

does not excuse one from drawing what inferences one can

I understand this is one of your credos, but it doesn't apply here. The authors of the target work make no claim that they sampled data based on availability, and the criticism is that they must have deliberately (or at least negligently) chosen to use a biased data set.

[Or do you contend that even deliberately biased "science" is better than nothing?]

Expand full comment

The fact that with more effort one could collect more data and draw stronger inferences does not excuse one from drawing what inferences one can from what data one has at hand.

Expand full comment

Robin, don't be such a bully. Exclusion of known organisms with huge genomes is a valid criticism that is not diminished when a commenter here doesn't spend weeks trailing genetic databases and figuring out the data format and hidden assumptions/conventions of academic genetics and this one particular paper (it only takes one flaw to bust a theory). Regression doesn't take into account different priors, it will just treat a lungfish as a fluke/noise to be averaged out even though we know our data about the lungfish is reliable and that panspermia has a low prior probability, especially from a first generation solar system).

Expand full comment

Obviously not, or you would have just posted the alternative data set.

Expand full comment

One should also keep in mind that all life forms that we currently have on Earth have evolved for exactly the same amount of time. Bacteria have not stayed the same for a billion years, small animals not for half a billion etc. Yet they have different molecular complexities. That must be for other reasons than time, then,

Expand full comment