Gene families and evolution

QuarkHead

Remedial Math Student
Valued Senior Member
The following thoughts are probably not original, although I have never seen them put down in this way.

It is a fact that, in general, more complex organisms have more DNA that simpler ones. It is also a fact that more complex forms have more genes than simpler ones. Let's assume that these two facts are connected, as seems reasonable.
Let us also assume that there is some sort of connection between complexity and the numbers of different genes. Again, that seems fair.

Finally let us accept that complex forms have evolved from simpler ones. Not controversial, I hope.

Now let us ask what is the origin of the additional DNA that results in additional genes and increased complexity. Let's restrict ouselves to nuclear DNA. The additional DNA can only be the result of some sort of amplification process during the course of evolution.
Let us assume that this amplification applied equally across the whole genome. The result is, of course, that in the early stages of this process, there will exist multiple copies of identical genes. Let's call these gene families.
Now the progenitor organism managed along quite nicely with one (two if you like) copy of each gene, so its evolutionary descendent has many copies surplus to requirements.

It seems reasonable then to suggest that these "spare" copies are free to undergo any sort of change we care to imagine, without presenting themselves as a negative selection target. But these evolutionary "experiments" can be expected, from time to time, to confer selective advantage.

Moreover, it is known that gene families are capable of undergoing much more dramatic and sudden sequence alteration than are singletons. For example, single genes mostly suffer point mutaion, deletion or insertion. Gene families, on the other hand readily exchange blocks of sequence, and this is a relatively error-prone process.
So the suggestion is that a mechanism of this sort could quite easily result in sudden and dramatic evolutionary change with no evidence of intermediate forms
 
Last edited:
Humans have fewer genes than rice, so that part of your assumption is not so certain. DNA may also be incorporated from other life forms, such as when unnucleated cells merged with bacterium to form nucleated cells. One fact to remember is that the complexity of the genome does not correlate with the complexity of the organism, since the genome does not define every individual structure. Rather, it calls up certain predetermined routines, and defines how often or for how long they should grow, so the difference between a compact animal and a centipede might just be the number of times the central segment is repeated. It certainly seems possible that there are repeated segments of the genome to "experiment" with, and that there is a correction mechanism that could fix it in later generations if it doesn't work. But, the changes must fit in with the previous body, environment, and lifestyle in such a way that it really is an advantage, and that seems highly unlikely unless it is accomplished in small steps.
 
QuarkHead said:
It is a fact that, in general, more complex organisms have more DNA that simpler ones.
This seems a generalisation too far, as Imperfectionist has noted. As such it seems to invalidate the basis of your later speculations.
 
animals also can incorporate the genes of plants and visa-versa. gene swapping is common in nature maybe that can account for sudden increase's in the length of dna sequences.
 
Imperfectionist said:
animals also can incorporate the genes of plants and visa-versa.

They can? I don't think that's correct. Can you give some specific examples?

Imperfectionist said:
gene swapping is common in nature

It is? I don't think that's correct either, except for bacteria. 'Horizontal transfer' of genes is a mechanism that has been out of favor for a while now.

Imperfectionist said:
maybe that can account for sudden increase's in the length of dna sequences.

I don't think so. Increases in genome size is attributed to partial/whole genome duplications and mobile elements (such as transposons and retroviruses) that copy portions of the genome in the process of jumping around. Duplications of chromosomal segments, whole chromosomes or whole genomes creates extra copies of genes which can then undergo neo- or sub-functionalization to produce related families of genes that were referred to in the original post.<P>
 
Last edited:
That more complex organisms have more DNA is to some extents true but is, as noted above, very much an overgeneralization. In fact, this is called the "C-value paradox", where, at least for eukaryotic genomes, the size of the genome has little relationship with the evolutionary complexity of the organism. For the most part, this is due to repetitive elements that tend to replicate within a host's genome (transposons, etc.), or the tendency of an organism toward polyploidy (more than one genome - mostly applies to plants).
Gene swapping between species is not so common in eukaryotes. The vast majority of increases in the amount of DNA in eukaryotes (plants and animals) comes from mistakes during genome replication (one example would be Down's Syndrome). But "duplication and divergence" of gene families is a major player in evolutionary advancement - the biggest player in plant evolution. In fact, plants are forming thousands of new species every second through the magic of polyploidy or genome duplication.
 
Imperfectionist said:
Humans have fewer genes than rice, so that part of your assumption is not so certain.
Yes, admittedly it was a rash generalization. I was thinking about long range vertical comparison (in the sense that rice is not an ancestral type to Man).
One fact to remember is that the complexity of the genome does not correlate with the complexity of the organism,
That isn't what I said. It is true that much of eukaryotic DNA is of low sequence complexity - that is why I couched my assumptions in the way that I did.
It certainly seems possible that there are repeated segments of the genome to "experiment" with, and that there is a correction mechanism that could fix it in later generations if it doesn't work.
But that's my whole point! If "iit doesn't work" as you put it (and mostly it won't) there is no harm done and therefore no need to "fix" it.
But, the changes must fit in with the previous body, environment, and lifestyle in such a way that it really is an advantage, and that seems highly unlikely unless it is accomplished in small steps.
Again, my whole point was that sequence divergence within superfluous copies of genes would only be targets for positive selection, and so large steps may well represent the accumulation of small sequence change, but these small changes would be phenotypically silent, unless dominant (which is rather unlikely)
 
Ophiolite said:
This seems a generalisation too far, as Imperfectionist has noted. As such it seems to invalidate the basis of your later speculations.
Well no - my point is not critically dependent on the absolute amount of DNA, only on the number of genes.
 
zyncod said:
...or the tendency of an organism toward polyploidy
which of course is a form of amplification!
The vast majority of increases in the amount of DNA in eukaryotes (plants and animals) comes from mistakes during genome replication (one example would be Down's Syndrome).
I don't think this is true, certainly if by "replication" you mean DNA replication like the rest of us. Downs is a constellation of chromosomal aberrations, some of which certainly are the result of disjunctional failure.
But "duplication and divergence" of gene families is a major player in evolutionary advancement - the biggest player in plant evolution. In fact, plants are forming thousands of new species every second through the magic of polyploidy or genome duplication.
Well, there you go!
 
spuriousmonkey said:
Some complex animals actually have streamlined their genome, i.e. it is smaller than expected for their complexity.


Yes indeed. <I>Takifugu rubripes</I> (puffer fish) is an example. This is the precise reason for it being one of the first vertebrates whose genome has been sequenced completely (31,059 genes coding for 33,609 proteins, Aparicio et al. 2002). Its compact genome and lack of highly repetitive DNA makes the puffer fish genome a valuable comparison point when looking at syntenic relationships between vertebrate genomes.<P>
 
Actually, the cause of Downs syndrome in most patients is trisomy of chromosome 21 (three copies instead of two). I should have said aneuploidy instead of polyploidy but I was trying to simplify. And nondisjunction IS a replicational mistake.

As far as genome simplification, you might want to look at nested genes within viruses. Essentially, taking a genome as a "book" that you might read, for many viruses the "book" would actually make sense if you read it forward or backward, which I find quite cool. The fact that eukaryotes have not tended toward this simplification is that the energy requirements for multicellular organisms are so high that the additional energy required to synthesize a few thousand, million, or even billion base pairs are negligible when compared to the total energy requirements.
 
zyncod said:
Actually, the cause of Downs syndrome in most patients is trisomy of chromosome 21 (three copies instead of two). I should have said aneuploidy instead of polyploidy but I was trying to simplify. And nondisjunction IS a replicational mistake.
Non-disjunction has nothing to do with replication, it is an error of seperation of chromosomes during meiosis. Replication, as most people use the word, refers to DNA replication. Nothing to do with chromosome disjunction.

As far as genome simplification, you might want to look at nested genes within viruses.
Actually I know all about these. But I was heading in the opposite direction, i.e. towards larger genomes encoding more complex life forms
 
Yes indeed. Takifugu rubripes (puffer fish) is an example. This is the precise reason for it being one of the first vertebrates whose genome has been sequenced completely (31,059 genes coding for 33,609 proteins, Aparicio et al. 2002). Its compact genome and lack of highly repetitive DNA makes the puffer fish genome a valuable comparison point when looking at syntenic relationships between vertebrate genomes.

Interesting. So the amount of alternative splicing in the pufferfish is almost nil then?

Humans, for example, have between 20 to 25 thousand genes at the latest estimates which code for 90,000 or so proteins.

It seems to me that complexity might very well be inversely proportional to the number of genes. More and more research is leading towards non-gene functions in cellular development. Epigenetic factors. And also the tons of 'junk' DNA that have turned out to not always be so junky after all.

It is in the number of different ways that any one gene code to a variety of proteins that seems to me to speak of complexity in the host. I have no figures on prokaryotes and the average number of alternative splicings for their genes, but I do know that introns are practically negligible in most single-cells. It is only when the animals start to gain complexity and live in multicelled beings that the 'junk' DNA starts to build. Doesn't this scream significance?

Junk is not junk at all. It is one of the factors of complexity. One of the means of complexity. And genes are not the whole story except in simple organisms.
 
Imperfectionist said:
animals also can incorporate the genes of plants and visa-versa. gene swapping is common in nature maybe that can account for sudden increase's in the length of dna sequences.
Hercules Rockefeller said:
It is? I don't think that's correct either, except for bacteria. 'Horizontal transfer' of genes is a mechanism that has been out of favor for a while now.
It is. One example is that viruses will inject their own DNA into yours, after which you carry the code necessary to release new copies of the virus. It may stay latent for years or even never show up in you, but it might show up in your descendants.
 
Maddad said:

No, it isn’t.

The original statement was “…animals also can incorporate the genes of plants and visa-versa…” and I was responding to this statement. Plants and animals do not freely incorporate each other’s genes.

Maddad said:
One example is that viruses will inject their own DNA into yours, after which you carry the code necessary to release new copies of the virus.

Do you think that all viral infections lead to integration of the viral DNA into the genome of the infected cells? Retroviruses are the only type of virus that does this, and they represent only a tiny fraction of all the human viruses that exist. (Besides, retroviruses have RNA genomes, not DNA.) So yes, I suppose that retroviruses are one example of how an exogenous gene can insert into the human genome, but it’s a very limited example and doesn’t pertain to the original statement from Imperfectionist.

Maddad said:
It may stay latent for years or even never show up in you, but it might show up in your descendants.

Descendants? The only way for that to happen is for a retrovirus to infect germline cells rather than just somatic cells. I am not aware of any such retroviruses. Do you have any examples? Yes, I know that HIV-positive mothers can transmit it to their babies, but that’s due to plain old infection and not because HIV has integrated into their germline (ie. the genome of their oogonia). HIV infects T cells (ie. somatic cells). So any retrovirus that integrates into somatic cell genomes is of no relevance to evolution, which is the whole thrust of this thread.<P>
 
invert_nexus said:
So the amount of alternative splicing in the pufferfish is almost nil then?

I don’t know about that. Simply knowing a genome sequence does not indicate the extent of alternative splicing. This has to be determined empirically by assaying the transcriptome and proteome of an organism. I would suggest that the protein number indiacted does not take alternative splicing into account.

There is an accepted correlation between organism complexity and the extent of alternative splicing. Yeast barely does any, whereas the human genome exhibits at least 75%. I remember one seminar speaker I heard (whose research was on alternative splicing) you claimed that as many as 90-95% of all human genes are alternatively spliced!

invert_nexus said:
Humans, for example, have between 20 to 25 thousand genes at the latest estimates which code for 90,000 or so proteins.

Whoa, hold on there! There was <I><B>one</B></I> report that suggested that there may be as little as 20,000 genes in the human genome, and this was based solely on computational analysis of the genome sequence and not any experimental evidence. The fact is, you can come up with any figure you want simply by tweaking the gene-finding algorithms that you use. My understanding is that the current majority opinion on the number of human genes is currently around 60,000 genes, but probably more in reality.

If it’s one thing that discussions like these indicate, it’s that the issue of “organism complexity” can be a very subjective thing.<P>
 
60.000 was the consent before the human genome project.

The number went down quickly after that. 25.000 to 30.000 sounds reasonable.
 
Back
Top