Intron size, abundance, and distribution within untranslated regions of genes

Xin Hong, Douglas G. Scofield, Michael Lynch

Research output: Contribution to journalArticlepeer-review

131 Scopus citations


Most research concerning the evolution of introns has largely considered introns within coding sequences (CDSs), without regard for introns located within untranslated regions (UTRs) of genes. Here, we directly determined intron size, abundance, and distribution in UTRs of genes using full-length cDNA libraries and complete genome sequences for four species, Arabidopsis thaliana, Drosophila melanogaster, human, and mouse. Overall intron occupancy (introns/exon kbp) is lower in 5′ UTRs than CDSs, but intron density (intron occupancy in regions containing introns) tends to be higher in 5′ UTRs than in CDSs. Introns in 5′ UTRs are roughly twice as large as introns in CDSs, and there is a sharp drop in intron size at the 5′ UTR-CDS boundary. We propose a mechanistic explanation for the existence of selection for larger intron size in 5′ UTRs, and outline several implications of this hypothesis. We found introns to be randomly distributed within 5′ UTRs, so long as a minimum required exon size was assumed. Introns in 3′ UTRs were much less abundant than in 5′ UTRs. Though this was expected for human and mouse that have intron-dependent nonsense-mediated decay (NMD) pathways that discourage the presence of introns within the 3′ UTR, it was also true for A. thaliana and D. melanogaster, which may lack intron-dependent NMD. Our findings have several implications for theories of intron evolution and genome evolution in general.

Original languageEnglish (US)
Pages (from-to)2392-2404
Number of pages13
JournalMolecular biology and evolution
Issue number12
StatePublished - Dec 2006
Externally publishedYes


  • Genome evolution
  • Intron
  • Untranslated region

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics


Dive into the research topics of 'Intron size, abundance, and distribution within untranslated regions of genes'. Together they form a unique fingerprint.

Cite this