Genetic Code and Amino Acid Translation

Table 1 shows the genetic code of the messenger ribonucleic acid (mRNA), i.e. it shows all 64 possible combinations of codons composed of three nucleotide bases (tri-nucleotide units) that specify amino acids during protein assembling.

Each codon of the deoxyribonucleic acid (DNA) codes for or specifies a single amino acid and each nucleotide unit consists of a phosphate, deoxyribose sugar and one of the 4 nitrogenous nucleotide bases, adenine (A), guanine (G), cytosine (C) and thymine (T). The bases are paired and joined together by hydrogen bonds in the double helix of the DNA. mRNA corresponds to DNA (i.e. the sequence of nucleotides is the same in both chains) except that in RNA, thymine (T) is replaced by uracil (U), and the deoxyribose is substituted by ribose.

The process of translation of genetic information into the assembling of a protein requires first mRNA, which is read 5' to 3' (exactly as DNA), and then transfer ribonucleic acid (tRNA), which is read 3' to 5'. tRNA is the taxi that translates the information on the ribosome into an amino acid chain or polypeptide.

For mRNA there are 43 = 64 different nucleotide combinations possible with a triplet codon of three nucleotides. All 64 possible combinations are shown in Table 1. However, not all 64 codons of the genetic code specify a single amino acid during translation. The reason is that in humans only 20 amino acids (except selenocysteine) are involved in translation. Therefore, one amino acid can be encoded by more than one mRNA codon-triplet. Arginine and leucine are encoded by 6 triplets, isoleucine by 3, methionine and tryptophan by 1, and all other amino acids by 4 or 2 codons. The redundant codons are typically different at the 3rd base. Table 2 shows the inverse codon assignment, i.e. which codon specifies which of the 20 standard amino acids involved in translation.

Table 1. Genetic code: mRNA codon -> amino acid

1st
Base
2nd
Base
3rd
Base
  U C A G  
U Phenylalanine Serine Tyrosine Cysteine U
Phenylalanine Serine Tyrosine Cysteine C
Leucine Serine Stop Stop A
Leucine Serine Stop Tryptophan G
C Leucine Proline Histidine Arginine U
Leucine Proline Histidine Arginine C
Leucine Proline Glutamine Arginine A
Leucine Proline Glutamine Arginine G
A Isoleucine Threonine Asparagine Serine U
Isoleucine Threonine Asparagine Serine C
Isoleucine Threonine Lysine Arginine A
Methionine (Start)1 Threonine Lysine Arginine G
G Valine Alanine Aspartate Glycine U
Valine Alanine Aspartate Glycine C
Valine Alanine Glutamate Glycine A
Valine Alanine Glutamate Glycine G

Table 2. Reverse codon table: amino acid -> mRNA codon

Amino acid mRNA codons Amino acid mRNA codons
Ala/A GCU, GCC, GCA, GCG Leu/L UUA, UUG, CUU, CUC, CUA, CUG
Arg/R CGU, CGC, CGA, CGG, AGA, AGG Lys/K AAA, AAG
Asn/N AAU, AAC Met/M AUG
Asp/D GAU, GAC Phe/F UUU, UUC
Cys/C UGU, UGC Pro/P CCU, CCC, CCA, CCG
Gln/Q CAA, CAG Ser/S UCU, UCC, UCA, UCG, AGU, AGC
Glu/E GAA, GAG Thr/T ACU, ACC, ACA, ACG
Gly/G GGU, GGC, GGA, GGG Trp/W UGG
His/H CAU, CAC Tyr/Y UAU, UAC
Ile/I AUU, AUC, AUA Val/V GUU, GUC, GUA, GUG
START AUG STOP UAG, UGA, UAA

The direction of reading mRNA is 5' to 3'. tRNA (reading 3' to 5') has anticodons complementary to the codons in mRNA and can be "charged" covalently with amino acids at their 3' terminal. According to Crick the binding of the base-pairs between the mRNA codon and the tRNA anticodon takes place only at the 1st and 2nd base. The binding at the 3rd base (i.e. at the 5' end of the tRNA anticodon) is weaker and can result in different pairs. For the binding between codon and anticodon to come true the bases must wobble out of their positions at the ribosome. Therefore, base-pairs are sometimes called wobble-pairs.

Table 3 shows the possible wobble-pairs at the 1st, 2nd and 3rd base. The possible pair combinations at the 1st and 2nd base are identical. At the 3rd base (i.e. at the 3' end of mRNA and 5' end of tRNA) the possible pair combinations are less unambiguous, which leads to the redundancy in mRNA. The deamination (removal of the amino group NH2) of adenosine (not to confuse with adenine) produces the nucleotide inosine (I) on tRNA, which generates non-standard wobble-pairs with U, C or A (but not with G) on mRNA. Inosine may occur at the 3rd base of tRNA.

Table 3. Base-pairs: mRNA codon -> tRNA anticodon

1st (i.e. 5' end) and 2nd place
mRNA codon
1st (i.e. 3' end) and 2nd place
tRNA anticodon
A U
U A
C G
G C
3rd place (i.e. 3' end)
mRNA codon
3rd place (i.e. 5' end)
tRNA anticodon
A or G U
U A
U or C G
G C
U, C or A I

Table 3 is read in the following way: for the 1st and 2nd base-pairs the wobble-pairs provide uniqueness in the way that U on tRNA always emerges from A on mRNA, A on tRNA always emerges from U on mRNA, etc. For the 3rd base-pair the genetic code is redundant in the way that U on tRNA can emerge from A or G on mRNA, G on tRNA can emerge from U or C on mRNA and I on tRNA can emerge from U, C or A on mRNA. Only A and C at the 3rd place on tRNA are unambiguously assigned to U and G at the 3rd place on mRNA, respectively.

Due to this combination structure a tRNA can bind to different mRNA codons where synonymous or redundant mRNA codons differ at the 3rd base (i.e. at the 5' end of tRNA and the 3' end of mRNA). By this logic the minimum number of tRNA anticodons necessary to encode all amino acids reduces to 31 (excluding the 2 STOP codons AUU and ACU, see Table 5). This means that any tRNA anticodon can be encoded by one or more different mRNA codons (Table 4). However, there are more than 31 tRNA anticodons possible for the translation of all 64 mRNA codons. For example, serine has a fourfold degenerate site at the 3rd position (UCU, UCC, UCA, UCG), which can be translated by AGI (for UCU, UCC and UCA) and AGC on tRNA (for UCG) but also by AGG and AGU. This means, in turn, that any mRNA codon can also be translated by one or more tRNA anticodons (see Table 5).

The reason for the occurrence of different wobble-pairs encoding the same amino acid may be due to a compromise between velocity and safety in protein synthesis. The redundancy of mRNA codons exist to prevent mistakes in transcription caused by mutations or variations at the 3rd position but also at other positions. For example, the first position of the leucine codons (UCA, UCC, CCU, CCC, CCA, CCG) is a twofold degenerate site, while the second position is unambiguous (not redundant). Another example is serine with mRNA codons UCA, UCG, UCC, UCU, AGU, AGC. Of course, serine is also twofold degenerate at the first position and fourfold degenerate at the third position, but it is twofold degenerate at the second position in addition. Table 4 shows the assignment of mRNA codons to any possible tRNA anticodon in eukaryotes for the 20 standard amino acids involved in translation. It is the reverse codon assignment.

Table 4. Reverse amino acid encoding: amino acid -> tRNA anticodon -> mRNA codon

Amino acid tRNA anticodon mRNA codon
Phenylalanine 3'-AAG-5' 5'-UUU-3', 5'-UUC-3'
3'-AAA-5' 5'-UUU-3'
Leucine 3'-AAU-5' 5'-UUA-3', 5'-UUG-3'
3'-AAC-5' 5'-UUG-3'
3'-GAI-5' 5'-CUU-3', 5'-CUC-3', 5'-CUA-3'
3'-GAG-5' 5'-CUU-3', 5'-CUC-3'
3'-GAU-5' 5'-CUA-3', 5'-CUG-3'
3'-GAA-5' 5'-CUU-3'
3'-GAC-5' 5'-CUG-3'
Serine 3'-AGI-5' 5'-UCU-3', 5'-UCC-3', 5'-UCA-3'
3'-AGG-5' 5'-UCU-3', 5'-UCC-3'
3'-AGU-5' 5'-UCA-3', 5'-UCG-3'
3'-AGA-5' 5'-UCU-3'
3'-AGC-5' 5'-UCG-3'
3'-UCG-5' 5'-AGU-3', 5'-AGC-3'
3'-UCA-5' 5'-AGU-3'
Tyrosine 3'-AUG-5' 5'-UAU-3', 5'-UAC-3'
3'-AUA-5' 5'-UAU-3'
Cysteine 3'-ACG-5' 5'-UGU-3', 5'-UGC-3'
3'-ACA-5' 5'-UGU-3'
Tryptophan 3'-ACC-5' 5'-UGG-3'
Proline 3'-GGI-5' 5'-CCU-3', 5'-CCC-3', 5'-CCA-3'
3'-GGG-5' 5'-CCU-3', 5'-CCC-3'
3'-GGU-5' 5'-CCA-3', 5'-CCG-3'
3'-GGA-5' 5'-CCU-3'
3'-GGC-5' 5'-CCG-3'
Histidine 3'-GUG-5' 5'-CAU-3', 5'-CAC-3'
3'-GUA-5' 5'-CAU-3'
Glutamine 3'-GUU-5' 5'-CAA-3', 5'-CAG-3'
3'-GUC-5' 5'-CAG-3'
Arginine 3'-GCI-5' 5'-CGU-3', 5'-CGC-3', 5'-CGA-3'
  3'-GCG-5' 5'-CGU-3', 5'-CGC-3'
  3'-GCU-5' 5'-CGA-3', 5'-CGG-3'
  3'-GCA-5' 5'-CGU-3'
  3'-GCC-5' 5'-CGG-3'
  3'-UCU-5' 5'-AGA-3', 5'-AGG-3'
  3'-UCC-5' 5'-AGG-3'
Isoleucine 3'-UAI-5' 5'-AUU-3', 5'-AUC-3', 5'-AUA-3'
3'-UAG-5' 5'-AUU-3', 5'-AUC-3'
3'-UAA-5' 5'-AUU-3'
3'-UAU-5' 5'-AUA-3'
Methionine 3'-UAC-5' 5'-AUG-3'
Threonine 3'-UGI-5' 5'-ACU-3', 5'-ACC-3', 5'-ACA-3'
3'-UGG-5' 5'-ACU-3', 5'-ACC-3'
3'-UGU-5' 5'-ACA-3', 5'-ACG-3'
3'-UGA-5' 5'-ACU-3'
3'-UGC-5' 5'-ACG-3'
Asparagine 3'-UUG-5' 5'-AAU-3', 5'-AAC-3'
  3'-UUA-5' 5'-AAU-3'
Lysine 3'-UUU-5' 5'-AAA-3', 5'-AAG-3'
3'-UUC-5' 5'-AAG-3'
Valine 3'-CAI-5' 5'-GUU-3', 5'-GUC-3', 5'-GUA-3'
3'-CAG-5' 5'-GUU-3', 5'-GUC-3'
3'-CAU-5' 5'-GUA-3', 5'-GUG-3'
3'-CAA-5' 5'-GUU-3'
3'-CAC-5' 5'-GUG-3'
Alanine 3'-CGI-5' 5'-GCU-3', 5'-GCC-3', 5'-GCA-3'
  3'-CGG-5' 5'-GCU-3', 5'-GCC-3'
  3'-CGU-5' 5'-GCA-3', 5'-GCG-3'
  3'-CGA-5' 5'-GCU-3'
  3'-CGC-5' 5'-GCG-3'
Aspartate 3'-CUG-5' 5'-GAU-3', 5'-GAC-3'
3'-CUA-5' 5'-GAU-3'
Glutamate 3'-CUU-5' 5'-GAA-3', 5'-GAG-3'
3'-CUC-5' 5'-GAG-3'
Glycine 3'-CCI-5' 5'-GGU-3', 5'-GGC-3', 5'-GGA-3'
3'-CCG-5' 5'-GGU-3', 5'-GGC-3'
3'-CCU-5' 5'-GGA-3', 5'-GGG-3'
3'-CCA-5' 5'-GGU-3'
3'-CCC-5' 5'-GGG-3'

While it is not possible to predict a specific DNA codon from an amino acid, DNA codons can be decoded unambiguously into amino acids. The reason is that there are 61 different DNA (and mRNA) codons specifying only 20 amino acids. Note that there are 3 additional codons for chain termination, i.e. there are 64 DNA (and thus 64 different mRNA) codons, but only 61 of them specify amino acids.

Table 5 shows the genetic code for the translation of all 64 DNA codons, starting from DNA over mRNA and tRNA to amino acid. In the last column, the table shows the different tRNA anticodons minimally necessary to translate all DNA codons into amino acids and sums up the number in the final row. It reveals that the minimum number of tRNA anticodons to translate all DNA codons is 31 (plus 2 STOP codons). The maximum number of tRNA anticodons that can emerge in amino acid transcription is 70 (plus 3 STOP codons).

Table 5. Genetic code: DNA -> mRNA codon -> tRNA anticodon -> amino acid

Obs. DNA mRNA tRNA Amino acid Different AA Diff. tRNA anticodons
to encode all AA
1 TTT UUU AAA, AAG Phe Phenylalanine AAG
2 TTC UUC AAG Phe
3 TTA UUA AAU Leu Leucine AAU
4 TTG UUG AAU, AAC Leu
5 TCT UCU AGI, AGG, AGA Ser Serine AGI
6 TCC UCC AGI, AGG Ser
7 TCA UCA AGI, AGU Ser
8 TCG UCG AGC, AGU Ser AGC (or AGU)
9 TAT UAU AUA, AUG Tyr Tyrosine AUG
10 TAC UAC AUG Tyr
11 TAA UAA AUU STOP AUU
12 TAG UAG AUC, AUU STOP
13 TGT UGU ACA, ACG Cys Cysteine ACG
14 TGC UGC ACG Cys
15 TGA UGA ACU STOP ACU
16 TGG UGG ACC Trp Tryptophan ACC
17 CTT CUU GAI, GAG, GAA Leu GAI
18 CTC CUC GAI, GAG Leu
19 CTA CUA GAI, GAU Leu
20 CTG CUG GAC, GAU Leu GAC (or GAU)
21 CCT CCU GGI, GGG, GGA Pro Proline GGI
22 CCC CCC GGI, GGG Pro
23 CCA CCA GGI, GGU Pro
24 CCG CCG GGC, GGU Pro GGC (or GGU)
25 CAT CAU GUA, GUG His Histidine GUG
26 CAC CAC GUG His
27 CAA CAA GUU Gln Glutamine GUU
28 CAG CAG GUC, GUU Gln
29 CGT CGU GCI, GCG, GCA Arg Arginine GCI
30 CGC CGC GCI, GCG Arg
31 CGA CGA GCI, GCU Arg
32 CGG CGG GCC, GCU Arg GCC (or GCU)
33 ATT AUU UAI, UAG, UAA Ile Isoleucine UAI
34 ATC AUC UAI, UAG Ile
35 ATA AUA UAI, UAU Ile
36 ATG AUG UAC Met Methionine UAC
37 ACT ACU UGI, UGG, UGA Thr Threonine UGI
38 ACC ACC UGI, UGG Thr
39 ACA ACA UGI, UGU Thr
40 ACG ACG UGC, UGU Thr UGC (or UGU)
41 AAT AAU UUA, UUG Asn Asparagine UUG
42 AAC AAC UUG Asn
43 AAA AAA UUU Lys Lysine UUU
44 AAG AAG UUC, UUU Lys
45 AGT AGU UCA, UCG Ser UCG
46 AGC AGC UCG Ser
47 AGA AGA UCU Arg UCU
48 AGG AGG UCC, UCU Arg
49 GTT GUU CAI, CAG, CAA Val Valine CAI
50 GTC GUC CAI, CAG Val
51 GTA GUA CAI, CAU Val
52 GTG GUG CAC, CAU Val CAC (or CAU)
53 GCT GCU CGI, CGG, CGA Ala Alanine CGI
54 GCC GCC CGI, CGG Ala
55 GCA GCA CGI, CGU Ala
56 GCG GCG CGC, CGU Ala CGC (or CGU)
57 GAT GAU CUG, CUA Asp Aspartate CUG
58 GAC GAC CUG Asp
59 GAA GAA CUU Glu Glutamate CUU
60 GAG GAG CUU, CUC Glu
61 GGT GGU CCI, CCG, CCA Gly Glycine CCI
62 GGC GGC CCI, CCG Gly
63 GGA GGA CCI, CCU Gly
64 GGG GGG CCC, CCU Gly   CCC (or CCU)
No. 64 64 20 33

Note:
1The codon AUG both codes for methionine and serves as an initiation site: the first AUG in an mRNA's coding region is where translation into protein begins.