Memorizing Amino acids & Genetic Code Dictionary

Introduction:

Memorizing all the twenty standard amino acids and the genetic code dictionary is of great advantages when it comes to take up competitive exams besides improving the normal learning curve in the related topics. These two interdependent learning features may help solve many questions related to predicting some features of the wild type and mutated peptides (or, protein), effects of various mutations in the peptides, etc.

After the completion of this session, we would be able to-

A. Memorize all the 20 standard amino acids [names, 3-letter code, and 1- letter code].

B. Memorize the whole genetic code dictionary. [Thereafter, you would be able to tell the amino acid encoded by any of the triplet codon or can write all the triplet codons for any specified amino acid.]

C. Correlate the mutations and subsequent affects in the peptides, etc.

Section 1: Assigning three-letter (3-L) code to Amino acids

Note the Following Points-

1. The 3-L code of all the amino acids is their first 3-letters, except “acid amines,  Isoleucine and Tryptophan”.

2. Assigning 3-L code to the acid-amines (Asparagine and Glutamine)

I. Carboxylic acids are of higher group priority than respective amines. So, 3-L code is assigned first to the “ACID” then to the respective “AMINE” irrespective of their position in the alphabetic hierarchy.

II. So, aspartic acid would be assigned the 3-L code first, then the would come the number of asparagine. The same holds valid for Glutamic acid and Glutamine.

III. The 3-L code for the acid amide is generated by replacing the last letter of the 3-L code of its respective “Acid” with “n”.

three letter code of acid amides

3. Isoleucine: A special case for Isomer: Iso-leucine (an isomer of leucine) has “I” as its first letter (indicating Isomer) and first-two letters of Leucine as its 3-L code.

4. Tryptophan: The only exception to above rules: The 3- and 1-letter codes for amino acids were formally assigned in 1983 by the recommendations of the IUPAC‐IUB Joint Commission on Biochemical Nomenclature (JCBN) through “Nomenclature and Symbolism for Amino Acids and Peptideshttps://febs.onlinelibrary.wiley.com/doi/abs/10.1111/j.1432-1033.1984.tb07877.x .

Possible Explanation [A personal view] : The tryptophan operon was discovered in 1953 by Jacques Monod and his colleagues. Since then the term “Trp” has been in use for representing tryptophan. A change in its code later in 1983 would have led to confusions regarding the terms. So, the “Trp” might have left untouched and used as it was in use for almost four decades in past.

Three letter code for amino aicds

Section 2: Assigning one-letter Code to Amino acids

Nomenclature and Symbolism for Amino Acids and Peptides

IUPAC‐IUB Joint Commission on Biochemical Nomenclature (JCBN), Recommendations 1983

Key Points:

I. The unambiguous amino acids (the amino acids being the only words of their respective alphabets) are given that respective alphabet as the 1-letter code.

         Count = 6; Cysteine (C), Histidine (H), Isoleucine (I), Methionine (M), Serine (S), Valine (V)

II. For amino acids staring from the same alphabet, the simplest (smallest R-group) is assigned the respective alphabet as their 1-L code.

         Count = 5; Alanine (A), Glycine (G), Leucine (L), Proline (P), Threonine (T)

III. Based on phonetic association, Arginine is assigned “R”, and Phenylalanine is assigned “F”.

IV. The bulky R-group of Tryptophan fetches it the bulky letter “W”.

V. N and Q are assigned to Asparagine and Glutamine, respectively.

VI. D and E are assigned to Aspartic acid and Glutamic acid, respectively.

VII. K and Y are assigned to Lysine and Tyrosine, respectively- for the nearest available letter.

VIII. U is omitted from inclusion because of its similarity to hand-written “V”, and to avoid

subsequent confusion.

IX. O is omitted from inclusion because of its similarity to C, D, G and Q in imperfect printouts,

and to eliminate the chances of subsequent confusions.

X. J is omitted because of its absence from some languages.

XI. B is used to represent either of Asp and Asn. Z is used for either of Glu and Gln.

XII. X is used to represent unknown amino acid in the peptide sequence.

One letter code for amino acids

Section 3: Assigning one-letter (1-L) code to Amino acids

(  A New Perspective, Personal Views)

The one-letter code for amino acids are derived by subsequently grouping them in following three groups-

Group 1: All the AA- with simplest (smallest R-group)- of their respective alphabet group are assigned that alphabet as the 1-L code for that amino acid. [Count = 11]

One letter code for amino acids Group 1

Amino acids alphabet groups

Group 2: AA not being the simplest in respective alphabet group. [4]

              # 1-L code is the first un-assigned letter from amino acid itself. Note that 11 letters [A, C, G, H, I, L, M, P, S, T, V] have already been assigned to group 1 amino acids.

#Note: Once all amino acids from the first end “A” are assigned, the sequence breaks at  Glutamine, i.e. no free alphabet is available for it. To generate equal probability of assigning  free letters, now start assigning letter from the other end “Z”. So, “Y” is assigned to Tyrosine.

One letter code for amino acids Group 2

Group 3: All AA not eligible for group 1 and 2. [5]

# The one-letter code of the group 3 amino acids is the nearest available alphabet. [Except- F for Phenylalanine]

How to seek the nearest available alphabet: First assign 1-L code to AA that has free letter ±1 from its alphabet; then to AA that have free letter ±2 from its alphabet, and so on.

Available letters:  E F K Q W

I. Phenylalanine: It is assigned “F” as its 1-L code because of the phonetic similarity of “Ph” with “F”.

II. Lysine: The free/available alphabet nearest to L of Lysine is “K” (-1). So, it’s assigned “K” as the 1-L code.

III. Glutamic acid: Its nearest free/available alphabet is “E” (-2). So, it’s assigned “E” as the 1-L code.

IV. Tryptophan: Its nearest free/available alphabet is “W” (+3). So, it’s assigned “W” as the 1-L code.

V. Glutamine: Its nearest free/available alphabet is “Q” (+10). So, it’s assigned “Q” as the 1-L code.

The one-letter codes for amino acids are summarized below-

Three- and One- letter code for amino acids Summary

Section 4: Amino acid Classification based of the Polarity of R-groups

Amino acid Classification based of the Polarity of R-groups

Once we have memorized the 1-L code of AA, memorizing all the 20

Amino acids is relatively simpler with following phrases-

1. Non-polar Amino acids: Phrase “VIMW Pro FLAG”

    Read “VIMW Pro FLAG” as “BMW Pro FLAG”

It gives- V = Valine, I = Isoleucine, M = Methionine, W = Tryptophan, Pro = Proline, F = Phenylalanine, L =Leucine, A = Alanine, G = Glycine

2. Polar, Uncharged Amino acid: Phrase “NYC QueST”

Read “NYC QueST” as “New York City QueST”

It gives- N = Asparagine, Y = Tyrosine, C = Cysteine, Q = Glutamine , S = Serine, T = Threonine

3. Polar, Charged Amino acids: Phrase “DHERK” – a Marvel’s fiction character

https://marvel.fandom.com/wiki/Dherk_(Earth-616)

    Read “DHERK” as “DHERK”

It gives- D = Aspartic acid, H = Histidine, E = Glutamic acid, R = Arginine, K = Lysine

Section 6: Memorizing the Genetic Code Dictionary

A Personal Note: I first memorized the genetic code dictionary at the age of 17 (approx) when I was studying in the Intermediate of Science (equivalent to 10+2 in CBSE curricular, India or Pre-University Course). Once the one-letter code of amino acids is memorized, the whole genetic code dictionary can easily be memorized. The phrases derived here depict my own ease, you can derive similar phrases if needed.

Shown below are the two forms of the genetic code dictionary- one showing the triplet codons whereas the other shows the 3-letter code of the amino acids encoded by the respective triplet codons.

Genetic code dictionary the triplet codons

Now, replace the 3-letter codes of amino acids with the 1-letter code. Moreover, we draw a box when the same amino acid is encoded by the two codons differing only in their third letter.

Genetic code dictionary one letter code of AANow, in the above picture (and the two forms at top), note that

I. Some boxes [representing the four codons formed by the first and second letter of codon] encode the same amino acid irrespective of the third letter of codon. Example- CU_, GU_, UC_, CC_, AC_, GC_, CG_, and GG_ where (_) can be any letter at the third position of the codon.

II. Some boxes have the amino acid when the third letter of codon is either (U and C) or (A and G).

Now, in the final step, we write a condensed form of the genetic code dictionary. Moreover, each column is represented with a phrase to memorize the one-letter code of amino acids in that column in the condensed form.

Genetic code dictionary condensed formYou can develop your own phrases/words to memorize condensed form of genetic code dictionary. With this, the whole genetic code dictionary can be memorized within few days.

A note of success-

  1. Memorize the 1-letter code of the 20 standard amino acids
  2. Memorize the phrases [FulL LIVe, SePTA, Y0 HQ NiK DEy, and C0W aRe Sir G]. You can develop your own as suited.
  3. Memorize and then practice by writing the genetic code dictionary on paper few times a day [it would only a few minutes each time).
  4. Remember the positions of “Met”, and “0” and their significance. Met is always encoded by AUG. Locating the AUG codon in the genetic code dictionary shall eliminate the chances of error if you ever forget the position of the Met.
  5. Now, try to write all the codons for each amino acid without looking at the genetic code dictionary. You shall get 20 amino acids, 61 functional codons and 3 STOP codons if you remember it correctly.

All you suggestions, critics and corrections (if any, needed) are welcomed !

BioChem Calculations
  •  
  •  
  •  
  •  
  •  
  •  

Leave a Comment