DETERMINING THE AMINO ACID SEQUENCES OF PROTEINS

The three-dimensional shape and hence the functional properties of a protein are determined by its amino acid sequence. While we do not yet know how to predict these properties fully from a protein’s sequence, we can use sequence data to assign proteins to functional families, to identify specialized domains within proteins, and to determine likely effects of mutations that alter a protein’s sequence.
Edman Degradation
Enormous effort has gone into developing technologies for sequencing proteins, much of it now rendered obsolete by the sequencing of the whole human genome and the development of fairly reliable computational techniques to identify protein-coding genes in the genomic DNA sequence. Direct protein sequencing remains important for a number of purposes, however, and it is useful to review one approach, the Edman degradation.

The chemical reactions are outlined in the diagram. Its key features are:

  1. Phenyl isothiocyanate couples to the free amino terminus of a polypeptide chain in a reaction that can be driven to completion.
  2. A second reaction that can also be driven to completion cleaves the coupled amino acid from the rest of the polypeptide as a phenylthiohydantoin derivative.
  3. This derivative can be separated from the polypeptide chain, which is now one residue shorter. Analytical techniques allow all 20 possible derivatives to be identified, and the remainder of the polypeptide meanwhile can be subjected to another round of coupling and cleavage.
  4. In this way, the sequence of a polypeptide can be read off, one residue at a time.