Protein Sequencing

Amino Acid Sequence Determination:
Determination of the amino acid sequence of a protein involves the following steps:
1. Identification of the N- and C-terminal amino acid residues
2. Cleavage of any disulfide bonds present
3. Limited cleavage of the peptide into overlapping smaller fragments
4. Purification of the fragments
5. Their stepwise cleavage into individual amino acid residue.

Identification of the N-Terminal Residue:
  • Determination of the N-terminal residue is carried out by labeling the free unprotonated a-amino groups.
  • Three alternative labeling reagents are used: 2,4-dinitrofluorobenzene (DNFB; Sanger's reagent), dansyl chloride (1-dimethylaminonaphthalene-5-sulfonyl chloride), and phenylisothiocyanate (PITC; Edman's reagent).
  • Under basic conditions, DNFB and dansyl chloride react with free amino groups. 
  • The labeled peptide is hydrolyzed with acid to yield the labeled N-terminal residue and other free amino acids. 
  • The 2,4-dinitrophenyl amino acid derivatives have a yellow color and are separable by chromatographic methods and determinable by comparison with reference DNP-amino acids. 
  • DNFB reacts with the ε-amino groups of lysyl residues to yield ε-DNP-lysine after hydrolysis. 
  • N-Terminal lysine produces α,ε-di(DNP)-lysine, whereas an internal lysine produces a derivative with only one dinitrophenyl group.
  • The dansyl amino acid is isolated and identified by chromatographic methods. The dansyl procedure is about 100 times more useful than the DNFB method because the dansyl amino acids are fluorescent and therefore detectable in minute quantities.

Image result for Determination of N-terminal amino acid residues by use of 2,4-dinitrofluorobenzene (Sanger's reagent).
Identification of N terminal amino acid by using Sanger's Reagent
Image result for Determination of-terminalaminoacidresiduesby use of dansyl chloride.
Identification of N terminal amino acid by using dansyl chloride
Edman's Reagent:
  • In the Edman procedure, PITC reacts under basic conditions with the free α -amino group to form a phenylthiocarbamoyl peptide. 
  • Treatment with anhydrous acid yields the labeled terminal amino residue and the rest of the peptide. 
  • In this process, the terminal amino acid is cyclized to the respective phenylthiohydantoin derivative. 
  • A remarkable advantage of the Edman procedure is that on removal of the N-terminal residue, the remaining peptide is left intact and its N-terminal remaining peptide group is available for another cycle of the procedure. 
  • This procedure can thus be used in a stepwise manner to establish the sequence of amino acids in a peptide starting from the N-terminal.
Image result for Determination of the N-terminal residue by the Edman procedure
Edman's Reagent
Identification of the C-Terminal Residue:
  • The C-terminal residue is determined by the using a chemical reagent or the enzyme carboxypeptidase. 
  • The chemical reagent hydrazine forms aminoacyl hydrazides with every residue except the carboxy terminus. 
  • The C terminus is thus readily identified by chromatographic procedures. 
  • The disadvantage of hydrazinolysis is that the whole sample is used to identify just one residue.  Carboxypeptidase is an exopeptidase that specifically cleaves the C-terminal peptide bond by hydrolysis and releases the C-terminal amino acid. 
  • Two problems are linked with its use: the substrate specificity of the enzyme and the continuous action of the enzyme. 
  • The continuous enzyme action may yield the second, third, and additional residues from some chains even before the terminal residues on every chain are quantitatively released. 
  • Thus, it might be difficult to identify which residue is the C terminus. 
  • However, checking the sequential release of amino acids can often reveal the sequence of several residues at the C terminus. 
  • Because of specificity, carboxypeptidase A releases all C-terminal residues except Lys, Arg, and Pro; carboxypeptidase B cleaves C-terminal Arg and Lys residues; and carboxypeptidase C hydrolyzes C-terminal Pro residues. Thus, more than one method may be needed to determine the C-terminal amino acid.
Image result for Determination of C-terminal amino acid residues by use of hydrazine.
Determination of C- terminal amino acid by using Hydrazine
Selective Hydrolysis Methods:
  • Cleavage of disulfide bonds occurs before dissolution of the protein into peptides. 
  • Disulfide bonds may be cleaved oxidatively, or they may be reduced and alkylated. 
  • Treatment of the native protein with performic acid, an oxidizing agent, breaks disulfide bonds and converts cystine residues to cysteic acid. 
  • Reduction of the disulfide linkage by thiols, such as Beta-mercaptoethanol, yields reactive sulfhydryl groups. 
  • These groups may be stabilized by alkylation with iodoacetate or ethyleneimine to yield the carboxymethyl or aminoethyl derivative, respectively. 
  • Hydrolysis of a protein into peptides can be done by group-specific chemical and enzymatic reagents. 
  • N-Bromosuccinimide and cyanogen bromide hydrolyze proteins at tryptophan and methionine residues. 
  • Trypsin hydrolyzes the peptide linkage on the C-terminal side of lysine and arginine residues. Purification of the hydrolysis products is often the most challenging aspect of sequence determination. 
  • Ion exchange chromatography, paper chromatography, and electrophoresis are beneficial, and reverse-phase high-pressure liquid chromatography is used frequently because of its speed and sensitivity. 
  • The purified peptides are observed for amino acid composition and terminal residues. Small peptides might be sequenced directly, but large peptides must be further hydrolyzed. 
  • Proteases such as chymotrypsin, pepsin, and papain, which are much less specific than trypsin, cleave the peptides formed on tryptic digestion. 
  • The amino acid sequences of the purified peptides may be determined by the sequential Edman procedure. 
  • Another technique is indirect analysis following nucleic acid sequencing of a DNA or RNA fragment corresponding to a particular protein. 
  • The universal genetic code gives information for translating a nucleic acid sequence into an amino acid sequence. 
  • This method will not correctly identify amino acid sequences from proteins that undergo post translational modification or proteins derived from eukaryotic genes with intervening sequences that are not translated.
Image result for cleavage of a peptide chain at methionine residue by cyanogen bromide. The methionine residue is modified to a carboxy-terminal homoserine lactone residue
Cleavage of a peptide chain at methionine residue by cyanogen bromide
Table: Few reagents useful for protein hydrolysis and their site of cleavage.

Cleavage Reagent
Cleavage Site
Trypsin
Lys or Arg at amino acid carboxyl side
Chymotrypsin
Phe, Trp, or Tyr at amino acid carboxyl side
Thermolysin
Leu, Ile, or Val at amino acid amino side
Cyanogen bromide
Met at amino acid carboxyl side
2-nitro-5-thiocyanobenzoate
Cys at amino acid amino side
N-Bromosucccinimide
Trp or Tyr at amino acid side

Peptide Sequence Confirmation:
  • Once the sequence has been identified, the proper arrangement of individual peptides in the protein can be done by identifying the overlapping sequences between peptides obtained by different cleavage procedures. 
  • The ultimate confirmation of sequence determination is protein synthesis. Chemical synthesis of peptides and proteins of known amino acid sequence can be accomplished by an automated, solid phase procedure developed by Merrifield et al. Synthesis begins with the C-terminal amino acid, with each successive residue added in a stepwise manner. 
  • The C-terminal amino acid is covalently bound to a solid phase by a reaction between the carboxyl group and a chloromethyl group linked to a phenyl group of the resin polystyrene. 
  • Since the amino group of an amino acid is reactive, it must be protected by a blocking group so that it does not react with the chloromethyl groups. 
  • The t-BOC is later removed by acid, enabling the amino group to participate in peptide bond formation. 
  • The degradation products of t-BOC, isobutylene and carbon dioxide, are removed as gases. 
  • The carboxyl group of a second amino acid, added as a t-BOC derivative, reacts with the amino group of the anchored amino acid in the presence of the condensing agent, dicyclohexylcarbodiimide (DCC). DCC, which removes H20 from the two functional groups forming a peptide bond, is converted to dicyclohexylurea. 
  • In the next cycle, the t-BOC of a second amino acid of the solid-phase dipeptide is similarly removed, and a third t-BOC-amino acid is added along with DCC. The stepwise process of peptide synthesis is continued until the desired peptide is formed. 
  • The finished peptide can be easily cleaved from the polystyrene resin without affecting the peptide linkages. 
  • Advantages of the solid-phase method are amenability to automation, the almost 100% yield of product for each reaction, the ease of removal of excess reagents and waste products by washing and filtration of resin particles, the lack of need for purification of intermediates, and speed. 
  • Peptides or proteins that have been synthesized by the solid-phase method include ribonuclease, bradykinin, oxytocin, vasopressin, somatostatin, insulin, and the β-chain of hemoglobin. 
  • The sequence analyses for these substances were confirmed by demonstrating that the synthetic products, constructed on the basis of sequence data, had the same biological activities as those of the corresponding natural substances.
Peptides obtained from trypsin cleavage:
Ala-Gly-Glu-Lys
Gly-Ala-Met-Arg
Ile-VaI-Phe
Peptides obtained from cyanogen cleavage:
Ala-Gly-Glu-Lys-Gly-Ala-homoserine lactone
Arg-Ile-VaI-Phe
The complete sequence deduced from the above overlapping peptides
Ala-Gly-Glu-Lys-Gly-Ala-Met-Arg-Ile-VaI-Phe
Determination of Peptide sequence by overlapping sequences

Fmoc Solid Phase Peptide Synthesis:
  • Solid-phase polypeptide synthesis that uses the t-BOC group to protect the Nα-amino group is a major procedure used by protein chemists. 
  • However, another solid phase procedure is frequently used for the chemical synthesis of peptides and proteins of interest; this is the 9-fluorenylmethoxycarbonyl (Fmoc) procedure. 
  • The Fmoc method for solid-phase peptide synthesis uses the Fmoc group for protecting the Nα-amino group. In contrast to the t-BOG group method which requires acid for removal of the Nα-amino protective group, the Fmoc group can be removed by a mild base. 
  • A piperidine solution in N,N-dimethylformamide or N-methylpyrrolidone is generally used for the removal of the Fmoc group; anhydrous trifluoroacetic acid is used to remove the t-BOC group. 
  • This Fmoc procedure is technically simpler and chemically less complex than the t-BOC procedure. Because the Fmoc protecting group can be removed by base, the linkage of the peptide to the resin support does not have to be stable under acidic conditions as in the t-BOC procedure. 
  • The Fmoc procedure offers more flexible reaction conditions and more reagent options. Because of the milder conditions of peptide synthesis, the Fmoc procedure is widely used for the synthesis of modified polypeptides that are phosphorylated, sulfated, or glycosylated. 
  • Since the Fmoc procedure uses a base-for protection of the N-amino group, acid-labile compounds are used to protect the side chains. 
  • Side chain protecting compounds generally use a t-butyl moiety such as t-butyl ethers for Ser, Thr, and Tyr; t-butyl esters for Asp and Glu; and the t-BOC group for His and Lys, respectively. Also, the trityl group is used to protect Cys, Asn, and Gln; the 4-methoxy 2,3,6,-trimethylbenzenesulphonyl or the 2,2,5,7,8,-pentamethylChroman-6- sulfonyl groups have been used to protect the Arg guanidine group.

Schematic overview of Fmoc solid-phase peptide synthesis (SPPS), including related impurity formation. Adapted from N.L. Benoiton, Chemistry of Peptide Synthesis. Taylor & Francis, Boca Raton, 2006. 
Solid Phase Peptide Synthesis

Comments

Popular posts from this blog

Protein Isolation

RNA and Types

Brief Information about Lipids