A fast method to predict protein interaction sites from sequences

J Mol Biol. 2000 Sep 29;302(4):917-26. doi: 10.1006/jmbi.2000.4092.

Abstract

A simple method for predicting residues involved in protein interaction sites is proposed. In the absence of any structural report, the procedure identifies linear stretches of sequences as "receptor-binding domains" (RBDs) by analysing hydrophobicity distribution. The sequences of two databases of non-homologous interaction sites eliciting various biological activities were tested; 59-80 % were detected as RBDs. A statistical analysis of amino acid frequencies was carried out in known interaction sites and in predicted RBDs. RBDs were predicted from the 80,000 sequences of the Swissprot database. In both cases, arginine is the most frequently occurring residue. The RBD procedure can also detect residues involved in specific interaction sites such as the DNA-binding (95 % detected) and Ca-binding domains (83 % detected). We report two recent analyses; from the prediction of RBDs in sequences to the experimental demonstration of the functional activities. The examples concern a retroviral Gag protein and a penicillin-binding protein. We support that this method is a quick way to predict protein interaction sites from sequences and is helpful for guiding experiments such as site-specific mutageneses, two-hybrid systems or the synthesis of inhibitors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Amino Acid Substitution / genetics
  • Animals
  • Apolipoproteins E / chemistry
  • Apolipoproteins E / genetics
  • Apolipoproteins E / metabolism
  • Arginine / analysis
  • Arginine / metabolism
  • Bacterial Proteins*
  • Binding Sites
  • Calcium-Binding Proteins / chemistry
  • Calcium-Binding Proteins / genetics
  • Calcium-Binding Proteins / metabolism
  • Carrier Proteins / chemistry
  • Carrier Proteins / genetics
  • Carrier Proteins / metabolism
  • Computational Biology / methods*
  • Databases, Factual
  • Drug Design
  • Gene Products, gag / chemistry
  • Gene Products, gag / genetics
  • Gene Products, gag / metabolism
  • Hexosyltransferases*
  • Humans
  • Models, Molecular
  • Molecular Sequence Data
  • Muramoylpentapeptide Carboxypeptidase / chemistry
  • Muramoylpentapeptide Carboxypeptidase / genetics
  • Muramoylpentapeptide Carboxypeptidase / metabolism
  • Mutation / genetics
  • Penicillin-Binding Proteins
  • Peptidyl Transferases*
  • Protein Binding
  • Protein Structure, Tertiary
  • Proteins / antagonists & inhibitors
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism*
  • Sensitivity and Specificity
  • Time Factors
  • Viral Fusion Proteins / chemistry
  • Viral Fusion Proteins / genetics
  • Viral Fusion Proteins / metabolism

Substances

  • Apolipoproteins E
  • Bacterial Proteins
  • Calcium-Binding Proteins
  • Carrier Proteins
  • Gene Products, gag
  • Penicillin-Binding Proteins
  • Proteins
  • Viral Fusion Proteins
  • Arginine
  • Peptidyl Transferases
  • Hexosyltransferases
  • Muramoylpentapeptide Carboxypeptidase