RNABindR:  software for prediction of RNA binding residues in proteins

 

Michael Terribilini, Jeffry Sander, Jae-Hyung Lee, Peter Zaback, Robert L. Jernigan, Vasant Honavar and Drena Dobbs

Interdepartmental Graduate Program in Bioinformatics and Computational Biology Program, Department of Genetics, Development, and Cell Biology, and Department of Computer Science, Iowa State University, Ames IA  50011

 

Our group is interested in understanding the interactions of proteins with other macromolecules. RNABindR is a web-based server for analyzing and predicting RNA binding sites in proteins.

 

RNABindR provides two different functions:

 

1. Calculating RNA binding residues in a protein-RNA complex from the PDB.

2. Predicting RNA binding residues in a protein sequence that is not available in the PDB.

 

Calculating RNA Binding Residues

RNABindR calculates RNA binding residues in protein-RNA complexes based on a distance cutoff.  If any atom in an amino acid is within the distance cutoff of any atom in the RNA, that amino acid is determined to be RNA binding.  For calculating RNA binding residues, users must enter the PDB id for the protein-RNA complex and specify the distance cutoff to use.  By default, RNABindR uses a distance cutoff of 5 Ångstroms.  The RNA binding residues are computed from the structure and the output displays the primary sequence of the protein with a + indicating RNA binding residues and a - indicating non-binding residues.  A Jmol viewer is used to display the structure of the protein-RNA complex, with the protein displayed in blue spacefill, the RNA in green wireframe, and the RNA binding residues in the protein displayed in red spacefill.  Users can manipulate the image using the Jmol interface.

 

Predicting RNA Binding Residues

RNABindR uses a Naive Bayes classifier for all predictions as described in Terribilini et al., 2006 (see full reference below).  RNABindR predictions are based on observed interactions from structures of protein-RNA complexes in the PDB. Although structural information is used to determine the actual interacting residues in a protein for training RNABindR, only the protein sequence is used as input by RNABindR for generating predictions. Because the reliability of RNABindR predictions for any particular protein depends on the extent to which the query protein shares features that are "captured" by the classifier during training, prediction performance for any particular query sequence cannot be guaranteed.

 

The only input required for predictions is a protein sequence.  The sequence can be in any format, although FASTA format is preferred.  If an exact match to the query sequence is found in a protein-RNA complex from the PDB, the prediction method is not run; instead, RNABindR returns the actual RNA binding residues and a Jmol viewer highlighting the RNA binding sites in the protein-RNA complex.  If an exact match is not found, RNABindR returns three sets of predictions.  The three sets of predictions are the optimal prediction, high specificity prediction, and high sensitivity prediction.  The optimal prediction uses the classification threshold that maximized the correlation coefficient in leave-one-out cross validation experiments on our training set.  The high specificity prediction returns fewer predicted RNA binding residues with higher confidence in the predictions.  The high sensitivity prediction captures more of the actual RNA binding residues at the expense of predicting more false positives.

 

RNABindR is necessarily dynamic, and we encourage you to check back as the software improves.  Furthermore, after trying your hand at RNABindR, we welcome feedback to improve the program and its ease of use.  For questions or suggestions pertaining to this website and RNABindR, please contact:  terrible@iastate.edu

 

Please cite the following in any work that uses RNABindR:

 

Terribilini M, Lee JH, Yan C, Jernigan RL, Honavar V, Dobbs D. Prediction of RNA-binding sites in proteins from amino acid sequence. RNA 2006 12:1450-1462

Abstract

PDF

 

Terribilini, M, Sander, JD, Lee, JH, Zaback, P, Jernigan, RL, Honavar, V, Dobbs, D.  RNABindR: a server for analyzing and predicting RNA-binding sites in proteins.  Nucleic Acids Research Advance Access published on May 5, 2007.   doi:10.1093/nar/gkm294 

Abstract 

PDF