RNABindR: software for
prediction of RNA binding residues in proteins
Interdepartmental
Graduate Program in Bioinformatics and Computational Biology Program,
Department of Genetics, Development, and Cell Biology, and Department of
Computer Science, Iowa State University, Ames IA 50011
Our group is interested in understanding
the interactions of proteins with other macromolecules. RNABindR is a web-based
server for analyzing and predicting RNA binding sites in proteins.
RNABindR provides two different
functions:
1. Calculating
RNA binding residues in a protein-RNA complex from the PDB.
2. Predicting
RNA binding residues in a protein sequence that is not available in the PDB.
Calculating RNA Binding Residues
RNABindR calculates RNA binding residues
in protein-RNA complexes based on a distance cutoff. If any atom in an amino acid is within the
distance cutoff of any atom in the RNA, that amino acid is determined to be RNA
binding. For calculating RNA binding
residues, users must enter the PDB id for the protein-RNA complex and specify
the distance cutoff to use. By default,
RNABindR uses a distance cutoff of 5 Ångstroms. The RNA binding residues are computed from
the structure and the output displays the primary sequence of the protein with
a “+” indicating RNA binding residues and a “-“ indicating non-binding
residues. A Jmol viewer is used to display the structure of
the protein-RNA complex, with the protein displayed in blue spacefill, the RNA
in green wireframe, and the RNA binding residues in the protein displayed in
red spacefill. Users can manipulate the
image using the Jmol interface.
Predicting RNA Binding Residues
RNABindR uses a Naive Bayes classifier
for all predictions as described in Terribilini et al., 2006 (see full
reference below). RNABindR predictions
are based on observed interactions from structures of protein-RNA complexes in
the PDB. Although structural information is used to determine the actual
interacting residues in a protein for training RNABindR, only the protein
sequence is used as input by RNABindR for generating predictions. Because the
reliability of RNABindR predictions for any particular protein depends on the
extent to which the query protein shares features that are "captured"
by the classifier during training, prediction performance for any particular
query sequence cannot be guaranteed.
The only input required for predictions
is a protein sequence. The sequence can
be in any format, although FASTA format is preferred. If an exact match to the query sequence is
found in a protein-RNA complex from the PDB, the prediction method is not run; instead, RNABindR returns the
actual RNA binding residues and a Jmol viewer highlighting the RNA binding
sites in the protein-RNA complex. If an
exact match is not found, RNABindR returns three sets of predictions. The three sets of predictions are the “optimal” prediction, “high
specificity” prediction, and “high sensitivity” prediction. The “optimal” prediction uses
the classification threshold that maximized the correlation coefficient in
leave-one-out cross validation experiments on our training set. The “high specificity” prediction returns fewer predicted RNA binding residues with higher
confidence in the predictions. The “high sensitivity” prediction captures more of the actual
RNA binding residues at the expense of predicting more false positives.
RNABindR is necessarily dynamic, and we
encourage you to check back as the software improves. Furthermore, after trying your hand at
RNABindR, we welcome feedback to improve the program and its ease of use. For questions or suggestions pertaining to
this website and RNABindR, please contact:
terrible@iastate.edu
Please cite the following in any work that uses RNABindR:
Terribilini M, Lee JH, Yan C, Jernigan RL, Honavar V, Dobbs D. Prediction
of RNA-binding sites in proteins from amino acid sequence. RNA 2006 12:1450-1462
Terribilini, M, Sander, JD, Lee, JH, Zaback, P, Jernigan,
RL, Honavar, V, Dobbs, D. RNABindR: a
server for analyzing and predicting RNA-binding sites in proteins. Nucleic
Acids Research Advance Access published on