Instructions for using ZiFiT (V3.0)
OPEN (Oligomerized Pool Engineering)
- Input
- Input
- Scoring
A ZF is protein motif consisting of two beta
strands and an alpha helix, coordinated by a zinc ion via cysteine and
histidine residues. Residues -1 to 6 of the alpha helix recognize
specific DNA triplet sequences, primarily by forming base-specific contacts in
the major groove of the double-stranded target DNA. ZFs
are often referred to according to "recognition" residues in the
alpha helix, listed in N- to C-terminal direction; other residues in the module
are referred to as the backbone. Several module sets have been developed; we
refer to these using name of the laboratory or company
in which they were developed: Barbas Modules, Sangamo Modules, ToolGen Modules
(discussed below).
As illustrated below, ZFs
bind target DNA sites with amino acids of the recognition alpha helix (shown in
the top line from the amino (N) to
carboxyl (C) terminus) contacting
consecutive nucleotides in DNA in the 3'
to 5' direction. This can be
confusing because the DNA target site is always referred to in the 5' to
3' direction, whereas amino acid sequences are referred to from N to
C terminus. Therefore, in the ZF shown below, N-QSSNLVR-C, the R actually recognizes the G in the triplet 5'
-GAA- 3.'

S
Multiple ZFs can be
linked together to recognize a specific and preferably unique sequence in double-stranded genomic DNA. When multiple ZFs are combined, the resulting multi-finger array
recognizes a longer (and more likely unique) target sequence in DNA. By
fusing multi-finger arrays to various protein domains, they can be used to
target transcriptional activation domains, repressors, or nucleases to specific
locations in the genome.

ZFNs
consist of two zinc finger arrays, each fused to a single monomer of a dsDNA
nuclease. The nuclease is only active as
a dimer, and dimerization occurs when both zinc finger arrays bind their target
sequence. By requiring dimerization,
ZFNs cleave double-stranded DNA in a sequence-specific manner. Multi-finger
arrays are typically designed to recognize and bind sites that are separated by
a Spacer of several base pairs (usually 5 or 6 bp). We use Right Array to
refer to the array that binds the sequence at the 3' end of the DNA target
sequence (top strand in diagram shown below).
Left Array refers to the array that binds the reverse complement of
this sequence (bottom strand below).

This
approach to ZFP production uses pools of zinc fingers, each of which consists
of numerous unique helix solutions that recognize a particular DNA triplet at
one of three positions in a three-finger protein. Pools are recombined to generate hundreds of
thousands of unique solutions for each target site. Optimal solutions are identified through
genetic selections in which binding of the protein to the target site upstream
of a pair of selectable marker gene activates expression and confers a growth
advantage.
Sequence: The DNA sequence in which a user wishes to
identify target sites for ZF arrays is pasted into the Sequence Window.
Sequences should be in FASTA format. White space and numbers will be
ignored.
Spacer size: This parameter is only available when
designing zinc finger nucleases. The user specifies the number of
nucleotides between the ZF arrays. The appropriate distance is determined
by the length of the amino acid linker between the ZF array and the associated
nuclease domain. For standard linkers, the spacer is five or six bp,
which provides the proper spacing between the zinc finger nuclease monomers so
that they can interact to create a functional enzyme.
Triplets
(Position 1, Position 2, Position 3): This parameter allow the users to select which module
pools to consider for target site identification. Default parameters include all pools
currently available from the Zinc Finger Consortium.
Advanced Options
Triplet Composition: The user can specify
the composition of nucleotide triplets desired in target DNA sequences.
For example, if ANN (i.e., A followed by any 2 nucleotides) and GNN triplets are
desired, the user can choose to exclude CNN and TNN triplets from consideration
by setting the max value for these
triplets to zero. As another example, the user can specify search
parameters that require at least 3 GNN triplets in the target.
This ZFP design approach involves
joining together individual zinc finger modules with pre-characterized
specificities to generate a single zinc finger protein for a target site.
Although simple to perform, success rates for modular assembly are lower than
OPEN.
Sequence: The DNA sequence in which a user wishes to
identify target sites for ZF arrays is pasted into the Sequence Window.
Sequences should be in FASTA format. White space and numbers will be
ignored.
Array sizes: If designing zinc finger nucleases, the user can
chose the number of ZFs (or "fingers") in
the left and right arrays. Note that each ZF binds three DNA
nucleotides. When designing ZF arrays, the user can chose from 3 to 8 ZFs.
Spacer size: This parameter is only available when
designing zinc finger nucleases. The user specifies the number of
nucleotides between the ZF arrays. The appropriate distance is determined
by the length of the amino acid linker between the ZF array and the associated
nuclease domain. For standard linkers, the spacer is five or six bp,
which provides the proper spacing between the zinc finger nuclease monomers so
that they can interact to create a functional enzyme.
Module
sets (Modular Design):
Three different
module sets are available, and users may choose ZFs
from any combination of module sets. Some ZFs,
such as those from Sangamo, were designed for specific positions within a
three-finger array. Those modules
designed for specific positions will be labeled in the output. The
backbone for each module is listed below.
Barbas Modules: Developed via phage display by the Barbas lab
at the Scripps Research Institute (Segal, 1999), Barbas modules are available
for all GNN triplets, most ANN and CNN triplets, and several TNN
triplets. All Barbas modules were developed and tested in the middle position of the 3-finger mouse
transcription factor Zif268. Barbas modules are assumed to have complete
positional independence (i.e. context
independence). The Barbas
modules available from the ZF Consortium are in the SP1C backbone.
Sangamo Modules: Sangamo modules were developed at Sangamo BioSciences Inc. and are currently available for GNN
triplets. Sangamo modules were developed considering the position of each
module within a 3-finger array. Each of the three positions typically has
a distinct (although often similar) finger developed for a given triplet at
that position. For this reason, when selecting to use only Sangamo
modules, ZiFiT restricts the user to three finger arrays. The Sangamo
modules available from the ZF Consortium are in the SP1 backbone.
ToolGen Modules: ToolGen modules were selected and tested by
ToolGen, Inc. and are available for a variety of nucleotide triplets.
ToolGen modules take the view that nature did it best: they are based on ZFs encoded in the human genome. As a result, the
ToolGen modules available from the ZF Consortium exist in diverse (native)
backbones.
Using
multiple module sets: Choosing this option allows selection of ZFs from those available in all 3 sets: Barbas, Sangamo,
and ToolGen Modules. Once again, in the
case of the Sangamo modules, those modules designed for specific positions will
be labeled.
Advanced Options
Triplet Composition:
The
user can specify the composition of nucleotide triplets desired in target DNA
sequences. For example, if ANN (i.e., A followed by any 2 nucleotides)
and GNN triplets are desired, the user can choose to exclude CNN and TNN
triplets from consideration by setting the max
value for these triplets to zero. As another example, the user can
specify search parameters that require at least 3 GNN triplets in the target.
Ignore aspartate overlap: Aspartate (Asp, D) in position +2 of the alpha helix contacts a C or A
preceding and adjacent to the triplet on the opposite strand. However, in
combination with an arginine (Arg, R) at position -1,
the aspartate at +2 provides an extremely stable
interaction for binding G in the third position of the triplet. The default option (unchecked) filters
out all sequences in which interstrand interactions
cannot occur, i.e., there is no appropriate binding partner on the opposite
strand. (Note: The user should exercise caution when designing ZF
arrays at the ends of the input sequence. By default, end-sequences requiring
an adjacent base for interstrand interactions are
thrown out.)
GNN scoring (Modular Assembly): GNN scoring is based on an extensive study by the
Zinc Finger Consortium in which 168 three-module ZF arrays were modularly
assembled and tested for ‘in vivo’ functionality (Ramirez et al. 2008, Nature
Methods, 5:374). The 168 three-module proteins were designed to bind 104
diverse target sites varying in GNN, ANN, CNN, TNN subsite composition
(Alternate ZF modules for a given triplet were tested when available).
Success of individual arrays was heavily dependent upon GNN
composition. Arrays with three GNNs enjoyed a
59% success rate, which declined as GNN composition declined (2-GNNs=29%,
1-GNN=12%, 0-GNNs=0%). ZiFiT has been redesigned to provide users with a
score that estimates the success rate for a modularly designed three-finger
array based on these results. In the case of a zinc finger nuclease,
ZiFiT scores the arrays individually. The probability that the two arrays
will function as a ZFN can be calculated by multiplying the scores obtained
from the individual zinc finger arrays.
Example GNN score:
257 gTTCTCATCCATTTTCGCTGTTGACg
283 Right Score: 0.59
257 cAAGAGTAGGTAAAAGCGATAAGTGc
283 Left Score: 0.29
Interface with ZiFDB
To enhance its utility, ZiFiT is interfaced directly
with ZiFDB - a web accessible database of zinc fingers and
engineered zinc finger arrays. Target
Site hyperlinks within the ZiFiT output directly query ZiFDB
to determine if any previously constructed arrays exist that bind to completely
or partially matched target sequences.
In addition, ZiFiT users can query ZiFDB for
finger information for a specific triplet subsite by clicking on the
triplet. Thus, ZiFiT and ZiFDB work synergistically to aid in ZFA design.