Instructions for using ZiFiT (V3.0)

 

 

Zinc Fingers

 

OPEN (Oligomerized Pool Engineering)

        - Input

 

Modular Assembly

        - Input

        - Scoring

 

 

 

Zinc Fingers (ZFs) :

A ZF is protein motif consisting of two beta strands and an alpha helix, coordinated by a zinc ion via cysteine and histidine residues.  Residues -1 to 6 of the alpha helix recognize specific DNA triplet sequences, primarily by forming base-specific contacts in the major groove of the double-stranded target DNA.  ZFs are often referred to according to "recognition" residues in the alpha helix, listed in N- to C-terminal direction; other residues in the module are referred to as the backbone. Several module sets have been developed; we refer to these using name of the laboratory or company in which they were developed: Barbas Modules, Sangamo Modules, ToolGen Modules (discussed below).

 

As illustrated below, ZFs bind target DNA sites with amino acids of the recognition alpha helix (shown in the top line from the amino (N) to carboxyl (C) terminus) contacting consecutive nucleotides in DNA in the 3' to 5' direction.  This can be confusing because the DNA target site is always referred to in the 5' to 3' direction, whereas amino acid sequences are referred to from N to C terminus.  Therefore, in the ZF shown below, N-QSSNLVR-C, the R actually recognizes the G in the triplet 5' -GAA- 3.' 

 

S

 

 

Zinc Finger Arrays:

Multiple ZFs can be linked together to recognize a specific and preferably unique sequence in double-stranded genomic DNA. When multiple ZFs are combined, the resulting multi-finger array recognizes a longer (and more likely unique) target sequence in DNA.  By fusing multi-finger arrays to various protein domains, they can be used to target transcriptional activation domains, repressors, or nucleases to specific locations in the genome. 

 

 

Zinc Finger Nucleases (ZFNs):

 ZFNs consist of two zinc finger arrays, each fused to a single monomer of a dsDNA nuclease.  The nuclease is only active as a dimer, and dimerization occurs when both zinc finger arrays bind their target sequence.  By requiring dimerization, ZFNs cleave double-stranded DNA in a sequence-specific manner. Multi-finger arrays are typically designed to recognize and bind sites that are separated by a Spacer of several base pairs (usually 5 or 6 bp).  We use Right Array to refer to the array that binds the sequence at the 3' end of the DNA target sequence  (top strand in diagram shown below). Left Array refers to the array  that binds the reverse complement of this sequence (bottom strand below).

 

 

  

 

 

OPEN (Oligomerized Pool ENgineering): 

This approach to ZFP production uses pools of zinc fingers, each of which consists of numerous unique helix solutions that recognize a particular DNA triplet at one of three positions in a three-finger protein.  Pools are recombined to generate hundreds of thousands of unique solutions for each target site.  Optimal solutions are identified through genetic selections in which binding of the protein to the target site upstream of a pair of selectable marker gene activates expression and confers a growth advantage.

 

Input

 

Sequence:  The DNA sequence in which a user wishes to identify target sites for ZF arrays is pasted into the Sequence Window.  Sequences should be in FASTA format.  White space and numbers will be ignored.

 

Spacer size:  This parameter is only available when designing zinc finger nucleases.  The user specifies the number of nucleotides between the ZF arrays.  The appropriate distance is determined by the length of the amino acid linker between the ZF array and the associated nuclease domain.  For standard linkers, the spacer is five or six bp, which provides the proper spacing between the zinc finger nuclease monomers so that they can interact to create a functional enzyme.

 

Triplets (Position 1, Position 2, Position 3):  This parameter allow the users to select which module pools to consider for target site identification.  Default parameters include all pools currently available from the Zinc Finger Consortium.

 

 

Advanced Options

 

Triplet Composition:  The user can specify the composition of nucleotide triplets desired in target DNA sequences.  For example, if ANN (i.e., A followed by any 2 nucleotides) and GNN triplets are desired, the user can choose to exclude CNN and TNN triplets from consideration by setting the max value for these triplets to zero.  As another example, the user can specify search parameters that require at least 3 GNN triplets in the target.

 

 

 

Modular Assembly (Modular Design):

 This ZFP design approach involves joining together individual zinc finger modules with pre-characterized specificities to generate a single zinc finger protein for a target site. Although simple to perform, success rates for modular assembly are lower than OPEN.

 

Input

 

Sequence:  The DNA sequence in which a user wishes to identify target sites for ZF arrays is pasted into the Sequence Window.  Sequences should be in FASTA format.  White space and numbers will be ignored.

 

Array sizes:  If designing zinc finger nucleases, the user can chose the number of ZFs (or "fingers") in the left and right arrays.  Note that each ZF binds three DNA nucleotides.  When designing ZF arrays, the user can chose from 3 to 8 ZFs.

 

Spacer size:  This parameter is only available when designing zinc finger nucleases.  The user specifies the number of nucleotides between the ZF arrays.  The appropriate distance is determined by the length of the amino acid linker between the ZF array and the associated nuclease domain.  For standard linkers, the spacer is five or six bp, which provides the proper spacing between the zinc finger nuclease monomers so that they can interact to create a functional enzyme.

 

 

        Module sets (Modular Design):

Three different module sets are available, and users may choose ZFs from any combination of module sets.  Some ZFs, such as those from Sangamo, were designed for specific positions within a three-finger array.  Those modules designed for specific positions will be labeled in the output.  The backbone for each module is listed below.

 

Barbas Modules:  Developed via phage display by the Barbas lab at the Scripps Research Institute (Segal, 1999), Barbas modules are available for all GNN triplets, most ANN and CNN triplets, and several TNN triplets.  All Barbas modules were developed and tested in the middle position of the 3-finger mouse transcription factor Zif268.  Barbas modules are assumed to have complete positional independence (i.e. context independence).  The Barbas modules available from the ZF Consortium are in the SP1C backbone.

 

Sangamo Modules:  Sangamo modules were developed at Sangamo BioSciences Inc. and are currently available for GNN triplets.  Sangamo modules were developed considering the position of each module within a 3-finger array.  Each of the three positions typically has a distinct (although often similar) finger developed for a given triplet at that position.  For this reason, when selecting to use only Sangamo modules, ZiFiT restricts the user to three finger arrays.  The Sangamo modules available from the ZF Consortium are in the SP1 backbone.

 

ToolGen Modules:  ToolGen modules were selected and tested by ToolGen, Inc. and are available for a variety of nucleotide triplets.  ToolGen modules take the view that nature did it best: they are based on ZFs encoded in the human genome.  As a result, the ToolGen modules available from the ZF Consortium exist in diverse (native) backbones.

 

Using multiple module sets:  Choosing this option allows selection of ZFs from those available in all 3 sets: Barbas, Sangamo, and ToolGen Modules.  Once again, in the case of the Sangamo modules, those modules designed for specific positions will be labeled.

 

 

                        Advanced Options

 

Triplet Composition:  The user can specify the composition of nucleotide triplets desired in target DNA sequences.  For example, if ANN (i.e., A followed by any 2 nucleotides) and GNN triplets are desired, the user can choose to exclude CNN and TNN triplets from consideration by setting the max value for these triplets to zero.  As another example, the user can specify search parameters that require at least 3 GNN triplets in the target.

 

Ignore aspartate overlap: Aspartate (Asp, D) in position +2 of the alpha helix contacts a C or A preceding and adjacent to the triplet on the opposite strand.  However, in combination with an arginine (Arg, R) at position -1, the aspartate at +2 provides an extremely stable interaction for binding G in the third position of the triplet.  The default option (unchecked) filters out all sequences in which interstrand interactions cannot occur, i.e., there is no appropriate binding partner on the opposite strand.  (Note:  The user should exercise caution when designing ZF arrays at the ends of the input sequence. By default, end-sequences requiring an adjacent base for interstrand interactions are thrown out.)

 

Scoring

 

GNN scoring (Modular Assembly):  GNN scoring is based on an extensive study by the Zinc Finger Consortium in which 168 three-module ZF arrays were modularly assembled and tested for ‘in vivo’ functionality (Ramirez et al. 2008, Nature Methods, 5:374).  The 168 three-module proteins were designed to bind 104 diverse target sites varying in GNN, ANN, CNN, TNN subsite composition (Alternate ZF modules for a given triplet were tested when available).  Success of individual arrays was heavily dependent upon GNN composition.  Arrays with three GNNs enjoyed a 59% success rate, which declined as GNN composition declined (2-GNNs=29%, 1-GNN=12%, 0-GNNs=0%).  ZiFiT has been redesigned to provide users with a score that estimates the success rate for a modularly designed three-finger array based on these results.  In the case of a zinc finger nuclease, ZiFiT scores the arrays individually.  The probability that the two arrays will function as a ZFN can be calculated by multiplying the scores obtained from the individual zinc finger arrays. 

 

Example GNN score:

257 gTTCTCATCCATTTTCGCTGTTGACg 283 Right Score: 0.59
257 cAAGAGTAGGTAAAAGCGATAAGTGc 283 Left Score: 0.29

 

 

Interface with ZiFDB

To enhance its utility, ZiFiT is interfaced directly with ZiFDB - a web accessible database of zinc fingers and engineered zinc finger arrays.  Target Site hyperlinks within the ZiFiT output directly query ZiFDB to determine if any previously constructed arrays exist that bind to completely or partially matched target sequences.  In addition, ZiFiT users can query ZiFDB for finger information for a specific triplet subsite by clicking on the triplet.  Thus, ZiFiT and ZiFDB work synergistically to aid in ZFA design.