Recombinant DNA (DNA cloning)
Means of amplifying DNA sequence of interest.
Use to determine DNA sequence
Overproduce gene product
Manipulate gene product (e.g. make mutations for study)
Make reagents for further study (determine fate of gene product in cell or analyze it in other individuals i.e. forensics)
Need some kind of vector to amplify your DNA:
In vivo:
Plasmids (typically)
Virus (bacteriophage, eukaryotic)
Hybrids
artificial chromosomes (YACs)
In vitro:
PCR primers
Usually trying to clone into E. coli plasmid vector
Key to cloning - the restriction-modification system of bacteria
Most bacteria can bind and absorb DNA from the environment - transformation
Probably and adaptation to allow individuals to acquire variant genes from others of their species.
But if DNA from another species, could wreak havoc with regulation of gene expression.
So bacteria developed a biochemical system to allow them to acquire DNA of their own species, but not from others- The restriction-modification system.
Each species (or strain) of bacteria has a distinct pair of enzymes:
A Site-specific DNA methylase
and an endonuclease which cuts DNA that is not methylated at that site.
So if you transform E. coli strain R with DNA from strain R and select for expression of a particular gene --> colonies
But if you use DNA from strain K --> no colonies
The DNA from stain R is methylated by the EcoR I methylase
this keeps the EcoRI endonuclease from cutting
the DNA from strain K is methylated at different sites
so EcoRI can cut --> less efficient transformation
Restriction endonucleases have been purified from 100s of different species
Most recognize palindromic sequences (sequences that looks the same on either strand from either direction)
E.g.
H. influenzae HindIII 5'AAGCTT3' cuts 3' of 1st A
3'TTCGAA5' leaves 5' overhang
H. aegyptus HaeII 5'RGCGCY3' cuts 3' of last C
3'YCGCGR5' leaves 3' overhang
Used in many ways in DNA cloning, especially to cut vector DNA and gene of interest for splicing gene of interest into vector:
"sticky ends" from EcoRI cut of one fragment can base pair with those from another
isolate fragment with insert from gel, anneal sticky ends to plasmid DNA, and ligate with DNA ligase.
DNA ligase
DNA repair enzyme which repairs breaks in DNA backbone. Requires high energy cofactor. Usually use enzyme from T4 phage, which can ligate blunt or sticky ends. E. coli enzyme requires NAD+ instead of ATP and only repairs sticky ends.
Plasmids
autonomously replicating circular minichromosomes
most derived from drug-resistant bacteria
have origin of replication for amplifying DNA and marker gene to select for cells carrying it
Modern plasmids have variety of features to make life easier.
Usually Ampicillin resistance gene, beta lactamase, as selectable marker.
Polylinker sequence - synthetic sequence with many restriction endonuclease cut sites, so you can always find one that will work for you
Beta galactosidase gene fragment- polylinker is attached in frame to a portion of the lacZ gene, rest is supplied by host cells.
Plasmids without inserts produce beta-gal and appear blue on X-gal plates. Insert DNA disrupts open reading frame -> no beta gal -> white colonies
Necessary because plasmid vector will recircularize efficiently if cut with a single restriction endo.
Alternatively, one can use alkaline phosphatase to remove 5' PO4 from vector (but not the insert), or vector can be cut with two different rest. endos. Both of these will inhibit recircularization. Excess of insert is desirable also.
Many plasmids have promoter sequences for bacteriophage RNA polymerases flanking the polylinker. Can be used to make strand-specific probes of insert from either direction
Many include M13 bacteriophage origin of replication. Allows production of single strand DNA using helper phage. Useful for DNA sequencing or making site-directed mutants.
Plasmids introduced into cell by transformation:
Cells permeablized by osmotic shock or electrochemical pulse.
Circular DNA transforms efficiently
linear does not (due to cellular exonucleases)
Larger DNAs not as efficiently taken up as smaller ones for chemically permeablized cells.
Stability of plasmid often inversely proportional to size.
Host cell usually defective for restriction modification system, so exogenous DNA not digested.
Often is mutant for several genes required for genetic recombination; this helps in maintaining insert DNAs which might contain repeated sequences.
Other vectors:
Bacteriophage:
Lambda phage has 48.5 kb genome, packages one headful per virus particle. Can remove middle third of genome to allow insertion of up to 16 kb of insert
Convenient for construction of Genomic and cDNA libraries
Cosmids- hybrid of lambda and plasmid, contains only packaging signal from lambda plus usual plasmid replication and marker genes. Can insert up to 45 kb insert using lambda packaging system to efficiently introduce large plasmid into cell. Replicates as plasmid thereafter.
M13 phage; filamentous bacteriophage with ssDNA genome. Has dsDNA intermediate form of DNA in cell. Can isolate dsDNA from infected cells for manipulation with rest. endos. Isolate ssDNA from particles released into culture media.
P1 phage: allows replication of up to 150 kb of DNA. Great for genomic libraries.
Shuttle vector
includes plasmid features necessary for replication in E. coli, for ease of manipulation, but also origin of replication and selectable markers for replication in other hosts, such as yeast, plants or mammalian cells
Expression vectors
Include promoter sequences to allow expression of mRNA and protein from insert DNA. Often are expressed as fusion proteins, your gene is fused in frame with coding sequence for another protein. Can make more stable protein than foreign gene by itself. Also provides a handle for purification
ex.
Fusion product affinity resin
Maltose binding protein starch
glutathione-S-transferase glutathione
polyHis Ni++
protein A IgG
beta-galactosidase APTG
For bacterial expression, common promoter systems are lac and T7 phage
But many eukaryotic proteins not properly expressed and modified (e. g. glycosylation) in bacteria, so eukaryotic expression systems are used. Common promoters are SV40 and CMV in mammalian cells. The baculovirus expression system in insect cells has been used to overproduce a variety of eukaryotic proteins in high yield. The promoter in this system is for the viral coat protein.
Reporter plasmids
Clone control region of your gene (promoter) upstream of gene encoding enzyme that is easy to monitor. Can look at effects of changes in cell environment on expression of reporter gene to infer effects on the intact gene of interest
Common reporters.
Chloramphenicol acetyltransferase (CAT) modifies chloramphenicol
Luciferase Produces light (ATP-dependent)
Beta-galactosidase hydrolyzes lactose analogs
The second two can be monitored in situ as well as in vitro
Cloning gene of interest:
Usually have to isolate gene of interest from a library of clones.
Library - representative collection of all (hopefully) DNA encoded in genome or cDNA of all mRNAs expressed in cells.
Genomic DNA - usually partially digested with restriction endo or mechanically sheared and ligated into vector. --> overlapping clone sequences which span the genome, only a few of which contain your gene.
cDNA - copy of mRNA sequence made by reverse transcription of mRNA using retroviral reverse transcriptase. --> only exon sequences, much more convenient than genomic sequence for most genes (but may lose important regulatory sequences). Can make library of expressed genes for specific tissue.
For either type of library, you need a probe to detect the gene you want:
homologous DNA from related organism
cDNA for obtaining genomic clone (and vice versa)
antibody (need to use expression library)
oligonucleotide probe based on protein sequence
PCR product obtained from oligos based on shorter regions of homology
RFLP marker that maps near the gene you are interested in.
Need to know how many colonies (or phage plaques, which are usually easier to screen) are necessary to screen and be reasonably sure to get your clone.
to screen genomic DNA
P= 1-(1-f)N or N=ln(1-P)/ln(1-f)
where f is size of average insert/size of genome, P is probability, and N= number of clones to screen
for 10kb insert you would need 2200 clones to screen the E. coli genome of 4720kb. This could be done on a single small petri plate
You would need 1.4 million of these inserts to screen the human genome
this would require nearly 30 large petri plates.
(expression libraries require even more: gene has to be in proper orientation and reading frame for detection with antibody)
The library is usually plated out and the phage are grown and harvested to amplify the library when it is first made. That way you or your colleagues can probe the same library many times. Or you can buy it from Stratagene or Clontech. But the number of plaques that were initially plated is critical, since each of the unamplified phage represent a unique clone.
The library is plated out and after plaques are seen, a nitrocellulose or nylon filter is overlaid onto the plate and marked to allow realignment of filter and plate later on. Filter is removed and treated with NaOH to lyse phage and denature DNA. The filter is washed with buffer and then pretreated with a hybridization buffer that contains carrier DNA and protein which bind nonspecifically to the filter (pre-hybridization). Radiolabeled DNA or RNA probe is then added and incubated at specific temperature and salt concentration to allow the probe to anneal specifically to homologous phage DNA in the library.
Typically one uses a radiolabeled DNA or RNA probe (more on that below). The temperature and buffer conditions used for hybridizing and washing the filters has dramatic effects on results.
Factors favoring hybridization:
Low temperature
High [salt]
Low [denaturant]
probe length
time
%GC content of probe
Tm=81+16.6log[Na+] 0.4[%(G+C)] 0.6 (% formamide) 600/n 1.5(% mismatch)
where n is length of probe in bases
It is possible to favor hybridization too much and cause high back ground from hybridization to plaques with little homology to the probe. These hybe conditions are called "low stringency"
If stringency is too high, real signal could be washed off. This is particularly likely if the probe is not an exact match (e. g. probing a human library with a mouse DNA). In such cases the hybe conditions may have to be determined empirically, usually using a Southern Blot (see below).
Once the filters are probed, washed and dried, they are exposed to film. Black spots indicated putative clones of interest. The film is aligned with the plate and the area around the spot it cut out phage are harvested and rescreened. Since the phage diffuse out on the plate during the hybe procedure. A couple of rounds of purification are needed to isolate pure phage cultures.
Once isolated, clones must be tested to determine whether they represent the gene of interest. One common test is the Northern blot to determine whether the DNA is homologous to a mRNA in the cell type of interest.
If you know you need to isolate more flanking genomic sequence, must do "Chromosome walking".
Make probe from the far end of your genomic clone, rescreen library.
Clones that hybridized to this probe but not the original one represent DNA further away from the original probe. Can go through multiple "steps". The larger the size of inserts, the fewer rounds you need to cover a given distance, hence the demand for cosmids, P1 vectors, and YACs
Southern blot used to analyze (genomic) DNA sequences
Total genomic DNA is purified and digested with restriction enzymes
This digest is electrophoresed on an agarose gel to separate the DNA fragments according to size. If you stain the DNA in the gel with ethidium bromide, a smear is usually seen since the digest may contain thousands of different sized fragments.
The gel is treated with Acid, then NaOH. This randomly nicks the DNA then denatures it.
The gel is then placed on moist blotting paper and a nitrocellulose filter is placed on top, followed by a large stack of dry paper towels (or a disposable diaper!).
The dry paper sucks buffer through the gel, carrying the DNA with it. Nicking the DNA (acid treatment above) allows the large fragments to transfer as efficiently as the small ones
The blot is then probed as for library screening. After autoradiography, a dark band should be seen which corresponds to a DNA restriction fragment homologous to the probe DNA. The size of the fragment can be determined by comparison with known DNA standards run on the same gel.
Southern are useful for optimizing probe conditions for library screening, for mapping genomic sequences flanking cloned regions and in forensic analysis.
Northern blot used to mRNA expression
Instead of DNA total RNA or polyA+ mRNA from tissues of interest is electrophoresed on a denaturing gel (agarose-formaldehyde or acrylamide-urea), separating RNAs by size. The gel is blotted and probed as for a Southern Blot. Very useful for determining whether your DNA sequence is expressed as mRNA and how it is expressed.
Used to monitor regulation of mRNA levels:
Can isolate mRNA from different tissues
different times in development
cells treated differently in culture.
Making cDNA
Since many genes have numerous large introns, cDNA clones are often more convenient to use. Made from mature mRNA, so only exons present.
Usually begin by isolating polyA+ mRNA on oligo-dT column
To begin synthesis, polyA+ mRNA is annealed to oligo-dT primer
This serves as primer for 1st strand cDNA synthesis using retroviral reverse transcriptase (an RNA-dependent DNA polymerase) in the presence of dNTPs --> mRNA-cDNA heteroduplex, the template for 2nd strand DNA synthesis.
A mixture of enzymes is added: RNase H, DNA polI and dNTPs.
The RNase H is an endonuclease specific for RNA in a DNA-RNA heteroduplex. The RNA is nicked by RNase H and the small RNA left is annealed to the DNA to serve as primer for second strand synthesis by DNApol.
Nicks that remain are repaired by DNA ligase. Linker oligonucleotides, encoding restriction enzyme sites, are ligated on and digested allowing cloning of all the cDNAs into the vector. The library is screened as for genomic screening.
DNA sequencing:
Once cloned, DNA can be sequenced by either chemical or enzymatic methods. Chemical method is still used for footprinting type experiments, but virtually all sequencing is performed using DNA polymerase and chain terminating nucleotides.
To begin the sequencing reaction ssDNA or denatured dsDNA is annealed to a specific oligonucleotide primer (this may be flanking vector sequence near the polylinker site, or it might be known sequence within the insert DNA). The oligonucleotide serves as primer for synthesis of DNA by DNA polymerase, using the plasmid as template.
Usually an unlabelled oligonucleotide is used, label is incorporated in an initial reaction containing 32P or 35S labeled dNTP and small amounts of unlabeled dNTPs. After a couple of minutes reaction, to allow incorporation, the reaction is split into 4 tubes and added to higher concentrations of cold dNTPs. Each tube also contains a different 2'-3'-dideoxynucleotide triphosphate (ddNTPs). Theses analogs are incorporated by the polymerase, but cause synthesis of the chain to stop, since they have no free 3'-OH to serve as acceptors of 5' PO4 in the polymerization reaction. Thus, a reaction that contains ddATP plus all four dNTPs will stop whenever a ddA residue is incorporated opposite a T on the template strand. By performing 4 separate reactions each containing a different ddNTP and electrophoresing the samples on a high resolution acrylamide gel, which can separate DNA chains differing in length by a single nucleotide, one can read the DNA sequence from the pattern of bands on the autoradiograph of the gel.
The choice of polymerase is important. Originally the Klenow fragment of DNA polI was used. It was difficult to use because it had different Km for each dNTP and each ddNTP. The affinity for ddNTPs was particularly low, so each different reaction had to be carefully titrated when making up sequencing mixes. The polymerase was also distrubutive, rather than processive: it would incorporate a few nucleotides on one template, then dissociate and reassociate with another template. Some of the templates would not get extended completely to the ddNTP incorporation, causing background stops on the gel.
A modified version of T7 DNA polymerase, called sequenase is commonly used now. It has similar Km for all dNTPs and ddNTPs, so making sequencing reaction mixes is easier, (though most people buy them premade). The polymerase is much more processive. So once extension starts, it is likely to proceed until the ddNTP is incorporated --> much lower background and easier to read gels.
Automated sequencing. Usually uses similar polymerases and acrylamide sequencing gels, but the label is usually a dye molecule that can be detected readily by laser. As the samples run off the bottom of the sequencing gel, they are detected by the laser and automatically recorded.
polymerase: some use sequenase, others use thermostable polymerases and "cycle sequencing" similar to PCR.
Sequencing chips. An alternative method of sequencing is to hybridize your DNA to a DNA chip which contains an array of all 65,000 possible octamer sequences. The complete DNA sequence can be inferred by aligning the sequences of all the oligonucleotides on the chip which hybridized to your DNA.
Automated sequencing of genomes of humans and other model organism is revolutionizing the way molecular biology is done. The other player in this revolution is the Polymerase Chain Reaction
Polymerase Chain Reaction (PCR)
PCR is the amplification of a specific DNA fragment in vitro, directed by using a pair of oligonucleotides to prime DNA synthesis of the region between the primers using DNA polymerase. Successive rounds of DNA synthesis are accomplished by first heating the template DNA and primers to 94° to denature the template, cooling to approximately 50° to allow the primers to anneal to the template, followed by a polymerization reaction using DNA pol. The cycle of DNA denaturation, annealing and synthesis and polymerization is repeated many times. Each time the reaction is repeated, the products of the previous cycle are used as template, as well as the original template, causing an exponential synthesis of the fragment of DNA between the primers. This process was greatly facilitated by the use of thermostable DNA polymerases and automatic temperature cylers.
Using this procedure, a single DNA molecule can be specially amplified and detected. Used in a myriad of ways in molecular biology. Need not know sequence of DNA between the primers, so can amplify DNA of related organism based on primers of regions known to be conserved. Sensitive method to detect small amounts of contaminating bacteria, mutants cells, etc.
Can be used in combination with reverse transcriptase to amplify specific mRNA sequences.
Choice of polymerase is again important. Thermus aquaticus most commonly used, but it is error-prone, lack proofreading 3'->5' exonuclease and loses activity after about 35 cycles at high temperature. Other organisms produce polymerases that are more thermostable and less error prone and have proofreading function. Allowing more faithful replication of longer genes (up to 40 kb)
Pyrococcus furiousus polymerase: has 3->5 proofreading exo. Higher fidelity, but requires somewhat longer primers since exo may remove some before pol gets going.
Site directed mutagenesis
Many different strategies, all require:
synthetic oligonucleotide(s) with desired mutation
extending oligonucleotide by DNA polymerase to reconstruct entire sequence
strategy for eliminating starting "wild-type" template
e.g. "quick change" strategy:
A Gallery of Enzymes used in Molecular Biology
DNA polymerase:
DNA sequencing
DNA labeling (filling in restriction fragment ends, nick translation, random hexamer primed synthesis)
Formation of blunt ends from sticky ends for cloning (either filling in 5' overhang or removing 3' overhang)
PCR
cDNA synthesis by reverse transcriptase
Primer extension mapping of mRNA ends by reverse transcriptase
Site specific mutagenesis
Restriction endonucleases
Site-specific digestion of DNA for cloning DNA fragments or physical mapping of DNA
Phosphatase
Removal of 5 PO4 to reduce background in ligation
or for subsequent labeling with with gamma ATP and kinase
Exonuclease
blunt ending rest. frags. (polymerase)
generation of random deletions
DNase I
nonspecific DNA endonuclease:
-> single strand breaks with Mg++
-> doublle strand breaks with Mn++
RNaseH
digest RNA of RNA/DNA hybrid
used to prime 2nd strand cDNA synthesis
site-specific RNA cleavage when used with specific oligonucleotide DNA
RNAseA
cleaves RNA 3 to pyrimidine residue
at low salt cleaves ss and dsRNA
above 0.3M salt, dsRNA only used in RNase protection mapping experiments
used to remove RNA contaminants from DNA (minipreps)
very difficult to inactivate
Polynucleotide kinase
transfers g
PO4 from ATP to 5-OH of DNA or RNA
used for 5 end labeling and phosphorylating synthetic oligonucleotides for subsequent ligations
DNA ligase
joins ds DNA (or DNA/RNA) molecules E. coli enzyme requires NAD cofactor and joins only cohesive ends. T4 enzyme requires ATP and joins cohesive or blunt ends.
RNA ligase
joins ssDNA or RNA, useful for 3-end labeling RNA
S1 nuclease, mung bean nuclease
ss specific nuclease
used to map exon boundaries using labeled DNA probe
also used to blunt 5 overhand restriction fragments
DNA methylase
transfers CH3 to A or C residues
protects DNA from subsequent restriction endonuclease cleavage
DNA terminal transferase
polymerizes addition to 3 end of DNA without template
used to add homopolymeric tail to cDNA during library construction
also used to extend premature termination products in DNA sequencing reactions
T7, SP6, T3 RNA polymerases
DNA dependent RNA polymerases. Have 17 bp promoter sequence.
Transcribe RNA with high efficiency downstream of promoter.
Used for making substrates for RNA processing reations, labeled probes of plasmid inserts, and in vivo expression systems