Title:
Expression of proteins from lactose-inducible vectors in E coli.
Techniques to Master:
Learning Objectives:
Background:
A. Protein Source. Just as the old recipe for rabbit stew began with the instruction “first, catch a rabbit,” so we start this investigation of protein function with this: first, catch a protein. In the old days of biochemistry, this meant obtaining a large amount of whatever tissue your protein of interest could be found in - beef liver; chopped spinach; maybe, indeed, rabbit. The protein of interest might constitute a very small fraction of the total protein contained in the tissue of origin. Nowadays, the time and cost involved in obtaining workable amounts of protein for study can be greatly reduced if the protein can be overexpressed in an inducible bacterial expression system. This requires obtaining the gene for the protein of interest, but that is not usually a difficult process using modern molecular biology techniques. In this way, rather than hunting your desired protein “prey” amongst a forest of other creatures, you cultivate a bacterial population to churn out many copies of your target.
B. Getting bacteria to make the protein you want. The bacteria used for producing protein is a common laboratory strain of Escherichia coli, with modifications to the normal lactose operon system resulting in the ability to direct the bacterium’s protein-producing apparatus to making one’s protein of interest when the system is induced. This technology takes advantage of the bacterium’s naturally occurring ability to respond to its environment through gene regulation. To better understand how this method works, we need to review a bit of bacterial molecular biology.
Like many micro-organisms, E. coli metabolism is most efficient when glucose is available as a preferred fuel molecule. Uptake of glucose into E. coli cells is comparable to the way it works in humans, with entry into glycolysis as you have learned in your biochemistry and biology coursework. Other sugars, including monosaccharides such as fructose and galactose, and disaccharides such as lactose and sucrose, require additional enzymatic processing before they can be used in glycolysis. Metabolism of lactose, for example, requires hydrolysis of the glycosidic bond joining the galactose and glucose monomer units, followed by the action of several enzymes specific to galactose metabolism to convert it into a glycolysis intermediate.
Because most of the enzymes required to utilize lactose are specific only to lactose and its catabolic products, it can be to an organism’s advantage to produce them only when lactose is around to be metabolized. Most humans, like other mammals, produce lactase (𝛃-galactosidase) in the gut at birth to allow lactose from milk to be taken up in the bloodstream and used, but cease to make this protein into adulthood. If you find yourself increasingly unable to digest lactose from milk products, you are phenotypically “normal” in being lactose intolerant as an adult human! On the other hand, if you eat and drink milk products without discomfort, you have inherited a genetic disposition, likely derived from ancestral populations which have historically raised dairy animals, to continue producing the gut lactase enzyme into adulthood.
What has evolved in humans over many generations, turning off production of enzymes required for lactose utilization, can occur in bacteria over the course of minutes. This occurs in E. coli by repression of the lac operon when glucose is present. The lac operon consists of the bacterial gene for 𝛃-galactosidase (it’s called LacZ in bacteria) and several other genes required for uptake and use of lactose. When there is no lactose present, the lac repressor protein LacI binds to the promoter for these genes and blocks RNA polymerase, preventing transcription. The availability of glucose also serves to keep lac operon expression turned off. On the other hand, if glucose becomes unavailable and lactose is present, an isomer of lactose binds to the LacI protein, causing it to dissociate from the lac promoter. When this happens, RNA polymerase is able to access the promoter and transcribe the metabolically required genes.
In this way, the lac operon system can be regarded as an off/on switch, turning off (repressing) production of proteins required for the cell to utilize lactose when glucose, the preferred carbohydrate, is present, but rapidly turning on (inducing) production of those proteins when glucose is absent and lactose is available.
C. The pET expression technology. The ability to switch protein expression from off to on has been exploited, to the benefit of protein chemists, using the pET based expression system. The pET vector system was invented by Moffatt and Studier in 1986 [1]. The key to this method was in utilizing the promoter-RNA polymerase pair of the T7 virus, a binding interaction that is much tighter than the analogous pair in E. coli. pET vectors have cloning sites preceded by the T7 promoter, and are used in conjunction with the E. coli strain BL21(DE3), which has the T7 RNA polymerase inserted into its chromosome under control of the lac promoter. The researcher introduces a plasmid containing a copy of the gene for the protein to be produced and grows cells to a mid log-phase density in a liquid medium that contains both glucose and lactose. Under these growth conditions, glucose is utilized first and the lac repressor (LacI) remains bound to DNA, blocking transcription of the T7 polymerase as well as the gene for the protein one has inserted on the expression plasmid. As the bacteria grow, they use up the glucose and must switch to utilization of lactose. This switch in metabolism derepresses the lac promoters and unleashes transcription of the T7 polymerase. This polymerase locates itself to the promoter on the expression plasmid and activates a very high rate of transcription.
Thus, the pET expression strategy is a two-step process for induction. After growing the cells, the Lac operon is induced by addition of the lactose analog IPTG or by careful tweaking of the concentrations of glucose and lactose that allow glucose to become depleted. Induction results in the expression of the T7 RNA polymerase in this specially constructed strain of cells - that’s the first step. In the second step, the polymerase binds to the T7 promoter in the pET vector, and initiates a very high level of transcription of the inserted gene.
The plasmids used for expression are tailored to provide a high level of expression when induced, and, additionally, offer many options for extra amino acids, called tags, added to the N-terminal or C-terminal ends of the protein coding sequence to aid in expression, stability, and purification. These tags may be as small as a few amino acids or as large as several hundred. By adding codons for these extra amino acids which are not present in the naturally-occurring protein, the protein made is known as a “chimeric” protein. A chimera, in mythology, is a beast with body parts of more than one animal, such as a lion, a ram, and a serpent. In biology, the term is a little less dramatic, meaning either an organism containing cells of more than one genotype or, as in this context, a protein made of sequences of different protein found in nature whose genes have been spliced together into one continuous coding sequence. For example, in the figure at right, the gene for a hypothetical native protein has been modified by insertion of 18 nucleotides (shown in red) immediately following the ATG start codon. Thus, when introduced into a bacterial expression vector, this chimeric gene directs production of the expected protein with an extra snippet attached - a “fusion protein.” In this example, the so-called “his tag” is inserted adjacent to the normal start codon, so this would be an N-terminal tag. If the affinity tag is placed at the end of the protein coding sequence, just prior to the stop codon, it would be a C-terminal tag. In some cases, a short sequence of amino acids may be introduced between the tag and the native protein to allow for removal of the tag after cleaving it off by means of a specific protease.
The polyhistidine tag in this case allows for easy purification of the expressed fusion protein due to the affinity of the multiple histidine side chains, known chemically as imidazole groups, for metals. As you may have learned in an inorganic chemistry course, electron pairs on atoms such as oxygen and nitrogen can interact with d-orbitals on transition metals, forming coordination complexes. When a single molecule can make multiple contacts with a metal in this way, the result is called chelation. EDTA (ethylenediamine tetraacetate), a chelator you may have used in lab, can make up to six coordinate bonds to a single metal cation. Some chelators can occupy some, but not all, coordination positions on a metal, leaving other positions open for reversible binding by other ligands. Heme, the prosthetic group found in hemoglobin, for example, chelates iron cations at four planar positions; a fifth position is bound by a histidine side chain, and the six becomes the binding site for oxygen.
Chelated metal cations bound to a polysaccharide matrix such as agarose have been developed to provide immobilized binding sites to polyhistidine tagged proteins. The chelator most commonly used is nitrilotriacetate (NTA). When covalently attached to agarose beads, the result is NTA agarose, as shown in Figure XX. The three carboxylate groups of the NTA moiety bind tightly to several transition metal ions, particularly nickel and cobalt. This interaction holds the metal tightly while still allowing room for a second chelating agent to bind. Stretches of repeated histidine side chains nicely fit onto the bound metal. In this way, if a crude bacterial extract containing a his-tagged expressed protein is passed over a column of NTA agarose/Ni+2 or Co+2, the his-tagged product will be retained on the column while other bacterial proteins are washed off.
Of course, one issue with affinity chromatography of this type is getting your protein to bind to the column resin; the next is then getting your protein back off of the column and into solution so you can use it. There are a few approaches to doing this. If the tag added to the protein includes a specific protease cleavage site, the protease can be added to the column, thus freeing the protein of interest from the affinity tag as well as releasing the protein from the column. Secondly, if your protein is relatively in acidic conditions, lowering the pH to about 5 will cause the histidine tag side chains to become protonated and reverse the coordinate bonds to the metal. Thirdly, an relative excess of a small molecule which competes for binding to the column can be added to elute the intact tagged protein. For a histidine tagged protein, imidazole at a concentration of 200 mM or greater will displace the histidine side chains from the metal. The imidazole may be removed from the resulting purified protein by dialysis.
Purpose: The purpose of this section is to provide a method for the researcher to obtain an amount of unknown protein suitable for subsequent study. Most proteins will be obtained via overexpression in bacteria using pET or pET-derived vectors, therefore, the emphasis herein is on how to grow E. coli expressing foreign protein from such plasmids.
Experimental Design Considerations: How would you get your protein cloned/inserted into a “vector”? OR What vector is your plasmid cloned in? What features of that vector are important to protein expression and purification? What antibiotic must be present (and at what concentration) as you grow bacteria containing your plasmid? What affinity tag (if any) is present on your vector? What will you use as a negative control - i.e., bacterial lysate that does not contain your protein? How will you collect your bacteria after the growth/induction?
Supplies:
Equipment:
Small (15 mL) culture tubes
1 L culture flasks (suitable for growing 200 mL cultures)
Centrifuge with bottles capable of centrifuging 200 mL volumes
LB-Agar plates with appropriate antibiotic (ampicillin, 100 ug/mL, or kanamycin, 50 ug/mL)
Reagents & Solutions:
One or more pET or pMCSG based inducible vectors containing protein of interest
LB broth
Glycerol
Other:
Overnight Express TB (Novagen)
Competent BL21-DE3 E coli
Procedure:
References:
Additional Information:
Much more background on the pET system, including theory, selection of appropriate expression vector, and methods, can be found in the pET system manual, https://drive.google.com/open?id=0B3TL2rx4Zfv1cnVxVUNMNFUxX28
Overnight Express TB (Novagen), order from EMD-Millipore, cat # 71491-3 (enough for 1L) or cat # 71491-4 (for 5L)
Competent BL21-DE3 E coli, e.g. New England Biolabs cat #C2527I
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.