Cloning of eukaryotic genes in single-strand phage vectors: the human interferon genes

Cloning of eukaryotic genes in single-strand phage vectors: the human interferon genes

Gene,27 (1984)81-99 Elsevier GENE 940 Cloning of eukaryotic genes in single-strand phage vectors: the human interferon genes (Recombinant DNA; bacte...

1MB Sizes 0 Downloads 24 Views

Gene,27 (1984)81-99 Elsevier GENE

940

Cloning of eukaryotic genes in single-strand phage vectors: the human interferon genes (Recombinant DNA; bacteriophage fl ; leukocyte; fibroblast; ZucUVSpromoter; Escherichia coli host; cDNA; DNA sequencing)

Donald W. Bowden, Jen-i Mao, Tina Gill, Kathy Hsiao, Jay S. Lillquist, Douglas Testa * and Gerald F. Vovis Collaborative Research, Inc., 128 Spring St., Lexington, MA 02173 (U.S.A.) Tel. (617) 861-9700, and *Inteferon Sciences, Inc., 783 Jersey Avenue, New Brunswick, NJ 08901 (U.S.A.) Tel. (201) 249-3232 (Received

July 19th, 1983)

(Revision

received

(Accepted

October

October

lOth, 1983)

1 lth, 1983)

SUMMARY

Using oligonucleotide probes with defined sequences, we have selected clones from a human lymphocyte cDNA library which represent human leukocyte (HuIFN-a) and fibroblast (HuIFN-8) interferon gene sequences. Double-stranded fl phage DNA was used as the vector for initial cloning of cDNA. Clones carrying interferon gene sequences were identified by hybridization with the oligonucleotide probes. The same oligonucleotide probes were used as primers for dideoxy chain termination sequencing of the clones. One HuIFN-a clone, 201, has a nucleotide sequence different from published HuIFN-a sequences. Under control of the ZucUVS promoter, the 201 gene has been used to express biologically active HuIFN-a in Escherichia coli.

INTRODUCTION

A wide variety of methods have been used to clone eukaryotic genes into prokaryotic systems. In this article we describe a method for cDNA cloning directly into the bacteriophage fl. Previously, the single-stranded phage cloning systems have been used primarily as second-stage cloning systems, i.e., for subcloning after the initial cDNA cloning in a Abbreviations:

AMV, avian myeloblastosis

virus; bp, base pairs;

cDNA, DNA complementary

to mRNA;

DTT, dithiothreitol;

human interferon

libroblast);p,

HuIFN,

plasmid;p,O,,

lacZpUV5

CPE, cytopathic

promoter;

form of phage fl DNA; SDS, sodium dodecyl see MATERIALS plasmid

carrier

AND METHODS,

0 1984 Elsevier

8,

RF, replicative

sulfate; TY broth,

[ 1, indicates

section

a;

Science

Publishers

state.

0378-l 119/84/$03.00

effect;

(cq leukocyte;

double-stranded vector (Messing, 198 1). The cloning system described here has a number of novel features which aid in the identification, DNA sequencing, storage and manipulation of cloned material. As an example of eukaryotic genes cloned into fl, we have chosen the genes for human leukocyte interferon (HuIFN-a) and human fibroblast interferon (HuIFN-/I). The interferons are a family of proteins which have the ability to confer a virus-resistant state upon their target cells (Isaacs and Lindenmann, 1970; Stewart, 1979). Several groups have cloned cDNAs for the HuIFN-a (Nagata et al., 1980a,b; Goeddel et al., 1980a) and HuIFN# gene (Taniguchi et al., 1980; Goeddel et al., 1980b; Derynck et al., 1980). Analysis of the cDNAs for HuIFN-a revealed a family of 20 or more distinct, but related

xx

genes (Goeddel et al., 1981; Mantei et al., 1980; Streuli et al., 1980; Lawn et al., 1981b; Nagata et al., 1980b). In contrast, there is apparently only one gene coding for HuIFN-/3 (Ohno and Taniguchi, 1981; Lawn et al., 1981~; Degrave et al., 1981). We have found representatives of both gene types in a cDNA library generated from Sendai virus-induced lymphocyte mRNA.

MATERIALS

AND METHODS

(a) Strains and media E. coli strain JMlOl [(F’ traD36 proAB laclZdM15) in a d(lac-pro) supE thi background] was used as the bacteriophage fl host. Transfections were carried out using E. co& strain BNN45 (hsdI? _ h&M+ supE44 supF met thi) of Davis et al. (1982). Strain CGE43 [F’ d(Zac-pro) x 11 l] (Davis et al., 1982) was used as a host for expression experiments. Plasmid pGL101 (Guarente et al., 1980) carries the E. coli lac promoter (UV5 allele) on a derivative of pBR322. This prCIC promoter on pGL 101 is constitutively expressed. Rich medium for growth of bacteria was TY broth (10 g Bactotryptone, 8 g NaCl, 1 g yeast extract, 1 g glucose, and 60 mg NaOH per liter). Minimal medium was M9 as described by Miller (1972) with the addition of 0.5% casamino acids. Human peripheral blood leukocytes were induced with Sendai virus and the cells harvested at 4 h after induction.

(b) Enzymes, synthetic oligonucleotides, and special chemicals T4 DNA ligase and T4 polynucleotide kinase were products of Collaborative Research, Inc. DNA polymerase I Klenow fragment was purchased from Boehringer-Mannheim. Restriction enzymes were purchased from New England Biolabs and used as recommended by the supplier. AMV reverse transcriptase was purchased from Life Sciences, Inc., St. Petersburg, FL. Hind111 linkers (CCAAGCTTGG), oligo(dT)-cellulose, oligo(T),,_,, and synthetic oligonucleotides were products of Collaborative Research, Inc.

(c) Isolation of poly(A)RNA Totai RNA (4 mg) was isolated from 3.6 g Sendai virus-induced lymphocytes by the method of Foster et al. (1980). In this method RNA is separated from a guanidinium salt cell homogenate by centrifugation through a cushion of 5.7 M CsCl. Po1yfA)RNA (40 pg) was obtained by chromatography on a l-ml oligo(dT)-cellulose column as described by Desrosiers et al. (1975). (d) Synthesis and cloning of double-stranded cDNA Double-str~ded cDNA (about 0.1 pg) was synthesized from 25 pg poly(A)RNA, purified, and inserted via Hind111 synthetic linkers into the DNA of filamentous phage fl derivative CGF4 using the procedure described by Moir et al. (1982) to generate a cDNA library from calf stomach mRNA. These procedures generated a population of replicative form CGF4 DNA containing cDNA copies of lymphocyte mRNA. The cDNA was inserted at the engineered Hind111 site (Moir et al., 1982) in the intergeni~ space of CGF4. This mixture of recombinant DNA was used to transform BNN45 to produce a library of phage plaques. The plaques produced by transformation were stored on TY plates. (e) Plaque hybridization and sizing of cDNA inserts Recombinant phage plaques were transferred to nitrocellulose filters (Schleicher and Schuell, Inc.) using the method of Benton and Davis (1977). Plaques were hybridized with [5’-32P]oligonucleotides by the method described by Wallace et al. (1981) (see Fig. 2). The size of insert DNA present in the recombinant phage was measured by electrophoresis of whole phage in agarose gels (Moir et al., 1982). Alternatively, double-stranded RF1 phage DNA was produced using a modification of the isobutanol boiling method (Holmes and Quigley, 1981). The length of inserts in such DNA could subsequently be measured by standard restriction enzyme analysis. (f) DNA isolation and DNA sequencing Double-stranded RF1 DNA was isolated from bacteriophage fl-infected bacteria by the method of

89

Model and Zinder (1974). Since-str~d~ circular fl phage DNA was prepared from plate stocks of fl using the method described by Zinder and Boeke (1982). Phage fl cDNA clones were screened by DNA sequencing using the dideoxy chain termination method of Sanger et al. (1977). Single-stranded circular phage DNAs, carrying cDNA inserts which hybridized to inte~eron probes, were used as a template for priming with synthetic interferon oligonucleotides in the dideoxy chain termination sequencing. The complete nucleotide sequence of clone 201 was obtained by using the sequencing method of Maxam and Gilbert (1980). Plasmid pBR322 and derivatives were purified by standard procedures. (g) Construction of a 5’-coding sequence for the mature HuIFN-a type 201 gene Before it was expressed in E. coli, the cloned 201 HuIFN-a gene was modified to delete the preinterferon leader coding sequence with the concomitant addition of an ATG initiation codon. The ATG was placed 5’ to the trinucleotide sequence TGT (nucleotides 70-73 of Fig. 5) which codes for the first amino acid of mature interferon (Mantei et al., 1980; Nagata et al., 1980; Lawn et al., 1981a; Goeddel et al., 1981). Based on the fact that Sau3A cuts at the 3’-side of the TGT codon, a self-complementary oligonucleotide (ACACATCGATGTGT), which is recognized by ClaI and contains an ATG-TGT sequence, was synthesized. Initially the 1237-bp cDNA insert was purified from phage 201 RFI DNA by agarose gel electrophoresis following Hind111 digestion. A Sau3A fragment which contains the coding sequence for amino acid residues 2 to 61 (nucleotides 73 through 252, Figs. 4 and 5) was purified by digesting 30 pg of the 1237-bp fragment with 10 units Sau3A (4 h at 37”C, 50-~1 volume). The appropriate Sau3A fragment was purified after polyac~l~ide gel dectrophoresis by electroelution, followed by phenol extraction and ethanol precipitation. The DNA pellet was suspended in H,O, and the cohesive ends were filled in by treating the DNA with DNA polymerase I Klenow fragment (30 min, room temperature; in 0.1 mM each nucleoside triphosphate, 66 mM Tris * HCI pH 7.5,10 mM MgCl, and 6.6 mM DTT). The oligonucleotide containing the CZaI site was ligated onto this blunt-ended Sau3A fragment by

adding 5 pg of 3zP-end-labelled oligonucIeotide, ATP to 1 mM, and T4 DNA ligase to the previous reaction (Goodman and MacDonald, 1979), followed by overnight incubation at 17 ‘C. This ligation restored the first codon of mature interferon, TGT, and placed an ATG initiation codon at its 5’-end. This ligated DNA preparation was digested with ClaI. The DNA fragment containing the original Sau3A site and now with ClaI termini was purified by agarose gel electrophoresis and ligated into the CZaIsite of pBR322 to produce pCGE32 (see Fig. 7). The resulting DNA was used to transform CGE43. The plasmid DNA in ampicillin-resistant colonies was isolated and characterized by restriction enzyme digestion. One transfo~~t with a DNA fragment containing the interferon coding sequence for amino acids 1 to 61 (nucleotides 70-252, Figs. 4 and 5) in the C/a1 site of pBR322, was designated CGE79, and the resident plasmid pCGE32. The structure of this plasmid is shown in Fig. 7. (h) Cons~uction of an expression vector for mature HuIFN-a type 201 under control of the lacUV5 promoter A plasmid designed to express mature HuIFN-a type 201 under control of the E. colz’ lucUV5 promoter &,) was constructed by the four-molecule ligation outlined in Fig. 7. The plasmid pGL 101 was digested with PvuII and Pst I, and the DNA fragment containingp,, was purified using agarose gel electrophoresis. The plasmid pBR322 was digested with PstI and Hind111 and the large DNA fragment containing the plasmid origin of replication (ori) was purified using agarose gel electrophoresis. The plasmid pCGE32 was first digested with HindIII, then treated with DNA polymerase I Klenow fragment as described in section g above to till in the cohesive ends and produce blunt-ended termini. The resulting DNA was digested with EcoRI and the short EiindIII-EcoRI DNA fragment containing the 5’coding sequence for mature interferon (ATG TGT.. . ; bp 70-181 of Figs. 4 and 5) was purified using agarose gel electrophoresis. The 1237-bp fragment containing the entire 201 cDNA insert was purified after HindI digestion of the RF1 DNA. This DNA fragment was subjected to a partial digest with EcoRI and the EcoRI-i;lindIII fragment containing the 3’-coding region of interferon

(bp 182-1202 of Figs. 4 and 5) was purified by agarose gel electrophoresis. The four different fragments were pooled and ligated with T4 DNA ligase in the manner described above (MATERIALSANDMETHODS, sectiong).After ligation the DNA was used to transform CGE43. Transformants were selected by growth on ampicillin. The plasmid DNA of ~picillin-resist~t colonies was isolated and analyzed by restriction enzyme digestion. The nucleotide sequence of three different plasmids was determined in the junction region of the promoter and mature interferon coding sequence to insure that the ATG for initiation of translation was in the correct position. The three plasmids were designated pCGE35, pCGE36, and pCGE37 (Fig. 7) and the strains in which they were grown: CGE88, CGE89, and CGEOO, respectively. (i) Preparation of extracts and assay for interferon activity in E. coli extracts Bacteria were grown in 50-ml cultures of TY broth with 20 pg/ml ampicillin to a cell density corresponding to 100 as measured by a Klett-Summerson calorimeter. The cells were collected by centrifugation, washed once with 10 ml M9 medium and resuspended in 0.6 ml of a buffer containing 0.01 M Tris * HCl pH 7.5, 0.05 M NaCl, 0.5 mM EDTA, 5 y0 (v/v) glycerol. The resuspended cells were frozen at -20°C overnight. The cells were thawed by brief exposure to room temperature and 0.05 ml egg white lysozyme (20 mg/ml) was added to the mixture. The mixture was incubated at 4°C for 1 h, followed by a brief sonication to reduce the viscosity of the lysed bacterial suspension. The mixture was clarified by centrifugation, and the supernatant assayed for IFN-cr activity. Interferon titers in bacterial extracts were determined by comparison with the NIH HuIFN-cc standards using the CPE assay. The indicator cells, WISH, were challenged with vesicular stomatitis virus.

sections a and c) and used as a substrate for the synthesis of double-stranded cDNA (MATERIALS AND METHODS, section d). Following addition of ~~~dIIIlnkers (MATERIALS AND METHODS, section d), this cDNA was ligated to phage CGF4 doublestranded DNA which had been cut with Hind111 and treated with calf alkaline phosphatase. A portion of the ligated DNA was used to transfect E. cdi strain BNN45 (MATERIALS AND METHODS, section d). A library of approx. 8500 phage plaques was generated on TY plates. Control experiments indicated that 90% of the plaques carried cDNA inserts. (b) Identification of cDNA clones carrying HuIFN-a and HuIFN-~ DNA sequences The CGF4-cDNA library was screened by hyb~dization, using “2P-labeled oligonucleotide probes (MATERIALS ANDMETHODS, sectione). The nucleotide sequences of the probes are shown in Fig. 1. The sequence for the 18-base HuIFN-or gene probe, 18A, was chosen to be complements to a sequence conserved in the HuIFN-c( gene family (Goeddel et al., 1981). The sequence for the HuIFN-~probe, 18B, was chosen to be complementary to a region of similar sequence in the HuIFN-0 gene (Taniguchi et al., 1980). These probes were hybridized to nitrocellulose filter replicas of the CGF4-cDNA library (MATERIALS AND METHODS, section e). These filters were used to expose X-ray film. Representative results are shown in Fig. 2. Two replica filters were made from each plate. One was probed with the HuIFN-a probe 18A and the other filter was probed with the HuIFN-fl probe 18B. The two probes hyb~dized with different plaques on the replica filters, even though the two probes differed in only four of eighteen nucleotides. Approx. 450 cDNAcontaining plaques were present on each plate, but only a small number of plaques were selected by the probes. This result indicated the method was very selective. About 0.6% of the plaques hybridized with the 18A probe and 0.1% of the plaques hybridized

RESULTS

(a) Con&u&ion of an fl-lymphocyte cDNA Iibrary Poly(A)RNA was isolated from Sendai virusinduced lymphocytes (MATERIALS AND METHODS,

Fig.1.Sequences ofoligonucleotide probes. DNA sequences of HuIFN-ct probe 18A and HuIFN-P probe 18B. The boxes mark nucleotides

which differ.

FILTERI.5

FILTER 16

FILTER

16

Fig. 2. Hybridization of cDNA recombinant plaques with oligonucleotide probes. Plaque transfers and DNA denaturization were performed as described in MATERIALS AND METHODS, section e. Nitrocellulose filters were hybridized with 5’-32P-labeled oligonucleotide probes (approx. lo* cpm/pg, 10’ cpm/tilter) overnight at 50°C. Hybridization buffer was 0.05 M sodium phosphate, 0.9 M NaCl, 5 mM Na, . EDTA, 0.1% SDS, 0.1% pyrophosphate, 0.02% bovine serum albumin, 0.02% polyvinyl pyrollidine, 0.02% Ficoll pH 7.0. Filters were washed with 2 x SSPE buffer (0.36 M NaCl, 0.02 M NaPO,, 2 mM Na, . EDTA, pH 7.0), air dried, and placed against X-ray film for autoradiography. Autoradiographic replicas of the filters are shown. Filters 15 and 16 were probed with 18A (left) or 18B (right). The numbers classify plaques which hybridized the labeled probes (200-299 selected by HuIFN-a probe; 300-399 selected by the HuIFN-/l probe) and consequently exposed the X-ray film.

to the 18B probe, HuIFN-figene

suggesting

sequences

both

HuIFN-a

mined from cDNA clone 304 using the 18B probe as

and

a primer. Comparison

were present in the cDNA

library.

published

sequence

1980) indicated (c) Characterization of clones by DNA sequencing

of the 304 sequence for HuIFN-fl probes

18A or 18B can only

select half of the phage plaques

RIALS

AND

nucleotide

METHODS,

probes

section

f). Using

as primers,

the single-stranded

stranded

CGF4

vector,

in either orienta-

the

mature

single-

phage DNA carries only one strand of the cDNA

because

insert.

Interferon

only

clones inserted

can be selected

the phage

which

in

by the probes

contained

single-

stranded DNA complementary to the probes will hybridize. Interferon clones carrying DNA inserts in the opposite orientation would not carry sequences complementary to the probes, and consequently would not hybridize. To find this other class of clones, nick-translated, 32P-labeled clone 201 RF1 (double-stranded) DNA was used to rescreen the nitrocellulose filters by the plaque-filter method of Benton and Davis (1977). This second screening of the filters identified additional plaques not found

HuIFN-c( type B, A and D respectively by Goeddel et al. (1981). Using the same technique, a clone selected by the 18B probe was shown to carry a cDNA insert

with the oligonucleotide probe 18A since both strands of interferon sequence of clone 201 were available for hybridization. As expected, about half

corresponding to the HuIFN-P gene. Figure 6B shows a portion of the nucleotide sequence deter-

-G-C-T-G-T-G-A

which carry inter-

Since the Hind111 linker-

could have inserted

the

only one orientation

stretches of sequence determined from these experiments are shown in Fig. 3A. The sequences derived from clones 203 and 204 were identical. The sequences from 201, 202 and 203 corresponded to three published nucleotide sequences designated

202

into

original

(1977). Initially, four plaques selected by the HuIFN-a probe 18A (designated 201,202,203 and 204) were chosen for DNA sequencing. Short

201

sequences.

tailed cDNA tion

the oligo-

DNAs were used as templates for sequencing by the dideoxy chain termination method of Sanger et al.

203

feron-related

et al.,

the clone 304 is a HuIFN-/I sequence.

The oligonucleotide

DNA from plaques selected by the oligonucleotide probes was purified and partially sequenced (MATE-

with the

(Taniguchi

G-A-A-A-T-A-C-T-T-C-C

A-A-G-A-A-T-C-A-C-T-C-T

G-A-A-A-T-A-C-T-T-C-C

A-A-G-A-A-T-C-A-C-T-C-T

G-A-A-A-T-A-C-T-T-C-C

A-A-G-A-A-T-C-A-C-T-C-T

A. 201

A-G-A-G-A-A-G-A-A-A-T-A-C-A-G-C

T C-T-T-G-T-G-C-C-T-G-G-G-A-G-G-T-T-18A

202

A-G-A-G-A-A-G-A-A-A-T-A-C-A-G-C

C C-T-T-G-T-G-

203

A-G-A-G-A-A-G-A-A-A-T-A-C-A-G-C

-18A

C-T-T-G-T-G-C-C-T-G-G-G-A-G-G-T-T-18A ltC

HuIFN-cc CLONES

304 6.

-A-G-G-A-T-T-C-T-G-C-A-T -T -G-T-G-C-C-T

-T-A-C-C-T-G-A-A-G-G-C-C-A-A-G-G-A-G-T-A-C-A-G-T-C-A-C

-G-G-A-C-C-A-T-A-18B

HuIFN-p CLONES Fig. 3. DNA sequences method indicators.

as described

of recombinant

(B) The determined

et al., 1980).

plaques.

in the text. (A) The limited sequence

Recombinant sequences

of 304 is identical

fl plaques

were selected

could be used to assign to the corresponding

and sequenced

HuIFN-a

using the dideoxy

termination

types using the boxed nucleotides

region of the published

HuIFN-/I

sequence

as

(Taniguchi

93

(e) Characterization of clone 201

of the plaques selected by the nick-translated probe were previously identified by the 18A probe. After plaque purification and rescreening with the radiolabeled probes, 50 plaques carrying HuIFN-a interferon-related sequences were obtained. Ten plaques carrying HuIFNj? interferon sequences were isolated.

The initial sequence information obtained by the dideoxy chain termination method suggested that the clone 20 1 was identical to the HuIFN-a type B clone of Goeddel et al. (1981). More detailed analysis of the insert DNA in 20 1 revealed significant differences between the 201 sequence and the HuIFN-a type B. Initially it was observed that the 20 1 insert DNA was approx. 1200 bp long, whereas the published B sequence is 1042 bp long (Goeddel et al., 1981). A detailed restriction enzyme map, shown in Fig. 4, was generated for the 201 insert DNA. The 201 insert contains four Sau3A restriction enzyme sites, while the published B sequence contains five Sau3A sites. The “missing” site is in the region of 201 corresponding to the protein coding sequence at nucleotide 37 1 of the published B sequence (Goeddel et al., 1981) (nucleotide 369 of the 201 sequence shown in Fig. 5). In addition an AccI site found in the 3’noncoding region of the B sequence is not present in the 20 1 sequence. Finally, two sites for HgiAI were found in the 3 ‘-noncoding region of 20 1, while only one HgiAI site is present in the published B sequence. These differences in the restriction enzyme map suggested the 201 clone might represent a different

(d) Characterization of the interferon clones Phage selected by the 18A or 18B probes were sized by measuring their electrophoretic mobility (MATERIALS AND METHODS, section e) and DNA insert lengths were determined by agarose gel electrophoretic analysis of Hind111 restriction endonuclease digested RF1 DNA molecules (MATERIALS AND METHODS, section e). Twelve of the 50 HuIFN-cr clones carried cDNA inserts of over 800-bp in length. This observation suggested that these recombinants might contain full length HuIFN-a genes. All of the HuIFN-/I clones carried inserts of less than 500 bp. This observation suggested that these recombinant phage do not contain full-length HuIFN-P genes. At this time we have not further characterized the HuIFN-/I clones. One HuIFN-a clone, phage 201, was selected for further study.

TGA I

ATG I (.............,

I

I

l

D

I

*

< 1 I

I

l

I

l

*

1

* I

I

l

I

w (__-__---

l- e-_-m,

(_.+_____>

J I

l

I

D l

I

II

..*.,-

<_+_______*

--I

*______-_I

Fig. 4. A map of restriction enzyme cleavage sites in the cDNA insert of recombinant phage 201. Nucleotides are numbered beginning with the A of the preinterferon ATG initiation codon. The direction is 5’ to 3’, left to right. The restriction enzyme site shown in parentheses is present at additional, unmarked sites in the cDNA insert. The marked site was used for DNA sequencing. The horizontal arrows mark the length of DNA sequence determined from a restriction fragment labeled at the site marked by a vertical line associated with each arrow. The dotted horizontal arrow indicates the sequence determined by the dideoxy chain termination method (see MATERIALS AND METHODS, section f).The solid and dashed horizontal arrows denote the sequences determined by the method of Maxam and Gilbert (1980). Solid arrows denote 5’-end labeling with T4 polynucleotide kinase and dashed arrows denote 3’-end labeling with cordycepin.

94

F3’ T

-40

CAAGCTTG

GTC ATCCATCTGA

-20

ACkAGCTCAG

G-10

CAGCATCCk

1

MCATCCTACA

10

2o

30

A

ATG GCC TTG ACT TTT TAT TTA &TG GTC GCC CTA GTG GTG met ala leu thr phe tyr leu leu val ala leu val val

60 50 70 110 40 80 90 100 120 CTC AGC TAC AAG TCA TTC AGC TCT CTG GGC TGT GAT CTG CCT CAG ACT CRC AGC CTG GGT AAC AGG AGG GCC TTG ATA CTC leu ser tyr lys sei phe ser ser leu gly cys asp leu pro gin thr his ser leu giy asn arg arg ala leu ile :eu 130 140 150 170 160 180 190 200 CTG GCA CAA ATG CGA AGA ATC TCT CCT TTC TCC TGC CTG AAG GAC AGA CAT GAC TTT GAA TTC CCC CAG GAG GAG TTT GAT leu ala gln met arg arg 11e ss~ pro phe ser cys leu lys asp arg his asp phe glu phe pro gin glu glu phe asp 280 210 220 230 240 250 260 270 GAT AAA CAG TTC CAG AAG GCT CAA GCC ATC TCT GTC CTC CAT GAG ATG ATC CAG CAG ACC TTC ARC CTC TTC AGC ACA AAG asp lys gln phe gln lys ala gln ala ile ser val leu his glu met 11e gin gln thr phe asn leu phe ser thr lys 290 300 330 340 360 310 320 350 P GAC TCA TCT GCT GCT TTG GAT GAG ACC CTT CTA GAT GAA TTC TAC ATC GAA CTT GAC CAG CAG CTG RAT GAC CTG GA% TCC asp ser ser ala ala leu asp glu the leu leu asp glu phe tyr ile glu leu asp gin gin leu asn asp ieu glu ser

370

380

390

400

420

410

440

430

A TGT GTG ATG CAG GAA GTG GGG GTG ATA GAG TCT CCC CTG ATG TAC GAG GAC TCC ATC CTG GCT GTG AGG AAA TAC TTC CAA cys val met gin glu val gly val 11e glu ser pie leu met tyr glu asp ser ile leu ala val arg lys tyr phe gln 490 520 450 460 470 480 500 510 ATC ACT CTA TAT CTG ACA GAG AAG AAA TAC AGC TCT TGT GCC TGG GAG GTT GTC AGR GCA GAA ATC ATG AGA TCC TTC ary lie thr leu tyr ieu thr glu lys lys tyr ser ser cys ala trp glu val val arg ala giu ile met arg ser phe

AGA

540

530

560

550

580

570

590

GACCTGGTAC

TCT TTA TCA ATC A?+C TTG CAA AAA AGA TTG AAG ACT AAG GAA TGA ser leu ser ile asn leu gln lys arg leu lys ser lys glu

AACACGGAAA

6

00

TGATTCTaAT

610 AGACTAATAC

620 630 650 670 640 660 690 700 710 680 TFT T AGCAGCTCAC ACTTCGACAA GTTGTGCTCT TTCAAAGACC CTTGTTTCTG CCAAAACCAT GCTATGAATT GAATCAAATG TGTCAAGTGT TTTCAGGAGT 720 GTTAAGCAAC

730 ATCCTGTTCA

820 d CTATCTATAG

A

GCTGTATGGG

830

CACTAGTCCC

930

ATATTATATT

940

770

760 TTACAGATGA

850

840

GGfITTAAATT AGTTTTGTTC

920

750

740

CCATGCTGAT

860 ATGTGAACTT

950

c

870

TT&ATTGTG

960

970

GA

A GCCiTGTTTA

TTAAATTTTT

ACTATkA

AATTCTTTAT

TTATTCTTTA

800 790 810 780 A GGATCTATTC ATCTATTTAT TTAAATCTTT ATTTAGTTAA

AAATTGAACT

880

890

AATTGTGTAA

CAAAAACRTG

900 910 A TTCTTTATAT TTATTATTTT

980 990 1000 1010 A A L1 A 4 ?? CCAACCC~GAvTTGTGCAC TGATTAAAGG AAGTGGTGCA

1020 1030 1040 1050 1060 1070 1080 1090 1100 1110 CTTGCAAACA AGCTCTACTA TCCCTGAGGA AATACCAGAG ACTCTGGARG GTGATATTCA AAAAGCAAAA AGCAAAATTC TAACACTAAT TGAACCTGAC 1120 ATTAAAACAG

1130 CACAGATGAC

Fig. 5. The nucleotide are numbered the coding

and transversions are noted

and derived

consecutively

sequence

1140 TGCTACCATA

1150 GATTCCTGCC

amino acid sequence

from the intitial methionine

of 201 and the HuIFN-a A and insertions

1170 AEGGCAAGAC

of preinterferon

encoded

of the signal sequence.

type B of Goeddel

are shown as the base which replaces

by symbol

1160 TTTCAAACGC

1180

AGGC CAAGCTTG

insert of phage 201. The amino acids

are numbered

as in Fig. 4. Points

et al. (1981) differ are noted (see RESULTS,

9 with the inserting

HuIFN-cr gene rather than an allele of the HuIFN-cr type B gene. To verify this possibility the entire 201 insert DNA was sequenced using the strategy outlined in Fig. 4. The sequence, shown in Fig. 5, demonstrated that the 201 clone is a cDNA copy of a HuIFN-a gene distinct from the type B gene of Goeddel et al. (1981). The 201 gene has a 5’-noncoding sequence 12 bp longer, and a 3’-noncoding sequence 177 bp longer, than the published HuIFN-a type B. The total length of the DNA insert is 1237 bp which

CATACGTAGA

by the cDNA Nucleotides

a base found in the 201 sequence.

are noted by symbol

1190

ATTCATTGGT

where

section e). Transitions

Bases which are deleted in the type B sequence

base.

includes an open reading frame of over 600 nucleotides. This sequence codes for a preinterferon leader peptide 23 amino acids in length and an interferon polypeptide of 166 amino acids. Such a structure is consistent with pub~shed inte~eron structures (Mantei et al., 1980; Nagata et al., 1980; Lawn et al., 1981a; Goeddel et al., 1981). While no poly(A) was found at the 3’-end of the clone, there are two potential poly(A) addition sites (ATTAAA) (Proudfoot and Brownlee, 1976) present. One begins at nucleotide 920, while the other begins at nucleotide

95

coding sequence of 20 1. There is an insertion of a G residue in the 201 sequence (as compared to the B sequence) between nucleotides 371 and 373 of Fig. 5, and deletion of one base following nucleotide 359 (as compared to the B clone). Fig. 6 compares the nucleotide sequences in the B and 201 clones and shows the derived protein sequences coded by these nucleotides. In addition Fig. 6 compares these sequences to another HuIFN-a gene designated D by Goeddel et al. (1981). The one base deletion in 201 changes a GAA triplet to GAG, but does not alter the amino acid, glutamic acid, at this position. A cysteine is conserved in this region for all three sequences, while the insertion in 20 1 places its coding sequence back in frame with B and D. The protein sequences immediately before and after the area of the deletion-insertion are the same for B, D and 20 1. Although the 201 nucleotide sequence codes for different amino acids in the region between the insertion and deletion, the interferon protein coded for by this DNA sequence is biologically active (see section f, below).

993 of the sequence. The absence of a poly(A) sequence at the 3’-end of the 201 clone is not unusual. In a number of cases cDNA clones derived from oligo(dT)-cellulose purified mRNA have been found to lack 3’-poly(A) sequences (Moir et al., 1982; Streuli et al., 1980; Seeburg et al., 1983; Parnes et al., 1981). Within the sequence common to both 201 and HuIFN-a B, 201 contains six single base additions and five single base deletions, relative to HuIFN-a B. In addition there are thirteen singlebase changes within the sequence common to both: four transversions and nine transitions. All but three of these 24 differences are in the noncoding region. Within the coding region the two genes differ from each other by one insertion, one deletion, and one transversion. The 20 1 clone has a cytosine residue at nucleotide position 22 (Fig. 5) while the B type clone has an adenosine residue. This region of the sequence is thought to code for the signal sequence of the interferon protein. The C residue of 20 1 is part of a CTG triplet coding for leucine, while the A residue of the type B sequence is part of an ATG triplet coding for methionine. The most striking differences between the two related HuIFN-a sequences is the one-base deletion and subsequent one-base insertion found in the

GOEDDEL ET AL., i 981

B

THIS PAPER

201

(f) Expression of active HuIFN-a in E. coli To see if the HuIFN-a 201 DNA coded for a gene product with interferon activity, plasmids were con-

AAT GAC CTG AAT GAC CTG GA INSERTION

DELETION

B

ASN ASP LEU GLU VAL LEU CY ASP GLN GLU

GOEDDEC ET AL., 1981

D

ASN ASP LEU GLU ALA CYS VAL MET GLN GLU

GOEDDEL ET AL 1981

201

ASN ASP LEU 1GLU SER (71 CYS VAL MET GLN GLU

THIS PAP;;

DELETION Fig. 6. Comparison and insertion.

Lower drawing

deletion-insertion. box marks

of the nucleotide

INSERTION

and amino acid sequences

shows the derived

The boxes to the left and right indicate

the conserved

cysteine

residue.

of201 and HuIFN-a

amino acid sequences

of HuIFN-a

type B.Upper

drawing

shows the one base deletion

types 201, B, and D in the region of the suggested

the amino acids at the points of the suggested

deletion-insertion.

The central

strutted to express the coding sequence in E. cofi under control ofpLac (MATERIALS AND METHODS, sections g and h; Fig. 7). To do this the preinterferon leader sequence was excised and replaced with a methionine initiation codon, ATG, on the 5’-side of the TGT encoding the putative amino terminal cysteine of mature HuIFN-a interferon (Mantei et al., 1980; Nagata et al., 1980; Lawn et al., 1981a; Goeddel et al., 198 1). Plasmids containing the entire coding sequence of mature HuIFN-a type 20 1 under control of pr,, were constructed by a four-molecule ligation described in MATERIALS AND METHODS, section h and illustrated in Fig. 7. DNA sequence analysis indicated that these plasmids, pCGE35, pCGE36, and pCGE37, had 9,10, and 8 nucleotides respectively between the ribosome-binding site ofp,‘,( and the ATG initiation codon of the interferon gene (not shown). When the strains carrying these plasmids were tested for their ability to produce active interferon the results shown in Table I were obtained. Strains carrying pCGE35 and pCGE36 both produced measurable levels of interferon, while the control strain, carrying pBR322, and CGEBO, carrying pCGE37 produced no detectable interferon activity. At this time we do not know the reason for the precipitate drop in expression when the distance between the ribosome binding site and the ATG is reduced to 8 nucleotides.

TABLE I Expression of HuIFN-z type 201 in E. coli hosts -.Construction $’

RB S-ATG ’ (bp)

CGE43[pBR322] CGES8[pCGE35] CGE89(pCGE36] CGE90[pCGE37]

_ 9 IO 8

IFN titer ’ (ui~its~ml) < 10 4700 9400 < 10

a Plasmids were constructed as described in MATERIALS AND METHODS, section h,and in Fig. 7. The plasmids were transformed into strain CGE43 to produce the strains tested for HuIFN-z expression. b Number of bp between the ribosome binding site (Shine and Dalgarno, 1975) and the ATG initiation codon of mature interferon c Bacteria were grown and extracts prepared as described in MATERIALS AND METHODS, section i.Interferon titers are expressed in units of interferon per ml of bacterial extract.

DISCUSSlON

We have used engineered strains of bacteriophage fl as primary vectors for the cloning, identi~cation, and DNA sequencing of human interferon genes. We have previously used such vectors to clone the gene for calf chymosin (Moir et al., 1982). This method has a number of novel features which aid in the identification and manipulation of cloned DNA sequences. With 1%base oligonucleotide probes of defined sequence we were able to efficiently identify recombinants that make up 17; or less of the population of cDNA clones. Two l&base oligonucleotide probes were used: one being specific for HuIFN-d and the other for HuIFN-b. Although the probe sequences differed in only 4 of 18 possible sites, they selected different populations of clones. Wallace et al. (198 1) have reported that mixed probes, differing in as little as one base, are able to discriminate between completely and partially complementary sequences. The l&base oligonucleotide probes were also used as primers for rapidly determining the nucleotide sequences of selected recombinant phage using the dideoxy chain termination method (Sanger et al., 1977). Availability of defined oligonucleotide probes for interferon made screening by sequencing easy and rapid. The advantages of detined oligonucleotide probes can be multiplied by the use of “universal” primers for fl (Vovis, G.F., unpublished) which make sequencing of any cDNA insert a simple task. Those clones selected by the probes 18A or 18B for which the sequence was investigated carried HuIFN sequences. Clones 201, 202, 203, and 204, which were selected by the I8A probe, carried sequences suggesting they were identical to published HuIFN-z gene sequences. Clone 304, which was selected by the 1Sg probe, contains a sequence identica1 to the published HuIFN-6 sequence. Further analysis of the 201 clone indicated that it represented a gene whose sequence was related to, but different from, the published HuIFN-c( type B (Goeddel et al., 198 1). In addition, at least one of the twelve clones which could carry a full-length HuIFN-a gene contained a nucleotide sequence which appeared to be related to, but was clearly different from the published HulFN-a sequences (not shown). The 201 DNA sequence, when compared to the

2. Klenow blunt

Fig. 7. Construction

of plasmids

to express

HuIFN-a

201 gene in E. coli (MATERIALS

section f for experimental

details). Dashed

are (clockwise

PvuII (blunt) to blunt HindIII,

HindIII;

fromp&:

RI, EcoRI;

ApR, ampicillin

arrows indicate the 5’-to-3’

resistance.

direction

AND METHODS,

of transcription.

EcoRI to EcoRI, Hind111 to HindIII,

section h and RESULTS,

The joints in the four-molecule and PstI to PstI. Abbreviations:

ligation HIII,

published HulFN-cr type B of Goeddel et al. (1981), has a one-base deletion followed by a distal one-base addition twelve nucleotides away. The deletionaddition has the effect of putting the protein coding sequence out of, and then back into, the type B translational reading frame. Comparison of the 201 nucleotide sequence and published HuIFN-z sequences (Goeddel et al., 1981) suggests an evolutionary path. Goeddel et al. (1981) have suggested the HuIFN-cw type B is descended from a parental type, such as HuIFN-a type D, by single-base insertion and deletion. The gene represented by 201 could have descended from HuIFN-a type B by a subsequent, additional, single-base deletion and insertion in the following manner: D:

CTG GAA GCC TGT GTG ATA CAG

I insert T B:

I

I

CTG GA

I

in a manner

to plasmid

We thank Robert Breeze and Chris Goff for critically reading the manuscript. This investigation was supported by a contract with Interferon Sciences, Inc.

REFERENCES

Beaudoin,

J.: Studies

length

particles

Wisconsin,

CAG

analogous

ACKNOWLEDGEMENTS

Benton,

I

CTG GAA GTCCTGT GTG AT delete A

201:

I delete A

transfected vectors.

on coat protein of coiiphage

Madison,

W.D.

and

mutants

and on multiple

M13. Ph.D.

Thesis,

Univ. of

WI, lY70.

Davis,

R.W.:

clones by hybridization

Screening

Igt recombinant

to single plaques

in situ. Science

196

(1977) 180-182.

insert G I

GiCCTGT GTG ATG CAG

Davis,

R.W., Botstein,

Genetics. Harbor, Degrave,

Cold

D. and Roth, 3.R.: Advanced

Spring

Laboratory,

Bacterial

Cold

Spring

NY, 1980, p. 7. W., Derynck,

R., Tavernier,

Fiers, W.: Nucleotide

An additional point suggesting that the 201 gene evolved from the type B gene is the close homology between the 3’- and 5’-noncodingregions of 201 and B. There are five single-base additions, four singlebase deletions, and twelve single-base changes in 502 nucleotides of noncoding sequence which the two genes have in common (~omp~ing 201to B). In spite of changes in the coding sequence induced by the deletion-insertion in the 201 gene, active HuIFN-z can be expressed in bacteria under control of p/,,. (Table I). The advantages which come from using the singlestranded phage system are central to the utility of the method described in this paper. These advantages are summarized by Zinder and Boeke ( 1982). f 1 is an easy phage to grow, manipulate and store. Plaques carrying inserted DNA can be added to broth, heated at 65 oC and stored for long periods of time. Mature sing?-stranded circular phage DNA, a convenient template for the dideoxy sequencing method, is simply prepared from plate stocks of phage. Doublestranded fl RF1 is prepared by methods analogous to those in use for preparation of plasmids such as pBR322. In addition, fl RF1 can be manipulated and

Harbor

human

fibroblast

sequence

(&) interferon

J., Haegeman,

G. and

of the chromosomal

gene for

and of the flanking

regions.

Gene 14 (1981) 137-143. Derynck,

R., Content,

J., Devos,

human tibroblast Desrosiers,

J., De Clerq, E., Voikaert,

R. and Fiers, interferon

R.C., Friderici,

terization

G.. Tavernier,

and structure

gene. Nature

hepatoma

in the methylated

of a

285 (1980) 542-549.

K.H. and Rottman,

of Novikoff

heterogeneity

W.: Isolation

mRNA

F.M.: Characmethylation

5’ terminus.

and

Biochemistry

14

S., Karr, S.R. and Przybyla,

A.:

(1975) 4367-4374. Foster, J.A.,Rich, Translation

C.B.,Fletcher, of chick

acid. Comparison culture.

aortic

of elastin

Biochemistry

Goeddel,

D.V.,

elastin synthesis

Yelverton,

Stebbing,

N., Crea, R.+ Maeda,

logically Goeddel,

W., Seeburg,

J.M., Gross. leukocyte

organ

A., Heyneker,

active. Nature

H.L..

P.H., Dull, T., May, L.,

S., McCandliss,

M., Familletti,

interferon

D.V., Shepard,

Crea, R.: Synthesis

in chick aorta

E., Ullrich,

G., Holmes,

Human

ribonucleic

19 (1980) 857-864.

Miozzari, A., Tabor,

messenger

R., Sloma,

P.C. and Pestka,

produced

S.:

by E. coli is bio-

287 (1980a) 411-416. H.M., Yelverton,

of human

fibroblast

E., Leung, interferon

D. and in E. co&

Nucl. Acids Res. 8 (1980b) 4057-4074. Goeddel,

D.V., Leung, D.W., Dull, T.J., Gross,

McCandliss, Gray,

interferon H.M.

of eight distinct

cDNAs.

Nature

and MacDonald,

genes from a mixture

M., Lawn, R.M..

P.H., Ullrich, A., Yelverton,

P.W.: The structure

ieukocyte Goodman,

R., Seeburg,

of cDNA

cloned

E. and human

290 (1981) 20-26.

R.J.: Cloning molecules,

of hormone

in Wu, R. (Ed.),

99

Methods in Enzymology, Vol. 68, Academic Press, New York, 1979, pp. 75-90. Guarente, L., Lauer, G., Roberts, T.M. and Ptashne, M.: Improved methods for maximizing expression of a cloned gene: a bacterium that synthesizes rabbit pglobin. Cell 20 (1980) 543-553. Holmes, D.S. and Quigley, M.: A rapid boiling method for the preparation of bacterial plasmids. Anal. Biochem. 114 (1981) 193-197. Isaacs, A. and Lindenmann, J.: Virus interference I. The interferon. Proc. Roy. Sot. B147 (1957) 258-267. Lawn, R.M., Gross, M., Houck, C.M., Franke, A.E., Gray, P.V. and Goeddel, D.V.: DNA sequence of a major human leukocyte interferon gene. Proc. Natl. Acad. Sci. USA 78 (1981a) 5435-5439. Lawn, R.M., Adelman, J., Dull, T.J., Gross, M., Goeddel, D. and Ullrich, A.: DNA sequence of two closely linked human leukocyte interferon genes. Science 212 (1981b) 1159-l 162. Lawn, R.M., Adelman, J., Franke, A.E., Houck, CM., Gross, M., Najarian, R. and Goeddel, D.V.: Human fibroblast interferon gene lacks introns. Nucl. Acids Res. 9 (1981~) 1045-1052. Mantei, N., Scharzstein, M., Streuli, M., Panem, S., Nagata, S. and Weissmann, C.: The nucleotide sequence of a cloned human leukocyte interferon cDNA. Gene 10 (1980) l-10. Maxam, A. and Gilbert, W.: Sequencing end-labeled DNA with base-specific chemical cleavages, in Grossmann, L. and Moldave, K. (Eds.), Methods in Enzymology, Vol. 65, Academic Press, New York, 1980, pp. 499-560. Messing, J.: M13mp2 and derivatives: A molecular cloning system for DNA sequencing, strand specific hybridization, and in vitro mutagenesis, in Walton, A.G. (Ed.), Proceedings of the Third Cleveland Symposium on Macromolecules. Elsevier, Amsterdam, 1981, pp. 143-153. Miller, J.H.: Experiments in Molecular Genetics. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1972, p. 431. Model, P. and Zinder, N.D.: In vitro synthesis of bacteriophage fl protein. J. Mol. Biol. 83 (1974) 231-251. Moir, D., Mao, J., Schumm, J., Vovis, G., Alford, B.L. and Taunton-Rigby, A.: Molecular cloning and characterization of double-stranded cDNA coding for bovine chymosin. Gene 19 (1982) 127-138. Nagata, S., Taira, H., Hall, A., Johnsrud, L., Streuli, M., Ecsodi, J., Boll, W., Cantell, K. and Weissmann, C.: Synthesis in E. coli of a polypeptide with human leukocyte interferon activity. Nature 284 (1980a) 3 16-320.

Nagata, S., Mantei, N. and Weissmann, C.: The structure of one of the eight or more distinct chromosomal genes for human interferon-a. Nature 287 (1980b) 401-408. Ohno, S. and Taniguchi, T.: Structure of a chromosomal gene for human interferon 8. Proc. Natl. Acad. Sci. USA 78 (1981) 5305-5309. Parnes, J.R., Baruch, V., Felsenfeld, A., Ramanathan, L., Ferrini, U., Appella, E. and Seidman, J.G.: Mouse µglobulin cDNA clones: A screening procedure for cDNA clones corresponding to rare mRNAs. Proc. Nat]. Acad. Sci. USA 78 (1981) 2253-2257. Proudfoot, N.J. and Brownlee, G.G.: 3’ Non-coding region sequences in eukaryotic messenger RNA. Nature 263 (1976) 211-214. Sanger, F., Nicklen, S. and Coulsen, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74 (1977) 5463-5467. Seeburg, P.H., Sias, S., Adelman, J., De Boer, H.A., Hayflick, J., Jhurani, P., Goeddel, D.V. and Heyneker, H.L.: Efftcient bacterial expression of bovine and porcine growth hormones. DNA 2 (1983) 37-45. Shine, J. and Dalgarno, L.: Determinant of cistron specificity in bacterial ribosomes. Nature 254 (1975) 34-38. Stewart II, W.E.: The Interferon System. Springer Verlag, New York, 1979. Streuh, M., Nagata, S. and Weissmann, C.: At least three human type a interferons: structure of a2. Science 209 (1980) 1343-1347. Taniguchi, T., Ohno, S., Fujii-Kuriyama, Y. and Muramatsu, M.: The nucleotide sequence of human tibroblast interferon cDNA. Gene 10 (1980) 1l-15. Wallace, R.B., Johnson, M.J., Hirose, T., Miyake, T., Kawashima, E.H. and Itakura, K.: The use of synthetic oligonucleotides as hybridization probes, II. Hybridization of oligonucleotides of mixed sequence to rabbit /J-globin DNA. Nucl. Acids Res. 9 (1981) 879-894. Young, R.A. and Davis, R.W.: Efficient isolation of genes by using antibody probes. Proc. Natl. Acad. Sci. USA 80 (1983) 1194-1198. Zinder, N.D. and Boeke, J.D.: The tilamentous phage (Ff) as vectors for recombinant DNA - a review. Gene 19 (1982) l-10. Communicated by J. Carbon.