ELSEVIER
Virus Research 39 (1995) 299-309
Virus Research
Sequence analysis of the fiber genomic region of a porcine adenovirus predicts a novel fiber protein 1 S t e v e n B. K l e i b o e k e r * Virology Swine Research Unit, National Animal Disease Center, USDA, Agricultural Research Service, P.O. Box 70, 2300 Dayton Avenue, Ames, Iowa 50010, USA Received 5 May 1995; revised 24 July 1995; accepted 24 July 1995
Abstract
The complete nucleotide sequence of the putative fiber protein of a porcine adenovirus isolate, NADC-1, was determined. The coding sequence for the fiber protein was found to be 2112 nucleotides, predicting a 703 amino acid protein with a calculated molecular mass of 76,681 Da. The coding sequence is located between 86 and 92.5 map units. A polyadenylation signal was found 44 bases downstream of the stop codon. Northern hybridization analysis identified a band at 2.4 kb, which likely includes the primary transcript plus a tripartite leader (typically found with adenovirus transcripts) and a polyA(+) tail. The predicted NADC-1 fiber protein was found to have a tail domain comparable in size and sequence to most adenovirus fiber proteins, a comparatively short shaft region, and a head region that was more than twice as large as that of other adenovirus fiber proteins. Each of the three structural domains (tail, shaft, and head) of the predicted NADC-1 fiber protein was found to be most similar to the corresponding domain of a different mammalian adenovirus. The NADC-1 fiber head contained an RGD sequence, a motif that is found in the penton protein of other adenoviruses. Furthermore, the predicted amino acid sequence of the C-terminal half of the NADC-1 fiber head has significant homology to S-lectin proteins, a characteristic not shared with other adenovirus fiber proteins. Thus the pre-
1Disclaimer. No endorsements are herein implied. Brand names are necessary to report factually on available data; however, the USDA neither guarantees nor warrants the standards of the products, and the use of the names by the USDA implies no approval of the products to the exclusion of others that may also be suitable. * Present address: African Swine Fever Virus Research Group, Plum Island Animal Disease Center, USDA, ARS, NAA, P.O. Box 848, Greenport, NY 11944-0848, USA. e-mail:
[email protected]. Elsevier Science B.V. SSDI 0168-1702(95)00079-8
300
S.B. Kleiboeker/ Virus Research39 (1995) 299-309
dicted amino acid sequence of the NADC-1 fiber head is unique among adenoviruses for which the sequence of the fiber protein is known. Keywords: Porcine adenovirus; Fiber protein; Sequence analysis
Adenoviruses are double-stranded DNA viruses that have been isolated from a large number of animal species, including swine. While the majority of adenovirus infections in swine are thought to be subclinical, porcine adenovirus (PAV) has been associated with encephalitis, pneumonia, kidney lesions, and diarrhea (Derbyshire, 1992). Restriction endonuclease cleavage patterns have been described for PAV serotypes 1, 2, and 3 (Garwes and Xuan, 1989; Reddy et al., 1993; Reddy et al., 1995a) and for PAV serotype 5 (Tuboly et al., 1995). Our laboratory has previously reported the restriction endonuclease site mapping and genomic cloning of NADC-1, a PAV isolate (made by workers at the National Animal Disease Center) that is likely to be a strain of PAV serotype 4 (Kleiboeker et al., 1993). We have also recently reported the location and nucleotide sequence of the early transcription region 3 (E3) (Kleiboeker, 1994) and early transcription region 1 (El) (Kleiboeker, 1995) in this same PAV. The nucleotide sequence of the pVIII, putative E3 proteins, and fiber protein genes for the reference strain of PAV3 has also been determined (Reddy et al., 1995b). Proteins encoded in the E3 region are not required for adenovirus replication in cultured ceils or during acute infections (Morin et al., 1987; Ginsberg et al., 1989), and several authors have reported inserting heterologous sequences into the E3 region and expression of these sequences in vivo (reviewed by Grunhaus and Horwitz, 1992) with a goal of using adenovirus as a vector for vaccination against other viral agents. The adenovirus fiber protein is considered to be the viral protein involved with high affinity binding to the host cell (Devaux et al., 1987). To further understand how adenoviruses interact with the host cell, we have initiated studies of the NADC-1 fiber protein. In the present study, the predicted amino acid sequence of the putative PAV fiber gene is reported. NADC-1 was propagated in PK-15 (porcine kidney) cells. PK-15 cells were grown and maintained in Eagle's minimal essential medium F-15 supplemented with 0.25% lactatbumin hydrolysate, 10% fetal bovine serum, 1% L-glutamine, 1% sodium pyruvate, and 50 /~g/ml gentamicin sulfate at 37°C, 5% CO 2 in a humidified incubator. Cell monolayers that were 80% confluent were infected with NADC-1 at a m.o.i, of approximately 1.0. When 75% of cells showed c.p.e. (typically 6 to 8 days post infection), viral DNA was prepared from infected ceils according to Hirt (1967), as modified by Shinagawa et al. (1983). Genomic clones were prepared and mapped as previously described (Kleiboeker et al., 1993). Based on the restriction endonuclease site map of NADC-1 (Kleiboeker et al., .1993), plasmid clones representing the Hind III H and I fragments (m.u. 81.3-86.3, and 86.3-88.4, respectively), SphI B fragment (m.u. 80.2-96.2), Barn HI and Eco RI fragment (m.u. 81.8-87.3), and Kpn I A fragment (m.u. 63.8-90.1) were prepared for sequencing by Qiagen column purification (Qiagen, Inc.). Additional
S.B. Kleiboeker/ Virus Research39 (1995) 299-309
301
subclones were generated using exonuclease III (Erase-a-Base system, Promega) and prepared for sequencing. Double-stranded nucleotide sequencing (Tabor and Richardson, 1987) with Taq polymerase and fluorescently labelled dideoxynucleotides (Applied Biosystems International, Prism system) (Sanger et al., 1977) was performed for both strands in duplicate for analysis with an Applied Biosystems 373A automated sequencer. Universal forward and universal reverse primers, along with oligonucleotide primers synthesized from generated sequence, were used in sequencing reactions. Nucleotide sequence information was analyzed using Genesis software and Intelligenetics GeneWorks version 2.2 software. Comparisons of the nucleotide sequence and the predicted amino acid sequence were made using a nonredundant BLAST search and comparison (Altschul et al., 1990) of the EMBL, GenBank, Brookhaven Protein Data Bank, and SWlSS-PROT databases from the National Center for Biotechnology Information BLAST e-mail server. Amino acid sequence alignments were performed using the GeneWorks version 2.2 software. For all amino acid sequence alignments reported (unless otherwise indicated), the cost to open a gap = 5, the cost to lengthen a gap = 25, the minimum diagonal offset = 5, and the maximum diagonal offset = 10. Total cellular RNA was prepared from 25 cm 2 flasks of control cells and from virus-infected cell cultures when the cultures were showing approximately 25, 50 and 100% c.p.e. At the appropriate time point, 3 ml of lysis buffer (5 M guanidine isothiocyanate, 50 mM Tris-HCl, 10 mM EDTA, and 5% (v/v) 2-mercaptoethanol) was added to each flask. After lysis was complete, the solution was layered on top of 1.2 ml of 5.5 M CsC1 followed by centrifugation at 133,000 x g for 18 h at 18°C. The RNA pellet was resuspended in 10 mM Tris-HCl, 1 mM EDTA, pH 7.4, extracted twice with chloroform-butanol (4:1). Ten/xg of total cellular RNA was separated in a 0.9% agarose (SeaKem LE, FMC), 2.2 M formaldehyde, 100 mM MOPS, 2 mM EDTA, pH 7.0 gel. Following electrophoresis, the RNA was blotted to Hybond-N (Amersham) by capillary action using 2 M ammonium acetate. Blots were UV crosslinked at 120,000 m J / c m 2 (Stratalinker model 1800, Stratagene) prior to hybridization experiments. An enhanced chemiluminescent system (ECL direct nucleic acid labeling and detection system, catalog number RPN.3000, Amersham) with Kodak X-OMAT AR radiographic film was used, following the supplier's recommendations, for detection following Northern blot hybridizations. The blot was probed with a cloned 1754 bp PCR fragment corresponding to nucleotides 687-2441 (see below) shown in Fig. 1. The fiber head was cloned into the Bam HI and Xho I sites of the vector pcDNAI/Amp (Invitrogen) by performing the polymerase chain reaction (PCR) with the genomic NADC-1 Sph I B clone as the template. Primers used were primer 1: GCCGCGGATCCATG-ACCCTGTGGACGGGG(~CT and primer 2: GCGCCCTCGAG-CTACAGTATCTGAGGGTA. Primer 1 corresponds to nucleotides 687-704 (Fig. 1), shown in bold, and has a 5' extension, shown before the hyphen, which contains a Barn HI restriction endonuclease site (underlined) and an ATG start codon. Primer 2 corresponds to nucleotides 2424-2441 (Fig. 1),
302
S.B. 1,3eiboeker/Virus Research 39 (1995) 299-309
CAGCAACCAC TTGAAGGTGT GCATAAACTG TGTGAATCAT GGGTTCCTGG GCAGTGTCAC GTGTAATTGC T ~
75
TCCTGAGTGT CTCAGCAATG CTATCAACAG CATCTGTGCT ATCACAGGTC AAACTGAATA AAAACATACA CCTTA
150
CCTGAAAAGT CTGCAAAATT CTTCTGTGTC CAGGGGTCCC ACGACGTCCA CCCAGTCCCC GTTGTCCAAA CTGGT
225
"TATA" B O X GTAGGTCAGA TTATGGGAGT ATGCCAGTTT CCTGAAACGC CTAAAAGTTA TATTTGTCTC CCACAGATTC CCACC start fiber
300
GTAGCGAAGC TTCATTCTAC CCAAAAAAGA TGAAGCGGTC CGTCCCGTCC GACTTTAACC CGGTGTACCC CTATG
375
M K R S V P S D F N P V Y P Y D A C T A T C A A C C C A T T A G C C T C ATC-CCGGCTT T C T A T G A C A A C T A T G G C T T C C A C G A G G G T C C C A G C G G T G T T C T A A
16 450
Y Q P ~ C A T
I S L M P A F Y D N Y G F H E G P S G V L S TGCAAACCCT CTGGGTTATA CCCCCAGAAA AAAACTCTGC CTTAAGCTGG GGGAGGGTCT AGCAC
41 525
L N I A N P L G Y T P R K K L C L K L G E G L A L TGGACTCGGA TGGACACCTT CGTGTCCAGA TACCAGACAT GCAGGCCCAG CCTCCACTGC TCTACCAAGG ACATC
66 600
D S D G H L R V Q I P D M Q A Q P P L L Y Q G H R GACTTAGCTT ACTCTTTGAT GCGGACGCCG GATTTCATCT CACAGAAGAT GGAGCTTTGT CCCTGACAAA AACCT
91 675
L S L L F D A D A TAGTCTATCC CACCCTGTGG A~CCTG
F H L T E D G A L S L T K T L CTCCCGAGGC CAACGTCACC T T C T ~ AGAA~TCCCC ATCTG
116 750
V ¥ P T L W T G P A P E A N V T F S G E N S P S G G C A T T C T C A G A C T G T G T C T C A G C A G A A C C G GGGC4YACGGT C A T T G G C A C C C T G T C T G T A C A A G G T A G C C T C A C G A
141 825
I
L
R
L
C
L
S
R
T
G
G
ACCCCAGTAC CGGTCAGACC ~ A T G A P
166 900
T
N
S
P
P
T
S
F
T
D
G
D
A
T
L
D
D
R
G
L
P
N
Q
A
L
F
A
S
G
Y
E
L
Y
G
S
T
L
L
N
G
S
D
291
ATAGCTCCCC CTCCTTCCTG AGAGAACTTC CCCTTGCATC CGAGGCGGGC TACTTTGGCA AACTGGCAGC TGCCT
1275
S S P S F L R E L P L A S E A G ¥ F G K L A A A S CTGAGGAAAT GCCAGCCCCT CCTGAGGCCC AGACGCAGGA CCAAGCAGCT GAGGAGCCCC CGGCTCCTGC TGAGG
316 1350
E E M P A P P E A Q T Q D Q A A E E P P A P A E A CTGAGGCCCC CGCTCCTGCT GAGGCTGAGG CTGAGGCTGA ACCGCCCCGA AAACCCCCTA GGGGTGACCT GGCCG
341 1425
E A P A P A E A E A E A E P P R K P P R G D L A A CCCTATACAA TAGGGTCCAC AGCGACACCC GCGCAGAGGA CACACCAACC AGCCCCGAGT TGGTCACAAC CTTGC
366 1500
L Y N R V H S D T R A E D T P T S P E L V T T L P CAGACCCCTT TGTCCTCCCC CTACCCGACG GAGTCCCAAC CGGTGCGAGC ATTGTGTTGG AAGGTACCCT CACAC D P F V L P L P D G V P T G A S I V L E G T L T P
391 1575 416
CCTCCGCTGT GTTTTTTACC CTGGATCTGG TGACCGGC~C CGCCAGTCTG GCGCTGCACT TTAACGTGCG CCTCC S A V F F T L D L V T G P A S L A L H F N V R L P CACTGGAAGG CGAAAAGCAC ATTGTGTGCA ACTCCAGAGA GGGTAGCAGC A A C ~ G AAGAAGTAAG ACCGC
1650 441 1725
L E G E K H I V C N S R E G S S N W G E E V R P Q AGGAGTTCCC CTTTGAAAGG GAAAAGCCAT TCGTCCTGGT CATTGTCATC CAAAGTGACA CATACCAGAT CACTG
466 1800
E F P F E R E K P F V L V I V I TGAACGGGAA GCCTCTGGTG GATTTTCCAC AGAGACTACA ~TTACC N G K P L V D F P Q R L Q G I T
Q S D T Y Q I T V CGTGCCTCCC TATCCGGAGA CCTTG R A S L S G D L V
491 1875 516
TGTTTACCCG GTTGACAATG TACCCACCCG GAGACCCCCG TCCCACAACC TTGTTACCAC CCCCCGCAGC TCCCC F T R L T M Y P P G D P R P T T L L P P P A A P L TGGACGTAAT CCCAGATGCC TATGTGCTCA ATCTGCCCAC CGGACTGACG CCTAGAACAC TCCTCACCGT CACGG
1950 541 2025
D V I P D A GAACCCCCAC GCCCCTCGCC T P T P L A CCCTCCACTT TAATGTCGGC
566 2100 591 2175
Y V L N L P T G L T GAATTTTTTA TTGTGAATCT GGTCTACGAT E F F I V N L V Y D TTCACCTCTG ACAGCAAAGG CCACATCGCC
V
T
V
S
266
F
N
F
G
1200
T
L
Y
Q
F
M
G
L
V
TTACCATGAC CTTCACTAAC TCTCCCCCCA CCTCATTTGG TACCGACCTG GTGCAATTTG GCTACCTGGG TCAGG T
I
N
S
241 1125
K
M
L
N L T A Y P R L I Q T L T S S Y I Y T Q A H L D H ACAATAACAG TGTGGTGGAC ATCAAGATAG GGCTCAACAC AGACCTGAGG C C C A ~ CCTACGGCCT AAGCT
I
G
T
216 1050
D
L
G
V R G S W G M K D Q D T L V T P I A N G Q Y L M P CCAACCTCAC TGCATACCCT CGCCTCATAC AGACCCTAAC TTCCAGCTAC ATTTACACAC AAGCGCACCT TGACC
V
T
I
191
V
Q
V
975
S
G
T
L
N
T
G
TCGTCCGAGG GTCCTGGGGA ATGAAAGACC AAGATACCCT GGTGACTCCC ATTGCCAATG GGCAGTACCT GATGC
N
S
N
ACCTTTACTT TGACGCAGAC GGCAATGTGC TGTCTGAGAG CAACC
Q
P R T L L T V T G T T A C A C T A T G A T T C C A A A A A TGTC-G L H Y D S K N V A TGCAATGCCA GAATGAATGG CACAT
S.B. Kleiboeker/ Virus Research 39 (1995) 299-309
303
L H F N V G F T S D S K G H I A C N A R M N G T W GGGGAAGTGA AATCACAGTG TCTGATTTCC CCTTTCAAAG GGGAAAACCC TTCACTCTGC AGATTCTCAC CAGAG G
S
E
I
T
V
S
D
F
P
F
Q
R
G
K
P
AC~-4~AGACTT C C A A G T C C T C G T A G A T A A A C A A C C T T T A A C C C A G T I ~ A A
F
T
L
Q
I
L
T
R
E
TACACC~2%X~A AC~?~ACTGGA C C A A A
616 2250 641 2325
A D F Q V L V D K Q P L T Q F Q Y R L K E L D Q I TCAAATATGT ACACATGTTT GGCCATGTTG TGCAAACCCA CCTGGAACAC CAAGTGCCAG ATACTCCAGT TTTTT
666 2400
K Y V H M F G H V V Q T H L E H Q V P D T P V F S CTACTGCGGG AGTTTCGAAA GTTTACCCTC AGATACTGTA GTCAGACACA AAGATGTAAC TGCTTGCATG AGAAA
691 2475
T
A
G
V
S
K
V
Y
P
Q
I
L
X
stop fiber
703
p o l y a d e n y l a t i o n signal GTTTATTAGC AATAAAGCTC CTGGAAACGA TGCGTGTGGT GGATCTTTTC CCCCTCCCCC AAATTTCATT ~
2550
ACAATACCC
2559
Fig. 1 (continued).
shown in bold, and has a 5' extension, shown before the hyphen, which contains an Xho I restriction endonuclease site (underlined). For PCR, Taq polymerase (Perkin Elmer) was used, and 25 cycles were carried out 94°C for 10 s, 50°C for 30 s, 72°C for 30 s. After the final cycle the reactions were held at 72°C for 10 min. Following PCR, the reaction products were digested with Bam HI and Xho I (Gibco-BRL) and cloned into the p c D N A I / A m p vector using standard methodologies. In vitro transcription and translation was performed using rabbit reticulocyte lysate and T7 RNA polymerase (TNT' T7 Coupled Reticulocyte Lysate System, Promega, catalog number L4610) according to the supplier's instructions. Labeling was performed with 35[S]-methionine (New England Nuclear). In vitro transcription and translation products were separated in a sodium dodecyl sulfate polyacrylamide gel (10% acrylamide)(Laemmli, 1970) and visualized by autoradiography using Kodak X-OMAT AR radiographic film. The nucleotide sequence of NADC-1 was determined for the fiber genomic region which is located between 86 and 92.5 map units (Fig. 1). The first 329 nucleotides in Fig. 1 code for what has been previously reported as the end of the NADC-1 E3 region (Kleiboeker, 1993). The proposed fiber protein begins at nucleotide 330 (Fig. 1) and extends for 2112 nucleotides thus coding for a protein which is 703 amino acids long with a predicted molecular mass of 71,681 Da. A canonical polyadenlyation signal (AATAAA) was found 44 bases downstream of the stop codon. The overall predicted size of the putative NADC-1 fiber protein is larger than that of many adenovirus fiber proteins, but smaller than that of the BAV3 fiber protein (Mittal et al., 1993). Northern hybridization analysis, using a double-stranded, cloned 1754 bp PCR fragment which corresponds to the region coding for the NADC-1 fiber head, identified a single band at 2.4 kb and a group of bands above 4.4 kb in total
Fig. 1. T h e nucleotide sequence from 85 to 92.5 m.u. of N A D C - I is shown. T h e sequence was submitted to G e n B a n k and assigned accession n u m b e r U25120. Below the nucleotide sequence is the deduced amino acid sequence of the putative fiber protein. For the nucleotide sequence, a potential T A T A box and a polyadenylation signal are indicated. For the deduced amino acid sequence, N-linked glycosylation sites are underlined and myristylation sites are double underlined.
304
S.B. Kleiboeker / Virus Research 39 (1995) 299-309
cellular RNA purified from virus-infected cells when 25%, 50%, and 100% of cells exhibited cytopathic effect (data not shown). As the cytopathic effect progressed, the band at 2.4 kb shifted down, possibly due to degradation or processing of this RNA species. The band at 2.4 kb likely represents the processed mRNA species for the fiber protein which consists of a 2.1 kb open reading frame plus a 0.2 kb tripartite leader sequence, which is commonly seen with adenoviruses (Sharp, 1984), and a 0.1 to 0.2 kb poly A ( + ) tail. The bands located above 4.4 kb may represent colinear genomic transcripts or polyadenylated nuclear RNA which are produced during adenovirus transcription (Sharp, 1984). Comparison of the predicted NADC-1 fiber amino acid sequence was made using a nonredundant BLAST search and comparison (Altschul et al., 1990) of the EMBL, GenBank, Brookhaven Protein Data Bank, and SWlSS-PROT databases from the National Center for Biotechnology Information BLAST e-mail server. Although the nucleotide sequences coding for the fiber proteins of numerous adenovirus serotypes have been determined, only those which are similar in either size or sequence to the NADC-1 fiber sequence were further analyzed. As described by Green et al. (1983), the fiber protein of adenoviruses is likely to be separated into three domains, namely the tail (at the N-terminus), the shaft (in the middle), and the head (at the C-terminus). Each structural domain of the predicted NADC-1 fiber protein was found to be most similar to the corresponding domain of a different mammalian adenovirus. However, for each structural region, amino acid sequences from several different adenoviruses had similar levels of homology. The NADC-1 fiber tail, which is located at the N-terminus of the protein, demonstrated strongest homology to the bovine adenovirus 3 (BAV3) fiber tail (Mittal et al., 1993) (Fig. 2A). The NADC-1 fiber tail was 54% identical to the BAV3 fiber tail over 37 amino acids. The NADC-1 fiber shaft was only one-third as long as many adenovirus shafts, consisting of six repeating domains, presented according to the structural notation of Green et al. (1983) (Fig. 3A). This is similar in length to the human adenovirus 3 (HAV3) fiber shaft, which also contains only 6 repeating domains (Signals et al., 1985). In addition, the amino acid sequence of the NADC-1 fiber shaft was most similar to the HAV3 fiber shaft with 28% of residues identical over 82 amino acids (Fig. 2B). However, several large gaps were present in the amino acid alignment of the HAV3 and NADC-1 fiber shafts. The predicted fiber head of NADC-1 is much larger than that of other adenoviruses. For NADC-1, the fiber head is 584 amino acids long. The amino acid length of the fiber head of HAV-2 (Heriss6 and Galibert, 1981; Heriss6 et al.,
Fig. 2. T h e amino acid alignment for various regions of the NADC-1 fiber protein to other amino acid sequences is shown. In Panel A, the sequences shown correspond to residues 1-37 for NADC-1 and to 1-37 for the BAV3 fiber protein. In Panel B, the sequences shown correspond to residues 38-119 for NADC-1 and to 43-130 for the HAV3 fiber protein. In Panel C, the sequences shown correspond to residues 120-310 for NADC-1 and to 261-449 for the PAV3 fiber protein. For Panel C, the cost to open a gap = 1, the cost to lengthen a gap = 10, the m i n i m u m diagonal offset ~ 4, and the m a x i m u m diagonal offset = 10. In Panel D, the sequences shown correspond to residues 393-527 for NADC-1 and to 131-265 for galectin-3.
305
S.B. Kleiboeker / Virus Research 39 (1995) 299-309
A BAV3 fiber tail NADCI fiber tail
~
KRP
37 37
QE
B HAV3 fiber shft NADCI fiber shft
.................................
9
HAV3 fiber shft NADCI fiber shft
HAV3 fiber shft NADCI fiber shft
HRL~FDAD
14
...................
AGFHLTEDGA LSL~LVYP
82
C PAV3 fiber h e a d NADCI fiber head
VQ
P
PAV3 fiber h e a d NADCI fiber head
PAV3 fiber h e a d NADCI fiber head
--
~__~
~_~_R~ I - -Q T ~ _ ~ _ ~ ~ Q
PAV3 fiber h e a d NADCI fiber head
PAV3 fiber h e a d NADCI fiber head
48
-AH L
S
G
~_q~_T.v° ~GQDss-
EAGY FG
93
P
139
179
191
D Gal spec lectin NADCI fiber hd
Gal spec lectin NADCI fiber hd
EGEK~LLt~ ~ E ~ t ~ _ ~ Q
Gal spec lectin NADCI fiber hd
GK~fD~P~_~QGI---T~ SG ~ f _ ~ V
~U_~ ~ _ W ~v~v~ ~Y~Q~T~ L~YPPGD
~00 13S
306
S.B. Kleiboeker /Virus Research 39 (1995) 299-309
A. b3
b4
al A1
repeating domain 1 2 3 4 5 6
G K S P A K
V K D P G T
[L] [L] [G] [L] [F] [L]
A2
S C H L H V
a2
a3
a4
a5
A3
[L] [L] [L] [Y] [L] [Y]
bI B1
N K R Q T P
[I] [L] [V] [G] E
A G Q H D
N E I R G
P G p A
B2
[L] G [L] A D M [L] S [L] S
b2
B3
[Y] [L] Q [L] [L]
T D A L T
P
R
Q F
D
A
D
B.
HAV
5 Penton:
NADCI
Fiber:
R G D T F A T R A A A E E P P A
E E K R A E A
P A E A E A
P A
E A A A E A A A
P A E A E A
E A
P A A Q P E V E
E p p R K p p R G D
Fig. 3. In Panel A, The deduced amino acid sequence of the NADC-1 fiber shaft is presented according to the structural notation of Green et al. (1983). In this model positions A l, A3, a2, B1, and B 3 are typically occupied by hydrophobic amino acids and position a s is occupied by either proline (P) or glycine (G). For the NADC-1 sequence, hydrophobic amino acids are bracketed when located in the A 1, A 3, a 2, B 1, or B3 position. In Panel B, the amino acid sequence for the NADC-1 fiber protein from residues 331-363 is shown below the amino acid sequence of the HAV5 penton protein from residues 340-372. In both sequences, the RGD motif is underlined.
1981) and HAV-5 (Chroboczek and Jacrot, 1987) is 183 and 182, respectively; that of BAV3 is 175 (Mittal et al., 1993); that of canine adenovirus 1 is 178 (Dragulev et al., 1991); and that of PAV3 is 188 (Reddy et al., 1995b). Different regions of the predicted NADC-1 fiber head demonstrated strongest amino acid sequence similarity to different proteins. The N-terminus of the NADC-1 fiber head, which is indicated by the TLWT sequence (Green et al., 1983), was most similar to the PAV-3 fiber head (Reddy et al., 1995b) with 39.3% identical over 191 amino acids (Fig. 2C). The C-terminus of the NADC-1 fiber protein, beginning with residue 393, was found to be similar to a the S-lectin family of proteins (Fig. 2D). This region of the NADC-1 fiber head was found to be 33.3% identical over 135 amino acids to the C-terminus of galectin-3 (galactose-specific lectin 3) (Jia and Wang, 1988). Furthermore, in the conserved domains of the S-lectin family, the NADC-1 fiber head was found to have 21 of 44 amino acids identical to the consensus sequence. Galectin-3, as well as many other proteins to which this region of the NADC-1 fiber head showed similarity (data not shown), is thought to have a biological function involving carbohydrate binding. This function would be consistent with the putative role of the NADC-1 fiber protein in high affinity binding to the host cell. An RGD motif was identified beginning at residue 361 in the fiber head. This
S.B. gleiboeker/ Virus Research 39 (1995) 299-309
KDa
1
307
2
m
92.5-
9
u
. . . . .
46-
30-
21.5 -
Fig. 4. An autoradiogramof 35[S]-methioninelabeled in vitro transcription and translation products is shown. Lane 1 contains products generated from the NADC-1 fiber head cloned into pcDNAI/Amp. Lane 2 contains products generated from the parent pcDNAI/Amp plasmid (negative control). The position of molecularweight markers is shown to the left.
motif is not found in other adenovirus fiber proteins, but is found in the penton base protein of HAV-2 and HAV-5 (Neumann et at., 1988). The R G D motif is found in a number of cell adhesion molecules, such as fibronectin and vitronectin, and is capable of mediating cell adhesion via a specific family of receptors termed integrins (Cheresh and Spiro, 1987). In addition, the region just upstream of the R G D motif in the NADC-1 fiber head had a similar amino acid composition (alanine- and glutamate-rich) as the region just downstream of the R G D motif in the HAV-2 and HAV-5 penton base (Fig. 3B). For the NADC-1 fiber protein, 11 of 30 (36.7%) of the residues were alanine (A) and 8 of 30 (26.7%) of the residues were glutamate (E). For the HAV-2 and HAV-5 penton 12 of 29 (41.4%) of the residues were alanine and 7 of 29 (24.1%) of the residues were glutamate. The biological significance of these similarities, if any, is unknown. In vitro transcription and translation of a cloned PCR fragment of the NADC-1 fiber head demonstrated that an open reading frame of the appropriate size was present (Fig. 4). The predicted molecular mass of the unprocessed fiber head protein (beginning with the T at residue 120 and extending to the stop codon after residue 703) is 63,386 Da. The top band in Fig. 4, lane 1 has an apparent molecular weight of approximately 63 kDa. Two additional bands are present below the band at 63 Da. These products are likely generated from translation which initiated at the ATG codons located at nucleotides 852 and 921 (Fig. 1). Proteins initiated at
308
S.B. Kleiboeker /Virus Research 39 (1995) 299-309
these start codons and extending to the stop codon have a predicted molecular mass of 57,907 and 55,383 Da. In summary, the amino acid sequence of the predicted NADC-1 fiber protein head appears to be unique among adenoviruses for which the sequence of the fiber protein is known. The fiber head of NADC-1 is much larger than other known adenovirus fiber protein heads, and it has amino acid sequence similarity to other adenovirus fiber proteins in the first portion of the fiber head, to adenovirus penton proteins in the middle portion of the fiber head, and to S-lectin proteins in the C-terminus of the fiber head.
Acknowledgements The author wishes to thank Ms. Deborah Clouser for expert technical assistance, and Ms. Susan Ohlendorf for administrative assistance.
References Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment tool. J. Mol. Biol. 215, 403-410. Cheresh, D.A. and Spiro, R.C. (1987) Biosynthetic and functional properties of an arg-gly-asp-directed receptor involved in human melanoma cell attachment to vitronectin, fibronectin, and von Willebrand factor. J. Biol. Chem. 262, 17703-17711. Chroboczek, J. and Jacrot, B. (1987) The sequence of adenovirus fiber: similarities and differences between serotypes 2 and 5. Virology 161, 549-554. Derbyshire, J.B. (1992) Adenovirus. In: A.D. Leman, B.E. Straw, W.L. Mengeling, S. D'Allaire and D.J. Taylor (Eds.), Diseases of Swine (7th ed.), Iowa State University Press, Ames, Iowa, pp. 225-227. Devaux, C., Belin, M.-T., Caillet-Boudin, M.-L. and Boulanger, P. (1987) Crystallization, enzymatic cleavage, and the polarity of the adenovirus type 2 fiber. Virology 161, 121-128. Dragulev, B.P., Sira, S., Abouhaidar, M.G. and Campbell, J.B. (1991) Sequence analysis of putative E3 and fiber genomic regions of two strains of canine adenovirus type 1. Virology 183, 298-305. Garwes, D.J. and Xuan, H. (1989) Genome typing of three serotypes of porcine adenovirus. Intervirology 30, 234-236. Ginsberg, H.S., Lundholm-Beauchamp, U., Horstwood, R.L., Pernis, B., Wold, W.S.M., Chanock, R.M. and Prince, G.A. (1989) Role of early region 3 (E3) in pathogenesis of adenovirus disease. Proc. Natl. Acad. Sci. USA 86, 3823-3827. Green, N.M., Wrigley, N.G., Russel, W.C., Martin, S.R. and McLachlan, A.D. (1983) Evidence for a repeating cross-B sheet structure in the adenovirus fibre. EMBO J. 2, 1357-1365. Grunhaus, A. and Horwitz, M.S. (1992) Adenoviruses as cloning vectors. Seminars Virol. 3, 237-252. Heriss6, J. and Galibert, F. (1981) Nucleotide sequence of the EcoRI E fragment of adenovirus 2 genome. Nucleic Acids Res. 9, 1229-1249. Heriss6, J., Rigolet, M., Dupont De Dinechin, S. and Galibert, F. (1981) Nucleotide sequence of adenovirus 2 DNA fragment encoding for the carboxylic region of the fiber protein and the entire E4 region. Nucleic Acids Res. 9, 4023-4042. Hirt, B. (1967) Selective extraction of polyoma DNA from infected mouse cell cultures. J. Mol. Biol. 26, 365-369. Jia, S. and Wang, J.L. (1988) Carbohydrate binding protein 35: Complementary DNA sequence reveals homology with the proteins of the heterogeneous nuclear RNP. J. Biol. Chem. 263, 6009-6011.
S.B. Kleiboeker/Virus Research 39 (1995) 299-309
309
Kleiboeker, S.B. (1994) Sequence analysis of putative E3, pVIII and fiber genomic regions of a porcine adenovirus. Virus Res. 31, 17-25. Kleiboeker, S.B. (1995) Identification and sequence analysis of the E1 genomic region of a porcine adenovirus. Virus Res. 36, 259-268. Kleiboeker, S.B., Seal, B.S. and Mengeling, W.L. (1993) Genomic cloning and restriction site mapping of a porcine adenovirus: demonstration of genomic stability in a porcine adenovirus. Arch. Virol. 133, 357-368. Laemmli, U.K. (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature (London) 227, 680-685. Mittal, S.K., Prevec, L., Babiuk, L.A. and Graham, F.L. (1993) Sequence analysis of bovine adenovirus type 3 early region 3 and fibre protein genes. J. Gen. Virol. 74, 2825. Morin, J.E., Lubeck, M.D., Barton, J.E., Conley, A.J., Davis, A.R. and Hung, P.P. (1987) Recombinant adenovirus induces antibody response to hepatitis B virus surface antigen in hamsters. Proc. Natl. Acad. Sci. USA 84, 4626-4630. Neumann, R., Chroboczek, J. and Jacrot, B. (1988) Determination of the nucleotide sequence for the penton-base gene of human adenovirus type 5. Gene 69, 153-157. Reddy, P.S., Nagy, I~. and Derbyshire, J.B. (1993) Restriction endonuclease analysis and molecular cloning of porcine adenovirus type 3. Intervirol. 36, 161-168. Reddy, P.S., Tuboly, T., Nagy, 16. and Derbyshire, J.B. (1995a) Molecular cloning and physical mapping of porcine adenovirus type 1 and 2. Arch. Virol. 140, 195-200. Reddy, P.S., Nagy, I~. and Derbyshire, J.B. (1995b) Sequence analysis of putative pVIII, E3, and fibre regions of porcine adenovirus type 3. Virus Res. 36, 97-106. Sanger, F., Nicklen, S. and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463-5464. Sharp, P.A. (1984) Adenovirus transcription. In: H.S. Ginsberg Ed.), The Adenoviruses, Plenum Press, New York, New York, pp. 173-204. Shinagawa, M., Matsuda, A., Ishiyama, T., Goto, H. and Sato, G. (1983) A rapid and simple method for preparation of adenovirus DNA from infected cells. Microbiol. Immunol. 27, 817-822. Signals, C., Akusj~irvi, G. and Petterson, U. (1985) Adenovirus 3 fiber polypeptide gene: Implications for the structure of the fiber protein. J. of Virol. 53, 672-678. Tabor, S. and Richardson, C.C. (1987) DNA sequence analysis with a modified bacteriophage T7 DNA polymerase. Proc. Natl. Acad. Sci. USA 84, 4767-4771. Tuboly, T., Reddy, P.S., Nagy, 1~. and Derbyshire, J.B. (1995) Restriction endonuclease analysis and physical mapping of the genome of porcine adenovirus type 5. Virus Res. 37, 49-54.