Accepted Manuscript Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark (Scyliorhinus canicula) Rita Pettinello, Anthony K. Redmond, Christopher J. Secombes, Daniel J. Macqueen, Helen Dooley PII:
S0145-305X(17)30020-4
DOI:
10.1016/j.dci.2017.04.015
Reference:
DCI 2879
To appear in:
Developmental and Comparative Immunology
Received Date: 9 January 2017 Revised Date:
18 April 2017
Accepted Date: 18 April 2017
Please cite this article as: Pettinello, R., Redmond, A.K., Secombes, C.J., Macqueen, D.J., Dooley, H., Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark (Scyliorhinus canicula), Developmental and Comparative Immunology (2017), doi: 10.1016/j.dci.2017.04.015. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
1
Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark
2
(Scyliorhinus canicula)
3 4
Rita Pettinello1, Anthony K Redmond1, 2, Christopher J Secombes1, Daniel J Macqueen3,
5
Helen Dooley1, 4
6 1
8
Biological and Environmental Sciences, University of Aberdeen, Aberdeen AB24 2TZ, United
9
Kingdom.4Dept. Microbiology & Immunology, University of Maryland School of Medicine, Institute
10
School of Biological Sciences and 2Centre for Genome-Enabled Biology & Medicine and 3Institute of
of Marine & Environmental Technology, Baltimore MD21202. USA.
SC
11
RI PT
7
12
Corresponding author: Rita Pettinello. School of Biological Sciences, University of
13
Aberdeen, Aberdeen AB24 2TZ, United Kingdom. Email:
[email protected]
M AN U
14
Abstract
16
In every jawed vertebrate species studied so far, the T cell receptor (TCR) complex is
17
composed of two different TCR chains (α/β or γ/δ) and a number of CD3 subunits
18
responsible for transmitting signals into the T cell. In this study, we characterised all of the
19
TCR and CD3 genes of small-spotted catshark (Scyliorhinus canicula) and analysed their
20
expression in a broad range of tissues. While the TCR complex is highly conserved across
21
jawed vertebrates, we identified a number of differences in catshark, most notably the
22
presence of two copies of both TCRβ and CD3γδ, and the absence of a functionally-important
23
proline rich region from CD3ε. We also demonstrate that TCRβ has duplicated independently
24
multiple times in jawed vertebrate evolution, bringing additional diversity to the TCR
25
complex. This study reveals new insights about the evolutionary history of the TCR complex
26
and raises new avenues for future exploration.
EP
AC C
27
TE D
15
28
Keywords
29
T cell receptor; CD3; cartilaginous fish; shark; evolution
30 31
Abbreviations
32
T cell receptor (TCR), Immunoglobulin superfamily (IgSF), Major histocompatibility
33
complex (MHC), Rapid amplification of cDNA ends (RACE)-PCR, Quantitative real time
34
PCR (qPCR), No-reverse transcriptase (NRT), Multiple sequence alignment (MSA), No-
35
template control (NTC), peripheral blood lymphocytes (PBL). 1
ACCEPTED MANUSCRIPT
1. Introduction
37
In mammals, the TCR complex is composed of disulphide-linked heterodimers of TCR
38
chains (α/β or γ/δ) associated with an invariant CD3 signalling complex. Each of the four
39
TCR chains contains an extracellular variable (V) immunoglobulin superfamily (IgSF)
40
domain that recognises proteasome-derived peptides displayed in the groove of major
41
histocompatibility complex (MHC), and a membrane-proximal constant (C) domain involved
42
in signal transmission following TCR engagement (Clevers et al., 1988). As all of the TCR
43
chains have a very short cytoplasmic tail they need to partner with the CD3 complex,
44
intracellular signalling proteins with long, immunoreceptor tyrosine-based activation motif
45
(ITAM)-containing cytoplasmic tails, which facilitate signal transduction into the T cell. The
46
CD3 complex in mammals is formed of CD3ε, CD3δ, and CD3γ proteins that associate non-
47
covalently into δ/ε and γ/ε heterodimers, plus the disulphide linked ζ/ζ homodimer. Following
48
TCR engagement, tyrosine kinases, such as Lck and Fyn, phosphorylate the tyrosines in the
49
ITAMs of the CD3 and ζ chains. This creates a docking site for the protein kinase ZAP70
50
which, in turn, phosphorylates the adaptor protein SLP-76, allowing recruitment of LAT
51
(Guy and Vignali, 2009; Wunderlich et al., 1999) and initiating the downstream
52
phosphorylation cascade (Hořejší et al., 2004; Lin and Weiss, 2001). Previous work has also
53
demonstrated the importance of the adaptor protein Nck for TCR signalling regulation
54
(Takeuchi et al., 2008); Nck can bind to the proline-rich sequence (PXXDY) in the CD3ε
55
cytoplasmic tail (Santiveri et al., 2009) via its SH3 binding domain and to SPL-76 via its SH2
56
binding domain (Wunderlich et al., 1999). As the CD3ε proline-rich sequence overlaps one of
57
the ITAM domains there is debate as to whether tyrosine phosphorylation is a prerequisite for
58
Nck binding, as well as precisely what role Nck plays in signal transduction (Barda-Saad et
59
al., 2005; Gil et al., 2002; Paensuwan et al., 2015).
60
The transmembrane regions of all of the TCR and CD3 proteins contain a number of highly
61
conserved residues that facilitate complex assembly (Call et al., 2002). The CD3γ, δ, and ε
62
chains have an extracellular IgSF domain that mediates heterodimer formation (Gold et al.,
63
1987) and a CXXCXE motif connecting the extracellular domain to the transmembrane
64
region, which plays an important role in the assembly of CD3 with the TCR and subsequent
65
signal transduction. In contrast, the ζ chain is not an IgSF member, has a shorter extracellular
66
domain, and forms dimers via a conserved cysteine present in the transmembrane region
67
(Weissman et al., 1988). ζ, with its longer cytoplasmic tail, encodes three ITAMs rather than
68
the single ITAM present in the tail of the other CD3 molecules (Chan et al., 1994). All
69
components of the TCR/CD3 complex need to be fully assembled to allow cell surface
70
display (Clevers et al., 1988).
AC C
EP
TE D
M AN U
SC
RI PT
36
2
ACCEPTED MANUSCRIPT
While most of the genes for the CD3 signalling complex existed in the ancestor of bony
72
vertebrates (Bernot and Auffray, 1991; Dzialo and Cooper, 1997; Liu et al., 2008; Øvergård
73
et al., 2009), the duplication giving rise to CD3γ and CD3δ occurred more recently, in the
74
ancestor of mammals (Bernot and Auffray, 1991). Thus a single locus, known as CD3γδ, is
75
found in non-mammals; CD3γδ has a hybrid chain structure with a ‘CD3δ-like’ extracellular
76
domain and ‘CD3γ-like’ intracellular region (Göbel et al., 2000). In this case the TCR
77
complex is formed by association of the TCR heterodimer with two CD3ε/CD3γδ
78
heterodimers and one ζ homodimer (Berry et al., 2014).
79
Homologues of the four mammalian TCR chains have previously been identified in nurse
80
shark, little skate and elephant shark (Criscitiello et al., 2010; Rast and Litman, 1994;
81
Venkatesh et al., 2014). Unlike immunoglobulins (Igs), which are present in an atypical
82
‘cluster’ organisation (Litman et al., 1993; Rast et al., 1994), it is thought that the TCR chains
83
of cartilaginous fish exist in a translocon configuration similar to other bony vertebrates
84
(Chen et al., 2010; Rast et al., 1997). In addition to the four canonical TCR chains
85
cartilaginous fish also have a novel TCR chain, NAR-TCR, composed of a TCRδ C domain
86
carrying two independently-rearranging V domains. The N terminal V domain is most closely
87
related to that of IgNAR, a shark-specific Ig that does not associate with light chains
88
(Criscitiello et al., 2006; Greenberg et al., 1996). Some divergent TCR chains are also
89
generated via trans-rearrangement between a TCR V region and that of a (presumably) near-
90
by Ig (Criscitiello et al., 2010). Cartilaginous fish also use somatic hypermutation to increase
91
the diversity of TCR V regions, while mammals only mutate Ig Vs (Chen et al., 2012).
92
Currently, our understanding of the TCR complex of cartilaginous fish is fragmented; TCR
93
chains have been characterised in nurse shark, while the genes encoding CD3ε, CD3γδ and
94
the ζ chain were recently annotated in the elephant shark genome. Therefore, to improve our
95
understanding of the TCR complex of cartilaginous fish and its evolutionary history in
96
vertebrates, we set about characterising the complete set of genes encoding TCR chain and
97
CD3 proteins in a single elasmobranch species, the small-spotted catshark (Scyliorhinus
98
canicula) and quantified their mRNA expression patterns in a range of tissues.
99
AC C
EP
TE D
M AN U
SC
RI PT
71
100
2. Materials and Methods
101
2.1 Small-spotted catshark
102
Three male and one female captive-bred, small-spotted catsharks (Scyliorhinus canicula) of
103
approximately three years of age, were maintained in artificial seawater at 8-12°C in indoor
104
tanks at the University of Aberdeen. Animals were overdosed in MS222 prior to sacrifice and 3
ACCEPTED MANUSCRIPT
105
tissue harvest. All procedures were conducted in accordance with UK Home Office ‘Animals
106
and Scientific Procedures Act 1986; Amendment Regulations 2012’ on animal care and use.
107
2.2 RNA isolation and cDNA synthesis
109
A range of tissues (epigonal organ, gill, Leydig organ, spiral valve, liver, heart, muscle and
110
stomach) were collected from each animal, diced into small pieces and either flash frozen or
111
immersed in RNAlater then stored at -80°C (Sigma). Total RNA was extracted by
112
mechanical disruption of the tissues in TRIzol reagent (Sigma) and the subsequent addition of
113
1/3 volume of chloroform. After centrifugation at 4°C the upper phase was collected and the
114
RNA precipitated through the addition of isopropanol and incubation at -80°C for at least 3 h.
115
Following centrifugation at 4°C, the RNA pellet was washed three times with 70% ethanol.
116
The pellet was allowed to air dry after aspirating the remaining ethanol and then re-suspended
117
in RNase-free water. The RNA was quantified via a NanoDrop® ND-1000
118
spectrophotometer; for all samples the 260/280 and 260/230 ratios were 2.0 +/-0.2. RNA
119
integrity was tested by gel electrophoresis then stored at -80°C for future use.
120
First strand cDNA for PCR and rapid amplification of cDNA ends (RACE)-PCR was
121
synthesised from 2µg of total RNA extracted from spleen. For use in PCR, first strand cDNA
122
was synthesised using illustra™ Ready-To-Go™ RT-PCR Beads (GE Healthcare) with oligo
123
(dT) primer, while for RACE-PCR it was synthesised using the SMARTer II RACE Kit
124
(Clontech Ltd.). After a genomic DNA removal step first strand cDNA for quantitative real
125
time PCR (qPCR) was synthesized from 1µg total RNA from each tissue using the
126
QuantiTect Reverse Transcription Kit (Qiagen Ltd.), and a no-reverse transcriptase (NRT)
127
control was made containing a pooled RNA equally representing all samples and all reagents
128
but not the reverse transcriptase. For each kit the manufacturer’s instructions were followed.
SC
M AN U
TE D
EP
AC C
129
RI PT
108
130
2.3 PCR, cloning and sequencing
131
An in-house, normalised multi-tissue small-spotted catshark transcriptome (sequenced on the
132
Ion Proton ™ Sequencer and assembled with Trinity v20131110) (Redmond et al., in
133
preparation) was BLAST searched for TCR and CD3 transcripts, seeding with sequences
134
taken from across vertebrate phylogeny (sequence accession numbers provided in
135
supplemental table 1). Partial sequences for the TCRβ and ζ chains, full sequences for CD3ε,
136
CD3γδ and the constant regions for the remaining TCRs were identified. PCR primers were
137
designed with the aid of OligoCalc (http://biotools.nubic.northwestern.edu/OligoCalc.html;
138
Kibbe 2007) (supplemental table 2). 4
ACCEPTED MANUSCRIPT
Nested 5’ or 3’ RACE-PCR using gene specific primers (primer details provided in
140
supplemental table 2) and the universal primers provided in the RACE kit was performed as
141
per the manufacturer’s instructions to complete partial sequences. Cycling conditions for both
142
rounds of RACE-PCR were as follows: 1 cycle at 95°C for 5 min; 35 cycles at 95°C for 30
143
sec, 60°C for 30 sec and 72°C for 90 sec; 1 cycle at 72°C for 3 min. Completed sequences, as
144
well as those found in full in the transcriptome, were amplified end-to-end by PCR using
145
gene specific primers (supplemental table 2). PCR products were ligated into pGEM-T Easy
146
vector (Promega) and transformed into heat shock-competent E. coli. Colonies containing
147
vector with inserts were identified through blue–white colour selection when grown on LB
148
agar containing 100 µg/ml ampicillin, 25 µg/ml X-Gal and 25 µg/ml IPTG. Plasmid DNA
149
was prepared from three independent white colonies and screened for insert size by standard
150
PCR using SP6 and T7 primers under the following cycling conditions: 1 cycle at 95°C for 5
151
min; 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 90 sec; 1 cycle of 72°C for 3
152
min. Once the insert size was confirmed by gel electrophoresis, plasmids were sent to Eurofin
153
MWG-Biotech for Sanger sequencing. Sequences confirmed in this manner were submitted
154
to NCBI under the accession numbers given in supplemental table 3 and 4.
M AN U
SC
RI PT
139
155
2.4 Sequence analysis
157
The sequences returned from cloning experiments were used in BLAST searches (Altschul et
158
al., 1990) against the non-redundant NCBI database to check for similarity with other
159
characterized sequences. Subsequently, amino acid comparison was performed by aligning
160
closely-related sequences from different vertebrate classes using MAFFT version 7
161
(http://mafft.cbrc.jp/alignment/server/index.html - Katoh and Standley, 2013). Combined
162
transmembrane topology and the presence of signal peptide were predicted using Phobius
163
(http://phobius.sbc.su.se/; Käll et al. 2004) and Tmhmm
164
(http://www.cbs.dtu.dk/services/TMHMM/; Möller et al., 2001). N-linked glycosylation sites
165
were predicted using NetNGlyc 1.0 Server (http://www.cbs.dtu.dk/services/NetNGlyc/; Salim
166
et al., 2003). Conserved domains were identified using CD-Search
167
(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi; Marchler-Bauer et al. 2014).
168
Secondary structure of TCR variable regions was predicted with PSIPRED Protein Sequence
169
Analysis Workbench (http://bioinf.cs.ucl.ac.uk/psipred/).
AC C
EP
TE D
156
170 171
2.5 Phylogenetic analysis
172
Multiple sequence alignment (MSA) was performed in PRANK (Loytynoja and Goldman,
173
2008; Löytynoja and Goldman, 2010). Sequences were adjusted manually to reduce long 5
ACCEPTED MANUSCRIPT
gaps in BioEdit 7.2.5 (Hall, 1999), and the best-fitting amino acid substitution model was
175
estimated by maximum likelihood using the IQ-TREE server (Nguyen et al., 2014)
176
(WAG+I+G4 for both CD3 proteins and TCR constant chains).
177
Two Markov chain Monte Carlo (MCMC) chains of Bayesian phylogenetic analysis were
178
executed using BEAST v.1.8.1 (Drummond et al., 2012) for each dataset; with a lognormal
179
uncorrelated relaxed clock (Drummond et al., 2006), yule speciation process, random starting
180
tree, and MCMC chain lengths of 20 million states (sampled every 1000 states). Tracer v1.6
181
(http://beast.bio.ed.ac.uk/Tracer; (Rambaut et al., 2014) was used to assess convergence and
182
mixing of the MCMC traces (effective sample size values were greater than 200 for all
183
estimated parameters). The trees generated from both runs were combined with LogCombiner
184
v1.8.1 after removing 10% as burn-in. A maximum clade credibility tree was then created
185
with TreeAnnotator v1.8.1 and visualised using FigTree v1.4
186
(http://tree.bio.ed.ac.uk/software/figtree/).
M AN U
187
SC
RI PT
174
2.6 Quantitative real time PCR (qPCR)
189
We performed qPCR according to MIQE guidelines (Bustin et al., 2009). RNA was
190
extracted from 9 different tissues from four small-spotted catsharks as described above.
191
qPCR reactions contained 5 µl first strand cDNA, generated as described above, 1.5 µl
192
sense/antisense primers (at 10 nM), 7.5 µl 2x Brilliant III SYBR® Green QPCR Master Mix
193
(Agilent Technology) and 1 µl RNase free water to give a final volume of 15 µl. qPCR
194
primers were manually designed and checked with NetPrimer
195
(http://www.premierbiosoft.com/NetPrimer/AnalyzePrimer.jsp; supplementary table 2).
196
Reactions were performed on a Stratagene Mx3005P system (Agilent Technology) with 1
197
cycle of 3 min at 95°C then 40 cycles of 20 sec at 95°C and 20 sec at 64°C followed by
198
dissociation analysis, where a single peak was observed in all cases. All samples were present
199
in each plate in duplicate with NRT and a no-template control (NTC). Cq values were
200
calculated from baseline-corrected SYBR fluorescence data with a standardised threshold and
201
the baseline set between 3 and 15 cycles. For each run, the NRT and NTC controls produced
202
negligible amplification, indicating a lack of contamination from carried over genomic DNA
203
or other DNA sources. A cut-off of “no expression” was set at 36 Cq for each experimental
204
qPCR assay, approximating the limit of reliable detectability. The efficiency of each primer
205
pair was calculated using LinRegPCR v.11 (Ruijter et al., 2009), following the authors
206
recommendations.
207
Five reference genes (RPL13, RPS13, RPS19, ACTβ and EF1α) were tested using the
208
NormFinder modelling algorithm (Andersen et al., 2004) within Genex v5 (MultiD Analysis
AC C
EP
TE D
188
6
ACCEPTED MANUSCRIPT
AB), accounting for variance in regulation across tissues and individuals. This analysis
210
indicated that the combined expression of RPL13 and RPS13 provided the most appropriate
211
normalization factor. Samples giving poor quality cDNA (showing no expression of reference
212
genes) were excluded from the analysis, leaving only one sample for peripheral blood
213
lymphocytes (PBL) and pancreas, and three samples for Leydig organ. Raw qPCR data were
214
normalized to the expression of RPL13 and RPS13 and the relative expression calculated on a
215
scale quantitatively comparable across all genes (assigning a value of 1 to the average of the
216
expression of all genes combined across tissues using Genex v5).
217
To test if differences in gene expression between tissues were statistically significant for each
218
tested gene, we performed Kruskal-Wallis H tests, along with Dunn’s post-hoc test returning
219
p-values adjusted using the Bonferroni correction (Pohlert, 2014) with a 95% confidence
220
level (Theodorsson-Norheim, 1986) using packages within R (R Core Team, 2014). Mean,
221
standard error and standard deviation were also calculated in R with the summarySE function
222
of the Rmisc package (Hope, 2013). Data were visualised with bar chart plots in R using the
223
ggplot2 package (Wickham, 2009).
SC
3. Results
226
3.1 Characterisation of multiple TCRβ and CD3γδ in small-spotted catshark
TE D
227
M AN U
224 225
RI PT
209
Putative small-spotted catshark TCR and CD3 transcripts were identified through a
229
BLAST search of our in-house, normalised multi-tissue transcriptome using those of other
230
vertebrates as the query (accession numbers provided in supplementary table 1). Based on our
231
initial assignments using BLAST searches against characterized vertebrate proteins (i.e.
232
subject to confirmation by phylogenetic analysis), we identified single genes for TCRα,
233
TCRγ, TCRδ, CD3ε and ζ, two variants for CD3γδ and two variants for TCRβ in small-
234
spotted catshark (NCBI accession numbers provided in supplementary table 3). Sequences
235
were confirmed by cloning and Sanger sequenced as described previously. The discovery of
236
multiple putative CD3γδ and TCRβ genes was unexpected, so to investigate this finding
237
further, we searched additional available cartilaginous fish datasets (consisting of an in-house
238
nurse shark transcriptome [unpublished data], the published transcriptome and genome of
239
little skate (Wang et al., 2012; Wyffels et al., 2014), and the published transcriptome and
240
genome of elephant shark (Venkatesh et al., 2014)), using our small-spotted catshark
241
transcripts as the query; the resultant hits suggest the presence of three TCRβ C variants and
242
two CD3γδ variants in the nurse shark, two TCRβ C and one CD3γδ in little skate, and two
AC C
EP
228
7
ACCEPTED MANUSCRIPT
243
TCRβ C and two CD3γδ in the elephant shark (figures 1 and 2), a finding missed by previous
244
studies.
245 246
3.2 Multiple independent duplications of TCRβ and CD3γδ in jawed vertebrates Previous studies used distant outgroups (e.g. Richards & Nelson 2000) or unrooted
248
phylogenies (e.g. Parra et al. 2007) to establish the relationship between the different TCR
249
chains. Here, for the first time, we utilised a relaxed clock rooting method, allowing us to
250
statistically infer the root without including distant outgroups (Drummond et al., 2006), as
251
performed successfully for other gene families over similar evolutionary timescales
252
(Macqueen and Wilcox, 2014; Redmond et al., 2017). Phylogenetic analysis revealed that the
253
shark TCR constant domain and CD3 sequences form clades with their respective
254
counterparts from other vertebrates (figures 1 and 2 respectively). The root for TCR constant
255
chains falls in between TCRα/TCRδ and TCRγ/TCRβ, while the root for CD3 gene separate
256
CD3ε and CD3γδ (Cd3γ and CD3δ for mammals) and ζ. These results are congruent with
257
previous studies (e.g. Richards and Nelson, 2000). The relatively low homologies of the
258
TCRβ and CD3γδ variants (72% and 56% respectively) combined with the fact that both form
259
multiple phylogenetic clusters, containing sequences from different cartilaginous fish species,
260
supports the conclusion that the two variants are not allelic forms of the same gene. Each
261
variant was cloned multiple times for both genes, from the mRNA of different individuals,
262
and no variation in sequence was observed (data not shown). While the sequence data for
263
TCRβ is not consistent with the variants being alternate splice forms, the fact that the first 18
264
amino acids of the two CD3γδ variants are identical (also at the DNA level), in both catshark
265
and nurse shark, suggests the variants are a product of a partial gene duplication and share the
266
exon(s) encoding this region (supplemental figure 1).
SC
M AN U
TE D
EP
Intriguingly, our analysis also shows that TCRβ has duplicated independently
AC C
267
RI PT
247
268
multiple times during jawed vertebrate evolution (figure 1), and underwent at least four
269
duplication events in the cartilaginous fish lineage alone; one duplication in the ancestor of
270
the elephant shark (or all chimeras) after the separation with elasmobranchs, while two
271
independent duplications happened in an ancestor common to nurse shark, and small-spotted
272
catshark, and one duplication happened in an ancestor of the skate lineage (figure 1). As the
273
two TCRβ variants of elephant shark share a much higher similarity (92%) than those of
274
catshark (72%), it suggests that the elephant shark duplication either happened later in this
275
species’ evolutionary history or, alternatively, that on-going gene conversion events have
276
homogenised older gene duplicates. 8
ACCEPTED MANUSCRIPT
CD3γδ appears to have duplicated at least twice in the ancestor of all cartilaginous
278
fishes with the partial duplication being a third event in a common ancestor of nurse shark
279
and small-spotted catshark (figure 2). However, support at key nodes was low and so
280
additional analysis were performed without more distant outgroups; these analyses again
281
suggested that at least two duplications occurred in the ancestor of cartilaginous fishes, but
282
the patterns of duplication are different, with the partial duplication occurring in the ancestor
283
of all cartilaginous fishes (supplemental figure 2). The incongruence between these analyses
284
make the precise timing of the CD3γδ partial duplication difficult to assign. Re-analysis once
285
full sequences from nurse and elephant shark are available might help to resolve the exact
286
timing of this duplication.
RI PT
277
288
SC
287
3.3 Conservation of TCR constant chain structure in small-spotted catshark We characterised the four canonical TCR C domains in small-spotted catshark; as in
290
previous studies of cartilaginous fish analysis of our TCRs transcripts suggested the presence
291
of a single gene for TCRα, TCRγ and TCRδ. However, unlike previous studies, we identified
292
two significantly different transcripts for the TCRβ C region.
293
Comparison of the amino acid sequences of the small-spotted catshark TCR constant regions
294
with other vertebrates shows conservation of gross structure (IgSF constant domain,
295
connecting peptide, transmembrane region and a short intracellular cytoplasmic tail; figure
296
3). The two canonical cysteines in the extracellular domain necessary for intradomain
297
disulphide bonding are conserved, as are the positively charged residues in the
298
transmembrane region required for association with the CD3 complex (figure 3). Small-
299
spotted catshark TCRα, like TCRα from other cartilaginous and bony vertebrates, lacks the
300
connecting peptide motif ‘FETDXNLN’, which is thought to play a role in signal
301
transduction and CD8 co-receptor function in mammals (Bäckström et al., 1996; Mallaun et
302
al., 2008). The ‘SRLR’ sequence that in mammals mediates association of TCRβ with TCRα
303
chains is conserved in all elasmobranchs analysed (Arnaud et al. 1997) (figure 3). There are
304
no major structural differences between the two TCRβ constant regions and both variants
305
carry the glutamine (position 136 in human) crucial for signalling (Bäckström et al., 1996).
306
As mentioned earlier, cartilaginous fish TCR genes are thought to be present in the typical
307
translocon organisation, rather than the atypical ‘cluster’ organisation observed for their Ig
308
genes (Litman et al., 1993). In mammals, which also express two TCRβ variants, the locus
309
contains multiple V segments each of which can rearrange with either of two downstream D-
310
Jn-C blocks (figure 4a; translocon configuration). To better understand the TCRβ organisation
311
in catshark, and determine if it had a similar structure to that of mammals, we used 5’ RACE-
AC C
EP
TE D
M AN U
289
9
ACCEPTED MANUSCRIPT
PCR with C domain specific primers for both TCRβ variants to sequence the V regions. The
313
sequences obtained (figure 4b) show that the two C domains are associated with distinct V
314
and J segments suggesting two separate clusters, each having one V, (at least) one D and one
315
J segment and a C domain (figure 4a; cluster configuration).
316
While full characterisation of the small-spotted catshark TCR V region repertoire was not an
317
aim of this project, all of the V region sequences obtained shared the markers typical of IgSF
318
V domains and are presented in FASTA format in supplemental figure 3 for the use of others.
319
While we did not identify NAR-TCR in this study it does not exclude its presence in this
320
species. We did identify a single V region that was a hybrid of IgM (Crouch et al., 2013) and
321
TCRδ (supplemental figure 4), indicating trans-rearrangement occurs in catshark as well as
322
nurse shark.
SC
RI PT
312
323
3.4 Surprising absence of the proline-rich domain in small-spotted catshark CD3ε
325
CD3 sequences were confirmed by PCR from catshark spleen cDNA. Again the expected
326
gross structure (leader peptide followed by extracellular, transmembrane, and cytoplasmic
327
domains) is conserved in all three CD3 subunits (figure 5). The extracellular domain of CD3ε
328
contains a canonical IgSF domain, however domain-prediction software did not identify this
329
domain in CD3γδ-A and CD3γδ-B even though they both contain the tryptophan and the two
330
canonical cysteines common to IgSF domains and critical for the β-sandwich tertiary
331
structure. The CXXCXE motif important for TCR complex assembly and signal transmission
332
is conserved in the extracellular region of CD3ε and both CD3γδ variants. The charged amino
333
acid necessary for TCR complex formation is present in the transmembrane region of all the
334
CD3 proteins, with the notable exception of CD3γδ-B, suggesting it may have a different
335
mode of interaction. The ITAM motifs necessary for the transmission of intracellular signals
336
are conserved in all four CD3 proteins, with CD3ε and both CD3γδ having only one ITAM
337
motif and ζ three ITAMs (figure 5). Interestingly the canonical SH3 binding motif PXXDY,
338
important for Nck binding, is not conserved in the cytoplasmic tail of small-spotted catshark
339
CD3ε. Amino acid comparison shows that the same motif is missing in nurse shark however
340
it is present in elephant shark CD3ε (figure 5a).
341
It was apparent from the MSA that the small-spotted catshark ζ extracellular domain is
342
shorter than that of other vertebrates, including the elephant shark (figure 5b). To confirm this
343
unexpected finding 5’ RACE-PCR was performed and the sequence obtained was identical to
344
the one retrieved from our in-house transcriptome databases, having a stop codon before the
345
sequence shown in figure 5b. BLAST searches in additional small-spotted catshark
346
transcriptome datasets (Mulley et al. 2014) also show a stop codon in the same position,
AC C
EP
TE D
M AN U
324
10
ACCEPTED MANUSCRIPT
indicating the catshark ζ transcript is indeed shorter than the others (510bp, encoding 169aa).
348
Interestingly the missing region corresponds to the first exon of mammalian ζ, suggesting this
349
exon has either been lost in catshark or is skipped during mRNA splicing (perhaps due to a
350
point mutation or frameshift). Despite this, the cysteine in the transmembrane region required
351
for ζ homodimer formation is conserved in small-spotted catshark.
352
In other species, the ‘LG(X)4DPRG’ motif in CD3γδ aids binding to CD3ε (Dietrich et al.,
353
1996) however this motif is absent from both small-spotted catshark CD3γδ variants. Small-
354
spotted catshark CD3γδ-A has two predicted N-linked glycosylation sites at N42 and N49
355
which are absent in CD3γδ-B (figure 5c). Nurse shark CD3γδ-B has only one predicted
356
glycosylation site. While both nurse shark CD3γδ sequences were partial, it was possible to
357
identify the conserved CXXC region and part of the ITAM domain (figure 5c). In nurse
358
shark, as in small-spotted catshark, only one of the two variants has the negatively charged
359
amino acid required for TCR binding, while in the other variant this residue has mutated to
360
serine. The presence of serine in both small-spotted catshark and nurse shark suggests this
361
change occurred in a common ancestor of the elasmobranchs.
M AN U
SC
RI PT
347
362 363
3.6 TCR complex genes are highly expressed in secondary lymphoid tissues Baseline mRNA levels for each of the TCR-CD3 complex genes in a range of
365
catshark tissues are shown in figure 6. Due to the small sample size (four individuals), it was
366
not possible to identify a statistical significant difference in the expression of each gene;
367
however all the genes showed a similar pattern, with the highest transcript levels observed in
368
spleen followed by gills and spiral valve, and the lowest levels in muscle (p-values presented
369
in supplemental table 5), an expression pattern similar to that observed for TCR in nurse
370
shark (Criscitiello et al., 2010). This result is not unexpected as the spleen is the major
371
secondary lymphoid organ in sharks and where the adaptive immune response primarily
372
occurs (Rumfelt et al., 2002; Zapata and Amemiya, 2000). Further, the gills and spiral valve
373
will come into direct contact with externally-derived pathogens and so would also be
374
expected to have some secondary immune function. While pancreas and PBL were excluded
375
from the statistical analysis due to the presence of only one cDNA sample, TCR/CD3
376
expression in the one available PBL sample was almost as high as spleen, whereas that in the
377
one pancreas sample was much lower than that of other tissues (supplemental figure 5).
378
The expression of CD3ε and CD3γδ-A were significantly higher than CD3γδ-B, TCRα,
379
TCRβ-A, and TCRδ (supplementary table 5). Although TCRγ and TCRδ had similar
380
expression levels, TCRβ-A and TCRβ-B (combined) had almost double the expression of
381
TCRα, although this difference was not statistically significant. The two TCRβ paralogues
AC C
EP
TE D
364
11
ACCEPTED MANUSCRIPT
382
did not show significantly different expression in any of the tissues examined (p-value
383
0.1786; figure 6 and supplemental table 6). In contrast, CD3γδ-A showed almost six times
384
higher expression than CD3γδ-B in all of the tissues examined (p-value 0.0063; figure 6 and
385
supplemental table 6).
386
4. Discussion
388
This report describes the TCR/CD3 complex genes of the small-spotted catshark and
389
represents the first molecular cloning of the full complement of CD3 subunits from any
390
cartilaginous fish species. As expected the gross structural features required for CD3-TCR
391
interaction and functioning are highly conserved between cartilaginous fish and other
392
vertebrates (including mammals).
393
Despite this, a number of notable differences were observed; firstly, both small-spotted
394
catshark and nurse shark CD3ε lacked the non-canonical PXXDY SH3 binding motif, which
395
in mammals is important for Nck binding (Takeuchi et al., 2008). Although the SH3 binding
396
motif is absent we were able to find transcripts for both Nck1 and Nck2 in our catshark
397
transcriptomic database and both possessed the amino acids required for CD3ε binding
398
(supplementary figure 6; Saksela & Permi 2012) implying they can still associate with CD3ε.
399
Indeed it has been reported that some proteins, such as SAM68, which do not have a PXXDY
400
motif can bind to Nck (Kesti et al., 2007), while other studies have identified non-canonical
401
SH3 binding motifs where binding is regulated by the conformation of the target molecule
402
(Barnett et al., 2000; Saksela and Permi, 2012). Interestingly, the PXXDY motif is present in
403
elephant shark CD3ε, suggesting that the loss of this motif is restricted to elasmobranchs. The
404
functional significance of this change for elasmobranch CD3ε requires further investigation.
405
We were also surprised to find two variants of the CD3γδ gene in both small-spotted catshark
406
and nurse shark. However, on closer examination it was clear the CD3γδ-B variant does not
407
possess all of the expected characteristics of a CD3γδ protein; while both variants have the
408
CXXC motif and ITAMs necessary for transmitting phosphorylation signals into the T cell,
409
the negatively charged residue in the transmembrane region necessary for non-covalent
410
binding with the TCR chain is only present in CD3γδ-A. Additionally there are two potential
411
N-linked glycosylation sites in the extracellular domain of CD3γδ-A that are absent in
412
CD3γδ-B. A previous study in chicken showed that glycosylation of CD3γδ is necessary for
413
binding to CD3ε (Göbel et al., 2000). An absence of glycosylation on CD3γδ has been
414
previously reported in teleosts, where the glycosylation site was present in CD3ε instead
415
(Guo and Nie, 2013) however this is not the case in small-spotted catshark (figure 5c).
416
Expression analysis of small-spotted catshark CD3γδ genes indicate that both have high
AC C
EP
TE D
M AN U
SC
RI PT
387
12
ACCEPTED MANUSCRIPT
expression in spleen compared to other tissues, but CD3γδ-A expression is six times higher
418
than that of CD3γδ-B across all tissues. The differences in sequence and expression between
419
the two paralogues suggest divergence of function following CD3γδ duplication. In the
420
future, it will be important to define the precise role of the two CD3γδ variants in the shark
421
TCR complex.
422
Another novel finding of our study was the presence of multiple TCRβ variants in
423
cartilaginous fish. Although the presence of multiple TCRβ C chains in elasmobranchs was
424
suggested previously by Rast and colleagues (1997), firm evidence was lacking until the
425
current study. The two TCRβ constant chains found in small-spotted catshark have a similar
426
structure and tissue expression profile. Multiple TCRβ genes have previously been
427
documented in vertebrate species ranging from mammals (2 TCRβ C domains) to axolotl (4
428
TCBβ Cs) and zebrafish (2 TCRβ Cs) (Fellah et al., 2001, 1993; Meeker et al., 2010;
429
Toyonaga et al., 1985). In both mammals and zebrafish, the TCRβ C domains are organised
430
in a translocon configuration and although they have independent D and J segments, they
431
share the same V segments (Meeker et al., 2010; Toyonaga et al., 1985). Preliminary analysis
432
indicates the two TCRβ C domains in catshark utilise different V and J segments, suggesting
433
the genes for the TCRβ chain may be present in the ‘atypical’ cluster organisation (where the
434
V, D and J segments are associated with a single set of C regions (figure 4b - Hinds and
435
Litman, 1986)) rather than translocon configuration suggested for the other shark TCR chains
436
(Chen et al., 2009). Unfortunately the scaffolds containing the TCRβ C regions in the
437
elephant shark genome were not long enough to confirm our hypothesis and our attempts to
438
amplify the catshark TCR genes in their germline configuration have so far been
439
unsuccessful.
440
The presence of multiple TCRβ C regions in diverse vertebrate lineages could be due to a
441
duplication in the ancestor of all jawed vertebrates followed by lineage-specific conversion
442
events or, alternatively, recurrent lineage-specific duplications. If the latter is true this would
443
suggest some selective pressure towards multiples copies of TCRβ C regions, perhaps as
444
another means to increase TCR diversity; why this occurs for TCRβ and not the other TCRs
445
is, as yet, unexplained. While this study revealed new information about the evolutionary
446
history of the TCR receptor complex, it also raises exciting new questions for future
447
exploration.
AC C
EP
TE D
M AN U
SC
RI PT
417
448 449
Acknowledgements
450
This work was supported by a PhD studentship awarded by the University of Aberdeen,
451
College of Life Sciences and Medicine to RP, and a Royal Society research grant 13
ACCEPTED MANUSCRIPT
452
[RG130789] awarded to HD. The authors thank Kathryn Crouch for providing the nurse
453
shark transcriptome assembly, Milena Monte, Kimberley Mackenzie, Heather Ritchie and
454
Abdullah Alzaid for technical advice, and Marianna Chimienti for help with manuscript
455
editing.
456
References
458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410. doi:10.1016/S0022-2836(05)80360-2 Andersen, C.L., Jensen, J.L., Ørntoft, T.F., 2004. Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Res. 64, 5245–50. doi:10.1158/0008-5472.CAN-04-0496 Arnaud, J., Huchenq, A., Vernhes, M.C., Caspar-Bauguil, S., Lenfant, F., Sancho, J., Terhorst, C., Rubin, B., 1997. The interchain disulfide bond between TCR alpha beta heterodimers on human T cells is not required for TCR-CD3 membrane expression and signal transduction. Int. Immunol. 9, 615–626. doi:10.1093/intimm/9.4.615 Bäckström, B.T., Milia, E., Peter, A., Jaureguiberry, B., Baldari, C.T., Palmer, E., 1996. A motif within the T cell receptor alpha chain constant region connecting peptide domain controls antigen responsiveness. Immunity 5, 437–47. doi:10.1016/S10747613(00)80500-2 Barda-Saad, M., Braiman, A., Titerence, R., Bunnell, S.C., Barr, V.A., Samelson, L.E., 2005. Dynamic molecular interactions linking the T cell antigen receptor to the actin cytoskeleton. Nat. Immunol. 6, 80–89. doi:10.1038/ni1143 Barnett, P., Bottger, G., Klein, A.T., Tabak, H.F., Distel, B., 2000. The peroxisomal membrane protein Pex13p shows a novel mode of SH3 interaction. EMBO J. 19, 6382– 91. doi:10.1093/emboj/19.23.6382 Bernot, A., Auffray, C., 1991. Primary structure and ontogeny of an avian CD3 transcript. Proc. Natl. Acad. Sci. 88, 2550–2554. doi:10.1073/pnas.88.6.2550 Berry, R., Headey, S.J., Call, M.J., McCluskey, J., Tregaskes, C.A., Kaufman, J., Koh, R., Scanlon, M.J., Call, M.E., Rossjohn, J., 2014. Structure of the chicken CD3εδ/γ heterodimer and its assembly with the αβT cell receptor. J. Biol. Chem. 289, 8240–51. doi:10.1074/jbc.M113.544965 Bustin, S.A., Benes, V., Garson, J.A., Hellemans, J., Huggett, J., Kubista, M., Mueller, R., Nolan, T., Pfaffl, M.W., Shipley, G.L., Vandesompele, J., Wittwer, C.T., 2009. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin. Chem. 55, 611–22. doi:10.1373/clinchem.2008.112797 Call, M.E., Pyrdol, J., Wiedmann, M., Wucherpfennig, K.W., 2002. The organizing principle in the formation of the T cell receptor-CD3 complex. Cell 111, 967–979. doi:10.1016/S0092-8674(02)01194-7 Chan, A.C., Desai, D.M., Weiss, A., 1994. The role of protein tyrosine kinases and protein tyrosine phosphatases in T cell antigen receptor signal transduction. Annu Rev Immunol 12, 555–592. doi:10.1146/annurev.iy.12.040194.003011 Chen, H., Bernstein, H., Ranganathan, P., Schluter, S.F., 2012. Somatic hypermutation of TCR γ V genes in the sandbar shark. Dev. Comp. Immunol. 37, 176–83. doi:10.1016/j.dci.2011.08.018 Chen, H., Kshirsagar, S., Jensen, I., Lau, K., Covarrubias, R., Schluter, S.F., Marchalonis, J.J., 2009. Characterization of arrangement and expression of the T cell receptor gamma locus in the sandbar shark. Proc. Natl. Acad. Sci. 106, 8591–8596. doi:10.1073/pnas.0811283106
AC C
EP
TE D
M AN U
SC
RI PT
457
14
ACCEPTED MANUSCRIPT
EP
TE D
M AN U
SC
RI PT
Chen, H., Kshirsagar, S., Jensen, I., Lau, K., Simonson, C., Schluter, S.F., 2010. Characterization of arrangement and expression of the beta-2 microglobulin locus in the sandbar and nurse shark. Dev. Comp. Immunol. 34, 189–95. doi:10.1016/j.dci.2009.09.008 Clevers, H., Alarcon, B., Wileman, T., Terhorst, C., 1988. The T cell receptor/CD3 complex: a dynamic protein ensemble. Annu. Rev. Immunol. 6, 629–62. doi:10.1146/annurev.iy.06.040188.003213 Criscitiello, M.F., Saltis, M., Flajnik, M.F., 2006. An evolutionarily mobile antigen receptor variable region gene: doubly rearranging NAR-TcR genes in sharks. Proc. Natl. Acad. Sci. U. S. A. 103, 5036–41. doi:10.1073/pnas.0507074103 Criscitiello, M.F.M., Ohta, Y., Saltis, M., McKinney, E.C., Flajnik, M.F., 2010. Evolutionarily conserved TCR binding sites, identification of T cells in primary lymphoid tissues, and surprising trans-rearrangements in nurse shark. J. Immunol. 184, 6950–60. doi:10.4049/jimmunol.0902774 Crouch, K., Smith, L.E., Williams, R., Cao, W., Lee, M., Jensen, A., Dooley, H., 2013. Humoral immune response of the small-spotted catshark, Scyliorhinus canicula. Fish Shellfish Immunol. 34, 1158–69. doi:10.1016/j.fsi.2013.01.025 Dietrich, J., Neisig, A., Hou, X., Wegener, A.M., Gajhede, M., Geisler, C., 1996. Role of CD3 gamma in T cell receptor assembly. J. Cell Biol. 132, 299–310. doi:10.1083/jcb.132.3.299 Drummond, A.J., Ho, S.Y.W., Phillips, M.J., Rambaut, A., 2006. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4, e88. doi:10.1371/journal.pbio.0040088 Drummond, A.J., Suchard, M.A., Xie, D., Rambaut, A., 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–73. doi:10.1093/molbev/mss075 Dzialo, R.C., Cooper, M.D., 1997. An amphibian CD3 homologue of the mammalian CD3 γ and δ genes. Eur. J. Immunol. 27, 1640–1647. doi:10.1002/eji.1830270708 Fellah, J.S., Durand, C., Kerfourn, F., Charlemagne, J., 2001. Complexity of the T cell receptor Cbeta isotypes in the Mexican axolotl: structure and diversity of the VDJCbeta3 and VDJCbeta4 chains. Eur. J. Immunol. 31, 403–11. doi:10.1002/15214141(200102)31:2<403::AID-IMMU403>3.0.CO;2-4 Fellah, J.S., Kerfourn, F., Guillet, F., Charlemagne, J., 1993. Conserved structure of amphibian T-cell antigen receptor beta chain. Proc. Natl. Acad. Sci. U. S. A. 90, 6811–4. Gil, D., Schamel, W.W.A., Montoya, M., Sánchez-Madrid, F., Alarcón, B., 2002. Recruitment of Nck by CD3ϵ reveals a ligand-induced conformational change essential for T cell receptor signaling and synapse formation. Cell 109, 901–912. doi:10.1016/S0092-8674(02)00799-7 Göbel, T.W., Dangy, J.-P.P., Gobel, T.W.F., Dangy, J.-P.P., 2000. Evidence for a stepwise evolution of the CD3 family. J. Immunol. 164, 879–883. doi:10.4049/jimmunol.164.2.879 Gold, D.P., Clevers, H., Alarcon, B., Dunlap, S., Novotny, J., Williams, A.F., Terhorst, C., 1987. Evolutionary relationship between the T3 chains of the T-cell receptor complex and the immunoglobulin supergene family. Proc. Natl. Acad. Sci. U. S. A. 84, 7649–53. Greenberg, a S., Hughes, a L., Guo, J., Avila, D., McKinney, E.C., Flajnik, M.F., 1996. A novel “chimeric” antibody class in cartilaginous fish: IgM may not be the primordial immunoglobulin. Eur. J. Immunol. 26, 1123–9. doi:10.1002/eji.1830260525 Guo, Z., Nie, P., 2013. Sequencing and expression analysis of CD3γ/δ and CD3ɛ chains in mandarin fish, Siniperca chuatsi. Chinese J. Oceanol. Limnol. 31, 106–117. doi:10.1007/s00343-013-2018-1 Guy, C.S., Vignali, D.A.A., 2009. Organization of proximal signal initiation at the TCR:CD3 complex. Immunol. Rev. 232, 7–21. doi:10.1111/j.1600-065X.2009.00843.x Hall, T., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98.
AC C
501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552
15
ACCEPTED MANUSCRIPT
EP
TE D
M AN U
SC
RI PT
Hinds, K.R., Litman, G.W., 1986. Major reorganization of immunoglobulin VH segmental elements during vertebrate evolution. Nature 320, 546–9. doi:10.1038/320546a0 Hope, R.M., 2013. Rmisc: Ryan Miscellaneous. Hořejší, V., Zhang, W., Schraven, B., 2004. Transmembrane adaptor proteins: organizers of immunoreceptor signalling. Nat. Rev. Immunol. 4, 603–616. doi:10.1038/nri1414 Käll, L., Krogh, A., Sonnhammer, E.L.L., 2004. A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 338, 1027–36. doi:10.1016/j.jmb.2004.03.016 Katoh, K., Standley, D.M., 2013. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 30, 772–780. doi:10.1093/molbev/mst010 Kesti, T., Ruppelt, A., Wang, J.-H., Liss, M., Wagner, R., Tasken, K., Saksela, K., 2007. Reciprocal regulation of SH3 and SH2 domain binding via tyrosine phosphorylation of a common site in CD3. J. Immunol. 179, 878–885. doi:10.4049/jimmunol.179.2.878 Kibbe, W.A., 2007. OligoCalc: an online oligonucleotide properties calculator. Nucleic Acids Res. 35, W43-6. doi:10.1093/nar/gkm234 Lin, J., Weiss, A., 2001. T cell receptor signalling. J. Cell Sci. 114, 243–4. Litman, G.W., Rast, J.P., Shamblott, M.J., Haire, R.N., Hulst, M., Roess, W., Litman, R.T., Hinds-Frey, K.R., Zilch, A., Amemiya, C.T., 1993. Phylogenetic diversification of immunoglobulin genes and the antibody repertoire. Mol. Biol. Evol. 10, 60–72. Liu, Y., Moore, L., Koppang, E.O., Hordvik, I., 2008. Characterization of the CD3zeta, CD3gammadelta and CD3epsilon subunits of the T cell receptor complex in Atlantic salmon. Dev. Comp. Immunol. 32, 26–35. doi:10.1016/j.dci.2007.03.015 Loytynoja, A., Goldman, N., 2008. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science (80-. ). 320, 1632–1635. doi:10.1126/science.1158395 Löytynoja, A., Goldman, N., 2010. webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser. BMC Bioinformatics 11, 579. doi:10.1186/1471-2105-11-579 Macqueen, D.J., Wilcox, A.H., 2014. Characterization of the definitive classical calpain family of vertebrates using phylogenetic, evolutionary and expression analyses. Open Biol. 4, 130219–130219. doi:10.1098/rsob.130219 Mallaun, M., Naeher, D., Daniels, M.A., Yachi, P.P., Hausmann, B., Luescher, I.F., Gascoigne, N.R.J., Palmer, E., 2008. The T cell receptor’s alpha-chain connecting peptide motif promotes close approximation of the CD8 coreceptor allowing efficient signal initiation. J. Immunol. 180, 8211–21. Marchler-Bauer, A., Derbyshire, M.K., Gonzales, N.R., Lu, S., Chitsaz, F., Geer, L.Y., Geer, R.C., He, J., Gwadz, M., Hurwitz, D.I., Lanczycki, C.J., Lu, F., Marchler, G.H., Song, J.S., Thanki, N., Wang, Z., Yamashita, R.A., Zhang, D., Zheng, C., Bryant, S.H., 2014. CDD: NCBI’s conserved domain database. Nucleic Acids Res. 43, D222-6. doi:10.1093/nar/gku1221 Meeker, N.D., Smith, A.C.H., Frazer, J.K., Bradley, D.F., Rudner, L.A., Love, C., Trede, N.S., 2010. Characterization of the zebrafish T cell receptor beta locus. Immunogenetics 62, 23–9. doi:10.1007/s00251-009-0407-6 Möller, S., Croning, M.D., Apweiler, R., 2001. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17, 646–53. Mulley, J.F., Hargreaves, A.D., Hegarty, M.J., Heller, R.S., Swain, M.T., 2014. Transcriptomic analysis of the lesser spotted catshark (Scyliorhinus canicula) pancreas, liver and brain reveals molecular level conservation of vertebrate pancreas function. BMC Genomics 15, 1074. doi:10.1186/1471-2164-15-1074 Nguyen, L.-T., Schmidt, H.A., von Haeseler, A., Minh, B.Q., 2014. IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol.
AC C
553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604
16
ACCEPTED MANUSCRIPT
EP
TE D
M AN U
SC
RI PT
Biol. Evol. 32, 268–274. doi:10.1093/molbev/msu300 Øvergård, A.-C., Hordvik, I., Nerland, A.H., Eikeland, G., Patel, S., 2009. Cloning and expression analysis of Atlantic halibut (Hippoglossus hippoglossus) CD3 genes. Fish Shellfish Immunol. 27, 707–713. doi:10.1016/j.fsi.2009.08.011 Paensuwan, P., Hartl, F.A., Yousefi, O.S., Ngoenkam, J., Wipa, P., Beck-Garcia, E., Dopfer, E.P., Khamsri, B., Sanguansermsri, D., Minguet, S., Schamel, W.W., Pongcharoen, S., 2015. Nck binds to the T cell antigen receptor using its SH3.1 and SH2 domains in a cooperative manner, promoting TCR functioning. J. Immunol. 196, 1500958. doi:10.4049/jimmunol.1500958 Parra, Z.E., Baker, M.L., Schwarz, R.S., Deakin, J.E., Lindblad-Toh, K., Miller, R.D., 2007. A unique T cell receptor discovered in marsupials. Proc. Natl. Acad. Sci. U. S. A. 104, 9776–81. doi:10.1073/pnas.0609106104 Pohlert, T., 2014. The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR). R package. R Core Team, 2014. A language and environment for statistical computing. Rambaut, A., Suchard, M.A., Drummond, A.J., 2014. Tracer v1.6. Rast, J., Anderson, M., Litman, R., Margittai, M., Litman, G., Ota, T., Shamblott, M., 1994. Immunoglobulin light chain class multiplicity and alternative organizational forms in early vertebrate phylogeny. Immunogenetics 40. doi:10.1007/BF00188170 Rast, J.P., Anderson, M.K., Strong, S.J., Luer, C., Litman, R.T., Litman, G.W., 1997. alpha, beta, gamma, and delta T Cell antigen receptor genes arise early in vertebrate phylogeny. Immunity 6, 1–11. Rast, J.P., Litman, G.W., 1994. T-cell receptor gene homologs are present in the most primitive jawed vertebrates. Proc. Natl. Acad. Sci. U. S. A. 91, 9248–52. Redmond, A.K., Pettinello, R., Dooley, H., 2017. Outgroup, alignment, and modelling improvements indicate that two TNFSF13-like genes existed in the vertebrate ancestor. Immunogenetics in press. doi:10.1007/s00251-016-0967-1 Richards, M.H., Nelson, J.L., 2000. The evolution of vertebrate antigen receptors: a phylogenetic approach. Mol. Biol. Evol. 17, 146–155. doi:10.1093/oxfordjournals.molbev.a026227 Ruijter, J.M., Ramakers, C., Hoogaars, W.M.H., Karlen, Y., Bakker, O., van den Hoff, M.J.B., Moorman, A.F.M., 2009. Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res. 37, e45. doi:10.1093/nar/gkp045 Rumfelt, L.L., McKinney, E.C., Taylor, E., Flajnik, M.F., 2002. The development of primary and secondary lymphoid tissues in the nurse shark Ginglymostoma cirratum: B-cell zones precede dendritic cell immigration and T-cell zone formation during ontogeny of the spleen. Scand. J. Immunol. 56, 130–48. Saksela, K., Permi, P., 2012. SH3 domain ligand binding: What’s the consensus and where’s the specificity? FEBS Lett. 586, 2609–2614. doi:10.1016/j.febslet.2012.04.042 Salim, A., Bano, A., Zaidi, Z.H., 2003. Prediction of possible sites for posttranslational modifications in human gamma crystallins: effect of glycation on the structure of human gamma-B-crystallin as analyzed by molecular modeling. Proteins 53, 162–73. doi:10.1002/prot.10493 Santiveri, C.M., Borroto, A., Simón, L., Rico, M., Alarcón, B., Jiménez, M.A., 2009. Interaction between the N-terminal SH3 domain of Nck-alpha and CD3-epsilon-derived peptides: non-canonical and canonical recognition motifs. Biochim. Biophys. Acta 1794, 110–7. doi:10.1016/j.bbapap.2008.09.016 Takeuchi, K., Yang, H., Ng, E., Park, S., Sun, Z.-Y.J., Reinherz, E.L., Wagner, G., 2008. Structural and functional evidence that Nck interaction with CD3ε regulates T-cell receptor activity. J. Mol. Biol. 380, 704–716. doi:10.1016/j.jmb.2008.05.037 Theodorsson-Norheim, E., 1986. Kruskal-Wallis test: BASIC computer program to perform
AC C
605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656
17
693 694
RI PT
SC
M AN U
TE D
EP
692
ACCEPTED MANUSCRIPT
nonparametric one-way analysis of variance and multiple comparisons on ranks of several independent samples. Comput. Methods Programs Biomed. 23, 57–62. doi:10.1016/0169-2607(86)90081-7 Toyonaga, B., Yoshikai, Y., Vadasz, V., Chin, B., Mak, T.W., 1985. Organization and sequences of the diversity, joining, and constant region genes of the human T-cell receptor beta chain. Proc. Natl. Acad. Sci. 82, 8624–8628. doi:10.1073/pnas.82.24.8624 Venkatesh, B., Lee, A.P., Ravi, V., Maurya, A.K., Lian, M.M., Swann, J.B., Ohta, Y., Flajnik, M.F., Sutoh, Y., Kasahara, M., Hoon, S., Gangu, V., Roy, S.W., Irimia, M., Korzh, V., Kondrychyn, I., Lim, Z.W., Tay, B.-H., Tohari, S., Kong, K.W., Ho, S., Lorente-Galdos, B., Quilez, J., Marques-Bonet, T., Raney, B.J., Ingham, P.W., Tay, A., Hillier, L.W., Minx, P., Boehm, T., Wilson, R.K., Brenner, S., Warren, W.C., 2014. Elephant shark genome provides unique insights into gnathostome evolution. Nature 505, 174–179. doi:10.1038/nature12826 Wang, Q., Arighi, C.N., King, B.L., Polson, S.W., Vincent, J., Chen, C., Huang, H., Kingham, B.F., Page, S.T., Rendino, M.F., Thomas, W.K., Udwary, D.W., Wu, C.H., 2012. Community annotation and bioinformatics workforce development in concert-Little Skate Genome Annotation Workshops and Jamborees. Database (Oxford). 2012, bar064. doi:10.1093/database/bar064 Weissman, A., Baniyash, M., Hou, D., Samelson, L., Burgess, W., Klausner, R., 1988. Molecular cloning of the zeta chain of the T cell antigen receptor. Science (80-. ). 239, 1018–1021. doi:10.1126/science.3278377 Wickham, H., 2009. ggplot2: Elegant Graphics for Data Analysis. Springer New York, New York, NY. doi:10.1007/978-0-387-98141-3 Wunderlich, L., Faragó, A., Downward, J., Buday, L., 1999. Association of Nck with tyrosine-phosphorylated SLP-76 in activated T lymphocytes. Eur. J. Immunol. 29, 1068–1075. doi:10.1002/(SICI)1521-4141(199904)29:04<1068::AIDIMMU1068>3.0.CO;2-P Wyffels, J., King, B.L., Vincent, J., Chen, C., Wu, C.H., Polson, S.W., L. King, B., Vincent, J., Chen, C., Wu, C.H., Polson, S.W., 2014. SkateBase, an elasmobranch genome project and collection of molecular resources for chondrichthyan fishes. F1000Research 3, 191. doi:10.12688/f1000research.4996.1 Zapata, A., Amemiya, C.T., 2000. Phylogeny of lower vertebrates and their immunological structures. Curr. Top. Microbiol. Immunol. 248, 67–107.
AC C
657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691
18
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
695
TE D
Figure 1. Bayesian phylogenetic analysis of the TCR constant domain from a range of jawed vertebrates including small-spotted catshark (generated under the WAG+I+G4 model). Posterior probability values are included for each node. The alignment was performed using PRANK. Accession numbers for sequences used are presented in supplementary table 3.
Figure 2. Bayesian phylogenetic analysis showing the relationship of CD3ε, CD3γ/δ, and CD3ζ of small-spotted catshark. Other details are as provided in figure 1 legend. Accession numbers for the sequences used are presented in supplementary table 4. 19
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 3. MSA of small-spotted catshark (a) TCRα, (b) TCRβ, (c) TCRγ, and (d) TCRδ constant domain with other vertebrate TCRs molecules. Conserved cysteines are in bold and shaded dark grey, predicted N-linked glycosylation sites are in bold, and other conserved residues are shaded light grey. Positively charge amino acids involved in TCR/CD3 binding are shaded in black. Sequences length and percentage of identity with the first sequence are shown at the end of each sequence. 20
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 4. (a) Schematic representation of possible TCRβ gene organisation. The first scheme represents the mammalian single translocon organisation, where both constant regions with their diversity and joining domain share the same variable regions (Toyonaga et al., 1985). The second scheme represents a multi-cluster organisation, where each TCRβ constant region and their VDJ are present on separate clusters (Rast et al., 1997). (b) MSA of small-spotted catshark TCRβ variables domains performed using MAFFT. TCRβ-A sequences (accession number KY434216-KY434220) are represented with no background and TCRβ-B (accession number KY434221-KY434224) are shaded dark grey. Conserved amino acids are shaded light grey, predicted N-linked glycosylation sites are in bold. Predicted β strands are indicated below each alignment. Main differences between the TCRβ-A and –B are underlined. Sequences length and percentage of similarities with the first sequence are shown at the end of each sequence.
21
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 5. MSA of small-spotted catshark (a) CD3ε, (b) CD3ζ, and (c) CD3γδ with other vertebrate CD3 proteins. Other details are as provided in figure 1 legend. Important residues that are different in sharks are in white highlighted in dark grey (in yellow for online version). Sequence length, percent amino acid identity compared to small-spotted catshark CD3γδ-A and to small-spotted catshark CD3γδ-A are shown at the end each sequence respectively. 22
696
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 6. Constitutive tissue expression levels of catshark TCR/CD3 complex genes in vivo. Expression levels are expressed as arbitrary units normalised against the expression of RPL13 and RPS13 and are provided on a scale that is quantifiably comparable across genes. The results are presented as the mean + SD for four animals with the exception of Leydig organ where only three individuals were used.
EP AC C
698
TE D
697
23
ACCEPTED MANUSCRIPT Highlights •
We describe the first cloning of all CD3 subunits from a cartilaginous fish
•
The core TCR complex has been highly conserved during vertebrate evolution
•
Cartilaginous fish have multiple copies of CD3γδ and the TCRβ constant region The TCRβ constant region is copy number variable across jawed vertebrate
RI PT
•
phylogeny •
Unexpectedly, the proline rich region is absent from small-spotted catshark
AC C
EP
TE D
M AN U
SC
CD3ε