Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark (Scyliorhinus canicula)

Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark (Scyliorhinus canicula)

Accepted Manuscript Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark (Scyliorhinus canicula) Rita Pettinello,...

9MB Sizes 0 Downloads 46 Views

Accepted Manuscript Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark (Scyliorhinus canicula) Rita Pettinello, Anthony K. Redmond, Christopher J. Secombes, Daniel J. Macqueen, Helen Dooley PII:

S0145-305X(17)30020-4

DOI:

10.1016/j.dci.2017.04.015

Reference:

DCI 2879

To appear in:

Developmental and Comparative Immunology

Received Date: 9 January 2017 Revised Date:

18 April 2017

Accepted Date: 18 April 2017

Please cite this article as: Pettinello, R., Redmond, A.K., Secombes, C.J., Macqueen, D.J., Dooley, H., Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark (Scyliorhinus canicula), Developmental and Comparative Immunology (2017), doi: 10.1016/j.dci.2017.04.015. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

1

Evolutionary history of the T cell receptor complex as revealed by small-spotted catshark

2

(Scyliorhinus canicula)

3 4

Rita Pettinello1, Anthony K Redmond1, 2, Christopher J Secombes1, Daniel J Macqueen3,

5

Helen Dooley1, 4

6 1

8

Biological and Environmental Sciences, University of Aberdeen, Aberdeen AB24 2TZ, United

9

Kingdom.4Dept. Microbiology & Immunology, University of Maryland School of Medicine, Institute

10

School of Biological Sciences and 2Centre for Genome-Enabled Biology & Medicine and 3Institute of

of Marine & Environmental Technology, Baltimore MD21202. USA.

SC

11

RI PT

7

12

Corresponding author: Rita Pettinello. School of Biological Sciences, University of

13

Aberdeen, Aberdeen AB24 2TZ, United Kingdom. Email: [email protected]

M AN U

14

Abstract

16

In every jawed vertebrate species studied so far, the T cell receptor (TCR) complex is

17

composed of two different TCR chains (α/β or γ/δ) and a number of CD3 subunits

18

responsible for transmitting signals into the T cell. In this study, we characterised all of the

19

TCR and CD3 genes of small-spotted catshark (Scyliorhinus canicula) and analysed their

20

expression in a broad range of tissues. While the TCR complex is highly conserved across

21

jawed vertebrates, we identified a number of differences in catshark, most notably the

22

presence of two copies of both TCRβ and CD3γδ, and the absence of a functionally-important

23

proline rich region from CD3ε. We also demonstrate that TCRβ has duplicated independently

24

multiple times in jawed vertebrate evolution, bringing additional diversity to the TCR

25

complex. This study reveals new insights about the evolutionary history of the TCR complex

26

and raises new avenues for future exploration.

EP

AC C

27

TE D

15

28

Keywords

29

T cell receptor; CD3; cartilaginous fish; shark; evolution

30 31

Abbreviations

32

T cell receptor (TCR), Immunoglobulin superfamily (IgSF), Major histocompatibility

33

complex (MHC), Rapid amplification of cDNA ends (RACE)-PCR, Quantitative real time

34

PCR (qPCR), No-reverse transcriptase (NRT), Multiple sequence alignment (MSA), No-

35

template control (NTC), peripheral blood lymphocytes (PBL). 1

ACCEPTED MANUSCRIPT

1. Introduction

37

In mammals, the TCR complex is composed of disulphide-linked heterodimers of TCR

38

chains (α/β or γ/δ) associated with an invariant CD3 signalling complex. Each of the four

39

TCR chains contains an extracellular variable (V) immunoglobulin superfamily (IgSF)

40

domain that recognises proteasome-derived peptides displayed in the groove of major

41

histocompatibility complex (MHC), and a membrane-proximal constant (C) domain involved

42

in signal transmission following TCR engagement (Clevers et al., 1988). As all of the TCR

43

chains have a very short cytoplasmic tail they need to partner with the CD3 complex,

44

intracellular signalling proteins with long, immunoreceptor tyrosine-based activation motif

45

(ITAM)-containing cytoplasmic tails, which facilitate signal transduction into the T cell. The

46

CD3 complex in mammals is formed of CD3ε, CD3δ, and CD3γ proteins that associate non-

47

covalently into δ/ε and γ/ε heterodimers, plus the disulphide linked ζ/ζ homodimer. Following

48

TCR engagement, tyrosine kinases, such as Lck and Fyn, phosphorylate the tyrosines in the

49

ITAMs of the CD3 and ζ chains. This creates a docking site for the protein kinase ZAP70

50

which, in turn, phosphorylates the adaptor protein SLP-76, allowing recruitment of LAT

51

(Guy and Vignali, 2009; Wunderlich et al., 1999) and initiating the downstream

52

phosphorylation cascade (Hořejší et al., 2004; Lin and Weiss, 2001). Previous work has also

53

demonstrated the importance of the adaptor protein Nck for TCR signalling regulation

54

(Takeuchi et al., 2008); Nck can bind to the proline-rich sequence (PXXDY) in the CD3ε

55

cytoplasmic tail (Santiveri et al., 2009) via its SH3 binding domain and to SPL-76 via its SH2

56

binding domain (Wunderlich et al., 1999). As the CD3ε proline-rich sequence overlaps one of

57

the ITAM domains there is debate as to whether tyrosine phosphorylation is a prerequisite for

58

Nck binding, as well as precisely what role Nck plays in signal transduction (Barda-Saad et

59

al., 2005; Gil et al., 2002; Paensuwan et al., 2015).

60

The transmembrane regions of all of the TCR and CD3 proteins contain a number of highly

61

conserved residues that facilitate complex assembly (Call et al., 2002). The CD3γ, δ, and ε

62

chains have an extracellular IgSF domain that mediates heterodimer formation (Gold et al.,

63

1987) and a CXXCXE motif connecting the extracellular domain to the transmembrane

64

region, which plays an important role in the assembly of CD3 with the TCR and subsequent

65

signal transduction. In contrast, the ζ chain is not an IgSF member, has a shorter extracellular

66

domain, and forms dimers via a conserved cysteine present in the transmembrane region

67

(Weissman et al., 1988). ζ, with its longer cytoplasmic tail, encodes three ITAMs rather than

68

the single ITAM present in the tail of the other CD3 molecules (Chan et al., 1994). All

69

components of the TCR/CD3 complex need to be fully assembled to allow cell surface

70

display (Clevers et al., 1988).

AC C

EP

TE D

M AN U

SC

RI PT

36

2

ACCEPTED MANUSCRIPT

While most of the genes for the CD3 signalling complex existed in the ancestor of bony

72

vertebrates (Bernot and Auffray, 1991; Dzialo and Cooper, 1997; Liu et al., 2008; Øvergård

73

et al., 2009), the duplication giving rise to CD3γ and CD3δ occurred more recently, in the

74

ancestor of mammals (Bernot and Auffray, 1991). Thus a single locus, known as CD3γδ, is

75

found in non-mammals; CD3γδ has a hybrid chain structure with a ‘CD3δ-like’ extracellular

76

domain and ‘CD3γ-like’ intracellular region (Göbel et al., 2000). In this case the TCR

77

complex is formed by association of the TCR heterodimer with two CD3ε/CD3γδ

78

heterodimers and one ζ homodimer (Berry et al., 2014).

79

Homologues of the four mammalian TCR chains have previously been identified in nurse

80

shark, little skate and elephant shark (Criscitiello et al., 2010; Rast and Litman, 1994;

81

Venkatesh et al., 2014). Unlike immunoglobulins (Igs), which are present in an atypical

82

‘cluster’ organisation (Litman et al., 1993; Rast et al., 1994), it is thought that the TCR chains

83

of cartilaginous fish exist in a translocon configuration similar to other bony vertebrates

84

(Chen et al., 2010; Rast et al., 1997). In addition to the four canonical TCR chains

85

cartilaginous fish also have a novel TCR chain, NAR-TCR, composed of a TCRδ C domain

86

carrying two independently-rearranging V domains. The N terminal V domain is most closely

87

related to that of IgNAR, a shark-specific Ig that does not associate with light chains

88

(Criscitiello et al., 2006; Greenberg et al., 1996). Some divergent TCR chains are also

89

generated via trans-rearrangement between a TCR V region and that of a (presumably) near-

90

by Ig (Criscitiello et al., 2010). Cartilaginous fish also use somatic hypermutation to increase

91

the diversity of TCR V regions, while mammals only mutate Ig Vs (Chen et al., 2012).

92

Currently, our understanding of the TCR complex of cartilaginous fish is fragmented; TCR

93

chains have been characterised in nurse shark, while the genes encoding CD3ε, CD3γδ and

94

the ζ chain were recently annotated in the elephant shark genome. Therefore, to improve our

95

understanding of the TCR complex of cartilaginous fish and its evolutionary history in

96

vertebrates, we set about characterising the complete set of genes encoding TCR chain and

97

CD3 proteins in a single elasmobranch species, the small-spotted catshark (Scyliorhinus

98

canicula) and quantified their mRNA expression patterns in a range of tissues.

99

AC C

EP

TE D

M AN U

SC

RI PT

71

100

2. Materials and Methods

101

2.1 Small-spotted catshark

102

Three male and one female captive-bred, small-spotted catsharks (Scyliorhinus canicula) of

103

approximately three years of age, were maintained in artificial seawater at 8-12°C in indoor

104

tanks at the University of Aberdeen. Animals were overdosed in MS222 prior to sacrifice and 3

ACCEPTED MANUSCRIPT

105

tissue harvest. All procedures were conducted in accordance with UK Home Office ‘Animals

106

and Scientific Procedures Act 1986; Amendment Regulations 2012’ on animal care and use.

107

2.2 RNA isolation and cDNA synthesis

109

A range of tissues (epigonal organ, gill, Leydig organ, spiral valve, liver, heart, muscle and

110

stomach) were collected from each animal, diced into small pieces and either flash frozen or

111

immersed in RNAlater then stored at -80°C (Sigma). Total RNA was extracted by

112

mechanical disruption of the tissues in TRIzol reagent (Sigma) and the subsequent addition of

113

1/3 volume of chloroform. After centrifugation at 4°C the upper phase was collected and the

114

RNA precipitated through the addition of isopropanol and incubation at -80°C for at least 3 h.

115

Following centrifugation at 4°C, the RNA pellet was washed three times with 70% ethanol.

116

The pellet was allowed to air dry after aspirating the remaining ethanol and then re-suspended

117

in RNase-free water. The RNA was quantified via a NanoDrop® ND-1000

118

spectrophotometer; for all samples the 260/280 and 260/230 ratios were 2.0 +/-0.2. RNA

119

integrity was tested by gel electrophoresis then stored at -80°C for future use.

120

First strand cDNA for PCR and rapid amplification of cDNA ends (RACE)-PCR was

121

synthesised from 2µg of total RNA extracted from spleen. For use in PCR, first strand cDNA

122

was synthesised using illustra™ Ready-To-Go™ RT-PCR Beads (GE Healthcare) with oligo

123

(dT) primer, while for RACE-PCR it was synthesised using the SMARTer II RACE Kit

124

(Clontech Ltd.). After a genomic DNA removal step first strand cDNA for quantitative real

125

time PCR (qPCR) was synthesized from 1µg total RNA from each tissue using the

126

QuantiTect Reverse Transcription Kit (Qiagen Ltd.), and a no-reverse transcriptase (NRT)

127

control was made containing a pooled RNA equally representing all samples and all reagents

128

but not the reverse transcriptase. For each kit the manufacturer’s instructions were followed.

SC

M AN U

TE D

EP

AC C

129

RI PT

108

130

2.3 PCR, cloning and sequencing

131

An in-house, normalised multi-tissue small-spotted catshark transcriptome (sequenced on the

132

Ion Proton ™ Sequencer and assembled with Trinity v20131110) (Redmond et al., in

133

preparation) was BLAST searched for TCR and CD3 transcripts, seeding with sequences

134

taken from across vertebrate phylogeny (sequence accession numbers provided in

135

supplemental table 1). Partial sequences for the TCRβ and ζ chains, full sequences for CD3ε,

136

CD3γδ and the constant regions for the remaining TCRs were identified. PCR primers were

137

designed with the aid of OligoCalc (http://biotools.nubic.northwestern.edu/OligoCalc.html;

138

Kibbe 2007) (supplemental table 2). 4

ACCEPTED MANUSCRIPT

Nested 5’ or 3’ RACE-PCR using gene specific primers (primer details provided in

140

supplemental table 2) and the universal primers provided in the RACE kit was performed as

141

per the manufacturer’s instructions to complete partial sequences. Cycling conditions for both

142

rounds of RACE-PCR were as follows: 1 cycle at 95°C for 5 min; 35 cycles at 95°C for 30

143

sec, 60°C for 30 sec and 72°C for 90 sec; 1 cycle at 72°C for 3 min. Completed sequences, as

144

well as those found in full in the transcriptome, were amplified end-to-end by PCR using

145

gene specific primers (supplemental table 2). PCR products were ligated into pGEM-T Easy

146

vector (Promega) and transformed into heat shock-competent E. coli. Colonies containing

147

vector with inserts were identified through blue–white colour selection when grown on LB

148

agar containing 100 µg/ml ampicillin, 25 µg/ml X-Gal and 25 µg/ml IPTG. Plasmid DNA

149

was prepared from three independent white colonies and screened for insert size by standard

150

PCR using SP6 and T7 primers under the following cycling conditions: 1 cycle at 95°C for 5

151

min; 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 90 sec; 1 cycle of 72°C for 3

152

min. Once the insert size was confirmed by gel electrophoresis, plasmids were sent to Eurofin

153

MWG-Biotech for Sanger sequencing. Sequences confirmed in this manner were submitted

154

to NCBI under the accession numbers given in supplemental table 3 and 4.

M AN U

SC

RI PT

139

155

2.4 Sequence analysis

157

The sequences returned from cloning experiments were used in BLAST searches (Altschul et

158

al., 1990) against the non-redundant NCBI database to check for similarity with other

159

characterized sequences. Subsequently, amino acid comparison was performed by aligning

160

closely-related sequences from different vertebrate classes using MAFFT version 7

161

(http://mafft.cbrc.jp/alignment/server/index.html - Katoh and Standley, 2013). Combined

162

transmembrane topology and the presence of signal peptide were predicted using Phobius

163

(http://phobius.sbc.su.se/; Käll et al. 2004) and Tmhmm

164

(http://www.cbs.dtu.dk/services/TMHMM/; Möller et al., 2001). N-linked glycosylation sites

165

were predicted using NetNGlyc 1.0 Server (http://www.cbs.dtu.dk/services/NetNGlyc/; Salim

166

et al., 2003). Conserved domains were identified using CD-Search

167

(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi; Marchler-Bauer et al. 2014).

168

Secondary structure of TCR variable regions was predicted with PSIPRED Protein Sequence

169

Analysis Workbench (http://bioinf.cs.ucl.ac.uk/psipred/).

AC C

EP

TE D

156

170 171

2.5 Phylogenetic analysis

172

Multiple sequence alignment (MSA) was performed in PRANK (Loytynoja and Goldman,

173

2008; Löytynoja and Goldman, 2010). Sequences were adjusted manually to reduce long 5

ACCEPTED MANUSCRIPT

gaps in BioEdit 7.2.5 (Hall, 1999), and the best-fitting amino acid substitution model was

175

estimated by maximum likelihood using the IQ-TREE server (Nguyen et al., 2014)

176

(WAG+I+G4 for both CD3 proteins and TCR constant chains).

177

Two Markov chain Monte Carlo (MCMC) chains of Bayesian phylogenetic analysis were

178

executed using BEAST v.1.8.1 (Drummond et al., 2012) for each dataset; with a lognormal

179

uncorrelated relaxed clock (Drummond et al., 2006), yule speciation process, random starting

180

tree, and MCMC chain lengths of 20 million states (sampled every 1000 states). Tracer v1.6

181

(http://beast.bio.ed.ac.uk/Tracer; (Rambaut et al., 2014) was used to assess convergence and

182

mixing of the MCMC traces (effective sample size values were greater than 200 for all

183

estimated parameters). The trees generated from both runs were combined with LogCombiner

184

v1.8.1 after removing 10% as burn-in. A maximum clade credibility tree was then created

185

with TreeAnnotator v1.8.1 and visualised using FigTree v1.4

186

(http://tree.bio.ed.ac.uk/software/figtree/).

M AN U

187

SC

RI PT

174

2.6 Quantitative real time PCR (qPCR)

189

We performed qPCR according to MIQE guidelines (Bustin et al., 2009). RNA was

190

extracted from 9 different tissues from four small-spotted catsharks as described above.

191

qPCR reactions contained 5 µl first strand cDNA, generated as described above, 1.5 µl

192

sense/antisense primers (at 10 nM), 7.5 µl 2x Brilliant III SYBR® Green QPCR Master Mix

193

(Agilent Technology) and 1 µl RNase free water to give a final volume of 15 µl. qPCR

194

primers were manually designed and checked with NetPrimer

195

(http://www.premierbiosoft.com/NetPrimer/AnalyzePrimer.jsp; supplementary table 2).

196

Reactions were performed on a Stratagene Mx3005P system (Agilent Technology) with 1

197

cycle of 3 min at 95°C then 40 cycles of 20 sec at 95°C and 20 sec at 64°C followed by

198

dissociation analysis, where a single peak was observed in all cases. All samples were present

199

in each plate in duplicate with NRT and a no-template control (NTC). Cq values were

200

calculated from baseline-corrected SYBR fluorescence data with a standardised threshold and

201

the baseline set between 3 and 15 cycles. For each run, the NRT and NTC controls produced

202

negligible amplification, indicating a lack of contamination from carried over genomic DNA

203

or other DNA sources. A cut-off of “no expression” was set at 36 Cq for each experimental

204

qPCR assay, approximating the limit of reliable detectability. The efficiency of each primer

205

pair was calculated using LinRegPCR v.11 (Ruijter et al., 2009), following the authors

206

recommendations.

207

Five reference genes (RPL13, RPS13, RPS19, ACTβ and EF1α) were tested using the

208

NormFinder modelling algorithm (Andersen et al., 2004) within Genex v5 (MultiD Analysis

AC C

EP

TE D

188

6

ACCEPTED MANUSCRIPT

AB), accounting for variance in regulation across tissues and individuals. This analysis

210

indicated that the combined expression of RPL13 and RPS13 provided the most appropriate

211

normalization factor. Samples giving poor quality cDNA (showing no expression of reference

212

genes) were excluded from the analysis, leaving only one sample for peripheral blood

213

lymphocytes (PBL) and pancreas, and three samples for Leydig organ. Raw qPCR data were

214

normalized to the expression of RPL13 and RPS13 and the relative expression calculated on a

215

scale quantitatively comparable across all genes (assigning a value of 1 to the average of the

216

expression of all genes combined across tissues using Genex v5).

217

To test if differences in gene expression between tissues were statistically significant for each

218

tested gene, we performed Kruskal-Wallis H tests, along with Dunn’s post-hoc test returning

219

p-values adjusted using the Bonferroni correction (Pohlert, 2014) with a 95% confidence

220

level (Theodorsson-Norheim, 1986) using packages within R (R Core Team, 2014). Mean,

221

standard error and standard deviation were also calculated in R with the summarySE function

222

of the Rmisc package (Hope, 2013). Data were visualised with bar chart plots in R using the

223

ggplot2 package (Wickham, 2009).

SC

3. Results

226

3.1 Characterisation of multiple TCRβ and CD3γδ in small-spotted catshark

TE D

227

M AN U

224 225

RI PT

209

Putative small-spotted catshark TCR and CD3 transcripts were identified through a

229

BLAST search of our in-house, normalised multi-tissue transcriptome using those of other

230

vertebrates as the query (accession numbers provided in supplementary table 1). Based on our

231

initial assignments using BLAST searches against characterized vertebrate proteins (i.e.

232

subject to confirmation by phylogenetic analysis), we identified single genes for TCRα,

233

TCRγ, TCRδ, CD3ε and ζ, two variants for CD3γδ and two variants for TCRβ in small-

234

spotted catshark (NCBI accession numbers provided in supplementary table 3). Sequences

235

were confirmed by cloning and Sanger sequenced as described previously. The discovery of

236

multiple putative CD3γδ and TCRβ genes was unexpected, so to investigate this finding

237

further, we searched additional available cartilaginous fish datasets (consisting of an in-house

238

nurse shark transcriptome [unpublished data], the published transcriptome and genome of

239

little skate (Wang et al., 2012; Wyffels et al., 2014), and the published transcriptome and

240

genome of elephant shark (Venkatesh et al., 2014)), using our small-spotted catshark

241

transcripts as the query; the resultant hits suggest the presence of three TCRβ C variants and

242

two CD3γδ variants in the nurse shark, two TCRβ C and one CD3γδ in little skate, and two

AC C

EP

228

7

ACCEPTED MANUSCRIPT

243

TCRβ C and two CD3γδ in the elephant shark (figures 1 and 2), a finding missed by previous

244

studies.

245 246

3.2 Multiple independent duplications of TCRβ and CD3γδ in jawed vertebrates Previous studies used distant outgroups (e.g. Richards & Nelson 2000) or unrooted

248

phylogenies (e.g. Parra et al. 2007) to establish the relationship between the different TCR

249

chains. Here, for the first time, we utilised a relaxed clock rooting method, allowing us to

250

statistically infer the root without including distant outgroups (Drummond et al., 2006), as

251

performed successfully for other gene families over similar evolutionary timescales

252

(Macqueen and Wilcox, 2014; Redmond et al., 2017). Phylogenetic analysis revealed that the

253

shark TCR constant domain and CD3 sequences form clades with their respective

254

counterparts from other vertebrates (figures 1 and 2 respectively). The root for TCR constant

255

chains falls in between TCRα/TCRδ and TCRγ/TCRβ, while the root for CD3 gene separate

256

CD3ε and CD3γδ (Cd3γ and CD3δ for mammals) and ζ. These results are congruent with

257

previous studies (e.g. Richards and Nelson, 2000). The relatively low homologies of the

258

TCRβ and CD3γδ variants (72% and 56% respectively) combined with the fact that both form

259

multiple phylogenetic clusters, containing sequences from different cartilaginous fish species,

260

supports the conclusion that the two variants are not allelic forms of the same gene. Each

261

variant was cloned multiple times for both genes, from the mRNA of different individuals,

262

and no variation in sequence was observed (data not shown). While the sequence data for

263

TCRβ is not consistent with the variants being alternate splice forms, the fact that the first 18

264

amino acids of the two CD3γδ variants are identical (also at the DNA level), in both catshark

265

and nurse shark, suggests the variants are a product of a partial gene duplication and share the

266

exon(s) encoding this region (supplemental figure 1).

SC

M AN U

TE D

EP

Intriguingly, our analysis also shows that TCRβ has duplicated independently

AC C

267

RI PT

247

268

multiple times during jawed vertebrate evolution (figure 1), and underwent at least four

269

duplication events in the cartilaginous fish lineage alone; one duplication in the ancestor of

270

the elephant shark (or all chimeras) after the separation with elasmobranchs, while two

271

independent duplications happened in an ancestor common to nurse shark, and small-spotted

272

catshark, and one duplication happened in an ancestor of the skate lineage (figure 1). As the

273

two TCRβ variants of elephant shark share a much higher similarity (92%) than those of

274

catshark (72%), it suggests that the elephant shark duplication either happened later in this

275

species’ evolutionary history or, alternatively, that on-going gene conversion events have

276

homogenised older gene duplicates. 8

ACCEPTED MANUSCRIPT

CD3γδ appears to have duplicated at least twice in the ancestor of all cartilaginous

278

fishes with the partial duplication being a third event in a common ancestor of nurse shark

279

and small-spotted catshark (figure 2). However, support at key nodes was low and so

280

additional analysis were performed without more distant outgroups; these analyses again

281

suggested that at least two duplications occurred in the ancestor of cartilaginous fishes, but

282

the patterns of duplication are different, with the partial duplication occurring in the ancestor

283

of all cartilaginous fishes (supplemental figure 2). The incongruence between these analyses

284

make the precise timing of the CD3γδ partial duplication difficult to assign. Re-analysis once

285

full sequences from nurse and elephant shark are available might help to resolve the exact

286

timing of this duplication.

RI PT

277

288

SC

287

3.3 Conservation of TCR constant chain structure in small-spotted catshark We characterised the four canonical TCR C domains in small-spotted catshark; as in

290

previous studies of cartilaginous fish analysis of our TCRs transcripts suggested the presence

291

of a single gene for TCRα, TCRγ and TCRδ. However, unlike previous studies, we identified

292

two significantly different transcripts for the TCRβ C region.

293

Comparison of the amino acid sequences of the small-spotted catshark TCR constant regions

294

with other vertebrates shows conservation of gross structure (IgSF constant domain,

295

connecting peptide, transmembrane region and a short intracellular cytoplasmic tail; figure

296

3). The two canonical cysteines in the extracellular domain necessary for intradomain

297

disulphide bonding are conserved, as are the positively charged residues in the

298

transmembrane region required for association with the CD3 complex (figure 3). Small-

299

spotted catshark TCRα, like TCRα from other cartilaginous and bony vertebrates, lacks the

300

connecting peptide motif ‘FETDXNLN’, which is thought to play a role in signal

301

transduction and CD8 co-receptor function in mammals (Bäckström et al., 1996; Mallaun et

302

al., 2008). The ‘SRLR’ sequence that in mammals mediates association of TCRβ with TCRα

303

chains is conserved in all elasmobranchs analysed (Arnaud et al. 1997) (figure 3). There are

304

no major structural differences between the two TCRβ constant regions and both variants

305

carry the glutamine (position 136 in human) crucial for signalling (Bäckström et al., 1996).

306

As mentioned earlier, cartilaginous fish TCR genes are thought to be present in the typical

307

translocon organisation, rather than the atypical ‘cluster’ organisation observed for their Ig

308

genes (Litman et al., 1993). In mammals, which also express two TCRβ variants, the locus

309

contains multiple V segments each of which can rearrange with either of two downstream D-

310

Jn-C blocks (figure 4a; translocon configuration). To better understand the TCRβ organisation

311

in catshark, and determine if it had a similar structure to that of mammals, we used 5’ RACE-

AC C

EP

TE D

M AN U

289

9

ACCEPTED MANUSCRIPT

PCR with C domain specific primers for both TCRβ variants to sequence the V regions. The

313

sequences obtained (figure 4b) show that the two C domains are associated with distinct V

314

and J segments suggesting two separate clusters, each having one V, (at least) one D and one

315

J segment and a C domain (figure 4a; cluster configuration).

316

While full characterisation of the small-spotted catshark TCR V region repertoire was not an

317

aim of this project, all of the V region sequences obtained shared the markers typical of IgSF

318

V domains and are presented in FASTA format in supplemental figure 3 for the use of others.

319

While we did not identify NAR-TCR in this study it does not exclude its presence in this

320

species. We did identify a single V region that was a hybrid of IgM (Crouch et al., 2013) and

321

TCRδ (supplemental figure 4), indicating trans-rearrangement occurs in catshark as well as

322

nurse shark.

SC

RI PT

312

323

3.4 Surprising absence of the proline-rich domain in small-spotted catshark CD3ε

325

CD3 sequences were confirmed by PCR from catshark spleen cDNA. Again the expected

326

gross structure (leader peptide followed by extracellular, transmembrane, and cytoplasmic

327

domains) is conserved in all three CD3 subunits (figure 5). The extracellular domain of CD3ε

328

contains a canonical IgSF domain, however domain-prediction software did not identify this

329

domain in CD3γδ-A and CD3γδ-B even though they both contain the tryptophan and the two

330

canonical cysteines common to IgSF domains and critical for the β-sandwich tertiary

331

structure. The CXXCXE motif important for TCR complex assembly and signal transmission

332

is conserved in the extracellular region of CD3ε and both CD3γδ variants. The charged amino

333

acid necessary for TCR complex formation is present in the transmembrane region of all the

334

CD3 proteins, with the notable exception of CD3γδ-B, suggesting it may have a different

335

mode of interaction. The ITAM motifs necessary for the transmission of intracellular signals

336

are conserved in all four CD3 proteins, with CD3ε and both CD3γδ having only one ITAM

337

motif and ζ three ITAMs (figure 5). Interestingly the canonical SH3 binding motif PXXDY,

338

important for Nck binding, is not conserved in the cytoplasmic tail of small-spotted catshark

339

CD3ε. Amino acid comparison shows that the same motif is missing in nurse shark however

340

it is present in elephant shark CD3ε (figure 5a).

341

It was apparent from the MSA that the small-spotted catshark ζ extracellular domain is

342

shorter than that of other vertebrates, including the elephant shark (figure 5b). To confirm this

343

unexpected finding 5’ RACE-PCR was performed and the sequence obtained was identical to

344

the one retrieved from our in-house transcriptome databases, having a stop codon before the

345

sequence shown in figure 5b. BLAST searches in additional small-spotted catshark

346

transcriptome datasets (Mulley et al. 2014) also show a stop codon in the same position,

AC C

EP

TE D

M AN U

324

10

ACCEPTED MANUSCRIPT

indicating the catshark ζ transcript is indeed shorter than the others (510bp, encoding 169aa).

348

Interestingly the missing region corresponds to the first exon of mammalian ζ, suggesting this

349

exon has either been lost in catshark or is skipped during mRNA splicing (perhaps due to a

350

point mutation or frameshift). Despite this, the cysteine in the transmembrane region required

351

for ζ homodimer formation is conserved in small-spotted catshark.

352

In other species, the ‘LG(X)4DPRG’ motif in CD3γδ aids binding to CD3ε (Dietrich et al.,

353

1996) however this motif is absent from both small-spotted catshark CD3γδ variants. Small-

354

spotted catshark CD3γδ-A has two predicted N-linked glycosylation sites at N42 and N49

355

which are absent in CD3γδ-B (figure 5c). Nurse shark CD3γδ-B has only one predicted

356

glycosylation site. While both nurse shark CD3γδ sequences were partial, it was possible to

357

identify the conserved CXXC region and part of the ITAM domain (figure 5c). In nurse

358

shark, as in small-spotted catshark, only one of the two variants has the negatively charged

359

amino acid required for TCR binding, while in the other variant this residue has mutated to

360

serine. The presence of serine in both small-spotted catshark and nurse shark suggests this

361

change occurred in a common ancestor of the elasmobranchs.

M AN U

SC

RI PT

347

362 363

3.6 TCR complex genes are highly expressed in secondary lymphoid tissues Baseline mRNA levels for each of the TCR-CD3 complex genes in a range of

365

catshark tissues are shown in figure 6. Due to the small sample size (four individuals), it was

366

not possible to identify a statistical significant difference in the expression of each gene;

367

however all the genes showed a similar pattern, with the highest transcript levels observed in

368

spleen followed by gills and spiral valve, and the lowest levels in muscle (p-values presented

369

in supplemental table 5), an expression pattern similar to that observed for TCR in nurse

370

shark (Criscitiello et al., 2010). This result is not unexpected as the spleen is the major

371

secondary lymphoid organ in sharks and where the adaptive immune response primarily

372

occurs (Rumfelt et al., 2002; Zapata and Amemiya, 2000). Further, the gills and spiral valve

373

will come into direct contact with externally-derived pathogens and so would also be

374

expected to have some secondary immune function. While pancreas and PBL were excluded

375

from the statistical analysis due to the presence of only one cDNA sample, TCR/CD3

376

expression in the one available PBL sample was almost as high as spleen, whereas that in the

377

one pancreas sample was much lower than that of other tissues (supplemental figure 5).

378

The expression of CD3ε and CD3γδ-A were significantly higher than CD3γδ-B, TCRα,

379

TCRβ-A, and TCRδ (supplementary table 5). Although TCRγ and TCRδ had similar

380

expression levels, TCRβ-A and TCRβ-B (combined) had almost double the expression of

381

TCRα, although this difference was not statistically significant. The two TCRβ paralogues

AC C

EP

TE D

364

11

ACCEPTED MANUSCRIPT

382

did not show significantly different expression in any of the tissues examined (p-value

383

0.1786; figure 6 and supplemental table 6). In contrast, CD3γδ-A showed almost six times

384

higher expression than CD3γδ-B in all of the tissues examined (p-value 0.0063; figure 6 and

385

supplemental table 6).

386

4. Discussion

388

This report describes the TCR/CD3 complex genes of the small-spotted catshark and

389

represents the first molecular cloning of the full complement of CD3 subunits from any

390

cartilaginous fish species. As expected the gross structural features required for CD3-TCR

391

interaction and functioning are highly conserved between cartilaginous fish and other

392

vertebrates (including mammals).

393

Despite this, a number of notable differences were observed; firstly, both small-spotted

394

catshark and nurse shark CD3ε lacked the non-canonical PXXDY SH3 binding motif, which

395

in mammals is important for Nck binding (Takeuchi et al., 2008). Although the SH3 binding

396

motif is absent we were able to find transcripts for both Nck1 and Nck2 in our catshark

397

transcriptomic database and both possessed the amino acids required for CD3ε binding

398

(supplementary figure 6; Saksela & Permi 2012) implying they can still associate with CD3ε.

399

Indeed it has been reported that some proteins, such as SAM68, which do not have a PXXDY

400

motif can bind to Nck (Kesti et al., 2007), while other studies have identified non-canonical

401

SH3 binding motifs where binding is regulated by the conformation of the target molecule

402

(Barnett et al., 2000; Saksela and Permi, 2012). Interestingly, the PXXDY motif is present in

403

elephant shark CD3ε, suggesting that the loss of this motif is restricted to elasmobranchs. The

404

functional significance of this change for elasmobranch CD3ε requires further investigation.

405

We were also surprised to find two variants of the CD3γδ gene in both small-spotted catshark

406

and nurse shark. However, on closer examination it was clear the CD3γδ-B variant does not

407

possess all of the expected characteristics of a CD3γδ protein; while both variants have the

408

CXXC motif and ITAMs necessary for transmitting phosphorylation signals into the T cell,

409

the negatively charged residue in the transmembrane region necessary for non-covalent

410

binding with the TCR chain is only present in CD3γδ-A. Additionally there are two potential

411

N-linked glycosylation sites in the extracellular domain of CD3γδ-A that are absent in

412

CD3γδ-B. A previous study in chicken showed that glycosylation of CD3γδ is necessary for

413

binding to CD3ε (Göbel et al., 2000). An absence of glycosylation on CD3γδ has been

414

previously reported in teleosts, where the glycosylation site was present in CD3ε instead

415

(Guo and Nie, 2013) however this is not the case in small-spotted catshark (figure 5c).

416

Expression analysis of small-spotted catshark CD3γδ genes indicate that both have high

AC C

EP

TE D

M AN U

SC

RI PT

387

12

ACCEPTED MANUSCRIPT

expression in spleen compared to other tissues, but CD3γδ-A expression is six times higher

418

than that of CD3γδ-B across all tissues. The differences in sequence and expression between

419

the two paralogues suggest divergence of function following CD3γδ duplication. In the

420

future, it will be important to define the precise role of the two CD3γδ variants in the shark

421

TCR complex.

422

Another novel finding of our study was the presence of multiple TCRβ variants in

423

cartilaginous fish. Although the presence of multiple TCRβ C chains in elasmobranchs was

424

suggested previously by Rast and colleagues (1997), firm evidence was lacking until the

425

current study. The two TCRβ constant chains found in small-spotted catshark have a similar

426

structure and tissue expression profile. Multiple TCRβ genes have previously been

427

documented in vertebrate species ranging from mammals (2 TCRβ C domains) to axolotl (4

428

TCBβ Cs) and zebrafish (2 TCRβ Cs) (Fellah et al., 2001, 1993; Meeker et al., 2010;

429

Toyonaga et al., 1985). In both mammals and zebrafish, the TCRβ C domains are organised

430

in a translocon configuration and although they have independent D and J segments, they

431

share the same V segments (Meeker et al., 2010; Toyonaga et al., 1985). Preliminary analysis

432

indicates the two TCRβ C domains in catshark utilise different V and J segments, suggesting

433

the genes for the TCRβ chain may be present in the ‘atypical’ cluster organisation (where the

434

V, D and J segments are associated with a single set of C regions (figure 4b - Hinds and

435

Litman, 1986)) rather than translocon configuration suggested for the other shark TCR chains

436

(Chen et al., 2009). Unfortunately the scaffolds containing the TCRβ C regions in the

437

elephant shark genome were not long enough to confirm our hypothesis and our attempts to

438

amplify the catshark TCR genes in their germline configuration have so far been

439

unsuccessful.

440

The presence of multiple TCRβ C regions in diverse vertebrate lineages could be due to a

441

duplication in the ancestor of all jawed vertebrates followed by lineage-specific conversion

442

events or, alternatively, recurrent lineage-specific duplications. If the latter is true this would

443

suggest some selective pressure towards multiples copies of TCRβ C regions, perhaps as

444

another means to increase TCR diversity; why this occurs for TCRβ and not the other TCRs

445

is, as yet, unexplained. While this study revealed new information about the evolutionary

446

history of the TCR receptor complex, it also raises exciting new questions for future

447

exploration.

AC C

EP

TE D

M AN U

SC

RI PT

417

448 449

Acknowledgements

450

This work was supported by a PhD studentship awarded by the University of Aberdeen,

451

College of Life Sciences and Medicine to RP, and a Royal Society research grant 13

ACCEPTED MANUSCRIPT

452

[RG130789] awarded to HD. The authors thank Kathryn Crouch for providing the nurse

453

shark transcriptome assembly, Milena Monte, Kimberley Mackenzie, Heather Ritchie and

454

Abdullah Alzaid for technical advice, and Marianna Chimienti for help with manuscript

455

editing.

456

References

458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410. doi:10.1016/S0022-2836(05)80360-2 Andersen, C.L., Jensen, J.L., Ørntoft, T.F., 2004. Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Res. 64, 5245–50. doi:10.1158/0008-5472.CAN-04-0496 Arnaud, J., Huchenq, A., Vernhes, M.C., Caspar-Bauguil, S., Lenfant, F., Sancho, J., Terhorst, C., Rubin, B., 1997. The interchain disulfide bond between TCR alpha beta heterodimers on human T cells is not required for TCR-CD3 membrane expression and signal transduction. Int. Immunol. 9, 615–626. doi:10.1093/intimm/9.4.615 Bäckström, B.T., Milia, E., Peter, A., Jaureguiberry, B., Baldari, C.T., Palmer, E., 1996. A motif within the T cell receptor alpha chain constant region connecting peptide domain controls antigen responsiveness. Immunity 5, 437–47. doi:10.1016/S10747613(00)80500-2 Barda-Saad, M., Braiman, A., Titerence, R., Bunnell, S.C., Barr, V.A., Samelson, L.E., 2005. Dynamic molecular interactions linking the T cell antigen receptor to the actin cytoskeleton. Nat. Immunol. 6, 80–89. doi:10.1038/ni1143 Barnett, P., Bottger, G., Klein, A.T., Tabak, H.F., Distel, B., 2000. The peroxisomal membrane protein Pex13p shows a novel mode of SH3 interaction. EMBO J. 19, 6382– 91. doi:10.1093/emboj/19.23.6382 Bernot, A., Auffray, C., 1991. Primary structure and ontogeny of an avian CD3 transcript. Proc. Natl. Acad. Sci. 88, 2550–2554. doi:10.1073/pnas.88.6.2550 Berry, R., Headey, S.J., Call, M.J., McCluskey, J., Tregaskes, C.A., Kaufman, J., Koh, R., Scanlon, M.J., Call, M.E., Rossjohn, J., 2014. Structure of the chicken CD3εδ/γ heterodimer and its assembly with the αβT cell receptor. J. Biol. Chem. 289, 8240–51. doi:10.1074/jbc.M113.544965 Bustin, S.A., Benes, V., Garson, J.A., Hellemans, J., Huggett, J., Kubista, M., Mueller, R., Nolan, T., Pfaffl, M.W., Shipley, G.L., Vandesompele, J., Wittwer, C.T., 2009. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin. Chem. 55, 611–22. doi:10.1373/clinchem.2008.112797 Call, M.E., Pyrdol, J., Wiedmann, M., Wucherpfennig, K.W., 2002. The organizing principle in the formation of the T cell receptor-CD3 complex. Cell 111, 967–979. doi:10.1016/S0092-8674(02)01194-7 Chan, A.C., Desai, D.M., Weiss, A., 1994. The role of protein tyrosine kinases and protein tyrosine phosphatases in T cell antigen receptor signal transduction. Annu Rev Immunol 12, 555–592. doi:10.1146/annurev.iy.12.040194.003011 Chen, H., Bernstein, H., Ranganathan, P., Schluter, S.F., 2012. Somatic hypermutation of TCR γ V genes in the sandbar shark. Dev. Comp. Immunol. 37, 176–83. doi:10.1016/j.dci.2011.08.018 Chen, H., Kshirsagar, S., Jensen, I., Lau, K., Covarrubias, R., Schluter, S.F., Marchalonis, J.J., 2009. Characterization of arrangement and expression of the T cell receptor gamma locus in the sandbar shark. Proc. Natl. Acad. Sci. 106, 8591–8596. doi:10.1073/pnas.0811283106

AC C

EP

TE D

M AN U

SC

RI PT

457

14

ACCEPTED MANUSCRIPT

EP

TE D

M AN U

SC

RI PT

Chen, H., Kshirsagar, S., Jensen, I., Lau, K., Simonson, C., Schluter, S.F., 2010. Characterization of arrangement and expression of the beta-2 microglobulin locus in the sandbar and nurse shark. Dev. Comp. Immunol. 34, 189–95. doi:10.1016/j.dci.2009.09.008 Clevers, H., Alarcon, B., Wileman, T., Terhorst, C., 1988. The T cell receptor/CD3 complex: a dynamic protein ensemble. Annu. Rev. Immunol. 6, 629–62. doi:10.1146/annurev.iy.06.040188.003213 Criscitiello, M.F., Saltis, M., Flajnik, M.F., 2006. An evolutionarily mobile antigen receptor variable region gene: doubly rearranging NAR-TcR genes in sharks. Proc. Natl. Acad. Sci. U. S. A. 103, 5036–41. doi:10.1073/pnas.0507074103 Criscitiello, M.F.M., Ohta, Y., Saltis, M., McKinney, E.C., Flajnik, M.F., 2010. Evolutionarily conserved TCR binding sites, identification of T cells in primary lymphoid tissues, and surprising trans-rearrangements in nurse shark. J. Immunol. 184, 6950–60. doi:10.4049/jimmunol.0902774 Crouch, K., Smith, L.E., Williams, R., Cao, W., Lee, M., Jensen, A., Dooley, H., 2013. Humoral immune response of the small-spotted catshark, Scyliorhinus canicula. Fish Shellfish Immunol. 34, 1158–69. doi:10.1016/j.fsi.2013.01.025 Dietrich, J., Neisig, A., Hou, X., Wegener, A.M., Gajhede, M., Geisler, C., 1996. Role of CD3 gamma in T cell receptor assembly. J. Cell Biol. 132, 299–310. doi:10.1083/jcb.132.3.299 Drummond, A.J., Ho, S.Y.W., Phillips, M.J., Rambaut, A., 2006. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4, e88. doi:10.1371/journal.pbio.0040088 Drummond, A.J., Suchard, M.A., Xie, D., Rambaut, A., 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–73. doi:10.1093/molbev/mss075 Dzialo, R.C., Cooper, M.D., 1997. An amphibian CD3 homologue of the mammalian CD3 γ and δ genes. Eur. J. Immunol. 27, 1640–1647. doi:10.1002/eji.1830270708 Fellah, J.S., Durand, C., Kerfourn, F., Charlemagne, J., 2001. Complexity of the T cell receptor Cbeta isotypes in the Mexican axolotl: structure and diversity of the VDJCbeta3 and VDJCbeta4 chains. Eur. J. Immunol. 31, 403–11. doi:10.1002/15214141(200102)31:2<403::AID-IMMU403>3.0.CO;2-4 Fellah, J.S., Kerfourn, F., Guillet, F., Charlemagne, J., 1993. Conserved structure of amphibian T-cell antigen receptor beta chain. Proc. Natl. Acad. Sci. U. S. A. 90, 6811–4. Gil, D., Schamel, W.W.A., Montoya, M., Sánchez-Madrid, F., Alarcón, B., 2002. Recruitment of Nck by CD3ϵ reveals a ligand-induced conformational change essential for T cell receptor signaling and synapse formation. Cell 109, 901–912. doi:10.1016/S0092-8674(02)00799-7 Göbel, T.W., Dangy, J.-P.P., Gobel, T.W.F., Dangy, J.-P.P., 2000. Evidence for a stepwise evolution of the CD3 family. J. Immunol. 164, 879–883. doi:10.4049/jimmunol.164.2.879 Gold, D.P., Clevers, H., Alarcon, B., Dunlap, S., Novotny, J., Williams, A.F., Terhorst, C., 1987. Evolutionary relationship between the T3 chains of the T-cell receptor complex and the immunoglobulin supergene family. Proc. Natl. Acad. Sci. U. S. A. 84, 7649–53. Greenberg, a S., Hughes, a L., Guo, J., Avila, D., McKinney, E.C., Flajnik, M.F., 1996. A novel “chimeric” antibody class in cartilaginous fish: IgM may not be the primordial immunoglobulin. Eur. J. Immunol. 26, 1123–9. doi:10.1002/eji.1830260525 Guo, Z., Nie, P., 2013. Sequencing and expression analysis of CD3γ/δ and CD3ɛ chains in mandarin fish, Siniperca chuatsi. Chinese J. Oceanol. Limnol. 31, 106–117. doi:10.1007/s00343-013-2018-1 Guy, C.S., Vignali, D.A.A., 2009. Organization of proximal signal initiation at the TCR:CD3 complex. Immunol. Rev. 232, 7–21. doi:10.1111/j.1600-065X.2009.00843.x Hall, T., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98.

AC C

501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552

15

ACCEPTED MANUSCRIPT

EP

TE D

M AN U

SC

RI PT

Hinds, K.R., Litman, G.W., 1986. Major reorganization of immunoglobulin VH segmental elements during vertebrate evolution. Nature 320, 546–9. doi:10.1038/320546a0 Hope, R.M., 2013. Rmisc: Ryan Miscellaneous. Hořejší, V., Zhang, W., Schraven, B., 2004. Transmembrane adaptor proteins: organizers of immunoreceptor signalling. Nat. Rev. Immunol. 4, 603–616. doi:10.1038/nri1414 Käll, L., Krogh, A., Sonnhammer, E.L.L., 2004. A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 338, 1027–36. doi:10.1016/j.jmb.2004.03.016 Katoh, K., Standley, D.M., 2013. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 30, 772–780. doi:10.1093/molbev/mst010 Kesti, T., Ruppelt, A., Wang, J.-H., Liss, M., Wagner, R., Tasken, K., Saksela, K., 2007. Reciprocal regulation of SH3 and SH2 domain binding via tyrosine phosphorylation of a common site in CD3. J. Immunol. 179, 878–885. doi:10.4049/jimmunol.179.2.878 Kibbe, W.A., 2007. OligoCalc: an online oligonucleotide properties calculator. Nucleic Acids Res. 35, W43-6. doi:10.1093/nar/gkm234 Lin, J., Weiss, A., 2001. T cell receptor signalling. J. Cell Sci. 114, 243–4. Litman, G.W., Rast, J.P., Shamblott, M.J., Haire, R.N., Hulst, M., Roess, W., Litman, R.T., Hinds-Frey, K.R., Zilch, A., Amemiya, C.T., 1993. Phylogenetic diversification of immunoglobulin genes and the antibody repertoire. Mol. Biol. Evol. 10, 60–72. Liu, Y., Moore, L., Koppang, E.O., Hordvik, I., 2008. Characterization of the CD3zeta, CD3gammadelta and CD3epsilon subunits of the T cell receptor complex in Atlantic salmon. Dev. Comp. Immunol. 32, 26–35. doi:10.1016/j.dci.2007.03.015 Loytynoja, A., Goldman, N., 2008. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science (80-. ). 320, 1632–1635. doi:10.1126/science.1158395 Löytynoja, A., Goldman, N., 2010. webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser. BMC Bioinformatics 11, 579. doi:10.1186/1471-2105-11-579 Macqueen, D.J., Wilcox, A.H., 2014. Characterization of the definitive classical calpain family of vertebrates using phylogenetic, evolutionary and expression analyses. Open Biol. 4, 130219–130219. doi:10.1098/rsob.130219 Mallaun, M., Naeher, D., Daniels, M.A., Yachi, P.P., Hausmann, B., Luescher, I.F., Gascoigne, N.R.J., Palmer, E., 2008. The T cell receptor’s alpha-chain connecting peptide motif promotes close approximation of the CD8 coreceptor allowing efficient signal initiation. J. Immunol. 180, 8211–21. Marchler-Bauer, A., Derbyshire, M.K., Gonzales, N.R., Lu, S., Chitsaz, F., Geer, L.Y., Geer, R.C., He, J., Gwadz, M., Hurwitz, D.I., Lanczycki, C.J., Lu, F., Marchler, G.H., Song, J.S., Thanki, N., Wang, Z., Yamashita, R.A., Zhang, D., Zheng, C., Bryant, S.H., 2014. CDD: NCBI’s conserved domain database. Nucleic Acids Res. 43, D222-6. doi:10.1093/nar/gku1221 Meeker, N.D., Smith, A.C.H., Frazer, J.K., Bradley, D.F., Rudner, L.A., Love, C., Trede, N.S., 2010. Characterization of the zebrafish T cell receptor beta locus. Immunogenetics 62, 23–9. doi:10.1007/s00251-009-0407-6 Möller, S., Croning, M.D., Apweiler, R., 2001. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17, 646–53. Mulley, J.F., Hargreaves, A.D., Hegarty, M.J., Heller, R.S., Swain, M.T., 2014. Transcriptomic analysis of the lesser spotted catshark (Scyliorhinus canicula) pancreas, liver and brain reveals molecular level conservation of vertebrate pancreas function. BMC Genomics 15, 1074. doi:10.1186/1471-2164-15-1074 Nguyen, L.-T., Schmidt, H.A., von Haeseler, A., Minh, B.Q., 2014. IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol.

AC C

553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604

16

ACCEPTED MANUSCRIPT

EP

TE D

M AN U

SC

RI PT

Biol. Evol. 32, 268–274. doi:10.1093/molbev/msu300 Øvergård, A.-C., Hordvik, I., Nerland, A.H., Eikeland, G., Patel, S., 2009. Cloning and expression analysis of Atlantic halibut (Hippoglossus hippoglossus) CD3 genes. Fish Shellfish Immunol. 27, 707–713. doi:10.1016/j.fsi.2009.08.011 Paensuwan, P., Hartl, F.A., Yousefi, O.S., Ngoenkam, J., Wipa, P., Beck-Garcia, E., Dopfer, E.P., Khamsri, B., Sanguansermsri, D., Minguet, S., Schamel, W.W., Pongcharoen, S., 2015. Nck binds to the T cell antigen receptor using its SH3.1 and SH2 domains in a cooperative manner, promoting TCR functioning. J. Immunol. 196, 1500958. doi:10.4049/jimmunol.1500958 Parra, Z.E., Baker, M.L., Schwarz, R.S., Deakin, J.E., Lindblad-Toh, K., Miller, R.D., 2007. A unique T cell receptor discovered in marsupials. Proc. Natl. Acad. Sci. U. S. A. 104, 9776–81. doi:10.1073/pnas.0609106104 Pohlert, T., 2014. The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR). R package. R Core Team, 2014. A language and environment for statistical computing. Rambaut, A., Suchard, M.A., Drummond, A.J., 2014. Tracer v1.6. Rast, J., Anderson, M., Litman, R., Margittai, M., Litman, G., Ota, T., Shamblott, M., 1994. Immunoglobulin light chain class multiplicity and alternative organizational forms in early vertebrate phylogeny. Immunogenetics 40. doi:10.1007/BF00188170 Rast, J.P., Anderson, M.K., Strong, S.J., Luer, C., Litman, R.T., Litman, G.W., 1997. alpha, beta, gamma, and delta T Cell antigen receptor genes arise early in vertebrate phylogeny. Immunity 6, 1–11. Rast, J.P., Litman, G.W., 1994. T-cell receptor gene homologs are present in the most primitive jawed vertebrates. Proc. Natl. Acad. Sci. U. S. A. 91, 9248–52. Redmond, A.K., Pettinello, R., Dooley, H., 2017. Outgroup, alignment, and modelling improvements indicate that two TNFSF13-like genes existed in the vertebrate ancestor. Immunogenetics in press. doi:10.1007/s00251-016-0967-1 Richards, M.H., Nelson, J.L., 2000. The evolution of vertebrate antigen receptors: a phylogenetic approach. Mol. Biol. Evol. 17, 146–155. doi:10.1093/oxfordjournals.molbev.a026227 Ruijter, J.M., Ramakers, C., Hoogaars, W.M.H., Karlen, Y., Bakker, O., van den Hoff, M.J.B., Moorman, A.F.M., 2009. Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res. 37, e45. doi:10.1093/nar/gkp045 Rumfelt, L.L., McKinney, E.C., Taylor, E., Flajnik, M.F., 2002. The development of primary and secondary lymphoid tissues in the nurse shark Ginglymostoma cirratum: B-cell zones precede dendritic cell immigration and T-cell zone formation during ontogeny of the spleen. Scand. J. Immunol. 56, 130–48. Saksela, K., Permi, P., 2012. SH3 domain ligand binding: What’s the consensus and where’s the specificity? FEBS Lett. 586, 2609–2614. doi:10.1016/j.febslet.2012.04.042 Salim, A., Bano, A., Zaidi, Z.H., 2003. Prediction of possible sites for posttranslational modifications in human gamma crystallins: effect of glycation on the structure of human gamma-B-crystallin as analyzed by molecular modeling. Proteins 53, 162–73. doi:10.1002/prot.10493 Santiveri, C.M., Borroto, A., Simón, L., Rico, M., Alarcón, B., Jiménez, M.A., 2009. Interaction between the N-terminal SH3 domain of Nck-alpha and CD3-epsilon-derived peptides: non-canonical and canonical recognition motifs. Biochim. Biophys. Acta 1794, 110–7. doi:10.1016/j.bbapap.2008.09.016 Takeuchi, K., Yang, H., Ng, E., Park, S., Sun, Z.-Y.J., Reinherz, E.L., Wagner, G., 2008. Structural and functional evidence that Nck interaction with CD3ε regulates T-cell receptor activity. J. Mol. Biol. 380, 704–716. doi:10.1016/j.jmb.2008.05.037 Theodorsson-Norheim, E., 1986. Kruskal-Wallis test: BASIC computer program to perform

AC C

605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656

17

693 694

RI PT

SC

M AN U

TE D

EP

692

ACCEPTED MANUSCRIPT

nonparametric one-way analysis of variance and multiple comparisons on ranks of several independent samples. Comput. Methods Programs Biomed. 23, 57–62. doi:10.1016/0169-2607(86)90081-7 Toyonaga, B., Yoshikai, Y., Vadasz, V., Chin, B., Mak, T.W., 1985. Organization and sequences of the diversity, joining, and constant region genes of the human T-cell receptor beta chain. Proc. Natl. Acad. Sci. 82, 8624–8628. doi:10.1073/pnas.82.24.8624 Venkatesh, B., Lee, A.P., Ravi, V., Maurya, A.K., Lian, M.M., Swann, J.B., Ohta, Y., Flajnik, M.F., Sutoh, Y., Kasahara, M., Hoon, S., Gangu, V., Roy, S.W., Irimia, M., Korzh, V., Kondrychyn, I., Lim, Z.W., Tay, B.-H., Tohari, S., Kong, K.W., Ho, S., Lorente-Galdos, B., Quilez, J., Marques-Bonet, T., Raney, B.J., Ingham, P.W., Tay, A., Hillier, L.W., Minx, P., Boehm, T., Wilson, R.K., Brenner, S., Warren, W.C., 2014. Elephant shark genome provides unique insights into gnathostome evolution. Nature 505, 174–179. doi:10.1038/nature12826 Wang, Q., Arighi, C.N., King, B.L., Polson, S.W., Vincent, J., Chen, C., Huang, H., Kingham, B.F., Page, S.T., Rendino, M.F., Thomas, W.K., Udwary, D.W., Wu, C.H., 2012. Community annotation and bioinformatics workforce development in concert-Little Skate Genome Annotation Workshops and Jamborees. Database (Oxford). 2012, bar064. doi:10.1093/database/bar064 Weissman, A., Baniyash, M., Hou, D., Samelson, L., Burgess, W., Klausner, R., 1988. Molecular cloning of the zeta chain of the T cell antigen receptor. Science (80-. ). 239, 1018–1021. doi:10.1126/science.3278377 Wickham, H., 2009. ggplot2: Elegant Graphics for Data Analysis. Springer New York, New York, NY. doi:10.1007/978-0-387-98141-3 Wunderlich, L., Faragó, A., Downward, J., Buday, L., 1999. Association of Nck with tyrosine-phosphorylated SLP-76 in activated T lymphocytes. Eur. J. Immunol. 29, 1068–1075. doi:10.1002/(SICI)1521-4141(199904)29:04<1068::AIDIMMU1068>3.0.CO;2-P Wyffels, J., King, B.L., Vincent, J., Chen, C., Wu, C.H., Polson, S.W., L. King, B., Vincent, J., Chen, C., Wu, C.H., Polson, S.W., 2014. SkateBase, an elasmobranch genome project and collection of molecular resources for chondrichthyan fishes. F1000Research 3, 191. doi:10.12688/f1000research.4996.1 Zapata, A., Amemiya, C.T., 2000. Phylogeny of lower vertebrates and their immunological structures. Curr. Top. Microbiol. Immunol. 248, 67–107.

AC C

657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691

18

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

695

TE D

Figure 1. Bayesian phylogenetic analysis of the TCR constant domain from a range of jawed vertebrates including small-spotted catshark (generated under the WAG+I+G4 model). Posterior probability values are included for each node. The alignment was performed using PRANK. Accession numbers for sequences used are presented in supplementary table 3.

Figure 2. Bayesian phylogenetic analysis showing the relationship of CD3ε, CD3γ/δ, and CD3ζ of small-spotted catshark. Other details are as provided in figure 1 legend. Accession numbers for the sequences used are presented in supplementary table 4. 19

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

Figure 3. MSA of small-spotted catshark (a) TCRα, (b) TCRβ, (c) TCRγ, and (d) TCRδ constant domain with other vertebrate TCRs molecules. Conserved cysteines are in bold and shaded dark grey, predicted N-linked glycosylation sites are in bold, and other conserved residues are shaded light grey. Positively charge amino acids involved in TCR/CD3 binding are shaded in black. Sequences length and percentage of identity with the first sequence are shown at the end of each sequence. 20

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

Figure 4. (a) Schematic representation of possible TCRβ gene organisation. The first scheme represents the mammalian single translocon organisation, where both constant regions with their diversity and joining domain share the same variable regions (Toyonaga et al., 1985). The second scheme represents a multi-cluster organisation, where each TCRβ constant region and their VDJ are present on separate clusters (Rast et al., 1997). (b) MSA of small-spotted catshark TCRβ variables domains performed using MAFFT. TCRβ-A sequences (accession number KY434216-KY434220) are represented with no background and TCRβ-B (accession number KY434221-KY434224) are shaded dark grey. Conserved amino acids are shaded light grey, predicted N-linked glycosylation sites are in bold. Predicted β strands are indicated below each alignment. Main differences between the TCRβ-A and –B are underlined. Sequences length and percentage of similarities with the first sequence are shown at the end of each sequence.

21

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

Figure 5. MSA of small-spotted catshark (a) CD3ε, (b) CD3ζ, and (c) CD3γδ with other vertebrate CD3 proteins. Other details are as provided in figure 1 legend. Important residues that are different in sharks are in white highlighted in dark grey (in yellow for online version). Sequence length, percent amino acid identity compared to small-spotted catshark CD3γδ-A and to small-spotted catshark CD3γδ-A are shown at the end each sequence respectively. 22

696

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

Figure 6. Constitutive tissue expression levels of catshark TCR/CD3 complex genes in vivo. Expression levels are expressed as arbitrary units normalised against the expression of RPL13 and RPS13 and are provided on a scale that is quantifiably comparable across genes. The results are presented as the mean + SD for four animals with the exception of Leydig organ where only three individuals were used.

EP AC C

698

TE D

697

23

ACCEPTED MANUSCRIPT Highlights •

We describe the first cloning of all CD3 subunits from a cartilaginous fish



The core TCR complex has been highly conserved during vertebrate evolution



Cartilaginous fish have multiple copies of CD3γδ and the TCRβ constant region The TCRβ constant region is copy number variable across jawed vertebrate

RI PT



phylogeny •

Unexpectedly, the proline rich region is absent from small-spotted catshark

AC C

EP

TE D

M AN U

SC

CD3ε