The genomic and transcriptomic analyses of serine proteases and their homologs in an endoparasitoid, Pteromalus puparum

The genomic and transcriptomic analyses of serine proteases and their homologs in an endoparasitoid, Pteromalus puparum

Accepted Manuscript The genomic and transcriptomic analyses of serine proteases and their homologs in an endoparasitoid, Pteromalus puparum Lei Yang, ...

3MB Sizes 0 Downloads 46 Views

Accepted Manuscript The genomic and transcriptomic analyses of serine proteases and their homologs in an endoparasitoid, Pteromalus puparum Lei Yang, Zhe Lin, Qi Fang, Jiale Wang, Zhichao Yan, Zhen Zou, Qisheng Song, Gongyin Ye PII:

S0145-305X(17)30164-7

DOI:

10.1016/j.dci.2017.07.014

Reference:

DCI 2942

To appear in:

Developmental and Comparative Immunology

Received Date: 25 April 2017 Revised Date:

12 July 2017

Accepted Date: 12 July 2017

Please cite this article as: Yang, L., Lin, Z., Fang, Q., Wang, J., Yan, Z., Zou, Z., Song, Q., Ye, G., The genomic and transcriptomic analyses of serine proteases and their homologs in an endoparasitoid, Pteromalus puparum, Developmental and Comparative Immunology (2017), doi: 10.1016/j.dci.2017.07.014. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

The genomic and transcriptomic analyses of serine proteases and

2

their homologs in an endoparasitoid, Pteromalus puparum

3

Lei Yang1, Zhe Lin2, Qi Fang1, Jiale Wang1, Zhichao Yan1, Zhen

4

Zou2, Qisheng Song3, Gongyin Ye1*

RI PT

1

5

1

6

Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Insect

7

Sciences, Zhejiang University, Hangzhou, China

8

2

9

Chinese Academy of Sciences, Beijing, 100101, China

State Key Laboratory of Rice Biology & Ministry of

10

3

11

Missouri, Columbia, Missouri, USA

M AN U

Division of Plant Sciences, College of Agriculture, Food and Natural Resources, University of

12 13

SC

State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology,

*Corresponding author: GY Ye, e-mail: [email protected].

AC C

EP

TE D

14

ACCEPTED MANUSCRIPT 15

Highlights

16 17 18



19 20 21



22 23



24



25

Abstract

26

In insects, serine proteases (SPs) and serine protease homologs (SPHs) constitute a

27

large family of proteins involved in multiple physiological processes such as digestion,

28

development, and immunity. Here we identified 145 SPs and 38 SPHs in the genome

29

of an endoparasitoid, Pteromalus puparum. Gene duplication and tandem repeats

30

were observed in this large SPs/SPHs family. We then analyzed the expression

31

profiles of SP/SPH genes in response to different microbial infections (Gram-positive

32

bacterium Micrococcus luteus, Gram-negative bacterium Escherichia coli, and

33

entomopathogenic fungus Beauveria bassiana), as well as in different developmental

34

stages and tissues. Some SPs/SPHs also displayed distinct expression patterns in

35

venom gland, suggesting their specific physiological functions as venom proteins. Our

36

finding lays groundwork for further research of SPs and SPHs expressed in the venom

37

glands.

38

Key words

39 40

Pteromalus puparum; Serine protease; Serine protease homolog; Expression profile; Venom gland

41

1. Introduction

AC C

EP

TE D

M AN U

SC

RI PT

One hundred and eighty-three serine proteases (SPs) and serine protease homologs (SPHs) were identified in the genome of an endoparasitoid, Pteromalus puparum. Detailed information of SPs and SPHs on the gene structure, distributions in scaffolds was revealed by genome scale analysis and the abundant levels of SPs/SPHs in different developmental stages and tissues were also obtained. Tandem repeats of SPs and SPHs with similar genomic structures indicated gene duplication events happened during evolution. Six SPs and two SPHs were predicted as venom proteins.

ACCEPTED MANUSCRIPT S1 Serine protease (SP)1 family proteins participate in various physiological

43

processes, such as digestion, development, and immune processes (Cerenius et al.,

44

2008; Loof et al., 2011; Lu et al., 2014; Rawlings et al., 2004). SPs usually contain

45

signal peptides and are produced as zymogens. Zymogens are converted to the active

46

enzymes by proteolytic cleavage at specific site. Trypsin and chymotrypsin are major

47

types of digestive proteases and usually highly expressed in midgut immediately after

48

feeding (Brackney et al., 2010; Soares et al., 2011). As a typical member of SP family,

49

bovine chymotrypsin features a catalytic triad consisting of Ser195, His57, and

50

Asp102 amino acid residues in its catalytic center (Perona and Craik, 1995). A

51

substrate binding cleft near the active site is the predominant factor in determining its

52

substrate specificity. The RNA interfere experiment in a fruit fly, Bactrocera dorsalis,

53

indicated that silence of a single trypsin induced the expressions of other trypsins, as a

54

result counteracting the influence of knock down of a single trypsin on the digestion

55

(Li et al., 2017). In immune activities, several SP zymogens can sequentially be

56

activated to form a cascade pathway and finally act on effectors (Ross et al., 2003).

57

The roles of SP cascades have been extensively studied in invertebrates (He et al.,

58

2017; Rao et al., 2010; Wang et al., 2014; Zou et al., 2010). The proteolytic activation

59

of prophenoloxidase (PPO) and Spätzle-induced synthesis of antimicrobial peptides

60

(AMPs) are mediated by these enzyme cascades (Kanost and Jiang, 2015; Kanost et

AC C

EP

TE D

M AN U

SC

RI PT

42

1

Abbreviations used are: SPs, serine proteases; SPHs, serine protease homologs; PPO, prophenoloxidase; AMPs, antimicrobial peptides; HP, hemolymph proteinase; JNK, c-Jun N-terminal kinase; PDVs, polydnaviruses; VLPs, virus-like particles; qPCR, quantitative real-time PCR; ANOVA, analysis of variance; VG, venom gland; DEPC, Diethy pyrocarbonate; cDNA, complementary DNA; cSPs, clip-domain serine proteases; cSPHs, clip-domain serine protease homologs; p. i, post infection; Pp, Pteromalus puparum; Am, Apis mellifera; Dm, Drosophila melanogaster; Ms, Mandu sexta; CCP, complement control protein; LDLA, low-density lipoprotein receptor class A; ANK, Ankyrin repeat region; FZ, Frizzled; CHIT, Chitin-binding; SR, scavenger receptor Cys-rich; CTL, C-type lectin; MSP, modular serine protease; GD, gastrulation defective.

ACCEPTED MANUSCRIPT al., 2004; Kellenberger et al., 2011; Povelones et al., 2013). In Manduca sexta,

62

hemolymph proteases HP14, HP21 and PAP 1/2/3 are responsible for activating PPO

63

pathway and HP8 stimulates AMP synthesis (An et al., 2009, 2013; Wang and Jiang,

64

2008). In Drosophila melanogaster, genetic analyses reveal that a series of genes in

65

the SP pathways are involved in mediating the Toll-Dorsal patterning of embryo for

66

ventralization and activating AMP productions (Belvin and Anderson, 1996; Kambris

67

et al., 2006). Researches also show that Drosophila serine proteases MP1, MP2 and

68

Hayan are involved in melanization (Veillard et al., 2016) while Hayan also

69

participates in c-Jun N-terminal kinase (JNK)-dependent cytoprotective program and

70

is involved in systemic wound response (Nam et al., 2012).

M AN U

SC

RI PT

61

In addition to SPs, many serine protease homolog genes (SPHs) have been

72

identified. SPHs lack amidase activity because of the mutation or absence of the

73

catalytic residues. These SPHs promote the activation of PPOs (Felfoldi et al., 2011;

74

Gupta et al., 2005; Wang and Jiang, 2017; Wang et al., 2014; Yu and Kanost, 2003).

75

SPHs are also involved in somatic muscle attachment in Drosophila embryos, cell

76

adhesion in the crayfish Pacifastacus leniusculus and regulation of TEP1 recruitment

77

to microbial surfaces in Anopheles gambiae (Huang et al., 2000; Povelones et al.,

78

2013; Zhang et al., 2013).

EP

AC C

79

TE D

71

Parasitic wasps, being invaluable in classical and augmentative biological control

80

of various insect pests, are an abundant and diverse hymenopteran group on earth,

81

which lay their eggs into internal body (endoparasitic wasps) or on the external

82

surface (ectoparasitic wasps) of their hosts (Gehrer and Vorburger, 2012; Pennacchio

ACCEPTED MANUSCRIPT and Strand, 2006). To avoid host immune defense and allow their offspring to develop

84

inside or outside host hemocoel for successful parasitism, several parasitic factors,

85

including venom, polydnaviruses (PDVs), virus-like particles (VLPs), ovarian fluids

86

and teratocytes, are produced by parasitoids and used alone or in combination,

87

depending largely upon their life strategy (Asgari and Rivers, 2011; de Graaf et al.,

88

2010; Pennacchio and Strand, 2006; Poirie et al., 2014; Teng et al., 2016). To date,

89

previous studies are mainly focused on the parasitic factors produced by the wasps

90

and the immune responses of their hosts (Gueguen et al., 2013; Thoetkiattikul et al.,

91

2005; Wang et al., 2013; Zhu et al., 2009). Insight into the venom compositions of the

92

parasitoids reveals SPs and SPHs are major components of venom and may

93

participate in immune activities (Colinet et al., 2014; de Graaf et al., 2010; Vincent et

94

al., 2010; Yan et al., 2016; Zhu, 2016, 2010). However, the reported physiological

95

functions of SPs and SPHs in venom glands of parasitoids are limited to a few species.

96

In Cotesia rubecula, a serine protease homolog named Vn50 is isolated from its

97

venom gland, and defined to show the ability in inhibiting host humoral immune

98

response by interfering with the PO cascades (Asgari et al., 2003; Zhang et al., 2004).

99

In Nasonia vitripennis, serine proteases possess a possible cytotoxic function in cell

101

SC

M AN U

TE D

EP

AC C

100

RI PT

83

death of Spodoptera frugiperda cell line (Formesyn et al., 2013). Recently, genome-wide analyses have helped to identify SP and SPH genes in

102

several insect species, including D. melanogaster (Ross et al., 2003), Apis mellifera

103

(Zou et al., 2006), Bombyx mori (Zhao et al., 2010), Nilaparvata lugens (Bao et al.,

104

2013), M. sexta (Cao et al., 2015) and Plutella xylostella (Lin et al., 2015). SP

ACCEPTED MANUSCRIPT pathways, which have been revealed by genetic and biochemical analyses in M. sexta,

106

Helicoverpa armigera, A. gambiae, D. melanogaster, Aedes aegypti and other insects,

107

vary in food digestion, embryo development and immune responses (An et al., 2013;

108

Jiang et al., 2003; Kuwar et al., 2015; Paskewitz et al., 2006; Zou et al., 2010).

109

However, our current knowledge about SP and SPH pathways in parasitic wasps is

110

still limited. Therefore, there is still a crucial need for extensive analyses using the

111

combination of various techniques such as genomic and transcriptomic analyses to

112

acquire a more comprehensive view on the composition and functional diversity of

113

SPs and SPHs in hymenopteran parasitoids.

SC

M AN U

114

RI PT

105

Pteromalus puparum is a pupal endoparasitoid wasp of certain Papilionidae and Pieridae species, and is the dominant parasitic species in regulating the field

116

population of the small white butterfly, Pieris rapae, which is an important

117

agricultural pest of the crucifer and caper families such as cabbage, cauliflower,

118

broccoli and collard, and spreads worldwide (Cai et al., 2004). P. puparum wasps

119

complete their life cycles except adult stage within the hemocoels of their host, feed

120

on host, and defend against host immune responses using the venom as the key

121

parasitic factor. For this parasitoid, the composition and function of the venom, and its

122

immune interrelationship with its host have been well investigated by our group (Fang

123

et al., 2016, 2011a, 2011b; Wang et al., 2015, 2013; Yan et al., 2017, 2015; Zhu et al.,

124

2015). To better uncover the interrelationship between P. puparum and its host P.

125

rapae, the whole genome of P. puparum has recently been sequenced by our group,

126

and the available gene annotation information enables us to identify SPs and SPHs in

AC C

EP

TE D

115

ACCEPTED MANUSCRIPT this parasitoid. In this current study, we characterized the SP and SPH genes and

128

presented the expression patterns in different development stages and tissues based on

129

transcriptomic and qPCR (quantitative real-time PCR) analyses. These data provide a

130

foundation for discovering the potential function of these genes, which are crucial for

131

the wasps to evade their host immune system and defend the microbial infections.

132

2. Materials and Methods

133

2.1 Insect rearing

M AN U

SC

RI PT

127

The laboratory P. rapae and P. puparum cultures were reared at 25 ± 1 °C with a

135

photoperiod of 10 h: 14 h (light: darkness) as described previously (Fang et al., 2010;

136

Zhang et al., 2005) and used in all experiments. Briefly, P. rapae larvae were fed with

137

cabbage leaf until they pupated, and then exposed to 2-day old mated female wasps

138

of P. puparum. Parasitized P. rapae pupae were maintained under the same

139

environmental condition described above. Once emerged, the new wasp adults were

140

collected and held in plastic finger-type vials and fed with 20% (v/v) honey solution.

141

2.2 Identification of SP/SPH genes from P. puparum genome

EP

AC C

142

TE D

134

SP and SPH sequences of D. melanogaster, M. sexta, A. mellifera and other insects

143

were downloaded from NCBI GenBank (https://www.ncbi.nlm.nih.gov/genome).

144

These genes were used as queries to search P. puparum genome for matches using

145

BLAST program (E-value 1e-5). Identified genes were validated manually by

146

searching the non-redundant nucleotide database. The predicted sequences were

147

categorized as SPs and SPHs based on the conserved His, Asp and Ser residues in

ACCEPTED MANUSCRIPT catalytic triad residues (Perona and Craik, 1995). Amino acid sequences containing all

149

three residues in TAAHC, DIAL and GDSGGP motifs are considered as SPs. The

150

sequences lack one or more of these residues were regarded as SPHs (Ross et al.,

151

2003). The signal peptides and transmembrane (TM) regions were predicted based on

152

SignalP 4.1 (http://www.cbs.dtu.dk/services/SignalP/) and TMHMM Server v.2.0

153

(http://www.cbs.dtu.dk/services/TMHMM/). The domain analyses of the retrieved

154

protein sequences were carried out using Pfam (http://pfam.xfam.org/), SMART

155

(http://smart.embl.de/) and PROSITE (http://prosite.expasy.org/). The function

156

prediction was according to KEGG and UniprotKB/Swiss-Prot annotation.

157

2.3 Sequence alignment and phylogenetic analysis

M AN U

SC

RI PT

148

P. puparum SP/SPH genes were aligned with those from D. melanogaster, M. sexta,

159

B. mori, A. mellifera, N. vitripennis and other insect species. Multiple sequence

160

alignments were performed with ClustalX 2.0 (Thompson et al., 1997), and

161

Neighbor-joining method was used to construct phylogenetic trees by MEGA5.10

162

(Tamura et al., 2011) with 1000 bootstrap replicates.

163

2.4 Samples collection

164

2.4.1

165

AC C

EP

TE D

158

Immunization by three microorganism species

Gram-positive

bacterium

Micrococcus

luteus,

Gram-negative

bacterium

166

Escherichia coli and entomopathogenic fungus Beauveria bassiana were used for

167

septic injury. The bacterium or fungus was freshly collected, washed three times with

168

sterile PBS and resuspended at a density of 5 x 107 colony forming units in PBS,

ACCEPTED MANUSCRIPT respectively, following the protocol by Wang et al. (2017). The 2-day old mated

170

female wasps of P. puparum were anesthetized with carbon dioxide for 30 s.

171

Acupuncture needles that pre-dipped in a conidial suspension of microbe were used to

172

penetrate the abdominal cuticle of the wasps. Control wasps were pricked with sterile

173

PBS only. These adult females were collected at 6, 24 and 48 h post-prick in order to

174

analyze the bacterium- or fungus-induced gene expressions.

175

2.4.2

RI PT

169

SC

Sample collections from different tissue and development stages

For tissue extraction, the 2-day mated female wasps were anaesthetized in −70 °C

177

refrigerator for 5 min, and dissected in DEPC (Diethy pyrocarbonate) water with 1

178

unit/µl RNAase inhibitor (Toyobo, Osaka, Japan) on the ice plate under a stereoscope

179

(Olympus, Japan). The tissues including fat body, gut, ovary, venom gland and the

180

remaining carcass were isolated, washed, and then pooled into centrifuge tube with

181

Trizol reagent (Invitrogen, USA), respectively.

TE D

M AN U

176

For different development stages, P. puparum embryos were dissected from P.

183

rapae pupae less than 12 h after parasitism. Also, the parasitoid larvae were dissected

184

from P. rapae pupae at 2, 4 and 6 d after parasitism, which were equal to the 1st, 2nd,

185

3rd instar of the wasps. The yellow wasp pupae i.e. the early stage of pupae were

186

acquired and divided into females and males. Two-day old mated female and male

187

wasps were also obtained. P. puparum in different development stages were collected,

188

washed and pooled into centrifuge tube with Trizol reagent (Invitrogen, USA),

189

respectively.

AC C

EP

182

ACCEPTED MANUSCRIPT 190

2.5 RNA extraction and qPCR Wasps from different development stages and immune challenged samples as well

192

as different tissue samples of mature female wasps were used for RNA extraction. The

193

total RNA was extracted using Trizol reagent according to the manufacture’s protocol.

194

The concentration of total RNA was estimated by measuring the absorbance at 260

195

nm. Single-stranded complementary DNAs (cDNAs) were synthesized using the

196

PrimeScript™ One Step RT-PCR Kit (Takara, Japan). The larval single-stranded

197

cDNA was made from equivalent mixtures of RNA samples extracted from 1st, 2nd,

198

3rd instar of wasp larvae. All the primer sequences (Table S1) were designed using

199

Primer 3 (Untergasser et al., 2012) and synthesized commercially (Sangon, China).

200

QPCR was performed on a BIO-RAD CFX96™ Real-Time System (Bio-Rad, USA)

201

using the iQ™ SYBR Green Supermix Kit (Takara, Japan), according to the

202

manufacturers’ instructions. The cDNA (2 µl) was used as a template in each 25 µl

203

reaction mixture. The cycling conditions for qPCR were as follows: enzyme

204

activation at 95 °C for 30 secs, followed by 40 cycles with denaturation at 95 °C for 5

205

sec, annealing at 60 °C for 34 sec. We have tested the expression stability of several

206

candidate reference genes in different developmental stages, tissues and

207

immune-challenged wasps. The expression level of P. puparum actin 1 was the most

208

stable among different microbial infections. The transcript level of 18s rRNA

209

presented good stability in different tissues and at differently developmental stages.

210

Therefore, we chose actin 1 as an internal control standard for the relative expression

211

levels of genes in different infected cases. Development- and tissues-specific relative

AC C

EP

TE D

M AN U

SC

RI PT

191

ACCEPTED MANUSCRIPT expressions were normalized to reference gene by18s rRNA. Dissociation curves

213

were checked at the end of PCR reactions. The mRNA expression levels were

214

quantified using the 2∆∆Ct method (Livak and Schmittgen, 2001). Error bars represent

215

the means ± standard deviations from three biological replicates. One-way analysis of

216

variance (ANOVA) was performed in expression profiles of different development

217

stages and tissues distribution and two-way ANOVA was used in expression profiles

218

of immune response.

219

2.6 Constructing, sequencing and analyzing RNA-seq libraries

SC

M AN U

220

RI PT

212

The construction and sequencing of cDNA libraries were performed by Nextomics Biosciences (Nextomics, Wuhan, China). Developmental stage samples and tissue

222

samples were prepared for deep sequencing analysis. Various life stages including

223

newly laid P. puparum embryos (n=200), 1st, 2nd, 3rd instar of wasp larvae with equal

224

quantity (n=10), female and male wasps in pupal stage and adult stage (n=20). For

225

tissue analysis, ovary, venom gland and carcass without venom gland were prepared

226

from female adults. The isolated RNA was purified using Sera-mag Magnetic Oligo

227

(dT) beads (Illumina, USA), and then transcribed using N6 primers followed by

228

synthesis of second cDNA strand. After end pair processing and ligation of adaptor,

229

RNA was amplified by PCR and purified using QIAquick Gel Extraction Kit (Qiagen,

230

Germany). The larvae mRNA sequencing library was made from equivalent mixtures

231

of RNA samples extracted from 1st, 2nd, 3rd instar of wasp larvae. Then nine mRNA

232

libraries were sequenced using an Illumina HiSeq 2100 instrument (Illumina, USA)

233

and raw reads were sorted out by barcodes.

AC C

EP

TE D

221

ACCEPTED MANUSCRIPT 234 235

2.7 Analysis of RNA-seq Data Raw reads from each library were filtered to remove low quality reads and the resulting reads were termed clean reads. The expression levels were estimated

237

by software TopHat and Cufflinks (Trapnell et al., 2012, 2010). The FPKM values of

238

P. puparum SP and SPH genes were listed (Table S2). The expression profiles of

239

SP/SPH genes in different development stages were visualized using the R statistical

240

program version 3.1.3. Identification of differentially expressed genes between

241

venom gland and carcass without venom gland were performed using the R

242

package DEGSeq v1.2.2 (Wang et al., 2010). The p-values were adjusted using

243

(Benjamini and Hochberg, 1995). Corrected p-value < 0.001, log2

244

(FPKM_VG/FPKM_Carcass) >1 and FPKM_VG (Venom gland) >10 were set as

245

the threshold. P. puparum SPs and SPHs which were differentially expressed in

246

venom gland were listed in Table S3. Based on the previous venom proteomic

247

information (Yan et al., 2016), part of these SPs and SPHs genes were marked and

248

defined as putative venom proteins.

249

3. Results

250

3.1 Identification and characterization of SP and SPH genes in P. puparum

SC

M AN U

TE D

EP

AC C

251

RI PT

236

One hundred and eighty-three predicted SP and SPH genes were identified from the

252

genome of P. puparum and most of them were similar to chymotrypsin (S1) family

253

(Table S4). The sequences of these 183 SP/SPH genes were listed in Table S5 and

254

NCBI descriptions were provided in Table S6. The total number of SP/SPH genes in P.

ACCEPTED MANUSCRIPT puparum is 183, less than 204 in D. melangaster and 441 in A. aegypti, but far more

256

than 57 in A. mellifera and 90 in N. lugens (Table S7). We also identified these

257

SP/SPH genes in the genomes of N. vitripennis (Werren et al., 2010) and

258

Microplitis demolitor in silico (Burke et al., 2014). The number of SP/SPH genes in P.

259

puparum genome is much similar to the number (143 genes) in N. vitripennis and far

260

more than that (74 genes) in M. demoliter (Table S7). Based on the presence or

261

absence of the conserved His, Asp and Ser residues in the catalytic triad, the whole P.

262

puparum SP/SPH genes were classified into 145 SP genes and 38 SPH genes (Table

263

S4).

M AN U

SC

RI PT

255

In P. puparum, 183 SP and SPH genes are spread across 47 different scaffolds

265

(Table S4). Apart from most of the scaffolds with less than 5 SP/SPH genes, 140

266

SP/SPH genes are located in 14 scaffolds. We drew the location of genes on scaffolds

267

with five or more SP/SPH genes (Fig. 1 & Fig. S1). Results indicated that large

268

clusters of SP/SPH genes are common events in the genome of P. puparum.

269

Twenty-four SP/SPH genes which form two clusters are located in scaffolds 17 with

270

cluster 17-1 genes placed adjacently. Striking phenomenon also exists in scaffolds 5

271

with 37 SP/SPH genes clustered into three clusters. Most of the SP/SPH genes in

272

scaffolds 5 were considered as chymotrypsin or trypsin (Table S4). Genes in cluster

273

5-2 share extremely similar structures at genome and protein levels and 16 of total 18

274

genes were predicted to be SPs. These SPs expect PpSP55 were predicted to be

275

activated by proteolytic cleavage between Arg and Ile, and consisted of signal

276

peptides and single serine protease domains formed by two or three exons gene

AC C

EP

TE D

264

ACCEPTED MANUSCRIPT

278

structures (Table S4). In following sections, we roughly described P. puparum SPs and SPHs based on

279

their structures and functions.

280

3.2 Structure features of the SPs and SPHs

281

3.2.1

RI PT

277

Signal domain SPs

Nearly half of P. puparum SP genes (84/183) are shorter than 300 residues, and

283

only contain signal peptide and one serine protease domain (Table S4). Previous

284

studies indicated that these enzymes were most likely to be trypsin or chymotrypsin

285

and expressed in gut which is related to digestion. We analyzed the expression

286

patterns of several SPs in the tissues of female wasps (Fig. 2) and SPs with signal

287

domain were marked with black frame. PpSP17, PpSP72 and PpSP90 displayed

288

similar expression patterns with strikingly high expressions in gut and very low levels

289

in other tissues. Results also showed that PpSP14 was expressed specifically in fat

290

body while PpSP21, PpSP56, PpSP83, PpSP95 and PpSP99 were exclusively

291

expressed in venom gland. These results indicated that these proteases are widely

292

distributed in different tissues to exert diverse functions.

293

3.2.2

M AN U

TE D

EP

AC C

294

SC

282

Clip-domain SPs/SPHs (cSPs/cSPHs)

Clip-domain SPs (cSPs) and clip-domain SPHs (cSPHs) are involved in

295

signal-amplifying reaction and play a significant role in mediating innate immunity

296

through cleaving and activating downstream SPs/SPHs. There were 16 cSPs and 9

297

cSPHs predicted in P. puparum genome. Clip domains are 37-55 amino acid

298

sequences knitted by three disulfide bonds to form quite compact structures. The

ACCEPTED MANUSCRIPT 299

length of clip domains in P. puparum PpcSPs/cSPHs vary from 30 to 55 amino acids

300

(Fig. S2). The domain of cSPs/cSPHs from P. puparum, A. mellifera, D. melanogaster and M.

302

sexta were aligned (Fig. S3) and the phylogenetic analysis revealed that P. puparum

303

cSPs/cSPHs were clustered together with other three insect cSP/cSPH genes (Fig. 3).

304

The phylogenetic tree showed that cSPs and cSPHs could be roughly divided into

305

three clades. A close ortholog relationship was observed between PpcSPH9,

306

MsSPH53 and masquerade proteins from D. melanogaster. All of them contain 5 clip

307

domains followed by a serine protease domain. PpcSP9 clustered with AmHP8,

308

MsHP8, MsPAP1 and PpcSP8. We examined the microbe (E. coil, M. luteus and B.

309

bassiana) induced expressions of PpcSP8 and PpcSP9. PpcSP8 was barely increased

310

by bacteria or fungus (data not shown) whereas the expression of PpcSP9 was

311

enhanced by the treatment of M. luteus and B. bassiana 24 h and 48 h post infection

312

(p. i). The expression of PpcSP9 reached the highest in the fungus-infected samples

313

24 h p. i. PpcSP6 shares a high amino sequence identity with AmHP21 and MsHP6.

314

QPCR results showed the induction of PpcSP6 24 h and 48 h post fungus infection

315

(Fig. 4). The expressions of PpcSPH1 and PpcSPH8 were also increased after immune

316

challenge. The expression of PpcSPH1 was dramatically activated by each of three

317

microbes 6 h p. i., and then gradually decreased in the E. coli and M. luteus-infected

318

samples while it reached a peak 24 h p. i. in the B. bassiana-infected samples.

319

PpcSPH8 was expressed highly in the M. luteus and B. bassiana-treated samples only

320

at 24 h and 48 h p. i (Fig. 4).

AC C

EP

TE D

M AN U

SC

RI PT

301

ACCEPTED MANUSCRIPT 321

3.2.3

Other complex domain SPs/SPHs

Ten of one hundred and eighty-three SP/SPH genes are far larger than the size of a

323

typical serine protease gene, and these SP/SPH genes consist of other domains and

324

modules (Fig. 5). These modules include complement control protein (CCP) domain,

325

low-density lipoprotein receptor class A (LDLA) domain, CUB domain, Ankyrin

326

repeat region (ANK), Frizzled (FZ) domain, SEA domain, Kringle domain, PAN

327

domain, Chitin-binding (CHIT) domain, scavenger receptor (SR) domain and C-type

328

lectin (CTL) domain. PpSP24 contains TM region, FZ domain, LDLA repeats, SR

329

domain and serine protease domain. The domain structure is similar with Drosophila

330

Corin, and may act as a pro-atrial natriuretic peptide-converting enzyme (Yan et al.,

331

2000). PpSP74 contains 4 LDLA repeats, 2 CCP domains followed by serine protease

332

domain whereas PpSP81 possess 2 more LDLAs than the domains in PpSP74. Both

333

of PpSP74 and PpSP81 have domain architectures quite similar to M. sexta HP14b

334

and modular serine protease (MSP) from D. melanogaster (DmMSP) (Fig. 5).

335

PpSP81 was hardly induced by immune challenge (data not shown). Analysis of the

336

expression response of PpSP74 gene to each of three microbes showed a relatively

337

low expression in the B. bassiana-challenged sample 6 h and 48 h p. i, but displayed a

338

striking induction at 24 h p. i (Fig. 4).

339

3.3 Possible function of SPs/SPHs in development

AC C

EP

TE D

M AN U

SC

RI PT

322

340

Extracellular serine protease processing events are essential during embryonic

341

development. Drosophila embryonic dorsal-ventral polarity relies on the serine

342

proteinase cascade in the egg perivitelline space (Belvin and Anderson, 1996). Several

ACCEPTED MANUSCRIPT identified SPs such as Nudel, gastrulation defective (GD), Snake and Easter are the

344

key components associated with DV axis establishment. Binding to Pipe-sulfated

345

glycoprotein, Nudel can be autoactivated firstly and then activate GD zymogen. GD

346

interacts with Snake, which in turn activates terminal proteinase of the signal cascade,

347

Easter. There were 3 GD genes, 1 Nudel-like gene, 2 Snake genes and 7 Easter genes

348

predicted in P. puparum genome. We only identified one Nudel-like gene PpSP62 in P.

349

puparum genome and this gene is a large mosaic protein consisting of a TM region

350

with serine protease domain embedded between the second and third LDLA domains.

351

The number of LDLA domains is 4, far less than LDLA domains in Nudel genes in A.

352

mellifera, D. melanogaster, Culex quinquefasciatu, but the same with the number of

353

domains in N. vitripennis and Trichogramma pretiosum Nudel-like genes (Fig. S4).

354

The presence of the transmembrane region suggested that PpSP62 located at vitelline

355

membrane served as a signal transduction member. The sequence alignment was made

356

between PpSP62, 8 Nudel genes and 2 Nudel-like genes (Fig. S5). The phylogenetic

357

tree revealed that genes from hymenoptera were divided into two clades. PpSP62 was

358

closely clustered with Nudel-like genes from N. vitripennis and Trichogramma

359

pretiosum. The Nudel genes of A. mellifera, Megachile rotundata and M. demoliter

360

formed the other clade (Fig. 6a). PpSP67, PpSP121 and PpSP129 are similar to GD

361

gene, which usually contains a putative signal peptide and a serine protease domain.

362

The phylogenetic tree revealed that these three GD genes in P. puparum were

363

clustered with N. vitripennis GD gene and PpSP121 was more closely related with N.

364

vitripennis GD gene (Fig. S6 & Fig. 6b). Based on the transcriptome data, only

AC C

EP

TE D

M AN U

SC

RI PT

343

ACCEPTED MANUSCRIPT PpSP129 had relatively high expression in the embryo stage. The FPKM values of

366

embryos in PpSP67 and PpSP121 were less than 1.0 (Table. S2). The qPCR analyses

367

also verified these results, indicating that only PpSP129 may function in the

368

embryonic development (Fig. 7). PpcSP3 and PpcSP6 found in the genome of P.

369

puparum were predicted as Snake genes (Table S4). These two genes possess signal

370

peptides, clip domains and serine protease domains as most Snake genes characterized

371

thus far. We also identified 7 Easter genes. Easter processes Spätzle into activating

372

ligand for Toll receptor and finally triggers the intracellular Toll-dorsal pathway. Four

373

Easter genes are closely linked with the same orientation and are located in scaffolds

374

16 (Fig. 1). These 4 Easter-like genes are distributed in a cluster within only 14.5 kb.

375

Another two Easter genes located in scaffolds 7 (Fig. S1) were also closely linked,

376

indicating Easter genes in P. puparum undergo gene duplication. A search of the P.

377

puparum genome showed 12 P. puparum Stubble genes. Stubble has been

378

characterized as transmembrane protein in intracellular signaling during leg and wing

379

imaginal disc morphogenesis (Bayer et al., 2003). Eleven of P. puparum Stubble

380

genes lacked the transmembrane domains which may be due to the incomplete of

381

N-terminal sequence. The number of Stubble genes in P. puparum is more than 8 in P.

382

xylostella, 5 in N. lugens and 1 in D. melanogaster. The length of P.puparum Stubble

383

genes varies from 262 to 946 amino acids, consisting of 5-12 exons (Table S4). These

384

implied the abundance of Stubble-like genes in P. puparum may also be conductive to

385

other physiological processes.

386

3.4 Expression profile of SPs/SPHs

AC C

EP

TE D

M AN U

SC

RI PT

365

ACCEPTED MANUSCRIPT 387

Several transcriptome databases from different development stages and tissues of P. puparum have been acquired. We obtained RNA-seq from different development

389

stages of P. puparum, including embryos, larvae, female and male pupae, female and

390

male adult wasps. The samples of venom, ovary and carcass from adult female wasps

391

were also collected for transcriptome sequencing. Expression profiles of serine

392

proteinase genes were determined by these databases.

The transcriptome (Table S2) revealed that some P. puparum SPs and SPHs owned

SC

393

RI PT

388

very low FPKM values in these already existing databases. Some of these SP/SPH

395

genes were possible to be pseudogenes. We retained all of the SPs and SPHs in

396

consideration of that some genes were only highly expressed in a specific tissue, but

397

not or barely expressed in other tissues. PpSPs and PpSPHs genes (133 in total with

398

the FPKM values greater than 1) were profiled (Fig. 8). The hierarchical clusters were

399

used to represent relative values of these genes and could be roughly divided into four

400

groups.

TE D

M AN U

394

A few numbers of SPs/SPHs displaying highest expressions in embryos were listed

402

in group I. Interestingly, more than half of these genes displayed similar expression

403

patterns and were detected at high levels in both embryo and female adult stages. The

404

SP and SPH genes highly expressed in adult female, the carcass and the ovary were

405

also categorized into group I. Group II is the largest group of genes highly expressed

406

in larval period. This group contains 53 SPs/SPHs, accounting for nearly 40% genes

407

in this super family. In group III, genes can be divided into three parts. In the first two

408

parts, eight and eleven SPs presented distinct expression patterns in mature female

AC C

EP

401

ACCEPTED MANUSCRIPT and male stages, respectively. Eleven SPs and SPHs with high transcript levels in both

410

female and male adult stages were located in the third part. In group IV, eight

411

SPs/SPHs expressed mainly in female or male pupal stage in the upper part and

412

eighteen SPs/SPHs showed most abundance in the venom gland of the lower part.

RI PT

409

Screening the transcriptome of P. puparum adult females yielded several genes with

414

notably high FPKM values in venom gland. To control false positive rate, the

415

expression level cut-off was set as FPKM_VG (Venom gland) > 10, and a venom

416

gland to carcass expression ratio log2 (FPKM_VG/FPKM_Carcass) > 1 and

417

corrected p-value < 0.001 to define differentially expressed genes in the venom

418

gland. Sixteen of eighteen PpSPs/PpSPHs profiled in group IV (Table S3) were

419

differentially expressed in the venom gland relative to the carcass. Among them, six

420

PpSPs and two PpSPHs contain signal peptides and identified in proteomic database,

421

which were considered as venom proteins. In our previous work, eight PpSPs and a

422

PpSPH were identified as venom proteins with the combined transcriptomic and

423

proteomic analysis (Yan et al., 2016). It is reasonable to have these two distinct results

424

since they mapped to different assembly sequences. We also used qPCR to validate

425

RNA-seq databases (Fig. 2). PpSP56, PpSP99, PpSP115, PpSPH1, PpSPH23 and

426

PpSPH26 had extremely higher transcript profiles in venom gland than those in other

427

tissues. The expression of PpSPH1 in venom gland was 1,000 times of that in fat body,

428

the second highest expression tissue in adult female. PpSP56, PpSP99, PpSP115,

429

PpSPH23 and PpSPH26 also showed similar expression patterns, with transcript

430

levels in venom gland dozens of times higher than that in other tissues. Compared to

AC C

EP

TE D

M AN U

SC

413

ACCEPTED MANUSCRIPT the number of SP and SPH genes specifically expressed in the venom gland, a few

432

showed most abundance in the ovary. PpSP62 and PpSPH12 showing relatively high

433

expression in the ovary were categorized into the lower part of group I.

434

4. Discussion

RI PT

431

SPs and SPHs belong to a large family, which function in biological processes as

436

food digestion, development and immunity. In the present article, P. puparum SP and

437

SPH genes have been identified by comparative analysis of genes in other reported

438

insects and confirmed by catalytic triad. There were 145 SPs and 38 SPHs identified

439

in the genome of P. puparum. The ratio of SPs and SPHs is close to that in D.

440

melanogaster or A. mellifera.

M AN U

SC

435

SP/SPH genes displaying a lineage of tandem repeats distribution were observed in

442

scaffolds of the P. puparum genome. These P. puparum SPs/SPHs presented large

443

clusters in several scaffolds with similar structures or functions. This phenomenon is

444

also widely identified in several species, such as D. melanogaster, B. mori and N.

445

lugens (Bao et al., 2013; Ross et al., 2003; Zhao et al., 2010). Previous studies showed

446

that gene expansion does not occur in SPs/SPHs of honey bee, A. mellifera, and this

447

may be due to their sociality, which shapes the diversity and evolution of the genes

448

(Gadau et al., 2012; Simola et al., 2013; Viljakainen et al., 2009). Therefore, the

449

amazing expansion of this superfamily in P. puparum with a similar structure or

450

function is the results of gene duplication and unequal crossing over.

AC C

EP

TE D

441

451

SPs with signal peptides and single serine protease domains are likely to be

452

gut-related serine proteinases. SPs in P. puparum presented distinct expression

ACCEPTED MANUSCRIPT patterns. SPs most abundantly expressed in gut tissue may relate to digestion. For

454

another part of SPs, the SPs highly expressed in fat body may indicate a role in wasp

455

immune system, whereas the SPs mainly expressed in venom gland may indicate a

456

function in regulating the host immune system. Signal serine proteinase domain SPs,

457

which occupying half of the SP genes family in P. puparum, may also participate in

458

various physiological processes. Further researches are needed to identify the function

459

of these candidate SP genes.

SC

RI PT

453

The size of P. puparum SP and SPH genes ranges from 175 to 2242 residues, with

461

the average size of 360. Our results showed that 35 of the 183 P. puparum SP and

462

SPH genes contain other modules. Clip domains constitute the largest group of

463

regulatory domains in P. puparum. These cSPs/cSPHs play a vital role in immune

464

reaction and embryo development. PpcSPH9 contains five clip domains and shares a

465

similar structure with masquerade. Drosophila masquerade gene participates in

466

somatic muscle attachment in Drosophila embryo (Murugasuoei et al., 1995).

467

Mutations in this Drosophila SPH affect axonal guidance and taste behavior

468

(MurugasuOei et al., 1996). PpcSPH9 with penta clip domains may allow the attached

469

SPH domain to interact with multiple partners. We examined the microbe induced (E.

470

coil, M. luteus and B. bassiana) expressions of the clip-domain SP/SPH genes. QPCR

471

results indicated PpcSP9 can be activated by Gram positive bacterium and fungus. We

472

implied PpcSP9 may also be involved in immune response. PpcSPH1 responded

473

quickly when infected by any of three microbes and may behave as a vital cofactor in

474

upstream of the proteinase cascade. PpcSPH8 expressed highly in M. luteus and B.

AC C

EP

TE D

M AN U

460

ACCEPTED MANUSCRIPT bassiana samples 24 h and 48 h p. i., indicating it may be regulated by the Toll

476

pathway activated by gram-positive bacteria and fungus. Some SPs contain other

477

types of domain additions. PpSP74 share structure similarity with M. sexta HP14b and

478

DmMSP. Previous studies showed that M. sexta HP14 and DmMSP could detect

479

bacteria and fungi by peptidoglycan-recognition molecules and then autoactivated and

480

triggered the serine proteinase cascade in activating prophenoloxidase system or Toll

481

pathway (Buchon et al., 2009; Kim et al., 2008; Wang and Jiang, 2010). The

482

expression of PpSP74 have been strikingly induced 24 h post B. bassiana infection,

483

indicating that PpSP74 may also be a modular serine protease that could be recruited

484

into the PG recognition complex.

M AN U

SC

RI PT

475

Previous studies presented several kinds of SP and SPH genes that determine the

486

dorsal-ventral polarity of embryos in Drosophila (Cho et al., 2010). Among them,

487

Drosophila Nudel is an initiator of these SP cascades. In the structure of Nudel genes

488

in most insects, the SP and SPH domains separated by two or three LDLA domains

489

(Fig. S4). However, we only found Nudel-like genes in the parasitoids of P. puparum,

490

N. vitripennis and T. pretiosum. The sequence of PpSP62 was confirmed by gene

491

cloning. These nude-like genes are much shorter than Nudel genes. The phylogenetic

492

trees also showed that Nudel and Nudel-like genes in hymenoptera formed two clades.

493

These may indicate that some Nudel-like genes in hymenoptera parasitoids such as

494

PpSP62 may lose part of C-terminal sequences in evolutionary events.

AC C

EP

TE D

485

495

RNA-seq databases provided us with gene temporospatial expression patterns.

496

Nearly 2/5 of the P. puparum SPs/SPHs expressed exclusively at the larval stage. P.

ACCEPTED MANUSCRIPT puparum wasps grow up inside the hemocoel of their host, may encounter defense

498

factors like proteinase inhibitors secreted by their host to block the serine protease

499

activity, which is similar to the interaction mechanism between herbivorous insects

500

and their host (Azzouz et al., 2005; Chougule et al., 2005). Meanwhile, larvae are

501

considered as a rapid growth stage. It is necessary to secrete massive proteinase for

502

digesting sufficient food. Therefore, the wasps need to generate abundant SPs/SPHs to

503

compete with the defense stress and grow up quickly in a host-parasitoids

504

co-evolution system (Saadat et al., 2014). The number of SPs/SPHs mainly expressed

505

in female or male pupal stages is 8, much less than that (53) in larval stage. In pupal

506

stage, the host provides these immature wasps with natural barriers for eclosion and

507

protects them from the attack of enemies. Besides, these pupae stop the ingestion of

508

food. It is expected to see the reduced expression levels of SPs/SPHs involved in

509

immunity and digestion. Insight into the distribution of adult female tissues shows

510

that several P. puparum SPs/SPHs are highly expressed in the venom gland. In most

511

host-parasitoid interaction mechanism, the venom secreted by the venom gland is an

512

essential requirement for successful parasitism (Asgari and Rivers, 2011; Fang et al.,

513

2011b; Martinson et al., 2014). Wasps introduce parasitoid factors to immobilize their

514

host. These include protein secretions from venom glands, PDVs and so on. Abundant

515

SPs/SPHs as venom proteins are discovered in parasitoids (Colinet et al., 2014; de

516

Graaf et al., 2010) and other venomous animals (Mukherjee et al., 2016). The serine

517

proteases in N. vitripennis venom induced cell apoptosis of Sf21 cell line (Formesyn

518

et al., 2013). A clip-SPH called Vn50 isolated from the venom of C. rubecula blocked

AC C

EP

TE D

M AN U

SC

RI PT

497

ACCEPTED MANUSCRIPT the melanization of its host through significantly reduced proteolysis of proPO

520

(Asgari et al., 2003; Zhang et al., 2004). Previous studies also revealed that a clip-SP

521

from Bombus ignitu venom SP (Bi-VSP) is a multifunctional enzyme. In the

522

immunity of B. ignite, BI-VSP acts as a proPO-activating factor, thereby triggering

523

the PO cascade. Bi-VSP also exhibits fibrin(ogen)olytic activity when interacts with

524

mammals and could induces a lethal melanization response in target insects (Choo et

525

al., 2010). Since N. vitripennis and P. puparum are close relatives that belong to the

526

same family, P. puparum venom SP and SPH genes may also act as cytotoxic

527

compounds in cell death related processes. Studies also showed that serine protease

528

and their homologs with clip domain generated striking melanization in the host or

529

enemies. We made an assumption that clip-domain SPs/SPHs specifically expressed

530

in the venom gland of P. puparum can also participate in the serine protease cascade

531

of the host (P. rapae) and interfere in the host signal transduction of immune system.

TE D

M AN U

SC

RI PT

519

All in all, P. puparum serine proteases and their homologs have been identified in

533

different development stages, tissues and microbe-challenged samples, which provide

534

a better understanding of the roles of these enzymes in digestion, development and

535

immunity in this parasitoid. Further research is needed to gain more insights in their

536

physiological processes and to provide feasible target genes used in biological control

537

of insect pests.

538

AC C

EP

532

ACCEPTED MANUSCRIPT Funding:

540

The study is supported by grants from National Natural Science Foundation of China

541

(Grant no. 31472038, 31272098, http://www.nsfc.gov.cn/), Major International

542

(Regional) Joint Research Project of National Natural Science Foundation (Grant

543

no.31620103915, http://www.nsfc.gov.cn/), and China National Science Fund for

544

Distinguished Young Scholars (Grant no. 31025021, http://www.nsfc.gov.cn/) as well

545

as the Foundation for Innovative Research Group from Ministry of Agriculture,

546

China.

SC

RI PT

539

AC C

EP

TE D

M AN U

547

ACCEPTED MANUSCRIPT Reference An CJ, Ishibashi J, Ragan EJ, Jiang HB, Kanost MR. Functions of Manduca sexta hemolymph proteinases HP6 and HP8 in two innate immune pathways. J Biol Chem. 2009; 284:19716-19726. An CJ, Zhang MM, Chu Y, Zhao ZW. Serine protease MP2 activates prophenoloxidase in the melanization immune response of Drosophila melanogaster. PLoS One. 2013; 8:e79533. Asgari S, Rivers DB. Venom proteins from endoparasitoid wasps and their role in host-parasite interactions. Annu Rev Entomol. 2011; 56:313-335.

RI PT

Asgari S, Zhang GM, Zareie R, Schmidt O. A serine proteinase homolog venom protein from an endoparasitoid wasp inhibits melanization of the host hemolymph. Insect Biochem Mol Biol. 2003; 33:1017-1024.

Azzouz H, Cherqui A, Campan EDM, Rahbe Y, Duport G, Jouanin L, Kaiser L, Giordanengo P. Effects of plant protease inhibitors oryzacystatin I and soybean Bowman-Birk inhibitor on the aphid (Hymenoptera Aphelinidae). J Insect Physiol. 2005; 51:75-86.

SC

Macrosiphum euphorbiae (Homoptera Aphididae) and its parasitoid Aphelinus abdominalis Bao YY, Qu LY, Zhao D, Chen LB, Jin HY, Xu LM, Cheng JA, Zhang CX. The genome- and Genomics. 2013; 14:160.

M AN U

transcriptome-wide analysis of innate immunity in the brown planthopper Nilaparvata lugens. BMC Bayer CA, Halsell SR, Fristrom JW, Kiehart DP, Von Kalm L. Genetic interactions between the RhoA and stubble-stubbloid loci suggest a role for a type II transmembrane serine protease in intracellular signaling during Drosophila imaginal disc morphogenesis. Genetics. 2003; 165:1417-1432. Belvin MP, Anderson KV. A conserved signaling pathway: The Drosophila Toll-Dorsal pathway. Annu Rev Cell Dev Biol. 1996; 12:393-416.

Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to

TE D

multiple testing. J Roy Stat Soc Ser B. 1995; 57:289-300.

Brackney DE, Isoe J, Black WCI, Zamora J, Foy BD, Miesfeld RL, Olson KE. Expression profiling and comparative analyses of seven midgut serine proteases from the yellow fever mosquito Aedes aegypti. J Insect Physiol. 2010; 56:736-744.

Buchon N, Poidevin M, Kwon HM, Guillou A, Sottas V, Lee BL, Lemaitre B. A single modular serine

EP

protease integrates signals from pattern-recognition receptors upstream of the Drosophila Toll pathway. Proc Natl Acad Sci U.S.A. 2009; 106:12442-12447. Burke GR, Walden KKO, Whitfield JB, Robertson HM, Strand MR. Widespread genome reorganization of an obligate virus mutualist. PLoS Genetics. 2014; 10:e1004660.

AC C

548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591

Cai J, Ye GY, Hu C. Parasitism of Pieris rapae (Lepidoptera: Pieridae) by a pupal endoparasitold Pteromalus puparum (Hymenoptera: Pteromalidae): effects of parasitization and venom on host hemocytes. J Insect Physiol. 2004; 50:315-322. Cao XL, He Y, Hu YX, Zhang XF, Wang Y, Zou Z, Chen YR, Blissard GW, Kanost MR, Jiang HB. Sequence conservation phylogenetic relationships and expression profiles of nondigestive serine proteases and serine protease homologs in Manduca sexta. Insect Biochem Mol Biol. 2015; 62:51-63. Cerenius L, Lee BL, Soderhall K. The proPO-system: pros and cons for its role in invertebrate immunity. Trends Immunol. 2008; 29:263-271. Choo YM, Lee KS, Yoon HJ, Kim BY, Sohn MR, Roh JY, Je YH, Kim NJ, Kim I, Woo SD, Sohn HD, Jin BR. Dual function of a bee venom serine protease: prophenoloxidase-activating factor in arthropods and fibrin(ogen)olytic Enzyme in mammals. Plos One. 2010; 5:e10393. Chougule NP, Giri AP, Sainani MN, Gupta VS. Gene expression patterns of Helicoverpa armigera gut

ACCEPTED MANUSCRIPT proteases. Insect Biochem Mol Biol. 2005; 35:355-367. Colinet D, Anselme C, Deleury E, Mancini D, Poulain J, Azema-Dossat C, Belghazi M, Tares S, Pennacchio F, Poirie M, Gatti JL. Identification of the main venom protein components of Aphidius ervi a parasitoid wasp of the aphid model Acyrthosiphon pisum. BMC Genomics. 2014; 15:342. de Graaf DC, Aerts M, Brunain M, Desjardins CA, Jacobs FJ, Werren JH, Devreese B. Insights into the venom composition of the ectoparasitoid wasp Nasonia vitripennis from bioinformatic and proteomic studies. Insect Mol Biol. 2010; 19:11-26.

RI PT

Fang Q, Wang BB, Ye XH, Wang F, Ye GY. Venom of parasitoid Pteromalus puparum impairs host humoral antimicrobial activity by decreasing host cecropin and lysozyme gene expression. Toxins. 2016; 88:203-221.

Fang Q, Wang F, Gatehouse JA, Gatehouse AMR, Chen XX, Hu C, Ye GY. Venom of parasitoid Pteromalus puparum suppresses host Pieris rapae immune promotion by decreasing host C-type lectin gene expression. PLoS One. 2011a; 6:e26888.

SC

Fang Q, Wang L, Zhu JY, Li YM, Song QS, Stanley DW, Akhtar ZR, Ye GY. Expression of immune-response genes in lepidopteran host is suppressed by venom from an endoparasitoid Pteromalus puparum. BMC Genomics. 2010; 11:484.

M AN U

Fang Q, Wang L, Zhu YK, Stanley DW, Chen XX, Hu C, Ye GY. Pteromalus puparum venom impairs host cellular immune responses by decreasing expression of its scavenger receptor gene. Insect Biochem Mol Biol. 2011b; 41:852-862.

Felfoldi G, Eleftherianos I, Ffrench-Constant RH, Venekei I. A serine proteinase homologue SPH-3 plays a central role in insect immunity. J Immunol. 2011; 186:4828-4834.

Formesyn EM, Heyninck K, de Graaf DC. The role of serine- and metalloproteases in Nasonia vitripennis venom in cell death related processes towards a Spodoptera frugiperda Sf21 cell line. J

TE D

Insect Physiol. 2013; 59:795-803.

Gadau J, Helmkampf M, Nygaard S, Roux J, Simola DF, Smith CR, Suen G, Wurm Y, Smith CD. The genomic impact of 100 million years of social evolution in seven ant species. Trends Genet. 2012; 28:14-21.

Gehrer L, Vorburger C. Parasitoids as vectors of facultative bacterial endosymbionts in aphids. Biol Lett.

EP

2012; 8:613-615.

Gueguen G, Kalamarz ME, Ramroop J, Uribe J, Govind S. Polydnaviral ankyrin proteins aid parasitic wasp survival by coordinate and selective inhibition of hematopoietic and immune NF-kappa B signaling in insect hosts. PLoS Pathog. 2013; 9:e1003580.

AC C

592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635

Gupta S, Wang Y, Jiang HB. Manduca sexta prophenoloxidase (proPO) activation requires proPO-activating proteinase (PAP) and serine proteinase homologs (SPHs) simultaneously Insect. Biochem Mol Biol. 2005; 35:241-248. He Y, Wang Y, Yang F, Jiang HB. Manduca sexta hemolymph protease-1 activated by an unconventional non-proteolytic mechanism mediates immune responses. Insect Biochem Mol Biol. 2017; 84:23-31. Huang TS, Wang HY, Lee SY, Johansson MW, Soderhall K, Cerenius L. A cell adhesion protein from the crayfish Pacifastacus leniusculus a serine proteinase homologue similar to Drosophila masquerade. J Biol Chem. 20002; 75:9996-10001. Jiang HB, Wang Y, Yu XQ, Zhu YF, Kanost M. Prophenoloxidase-activating proteinase-3 (PAP-3) from Manduca sexta hemolymph: a clip-domain serine proteinase regulated by serpin-1J and serine proteinase homologs. Insect Biochem Mol Biol. 2003; 33:1049-1060. Kambris Z, Brun S, Jang IH, Nam HJ, Romeo Y, Takahashi K, Lee WJ, Ueda R, Lemaitre B. Drosophila

ACCEPTED MANUSCRIPT immunity: A large-scale in vivo RNAi screen identifies five serine proteases required for toll activation. Curr Biol. 2006; 16:808-813. Kanost MR, Jiang HB. Clip-domain serine proteases as immune factors in insect hemolymph. Curr Opin Insect Sci. 2015; 11:47-55. Kanost MR, Jiang HB, Yu XQ. Innate immune responses of a lepidopteran insect Manduca sexta. Immunol Rev. 2004; 198:97-105. Kellenberger C, Leone P, Coquet L, Jouenne T, Reichhart JM, Roussel A. Structure-function analysis of

RI PT

Grass clip serine protease involved in Drosophila Toll pathway activation. J Biol Chem. 2011; 286:12300-12307.

Kim CH, Kim SJ, Kan H, Kwon HM, Roh KB, Jiang R, Yang Y, Park JW, Lee HH, Ha NC, Kang HJ, Nonaka M ,Soderhall K, Lee BL. A three-step proteolytic cascade mediates the activation of the peptidoglycan-induced Toll pathway in an insect. J Biol Chem. 2008; 283:7599-7607.

Kuwar SS, Pauchet Y, Vogel H, Heckel DG. Adaptive regulation of digestive serine proteases in the larval

SC

midgut of Helicoverpa armigera in response to a plant protease inhibitor. Insect Biochem Mol Biol. 2015; 59:18-29.

Li YL, Hou MZ, Shen GM, Lu XP, Wang Z, Jia FX, Wang JJ, Dou W. Functional analysis of five trypsin-like Physiol. 2017; 136:52-57.

M AN U

protease genes in the oriental fruit fly Bactrocera dorsalis (Diptera: Tephritidae). Pestic Biochem Lin H, Xia X, Yu L, Vasseur L, Gurr GM, Yao F, Yang G, You M. Genome-wide identification and expression profiling of serine proteases and homologs in the diamondback moth Plutella xylostella (L). BMC Genomics. 2015; 16:1054.

Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and ΔΔC(T)

the 2

method. Methods. 2001; 25:402-408.

TE D

Loof TG, Morgelin M, Johansson L, Oehmcke S, Olin AI, Dickneite G, NorrbyTeglund A, Theopold U, Herwald H. Coagulation an ancestral serine protease cascade exerts a novel function in early immune defense. Blood. 2011; 118:2589-2598.

Lu AR, Zhang QL, Zhang J, Yang B, Wu K, Xie W, Luan YX, Ling E. Insect prophenoloxidase: the view beyond immunity. Front Physiol. 2014; 5:252.

EP

Martinson EO, Wheeler D, Wright J, Mrinalini Siebert AL, Werren JH. Nasonia vitripennis venom causes targeted gene expression changes in its fly host. Mol Ecol. 2014; 23:5918-5930. Mukherjee AK, Kalita B, Mackessy SP. A proteomic analysis of Pakistan Daboia russelii russelii venom and assessment of potency of Indian polyvalent and monovalent antivenom. J Proteomics. 2016;

AC C

636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679

144:73-86.

Murugasuoei B, Rodrigues V, Yang XH, Chia W. Masquerade: a novel secreted serine protease-like molecule is required for somatic muscle attachment in the Drosophila embryo. Genes Dev. 1995; 9:139-154.

MurugasuOei B, Balakrishnan R, Yang XH, Chia W, Rodrigues V. Mutations in masquerade a novel serine-protease-like molecule affect axonal guidance and taste behavior in Drosophila. Mech Dev. 1996; 57:91-101. Nam H, Jang I, You H, Lee K, Lee W. Genetic evidence of a redox-dependent systemic wound response via Hayan protease-phenoloxidase system in Drosophila. Embo Journal. 2012; 31:1253-1265. Paskewitz SM, Andreev O, Shi L. Gene silencing of serine proteases affects melanization of Sephadex beads in Anopheles gambiae. Insect Biochem Mol Biol. 2006; 36:701-711. Pennacchio F, Strand MR. Evolution of developmental strategies in parasitic hymenoptera. Annu Rev

ACCEPTED MANUSCRIPT Entomol. 2006; 51:233-258. Perona JJ, Craik CS. Structural basis of substrate-specificity in the serine proteases. Protein Sci. 1995; 4:337-360. Poirie M, Colinet D, Gatti J. Insights into function and evolution of parasitoid wasp venoms. Curr Opin Insect Sci. 2014; 6:52-60. Povelones M, Bhagavatula L, Yassine H, Tan LA, Upton LM, Osta MA, Christophides GK. The CLIP-domain serine protease homolog SPCLIP1 regulates complement recruitment to microbial

RI PT

surfaces in the malaria mosquito Anopheles gambiae. PLoS Pathog. 2013; 9:e1003623.

Rao XJ, Ling E, Yu XQ. The role of lysozyme in the prophenoloxidase activation system of Manduca sexta: An in vitro approach. Dev Comp Immunol. 2010; 34:264-271.

Rawlings ND, Tolle DP, Barrett AJ. Evolutionary families of peptidase inhibitors. Biochem J. 2004; 378:705-716.

Ross J, Jiang H, Kanost MR, Wang Y. Serine proteases and their homologs in the Drosophila

SC

melanogaster genome: an initial analysis of sequence conservation and phylogenetic relationships. Gene. 2003; 304:117-131.

Saadat D, Bandani AR, Dastranj M. Comparison of the developmental time of Bracon hebetor

M AN U

(Hymenoptera: Braconidae) reared on five different lepidopteran host species and its relationship with digestive enzymes. Eur J Entomol. 2014; 111:495-500.

Simola DF, Wissler L, Donahue G, Waterhouse RM, Helmkampf M, Roux J, Nygaard S, Glastad KM, Hagen DE, Viljakainen L, Reese JT, Hunt BG, Graur D, Elhaik E, Kriventseva EV, Wen J, Parker BJ, Cash E, Privman E, Childers CP, Munoz-Torres MC, Boomsma JJ, Bornberg-Bauer E, Currie CR, Elsik CG, Suen G, Goodisman MAD, Keller L, Liebig J, Rawls A, Reinberg D, Smith CD, Smith CR, Tsutsui N, Wurm Y, Zdobnov EM, Berger SL, Gadau J. Social insect genomes exhibit dramatic evolution in gene

TE D

composition and regulation while preserving regulatory features linked to sociality. Genome Res. 2013; 23:1235-1247.

Soares TS, Watanabe RMO, Lemos FJA, Tanaka AS. Molecular characterization of genes encoding trypsin-like enzymes from Aedes aegypti larvae and identification of digestive enzymes. Gene. 2011; 489:70-75.

EP

Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood Evolutionary Distance and Maximum Parsimony Methods. Mol Biol Evol. 2011; 28:2731-2739. Teng ZW, Xu G, Gan SY, Chen X, Fang Q, Ye GY. Effects of the endoparasitoid Cotesia chilonis

AC C

680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723

(Hymenoptera: Braconidae) parasitism venom and calyx fluid on cellular and humoral immunity of its host Chilo suppressalis (Lepidoptera: Crambidae) larvae. J Insect Physiol. 2016; 85:46-56. Theopold U, Li D, Fabbri M, Scherfer C, Schmidt O. The coagulation of insect hemolymph. Cell Mol Life Sci. 2002; 59:363-372.

Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997; 25:4876-4882. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012; 7:562-578. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform

ACCEPTED MANUSCRIPT switching during cell differentiation. Nat Biotechnol. 2010; 28:511-U174. Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG. Primer3-new capabilities and interfaces. Nucleic Acids Res. 2012; 40:e115. Veillard F, Troxler L, Reichhart JM. Drosophila melanogaster clip-domain serine proteases: Structure function and regulation. Biochimie. 2016; 122:255-269. Viljakainen L, Evans JD, Hasselmann M, Rueppell O, Tingek S, Pamilo P. Rapid evolution of immune proteins in social insects. Mol Biol Evol. 2009; 26:1791-1801.

RI PT

Vincent B, Kaeslin M, Roth T, Heller M, Poulain J, Cousserans F, Schaller J, Poirie M, Lanzrein B, Drezen J, Moreau SJM. The venom composition of the parasitic wasp Chelonus inanitus resolved by combined expressed sequence tags analysis and proteomic approach. BMC Genomics. 2010; 11:693.

Wang L, Fang Q, Qian C, Wang F, Yu XQ, Ye GY. Inhibition of host cell encapsulation through inhibiting immune gene expression by the parasitic wasp venom calreticulin. Insect Biochem Mol Biol. 2013; 43:936-946.

SC

Wang L, Zhu JY, Qian C, Fang Q, Ye GY. Venom of the parasitoid wasp pteromalus puparum contains an odorant binding protein. Arch Insect Biochem Physiol. 2015; 88:101-110.

Wang LK, Feng ZX, Wang X, Wang XW, Zhang XG. DEGseq: an R package for identifying differentially

M AN U

expressed genes from RNA-seq data. Bioinformatics. 2010; 26:136-138.

Wang RJ, Lin Z, Jiang H, Li J, Saha TT, Lu Z, Lu Z, Zou Z. Comparative analysis of peptidoglycan recognition proteins in endoparasitoid wasp Microplitis mediator. Insect Sci. 2017; 24:2-16. Wang Y, Jiang H. A positive feedback mechanism in the Manduca sexta prophenoloxidase activation system. Insect Biochem Mol Biol. 2008; 38:763-769.

Wang Y, Jiang H. Binding properties of the regulatory domains in Manduca sexta hemolymph proteinase-14 an initiation enzyme of the prophenoloxidase activation system. Dev Comp Immunol.

TE D

2010; 34:316-322.

Wang Y, Jiang H. Prophenoloxidase activation and antimicrobial peptide expression induced by the recombinant microbe binding protein of Manduca sexta. Insect Biochem Mol Biol. 2017; 83:35-43. Wang Y, Lu Z, Jiang H. Manduca sexta proprophenoloxidase activating proteinase-3 (PAP3) stimulates melanization by activating proPAP3 proSPHs and proPOs. Insect Biochem Mol Biol. 2014; 50:82-91.

EP

Werren JH, Richards S, Desjardins CA, Niehuis O, Gadau J, Colbourne JK, Beukeboom LW, Desplan C, Elsik CG, Grimmelikhuijzen CJP, Kitts P, Lynch JA, Murphy T, Oliveira D, Smith CD, van de Zande L, Worley KC, Zdobnov EM, Aerts M, Albert S, Anaya VH, Anzola JM, Barchuk AR, Behura SK, Bera AN, Berenbaum MR, Bertossa RC, Bitondi MMG, Bordenstein SR, Bork P, Bornberg-Bauer E, Brunain M,

AC C

724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767

Cazzamali G, Chaboub L, Chacko J, Chavez D, Childers CP, Choi JH, Clark ME, Claudianos C, Clinton RA, Cree AG, Cristino AS, Dang PM, Darby AC, de Graaf DC, Devreese B, Dinh HH, Edwards R, Elango N, Elhaik E, Ermolaeva O, Evans JD, Foret S, Fowler GR, Gerlach D, Gibson JD, Gilbert DG, Graur D, Grunder S, Hagen DE, Han Y, Hauser F, Hultmark D, Hunter HC, Jhangian SN, Jiang HY, Johnson RM, Jones AK, Junier T, Kadowaki T, Kamping A, Kapustin Y, Kechavarzi B, Kim J, Kim J, Kiryutin B, Koevoets T, Kovar CL, Kriventseva EV, Kucharski R, Lee H, Lee SL, Lees K, Lewis LR, Loehlin DW, Logsdon JM, Lopez JA, Lozado RJ, Maglott D, Maleszka R, Mayampurath A, Mazur DJ, McClure MA, Moore AD, Morgan MB, Muller J, Munoz-Torres MC, Muzny DM, Nazareth LV, Neupert S, Nguyen NB, Nunes FMF, Oakeshott JG, Okwuonu GO, Pannebakker BA, Pejaver VR, Peng ZG, Pratt SC, Predel R, Pu LL, Ranson H, Raychoudhury R, Rechtsteiner A, Reese JT, Reid JG, Riddle M, Robertson IM, Romero-Severson J, Rosenberg M, Sackton TB, Sattelle DB, Schluns H, Schmitt T, Schneider M, Schuler A, Schurko AM, Shuker DM, Simoes ZLP, Sinha S, Smith Z, Solovyev V, Souvorov A, Springauf A, Stafflinger E, Stage DE,

ACCEPTED MANUSCRIPT Stanke M, Tanaka Y, Telschow A, Trent C, Vattathil S, Verhulst EC, Viljakainen L, Wanner KW, Waterhouse RM, Whitfield JB, Wilkes TE, Williamson M, Willis JH, Wolschin F, Wyder S, Yamada T, Yi SV, Zecher CN, Zhang L, Gibbs RA, Nasonia Genome Working G. Functional and evolutionary insights from the genomes of three parasitoid Nasonia species. Science. 2010; 327:343-348. Yan W, Wu FY, Morser J, Wu QY. Corin a transmembrane cardiac serine protease acts as a pro-atrial natriuretic peptide-converting enzyme P. Proc Natl Acad Sci U.S.A. 2000; 97:8525-8529. Yan ZC, Fang Q, Liu Y, Xiao S, Yang L, Wang F, An CJ, Werren JH, Ye GY. A venom serpin splicing isoform

RI PT

of the endoparasitoid wasp Pteromalus puparum suppresses host Prophenoloxidase cascade by forming complexes with host hemolymph proteinases. J Biol Chem. 2017; 292:1038-1051.

Yan ZC, Fang Q, Wang L, Liu JD, Zhu Y, Wang F, Li F, Werren JH, Ye GY. Insights into the venom composition and evolution of an endoparasitoid wasp by combining proteomic and transcriptomic analyses. Sci Rep. 2016; 6:19604. bacterial infection. Dev Comp Immunol. 2003; 27:189-196.

SC

Yu XQ, Kanost MR. Manduca sexta lipopolysaccharide-specific immulectin-2 protects larvae from Zhang GM, Lu ZQ, Jiang HB, Asgari S. Negative regulation of prophenoloxidase (proPO) activation by a clip-domain serine proteinase homolog (SPH) from endoparasitoid venom. Insect Biochem Mol Biol.

M AN U

2004; 34:477-483.

Zhang QX, Liu HP, Chen RY, Shen KL, Wang KJ. Identification of a serine proteinase homolog (Sp-SPH) involved in immune defense in the mud crab Scylla paramamosain. PLoS One. 2013; 8:e63787. Zhang Z, Ye GY, Cai J, Hu C. Comparative venom toxicity between Pteromalus puparum and Nasonia vitripennis (Hymenoptera: Pteromalidae) toward the hemocytes of their natural hosts non-target insects and cultured insect cells. Toxicon. 2005; 46:337-349.

Zhao P, Wang GH, Dong ZM, Duan J, Xu PZ, Cheng TC, Xiang ZH, Xia QY. Genome-wide identification Genomics. 2010; 11:405.

TE D

and expression analysis of serine proteases and homologs in the silkworm Bombyx mori. BMC Zhu JY. Deciphering the main venom components of the ectoparasitic ant-like bethylid wasp Scleroderma guani. Toxicon. 2016; 113:32-40.

Zhu JY, Fang Q, Wang L, Hu C, Ye GY. Proteomic analysis of the venom from the endoparasitoid wasp

EP

Pteromalus puparum (Hymenoptera: Pteromalidae). Arch Insect Biochem Physiol. 2010; 75:28-44. Zhu JY, Ye GY, Dong SZ, Fang Q, Hu C. Venom of Pteromalus puparum (Hymenoptera: Pteromalidae) induced endocrine changes in the hemolymph of its host Pieris rapae (Lepidoptera: Pieridae). Arch Insect Biochem Physiol. 2009; 71:45-53.

AC C

768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807

Zhu Y, Ye XH, Liu Y, Yan ZC, Stanley D, Ye GY, Fang Q. A venom gland extracellular chitin-binding-like protein from pupal endoparasitoid wasps Pteromalus puparum selectively binds chitin. Toxins. 2015; 7:5098-5113.

Zou Z, Lopez DL, Kanost MR, Evans JD, Jiang H. Comparative analysis of serine protease-related genes in the honey bee genome: possible involvement in embryonic development and innate immunity. Insect Mol Biol. 2006; 15:603-614. Zou Z, Shin SW, Alvarez KS, Kokoza V, Raikhell AS. Distinct melanization pathways in the mosquito Aedes aegypti. Immunity. 2010; 32:41-53.

ACCEPTED MANUSCRIPT 808

Figure Legends

809

Fig. 1. Scaffold location of Pteromalus puparum SP and SPH genes. Gene names

811

and predicted functions are shown on the right of the bar. The distance of two adjacent

812

genes (kilobases, kb) is presented on the left, while the distance cannot match the

813

length of bar is marked in red. Scaffold number is shown at the top of each bar.

RI PT

810

814

Fig. 2. Tissue-specific expressions of SP and SPH genes. Total RNA was extracted

816

from the gut (G), fat body (FB), ovary (O), venom (V) and carcass (C, the remaining

817

body) of the adult female P. puparum, and used to analyze the expression patterns of

818

these SPs and SPHs using qPCR. P. puparum 18s rRNA was used as a housekeeping

819

gene. Error bars represent the means ± standard deviations from three biological

820

replicates. A one-way ANOVA was used to determine the significant difference with

821

different lowercase letter (a-c) (p <0.05).

M AN U

TE D

822

SC

815

Fig. 3. Phylogenetic analysis of clip domain SP/SPH genes. The domain of amino

824

acid sequences of Pteromalus puparum (Pp), Apis mellifera (Am), Manduca sexta

825

(Ms) and Drosophila melanogaster (Dm) were aligned. Phylogenetic tree was

826

constructed by Neighbor-joining method, using the program Mega 5.10. Red spots at

827

the nodes denote bootstrap values greater than 500 from 1000 trials.

AC C

828

EP

823

829

Fig. 4. Expression levels of SP/SPH genes following different immune challenge.

830

Expression levels of SP/SPH genes following the infection of Gram-negative

831

(Escherichia coli) or Gram-positive (Micrococcus luteus) bacterium or

832

entomopathogenic fungus (Beauveria bassiana) were analyzed using qPCR. Time

ACCEPTED MANUSCRIPT points along the x-axis represent the hours post-infection. Error bars represent the

834

means ± standard deviations from three biological replicates. P. puparum actin 1 was

835

used as a housekeeping gene. A two-way ANOVA was used to determine the

836

combined effects of infection and time. The different lowercase letters (a-c) represent

837

the significant difference at the different time points after infection with the same

838

pathogen (p <0.05), and the capital letters (A~C) indicate the significant difference at

839

the same time points after different pathogenic infection (p <0.05).

841

Fig. 5. Domain organizations of SPs/SPHs with complex domains in Pteromalus

842

puparum.

M AN U

SC

840

RI PT

833

843

Fig. 6. Phylogenetic analysis of GD (a) and Nudel (b) genes. Phylogenetic tree was

845

constructed by Neighbor-joining method, using the program Mega 5.10. Red spots at

846

the nodes denote bootstrap values greater than 500 from 1000 trials. The GenBank

847

accession number for sequences used in phylogenetic analysis in Fig. 6a: PpSP62

848

(PPU07067-RA, Pteromalus puparum); Nv_Nudel-like (XP_003424379.2, Nasonia

849

vitripennis); Tp_Nudel-like (XP_014224565.1, Trichogramma pretiosum); Md_Nudel

850

(XP_008548271.1, Microplitis demolitor); Am_Nudel (XP_006559739.1, Apis

851

mellifera); Mr_Nudel (XP_012141152.1, Megachile rotundata); Tc_Nudel

852

(XP_015840900.1, Tribolium castaneum); Ap_Nudel (XP_001949959.4,

853

Acyrthosiphon pisum); Dm_Nudel (NP_523947.2, Drosophila melanogaster);

854

Cq_Nudel (XP_001843380.1, Culex quinquefasciatus) and Cb_Nudel

855

(XP_019889456.1, Cerapachys biroi). Amino acid sequences used for phylogenetic

856

tree in Fig. 6b: PpSP67 (PPU07692-RA, Pteromalus puparum); PpSP121

857

(PPU13886-RA, Pteromalus puparum); PpSP129 (PPU16927-RA , Pteromalus

AC C

EP

TE D

844

ACCEPTED MANUSCRIPT 858

puparum); NvGD (XP_003427708.1, Nasonia vitripennis); Am_GD

859

(XP_006563318.1, Apis mellifera); MrGD (XP_012143735.1, Megachile rotundata);

860

Dm_GD (NP_001259478.1, Drosophila melanogaster); Bm_GD (XP_012548092.1,

861

Bombyx mori) and Tc_GD (KYB27136.1, Tribolium castaneum).

RI PT

862

Fig. 7. Confirmation of developmental stage expressions of SP and SPH genes:

864

Total RNA was extracted from the embryo (E), larvae (L), female pupae (FP), male

865

pupae (MP), female adult (FA) and male adult (MA), and used to analyze the

866

expression pattern of these SPs and SPHs using qPCR. P. puparum 18s rRNA was

867

used as a housekeeping gene. Error bars represent the means ± standard deviations

868

from three biological replicates. A one-way ANOVA was used to determine the

869

significant difference with different lowercase letter (a-d) (p <0.05).

M AN U

SC

863

870

Fig. 8. Expression profiles of Pteromalus puparum SP and SPH genes across

872

different developmental stages and tissue distributions. Log2 FPKM values for the

873

SPs/SPHs are presented by bar colors where the darker red represent higher

874

expression values, the darker green represent lower expression values.

876 877

EP

AC C

875

TE D

871

Supplementary Material

878

Fig. S1. Scaffold location of Pteromalus puparum SP and SPH genes. Gene names

879

and predicted functions are shown on the right of the bar. The distance of two adjacent

880

genes (kilobases, kb) is presented on the left, while the distance cannot match the

881

length of bar is marked in red. Scaffold number is shown at the top of each bar.

ACCEPTED MANUSCRIPT 882

Fig. S2. Alignment of 25 Pteromalus puparum clip domain sequences by Clustal

884

X2. PpcSPH9 has five clip domains represented by PpcSPH9-1, -2, -3, -4 and -5. Six

885

conserved Cys residues are marked with black.

RI PT

883

886

Fig. S3. Alignment of serine proteinase domains from clip SPs/SPHs of

888

Pteromalus puparum (Pp), Apis mellifera (Am), Manduca sexta (Ms) and

889

Drosophila melanogaster (Dm) by Clustal X2.

M AN U

SC

887

890

Fig. S4. Domain organizations of Nudel and Nudel-likegenes in 11 insect species.

892

PpSP62 (PPU07067-RA, Pteromalus puparum); Nv_Nudel-like (XP_003424379.2,

893

Nasonia vitripennis); Tp_Nudel-like (XP_014224565.1, Trichogramma pretiosum);

894

Cb_Nudel (XP_019889456.1, Cerapachys biroi); Am_Nudel (XP_006559739.1, Apis

895

mellifera); Dm_Nudel (NP_523947.2, Drosophila melanogaster); Cq_Nudel

896

(XP_001843380.1, Culex quinquefasciatus); Tc_Nudel (XP_015840900.1, Tribolium

897

castaneum); Mr_Nudel (XP_012141152.1, Megachile rotundata); Md_Nudel

898

(XP_008548271.1, Microplitis demolitor); Ap_Nudel (XP_001949959.4,

899

Acyrthosiphon pisum).

EP

AC C

900

TE D

891

901

Fig. S5. Alignment of amino acid sequences from Nudel or Nudel-like genes of

902

Pteromalus puparum and other 10 insects by Clustal X2.

903

ACCEPTED MANUSCRIPT 904

Fig. S6. Alignment of serine protease domains of GD homolog genes from

905

Pteromalus puparum and other 6 insects by Clustal X2.

906 907

Table S1. Primers used for qPCR analysis of gene expressions.

RI PT

908

Table S2. FPKM values of the Pteromalus puparum serine proteases and their

910

homologs at different development stages and tissues obtained from the RNA-seq

911

data.

SC

909

M AN U

912 913

Table S3. Differentially expressed Pteromalus puparum serine proteases and their

914

homologs in the venom gland.

915

Table S4. Prediction of serine proteases and their homologs in Pteromalus

917

puparum.

TE D

916

918

Table S5. The predicted amino acid sequences of 183 Pteromalus puparum serine

920

proteases and their homologs.

AC C

921

EP

919

922

Table S6. NCBI blast descriptions of serine proteases and their homologs in

923

Pteromalus puparum.

924 925

Table S7. Gene counts for clip-domain and non-clip-domain serine proteases and

926

their homologs in ten insect species.

927

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT