In silico identification of the neuropeptidome of the pond wolf spider Pardosa pseudoannulata

In silico identification of the neuropeptidome of the pond wolf spider Pardosa pseudoannulata

General and Comparative Endocrinology 285 (2020) 113271 Contents lists available at ScienceDirect General and Comparative Endocrinology journal home...

3MB Sizes 0 Downloads 44 Views

General and Comparative Endocrinology 285 (2020) 113271

Contents lists available at ScienceDirect

General and Comparative Endocrinology journal homepage: www.elsevier.com/locate/ygcen

In silico identification of the neuropeptidome of the pond wolf spider Pardosa pseudoannulata Na Yu, Chenyang Han, Zewen Liu

T



Key Laboratory of Integrated Management of Crop Diseases and Pests (Ministry of Education), College of Plant Protection, Nanjing Agricultural University, Weigang 1, Nanjing 210095, China

A R T I C LE I N FO

A B S T R A C T

Keywords: C-type allatostatin Chelicerate Expression profile Neuropeptide Pardosa pseudoannulata

Neuropeptides have been successfully documented in numerous arthropod species via in silico prediction from transcriptomic and genomic data. We recently sequenced the genome and nine transcriptomes of a chelicerate species, the pond wolf spider, Pardosa pseudoannulata. Here 43 neuropeptide families encoded by 87 neuropeptide genes were identified, among which 84 genes were presented with complete open reading frames. The neuropeptide genes often had paralogs and paralogous genes showed different expression profiles in nine transcriptomes. Six crustacean hyperglycemic hormone/ion transport peptide-like (CHH/ITP) genes were predicted and CHH/ITP6 was expressed much higher than the others. Orcokinin 1 and orcokinin 2 genes were both expressed in brain at a similar level. But, interestingly, orcokinin 1 gene was ubiquitously expressed in appendages while orcokinin 2 gene was enriched in venom gland to an extreme extent. The expression profiling of neuropeptide genes offers clues for further functional investigation. Paralogous genes were also found to be clustered at scaffolds such as nine insulin-like peptide genes at three scaffolds and six pyrokinin genes at two scaffolds, indicating a result of local gene duplication. In contrast, the four C-type allatostatin family members were scattered at five scaffolds, different from their closely associated locations reported in many arthropod species including several spiders. The comprehensive inventory of P. pseudoannulata neuropeptides here expands our repository of chelicerate neuropeptides and further promotes our understanding of neuropeptide evolution and functions.

1. Introduction Neuropeptides are a class of diverse neuronal regulators and evolutionary conserved in metazoans (Elphick et al., 2018; Mirabeau and Joly, 2013). They are involved in many physiological processes and behaviours including water balance (Cannell et al., 2016; Coast et al., 2001; Sajadi et al., 2018), development (Hahn and Denlinger, 2011; Ikeya et al., 2002), nociception (Bachtel et al., 2018) and aggression (Bubak et al., 2019; Hoopfer, 2016). Neuropeptide studies in arthropods have been greatly promoted by massive transcriptome and genome sequencing and in silico prediction of neuropeptides provides informative evidence for functional and evolutionary study especially in species with limited neuronal and molecular manipulations available. A database for insect neuropeptide research DINeR (http://www. neurostresspep.eu/diner) in the nEUROSTRESSPEP programme (http:// www.neurostresspep.eu/home) is archiving the increasing neuropeptides that have been intensively and functionally studied in insects. Given its important evolutionary niche and unique biology, although



much behind of the progress in Insecta, Chelicerata is attracting researchers’ interests. Neuropeptides have been, in few cases, identified via mass spectrometry, and more intensively predicted via sequencing data in spiders, scorpions, mites and ticks (Christie, 2008, 2015; Christie and Chi, 2015; Christie et al., 2011; Veenstra, 2016a; Veenstra et al., 2012). Neuropeptides in chelicerate species presented both conservation and specificity to their insect and crustacean homologs, in accordance with the evolution and diversity of peptidergic signalling in metazoans (Jekely, 2013). Especially, chelicerate neuropeptides and their receptors often have paralogs, with one plausible explanation being the effects of one or more ancient whole gnome duplications (Di et al., 2015; Gendreau et al., 2017; Kenny et al., 2016; Veenstra, 2016a). However, most neuropeptides have been in silico identified from expressed sequence tags (ESTs) and transcriptomic data so that truncated precursors are not uncommon and that impeded the in-depth understanding of the chelicerate neuropeptide scenario. Transcriptome data validate the accuracy of gene transcripts while genome data offer a global view of gene structure and locus. The combination of both would

Corresponding author. E-mail address: [email protected] (Z. Liu).

https://doi.org/10.1016/j.ygcen.2019.113271 Received 24 June 2019; Received in revised form 27 July 2019; Accepted 12 September 2019 Available online 13 September 2019 0016-6480/ © 2019 Elsevier Inc. All rights reserved.

General and Comparative Endocrinology 285 (2020) 113271

N. Yu, et al.

Table 1 Neuropeptide precursor genes in P. pseudoannulata. #

Neuropeptide

1 2

Achatin Agatoxin-like peptide

3 4 5 6

Allatostatin Allatostatin Allatostatin Allatostatin

7 8 9

Allatostatin CCC Allatotropin Bursicon

10

Calcitonin-like

11 12

Crustacean cardioactive peptide CCHamide

13

CCRFamide

14

Crustacean hyperglycemic hormone/Ion transport peptidelike

15 16

Corazonin Diuretic hormone 31

17

Diuretic hormone 44

18

EFLamide

19

Eclosion hormone

20

Elevenin

21 22 23 24

Ecdysis-triggering hormone FMRFamide Gonadotropin releasing hormone-related peptide GPA2/GPB5

25

Insulin-like peptide

26 27 28

Leucokinin KYMGLamide Myosuppressin

29

Neuropeptide F

30 31

Neuropeptide-like precursor Natalisin

32

Orcokinin

A B C CC

Neuropeptide code in P. pseurodannulata

Scaffold No. in genome

Precursor gene

Precursor protein

No. of exons in ORF

ORF completeness&

Length (aa)

Signal peptide predictionψ

Paracopy No.

Parps_Achatin Parps_AGT1 Parps_AGT2 Parps_AGT3 Parps_AGT4 Parps_AstA Parps_AstB Parps_AstC Parps_AstCC1 Parps_AstCC2 Parps_AstCCC Parps_AT Parps_Bursα Parps_Bursβ Parps_Cal1 Parps_Cal2 Parps_Cal3 Parps_CCAP Parps_CCHa1 Parps_CCHa2 Parps_CCRFa1 Parps_CCRFa2 Parps_CHH/ITP1 Parps_CHH/ITP2 Parps_CHH/ITP3 Parps_CHH/ITP4 Parps_CHH/ITP5 Parps_CHH/ITP6 Parps_Crz Parps_DH31-1 Parps_DH31-2 Parps_DH44-1 Parps_DH44-2 Parps_EFLa1-1 Parps_EFLa1-2 Parps_EFLa2 Parps_EH1 Parps_EH2 Parps_EH3 Parps_Ele1 Parps_Ele2 Parps_ETH Parps_FMRFa Parps_GnRH

4888 913,469 913,469 4559 43 2679 3746 2534 + 1901 1637 111,528 1449 213,943 1637 1637 1265 3762 + 218706 2844 + 218706 3961 751 676,362 2423 276 544,486 499,224 4443 409057 + 135151 409,057 232,245 4051 1001 732501 + 259622 384,187 2265 2888 2888 141 608,707 4693 4290 126,072 1782 74,316 450,770 4051

1 3 3 3 3 1 1 2 2 2 2 3 2 2 3 3 1 3 3 3 1 1 2 2 2 2 2 2 2 3 3 2 2 2 2 2 2 1 2 2 4 3 2 2

c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c

192 101 95 109 92 432 225 87 87 118 96 103 174 150 115 106 88 93 96 107 99 93 104 104 99 107 104 108 90 111 109 86 177 288 157 163 80 90 77 108 105 126 178 87

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

11 1 1 1 1 20 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 13 7 1 1 1 1 1 1 1 2 1

Parps_GPA2 Parps_GPB5-1 Parps_GPB5-2 Parps_ILP1 Parps_ILP2 Parps_ILP3 Parps_ILP4 Parps_ILP5 Parps_ILP6 Parps_ILP7 Parps_ILP8 Parps_ILP9 Parps_ILP10 Parps_ILP11 Parps_K Parps_KYMGLa Parps_MS1 Parps_MS2-1 Parps_MS3-2 Parps_NPF1 Parps_NPF2 Parps_NPLP Parps_NTL1-1 Parps_NTL1-2 Parps_OK1

260,967 1233 924,609 1148 4675 4675 4675 4675 4226 4226 140,531 140,531 140,531 3883 139,590 1823 843 2388 2388 451 + 2720 451 214 493,093 493,093 2727

2 2 2 3 2 2 2 3 3 3 3 3 3 3 1 1 2 2 3 1 2 2 3 2 2

c c c c nc-n nc-n nc-n c c c c c c c c c c c c c c c c c c

115 164 170 132 113 110 103 119 113 117 199 219 215 183 276 870 423 1530 1083 88 93 271 386 678 434

1 1 1 1 1-n 1-n 1-n 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 22 15 53 22 1 1 3 9 19 16

(continued on next page) 2

General and Comparative Endocrinology 285 (2020) 113271

N. Yu, et al.

Table 1 (continued) #

Neuropeptide

33

PGWX3GLamide

34

Pyrokinin

35 36

Periviscerokinin Proctolin

37 38 39 40 41

Relaxin-like RYamide SIFamide Sulfakinin short neuropeptide F

42 43

Tachykinin Trissin

Neuropeptide code in P. pseurodannulata

Parps_OK2 Parps_PGWX3GLa1 Parps_PGWX3GLa2 Parps_PK1 Parps_PK2 Parps_PK3 Parps_PK4 Parps_PK5 Parps_PK6 Parps_PVRK Parps_Proc1 Parps_Proc2 Parps_Proc3 Parps_Relaxin Parps_RYa Parps_SIFa Parps_SK Parps_sNPF1 Parps_sNPF2 Parps_TK Parps_Tris

Scaffold No. in genome

202,958 2418 1823 395,994 395,994 395,994 395,994 913,469 913,469 4950 423,848 4610 1332 749 1346 74,316 263,759 3569 + 656479 74,316 5101 74,316

Precursor gene

Precursor protein

No. of exons in ORF

ORF completeness&

Length (aa)

Signal peptide predictionψ

Paracopy No.

2 2 1 2 2 2 2 2 4 1 2 2 2 3 2 2 1 2 2 2 2

c c c c c c c c c c c c c c c c c c c c c

208 205 870 127 200 299 473 137 406 1048 85 89 90 140 94 74 101 100 90 134 135

1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

6 2 1 2 4 8 9 2 2 30 1 1 1 2 1 1 2 1 1 2 1

&

c, complete ORF; nc, incomplete ORF with both 5′ start codon and 3′ stop codon missing; nc-n, incomplete ORF with 5′ start codon missing. 1, signal peptide (SP) predicted; 0, no SP predicted despite of the complete protein; 1-n, SP predicted despite of the incomplete N-terminus; 0-n, no SP predicted probably due to incomplete N-terminus. ψ

submission.php, last accessed 27 July 2019) was used to predict a gene with the truncated transcript as “expert options”. The neuropeptide precursor protein sequences were then obtained with ExPASy (https://web.expasy.org/translate/, last accessed 27 July 2019). Signal peptides were predicted with SignalP-5.0 Server (http://www.cbs.dtu. dk/services/SignalP/, last accessed 27 July 2019; Armenteros et al., 2019). Convertase cleavage sites typically K/R/KK/RR/KR/RK were manually examined following Veenstra (2000). The mature neuropeptides were manually identified when the conserved motifs or amino acid residues from known arthropod neuropeptides were recognized. For some neuropeptides with conserved C-terminal motif, the N-terminal extensions were decided according to their homologs in other chelicerates predicted in (Christie, 2015; Veenstra, 2016a). The nomenclature of neuropeptides was following (Coast and Schooley, 2011). Sequence alignment was conducted in MEGA (version 7.0.18, Kumar et al., 2016) and illustrated in GeneDoc (Nicholas, 1995). Sequence logos were generated with WebLogo (http://weblogo.berkeley. edu/logo.cgi, last accessed 27 July 2019; Crooks et al., 2004). Gene structure and locations at scaffolds were drawn with IBS (http://ibs. biocuckoo.org/, last accessed 27 July 2019; Liu et al., 2015). The FPKM (fragments per kilobase of exon model per million mapped fragments) values) values of the predicted P. pseudoannulata neuropeptide precursor transcripts were then retrieved from the normalized transcriptome of the nine tissues and illustrated in GraphPad Prism (version 7.00 for Windows, GraphPad Software, La Jolla California USA, www. graphpad.com; Swift, 1997). Figures were processed in Adobe Photoshop CS5.

help to achieve a more accurate and complete neuropeptide precursor prediction. We sequenced the genome of an araneomorph spider, the pond wolf spider, Pardosa pseudoannulata, as well as transcriptomes of nine tissues. Taken the advantage of the high-quality sequencing data, we attempted to comprehensively collect the neuropeptide information in the spider species. 2. Material and methods In silico identification of neuropeptide in P. pseudoannulata (Parps) were conducted via local BLAST and manual correction. Neuropeptide precursor protein sequences from spiders Stegodyphus mimosarum (Stemi), Latrodectus hesperus (Lathe), Parasteatoda tepidariorum (Parte), scorpion Mesobuthus martensii (Mesma), spider mite Tetranychus urticae (Tetur), crab Scylla paramamosain and crab Carcinus maenas were collected from publications (Bao et al., 2015; Veenstra, 2016a,b). Databases were constructed with the genome data (SBLA00000000) and transcriptomes of nine tissues (four pairs of legs, SRR8083389SRR8083392; chelicerae, SRR8083393; pedipalp, SRR8083394; venom gland, SRR8083395; brain, SRR8083396; and fat body, SRR8083398). The assembled P. pseudoannulata genome is 4.26 Gb with scaffold N50 of 699.15 kb. The gene set contains 23,310 genes, of which 98.8% are supported by homologous evidences or transcriptomic data and 92.0% are annotated with public databases. The information of transcriptomes were listed in Supplementary Table 1. TBLASTN was first conducted with query proteins searching in transcriptomes. The resultant transcripts were used as query in the next round of BLASTN when their corresponding protein products contained typical neuropeptide motifs such as the conserved amino acid core structures and convertase cleavage sites. Then BLASTN was conducted with neuropeptide precursor nucleic acid sequences as query searching in genome database. The resultant scaffold sequences were retrieved and transcripts were aligned at their corresponding scaffold for exon and intron recognition with online Splign (https://www.ncbi.nlm.nih.gov/sutils/splign/splign.cgi? textpage=online&level=form, last accessed 27 July 2019; Kapustin et al., 2008). In the case of a transcript with incomplete open reading frame (ORF), Augustus (http://bioinf.uni-greifswald.de/augustus/

3. Results 3.1. Overview In total, 87 neuropeptide precursor genes were identified and they encoded 43 neuropeptide families constituting the majority of known arthropod neuropeptides (Table 1, Supplementary File 1). Transcripts of all genes were found via TBLASTN in the transcriptome of brain except for the ecdysis-triggering hormone (ETH) gene, of which the 3

General and Comparative Endocrinology 285 (2020) 113271

N. Yu, et al.

Fig. 1. Amino acid sequence alignment and consensus sequence of the agatoxin-like peptide (ALP) in Arthropoda. The eight conserved Cys residues were marked with asterisks in the sequence alignment (A). Consensus sequence (B) was generated with sequences in (A). The arthropod ALP sequences were retrieved from (Sturm et al., 2016).

(X = any amino acid residue) and the sequences were aligned well with the ALPs from other arthropods (Fig. 1) (Sturm et al., 2016). The GnRHrelated peptides predicted in this spider species were Crz and GnRH-like peptide. Six pyrokinin (PK) genes were predicted to encode 2–9 paracopies of PK peptides with the consensus C-terminal sequence (F/L/P) X1PRX2amide (X1 = I, T, Q, V, Y; X2 = I, L, V) (Fig. 2A). One PVRK gene encoded a large precursor for 30 paracopies of PVRK peptides with a characteristic C-terminal PYPRXamide (X = P, L, A, T, V) (Fig. 2B) as also the case in S. mimosarum and L. hesperus (Veenstra, 2016a). However, different from the PK precursor 2 encoding PVRKs in S. mimosarum, the six PK precursors encoded no peptide containing PYPRXa. Two orcokinin (OK) precursors were identified, OK1 and OK2. OK1 precursor encoded 16 paracopies of OK-A with N-terminal consensus sequence NFDEID. OK2 precursor encoded 3 paracopies of OK-A, 2 paracopies of OK-B and 1 copy of orcomyotropin (MLDSVSGWAFGES). OK-B possessed the characteristic N-terminal LD residues followed by GGG or GG residues two or three residues later (Jiang et al., 2015).

transcript was found in the transcriptomes of legs, pedipalp and chelicerae. A general impression of the P. pseudoannulata neuropeptidome was that 20 of the 43 families were presented in paralogs. Neuropeptide precursors without a paralog were achatin, allatostatin A (AstA), allatostatin B, allatostatin C, allatotropin, bursicon, crustacean cardioactive peptide, corazonin (Crz), ETH, FMRFamide, Gonadotropin releasing hormone (GnRH)-like peptide, GPA2, leucokinin, KYMGLamide, neuropeptide-like precursor, periviscerokinin (PVRK), relaxin, RYamide, SIFamide, sulfakinin, tachykinin and trissin. Another notable observation was that precursors often contained a quite large number of paracopies such as AstA encoding 20, KYMGLa 22, PVRK 30, and up to 53 in myosuppressin, which was also reported in other spiders (Veenstra, 2016a). The KYMGLa precursor was predicted to have 22 paracopies of KYMGLa and one copy of PGWX3GLa (Supplementary File 1). Four genes encoding agatoxin-like peptide (ALP) were discovered with the consensus motif XC1X6C2X6C3C4X4C5XC6X6C7XC8X9G 4

General and Comparative Endocrinology 285 (2020) 113271

N. Yu, et al.

1).

3.2. Clustering of neuropeptide precursor genes Full-length coding sequences were found for all neuropeptide genes except for three insulin-like peptide (ILP) genes (ilp2, ilp3, ilp4). The 87 genes were located at 66 scaffolds and some paralog genes clustered at a same scaffold (Table 1). The 11 predicted ILP precursor genes were located at five scaffolds. Scaffold_4675 (1894 kb) contained ILP2-ILP5 with ILP3 predicted to possess a long intron up to 12 kb. ILP6 and ILP7 were present at scaffold_4226 (480 kb), and ILP8-ILP10 were located at scaffold_140531 (223 kb) (Fig. 3A, Table 1). The ILP1 gene and ILP11 gene were located at scaffold _251334 and scaffold_3883, respectively. Six PK genes clustered at two scaffolds with PK1-PK4 at scaffold_395994 of 764 kb and PK5-PK6 at scaffold_913469 of 4155 kb (Fig. 3B, Table 1). PK1-PK5 genes all possessed two coding exons while PK6 gene had four coding exons in their ORFs.

Fig. 2. Consensus amino acid sequences of pyrokinin and periviscerokinin. The consensus sequences were generated with amino acid residues predicted from the six pyrokinin precursors (A) and the one perivescerokinin precursor in Supplementary File 1.

The production of two transcripts of different sizes by alternative splicing were predicted in EFLamide (EFLa), myosuppressin (MS) and natalisin (NTL) genes. Two EFLa genes were predicted. One (EFLa1) encoded two alternative transcripts with one specific for -EFLa and the other specific for –EFLGGPa (Supplementary File 1) while the second (EFLa2) encoded a single copy of –EFLGGPa, as also reported in other chelicerates (Veenstra, 2016a). Two MS genes were predicted as MS1 and MS2. Alternative splicing of MS2 was supported by transcriptomic data with transcript MS2-2 shorter than MS2-1 by 1134 bp. The two NTL precursors produced via alternative splicing were named NTL1-1 and NTL1-2. A nucleic acid fragment of 876 bp was present in transcript NTL1-2 but absent in NTL1-1. (Table 1). Both NTL and its homolog tachykinin (TK) ended with a C-terminal Arg residue and the peptides containing a Pro residue at position 3 from the C-terminus were designated as NTL while the others were named TK (Supplementary File

3.3. Allatostatin C/CC/CCC Four genes encoded four allatostatin C (AstC) family members, one AstC, two AstCC and one AstCCC. The AstC family members exhibited the sequence characteristics of many arthropod species (Veenstra, 2016c). The four neuropeptides possessed the two conserved Cys residues forming a disulphide bridge and shared similar amino acid residues within the bridge while they differed from each other with the Nterminal extension (Fig. 4). The convertase cleavage site was KR in AstC precursor but RR in AstCC precursor. A Pro residue was present in the disulphide bridge in AstC while an Ala residue was at the same position in AstCC. An N-terminal pyroglutamate was also predicted in the mature AstCC. Lastly, AstC and AstCC were non-amidated at the C-teriminus whereas AstCCC possessed amidated C-terminus.

Fig. 3. Clustering of homologous genes. Gene structure and location of insulin-like peptide (ILP2-ILP10) genes (A) and pyrokinin (PK1-PK6) genes (B). Only regions (grey line) containing target genes were illustrated at the scaffolds. Orange arrows represent genes and indicate the relative direction of genes to the scaffold. Green blocks represent exons in ORFs and black lines between exons are introns. Distance between genes were scaled differently from intron length. 5

General and Comparative Endocrinology 285 (2020) 113271

N. Yu, et al.

4. Discussion Neuropeptides have been intensively studied in a bundle of chelicerate species by Christie, Veenstra and other researchers (Christie, 2008, 2015, 2016; Christie et al., 2011; Veenstra, 2016a; Veenstra et al., 2012). Their massive collection of neuropeptide information provided a valuable start point for the present study, which in turn will add values to expanding our knowledge of chelicerate neuropeptides. Although the search was conducted as thoroughly as possible, several neuropeptides were not found in the present study including adipokinetic hormone (AKH), AKH/corazonin related peptide (ACP), CNMamide, pigment dispersing factor (PDF) and sex peptide. Crz and GnRH-like of the AKH/Corazonin/ACP/GnRH super family were identified while AKH and ACP were missing, consistent with the situation reported in other chelicerate species including S. mimosarum, L. hesperus, M. martensii, T. urticae and Ixodes scapularis (Christie, 2008; Veenstra, 2016a). The chelicerate GnRH-like peptides have quite different amino acid composition from mammalian GnRH and arthropod AKH and ACP. Nevertheless, Veenstra (2016a) suggests that chelicerate GnRH-like peptides are ACP orthologs as they share some structural similarity. Therefore, GnRH-like peptide (GnRH) was used in the present study. CNMa was also absent in these chelicerate species. PDF was found in S. mimosarum and M. martensii but not in T. urticae or P. pseudoannulata. However, receptors to PDF seemed to exist in P. pseudoannulata at our attempt to search neuropeptide receptors (data not shown). A recent study reported that Coleoptera PDFs show significant changes in peptide sequences that explains why it had not been identified in Tribolium castaneum previously (Veenstra, 2019). However, we could not find any hit with all available PDF sequences via BLAST at the moment. It could be possible that P. pseudoannulata PDF has different sequence from S. mimosarum PDF given the presence of PDF receptor orthologs in P. pseudoannulata. The presence of neuropeptide paralogs is ubiquitous in spider genomes (Veenstra, 2016a). We saw 46% of the identified P. pseudoannulata neuropeptides with at least two paralogs. Clustering of homologous genes was also found in S. mimosarum with 12 ILP genes in four different contigs (Veenstra, 2016a) and is explained by repeated local gene duplication. In the case of AstC/CC/CCC, there was likely a local gene triplication in an early arthropod, as AstC family members are closely associated on the same chromosome (Veenstra, 2016c). In contrast, the four genes encoding AstC, AstCC and AstCCC in P. pseudoannulata were scattered at four (or five to be precise) different scaffolds with size ranging from 340 kb to 8106 kb (Supplementary Table 2). We could not exclude the possibility that some or all of the five scaffolds originated from one chromosome though the four genes were at least 64 kb away to the nearest end of the scaffold. AstC signalling has been recently reported to modulate nociception and immunity as well as the circadian activity pattern in Drosophila melanogaster (Bachtel et al., 2018; Diaz et al., 2019). No function has been revealed in chelicerates yet. AstCCC differed from AstC and AstCC with its C-terminal amidation (Fig. 4). In the American lobster, Homarus americanus, amindated and non-amidated AstC family members are differentially distributed in the stomatogastric nervous system, suggesting their distinct modulatory effects (Christie et al., 2018). The relative abundant expression of AstCCC precursor gene in fat body in addition to brain suggests that it will be interesting to investigate the hormonal function of AstCCC in fat body physiology (Fig. 5, Supplementary File 2). Moreover, the receptors to AstC family members have not been satisfactorily characterized yet. Although AstC and AstCC can activate the same receptor in T. castaneum (Audsley et al., 2013), it is plausible that the spider has more than one receptor, as it has in decapods (Christie et al., 2015). Thus it will be particularly interesting to know whether the spider has more than one homolog of the AstC receptor and if so, if these receptors are differentially expressed and whether such receptors differentiate between the three spider AstC-related peptides.“

Fig. 4. Amino acid sequence alignment of AstC family members in Arthropoda. The internal disulphide bridge formed by two Cys residues was indicated with solid line. Sequence similarity was shaded. “a” at the end of AstCCC represents C-terminal amidation. Parps, Pardosa pseudoannulata; Lathe, Latrodectus hesperus; Parte, Parasteatoda tepidariorum; Stemi, Stegodyphus mimosarum; Mesma, Mesobuthus martensii; Tetur, Tetranychus urticae; Ambva, Amblyomma variegatum.

3.4. Tissue expression profiling of neuropeptide genes Taken the advantage of the nine transcriptomes, the expression profile of the majority of the neuropeptide precursor transcripts (75/ 90) could be generated (Fig. 5, Supplementary File 2). The brain was the only tissue where most of the neuropeptide genes were abundantly expressed. Four neuropeptide genes, ALP4, crustacean hyperglycemic hormone/ion transport peptide-like 1 (CHH/ITP1), Proctolin 1 and short neuropeptide F 1 (sNPF1), were richly expressed in brain with FPKM values more than 2000. In addition to their similar expression level in brain, the two OK genes showed tissue-specific expression. OK1 transcript was present in appendages while OK2 transcript was enriched in venom gland with FPKM value of up to 6729 (Fig. 5, Supplementary File 2). The set of neuropeptide genes stood out in fat body were AstCCC and calcitonin-like 1 (Cal1). Different expression levels was observed among paralogous genes. ALP4 was expressed much higher than the other three ALP genes and CHH/ITP1 was the most expressed among the six CHH/ITP genes. 6

General and Comparative Endocrinology 285 (2020) 113271

N. Yu, et al.

Fig. 5. Expression levels of neuropeptide precursor genes in nine tissues. Expression levels were presented with FPKM values extracted from normalized transcriptomic data of nine tissues including four pairs of legs, pedipalp, chelicerae, brain, venom gland and fat body. FPKM values larger than 2000 were marked on the heatmap. The FPKM values generating the heatmap were listed in Supplementary File 2.

dominant role in brain. However, the expression levels of neuropeptide ALPs require experiments to determine as well as do their functions in tissues such as venom gland (ALP2 and ALP4, Supplementary File 2). OK1 was expressed in not only brain but also all appendages though at

Expression of neuropeptide in different tissues may throw light on their potential functions. Four ALP genes were expressed in the tested tissues at various levels with ALP4 the most enriched one. The extremely high expression of ALP4 precursor in brain may suggest its 7

General and Comparative Endocrinology 285 (2020) 113271

N. Yu, et al.

lower levels. Interestingly, OK2 was expressed in venom gland to an extreme extent (Fig. 5). OK was considered as a component of chelicerate venom with its myotropic activity when predicted from the scorpion Scorpiops jendeki venom gland ESTs (Christie et al., 2011). Therefore, it is reasonable to propose that OK1 and OK2 mediate different neuronal and physiological effects given their different structure and tissue distribution. The present work collected information of neuropeptides from transcriptomic and genomic data with the goal to expand the number of known chelicerate neuropeptides and to lay a foundation for functional study. Neuropeptides could only function when they interact with their receptors. Identification of neuropeptide receptors is another interesting and challenging task before we could draft an overview of the neuropeptidergic system of P. pseudoannulata.

Homarus americanus: new insights from high-throughput nucleotide sequencing. PLoS ONE 10, e0145964. Christie, A.E., Miller, A., Fernandez, R., Dickinson, E.S., Jordan, A., Kohn, J., Youn, M.C., Dickinson, P.S., 2018. Non-amidated and amidated members of the C-type allatostatin (AST-C) family are differentially distributed in the stomatogastric nervous system of the American lobster Homarus americanus. Invert. Neurosci. 18, 2. Christie, A.E., Nolan, D.H., Ohno, P., Hartline, N., Lenz, P.H., 2011. Identification of chelicerate neuropeptides using bioinformatics of publicly accessible expressed sequence tags. Gen. Comp. Endocrinol. 170, 144–155. Coast, G.M., Schooley, D.A., 2011. Toward a consensus nomenclature for insect neuropeptides and peptide hormones. Peptides 32, 620–631. Coast, G.M., Webster, S.G., Schegg, K.M., Tobe, S.S., Schooley, D.A., 2001. The Drosophila melanogaster homologue of an insect calcitonin-like diuretic peptide stimulates VATPase activity in fruit fly Malpighian tubules. J. Exp. Biol. 204, 1795–1804. Crooks, G.E., Hon, G., Chandonia, J.M., Brenner, S.E., 2004. WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190. Di, Z., Yu, Y., Wu, Y., Hao, P., He, Y., Zhao, H., Li, Y., Zhao, G., Li, X., Li, W., Cao, Z., 2015. Genome-wide analysis of homeobox genes from Mesobuthus martensii reveals Hox gene duplication in scorpions. Insect Biochem. Mol. Biol. 61, 25–33. Diaz, M.M., Schlichting, M., Abruzzi, K.C., Long, X., Rosbash, M., 2019. Allatostatin-C/ AstC-R2 is a novel pathway to modulate the circadian activity pattern in Drosophila. Curr. Biol. 29, 13–22. Elphick, M.R., Mirabeau, O., Larhammar, D., 2018. Evolution of neuropeptide signalling systems. J. Exp. Biol. 221, jeb151092. Gendreau, K.L., Haney, R.A., Schwager, E.E., Wierschin, T., Stanke, M., Richards, S., Garb, J.E., 2017. House spider genome uncovers evolutionary shifts in the diversity and expression of black widow venom proteins associated with extreme toxicity. BMC Genomics 18, 178. Hahn, D.A., Denlinger, D.L., 2011. Energetics of insect diapause. In: In: Berenbaum, M.R., Carde, R.T., Robinson, G.E. (Eds.), Annual Review of Entomology, vol. 56. pp. 103–121. Hoopfer, E.D., 2016. Neural control of aggression in Drosophila. Curr. Opin. Neurobiol. 38, 109–118. Ikeya, T., Galic, M., Belawat, P., Nairz, K., Hafen, E., 2002. Nutrient-dependent expression of insulin-like peptides from neuroendocrine cells in the CNS contributes to growth regulation in Drosophila. Curr. Biol. 12, 1293–1300. Jekely, G., 2013. Global view of the evolution and diversity of metazoan neuropeptide signaling. Proc. Natl. Acad. Sci. U.S.A. 110, 8702–8707. Jiang, H., Kim, H.G., Park, Y., 2015. Alternatively spliced orcokinin isoforms and their functions in Tribolium castaneum. Insect Biochem. Mol. Biol. 65, 1–9. Kapustin, Y., Souvorov, A., Tatusova, T., Lipman, D., 2008. Splign: algorithms for computing spliced alignments with identification of paralogs. Biol. Direct. 3, 20. Kenny, N.J., Chan, K.W., Nong, W., Qu, Z., Maeso, I., Yip, H.Y., Chan, T.F., Kwan, H.S., Holland, P.W.H., Chu, K.H., Hui, J.H.L., 2016. Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs. Heredity (Edinb) 116, 190–199. Kumar, S., Stecher, G., Tamura, K., 2016. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870–1874. Liu, W., Xie, Y., Ma, J., Luo, X., Nie, P., Zuo, Z., Lahrmann, U., Zhao, Q., Zheng, Y., Zhao, Y., Xue, Y., Ren, J., 2015. IBS: an illustrator for the presentation and visualization of biological sequences. Bioinformatics 31, 3359–3361. Mirabeau, O., Joly, J.-S., 2013. Molecular evolution of peptidergic signaling systems in bilaterians. Proc. Natl. Acad. Sci. U.S.A. 110, E2028–E2037. Nicholas, K., 1995. GeneDoc: analysis and visualization of genetic variation. Embnew. News 4, 28–30. Sajadi, F.C., Al Dhaheri, A., Paluzzi, J.V., 2018. Anti-diuretic action of a CAPA neuropeptide against a subset of diuretic hormones in the disease vector Aedes aegypti. J. Exp. Biol. 221, jeb177089. Sturm, S., Ramesh, D., Brockmann, A., Neupert, S., Predel, R., 2016. Agatoxin-like peptides in the neuroendocrine system of the honey bee and other insects. J. Proteomics 132, 77–84. Swift, M.L., 1997. GraphPad prism, data analysis, and scientific graphing. J. Chem. Inf. Comput. Sci. 37, 411–412. Veenstra, J.A., 2000. Mono- and dibasic proteolytic cleavage sites in insect neuroendocrine peptide precursors. Arch. Insect Biochem. Physiol. 43, 49–63. Veenstra, J.A., 2016a. Neuropeptide evolution: chelicerate neurohormone and neuropeptide genes may reflect one or more whole genome duplications. Gen. Comp. Endocrinol. 229, 41–55. Veenstra, J.A., 2016b. Similarities between decapod and insect neuropeptidomes. PeerJ 4, e2043. Veenstra, J.A., 2016c. Allatostatins C, double C and triple C, the result of a local gene triplication in an ancestral anthropod. Gen. Comp. Endocrinol. 230–231, 153–157. Veenstra, J.A., Rombauts, S., Grbic, M., 2012. In silico cloning of genes encoding neuropeptides, neurohormones and their putative G-protein coupled receptors in a spider mite. Insect Biochem. Mol. Biol. 42, 277–295. Veenstra, J.A., 2019. Coleoptera genome and transcriptome sequences reveal numerous differences in neuropeptide signaling between species. PeerJ 7, e7144.

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements This work was supported by the National Natural Science Foundation [grant number 31701823] and Postgraduate Research & Practice Innovation Program of Jiangsu Province [grant number SJCX19_0118]. The funding source had no involvement in the conduct of the research and preparation of the manuscript. Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.ygcen.2019.113271. References Armenteros, J.J.A., Tsirigos, K.D., Sonderby, C.K., Petersen, T.N., Winther, O., Brunak, S., von Heijne, G., Nielsen, H., 2019. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 37, 420–423. Audsley, N., Vandersmissen, H.P., Weaver, R., Dani, P., Matthews, J., Down, R., Vuerinckx, K., Kim, Y.J., Vanden Broeck, J., 2013. Characterisation and tissue distribution of the PISCF allatostatin receptor in the red flour beetle Tribolium castaneum. Insect Biochem. Mol. Biol. 43, 65–74. Bachtel, N.D., Hovsepian, G.A., Nixon, D.F., Eleftherianos, I., 2018. Allatostatin C modulates nociception and immunity in Drosophila. Sci. Rep. 8, 7501. Bao, C.C., Yang, Y.N., Huang, H.Y., Ye, H.H., 2015. Neuropeptides in the cerebral ganglia of the mud crab, Scylla paramamosain: transcriptomic analysis and expression profiles during vitellogenesis. Sci. Rep. 5, 17055. Bubak, A.N., Watt, M.J., Renner, K.J., Luman, A.A., Costabile, J.D., Sanders, E.J., Grace, J.L., Swallow, J.G., 2019. Sex differences in aggression: differential roles of 5-HT2, neuropeptide F and tachykinin. PLoS ONE 14, e0203980. Cannell, E., Dornan, A.J., Halberg, K.A., Terhzaz, S., Dow, J.A.T., Davies, S.A., 2016. The corticotropin-releasing factor-like diuretic hormone 44 (DH44) and kinin neuropeptides modulate desiccation and starvation tolerance in Drosophila melanogaster. Peptides 80, 96–107. Christie, A.E., 2008. Neuropeptide discovery in Ixodoidea: an in silico investigation using publicly accessible expressed sequence tags. Gen. Comp. Endocrinol. 157, 174–185. Christie, A.E., 2015. In silico characterization of the neuropeptidome of the Western black widow spider Latrodectus hesperus. Gen. Comp. Endocrinol. 210, 63–80. Christie, A.E., 2016. Expansion of the neuropeptidome of the globally invasive marine crab Carcinus maenas. Gen. Comp. Endocrinol. 235, 150–169. Christie, A.E., Chi, M., 2015. Neuropeptide discovery in the Araneae (Arthropoda, Chelicerata, Arachnida): elucidation of true spider peptidomes using that of the Western black widow as a reference. Gen. Comp. Endocrinol. 213, 90–109. Christie, A.E., Chi, M., Lameyer, T.J., Pascual, M.G., Shea, D.N., Stanhope, M.E., Schulz, D.J., Dickinson, P.S., 2015. Neuropeptidergic signaling in the American lobster

8