Comprehensive functional analysis and mapping of SSR markers in the chickpea genome (Cicer arietinum L.)

Comprehensive functional analysis and mapping of SSR markers in the chickpea genome (Cicer arietinum L.)

Journal Pre-proof Comprehensive functional analysis and mapping of SSR markers in the Chickpea genome (Cicer arietinum L.) AliAkbar Asadi, Amin Ebrahi...

2MB Sizes 0 Downloads 37 Views

Journal Pre-proof Comprehensive functional analysis and mapping of SSR markers in the Chickpea genome (Cicer arietinum L.) AliAkbar Asadi, Amin Ebrahimi, Sajad Rashidi-Monfared, Mohammad Basiri, Javad Akbari-Afjani

PII:

S1476-9271(18)30607-8

DOI:

https://doi.org/10.1016/j.compbiolchem.2019.107169

Reference:

CBAC 107169

To appear in:

Computational Biology and Chemistry

Received Date:

25 August 2018

Revised Date:

16 November 2019

Accepted Date:

18 November 2019

Please cite this article as: Asadi A, Ebrahimi A, Rashidi-Monfared S, Basiri M, Akbari-Afjani J, Comprehensive functional analysis and mapping of SSR markers in the Chickpea genome (Cicer arietinum L.), Computational Biology and Chemistry (2019), doi: https://doi.org/10.1016/j.compbiolchem.2019.107169

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier.

Comprehensive functional analysis and mapping of SSR markers in the Chickpea genome (Cicer arietinum L.)

AliAkbar Asadi1, Afjani1

1.



, Amin Ebrahimi2, Sajad Rashidi-Monfared1*, Mohammad Basiri1, Javad Akbari-

Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran.

2.

Agronomy and Plant Breeding Department, Faculty of Agriculture, Shahrood University of

3.

ro of

Technology, Semnan, Iran. Ψ: Present address: Plant Biotechnology Department, National Institute of Genetic Engineering and Biotechnology, Tehran, Iran.

-p

* Corresponding author. E-mail address: [email protected] (S. Rashidi-Monfared)

Jo

ur

na

lP

re

Graphical Abstract

Highlights 

Uniformity testing showed that the frequency distribution of the gSSRs was uniformly across the eight chickpea chromosomes. 1



A total of 1,798 simple sequence repeats (SSRs) containing transcript assembly contigs (TACs) were mapped onto chickpea chromosomes.



The chromosomal location and distribution of TAC-SSRs appeared to be unevenly dispersed across the chickpea genome. Transcription regulation proteins were identified for 13.44% of SSR-containing TACs (253 TACSSRs). The maximum number of TAC-SSRs (56.6% or 141 TAC-SSRs) was observed in genes encoding proteins that play a common role in various stresses.

 

Comprehensive functional analysis and mapping of SSR markers in the Chickpea genome (Cicer arietinum L.)

ro of

Abstract Plant molecular breeding largely depends on the relationship between molecular markers and major traits. Herein, a total of 32,962 genomic simple sequence repeats (SSRs) were detected in the whole genome of chickpea with an average density of 94.93 SSRs/Mb. Chickpea chromosomes uniformity test indicated that the genomic SSRs (gSSRs) were steadily distributed across the genome. Moreover, 48,667 transcriptome

sequences were analyzed and 1,949 SSR-containing transcript assembly contigs (TACs) were identified.

-p

The analysis showed that di- and trinucleotide SSRs were the most frequent SSR motifs within the transcriptome sequences. Among them, AT and TTA and AG and TTC motifs within the transcriptome

showed the highest frequencies among di- and trinucleotide repeat motifs, respectively. The SSRs-

re

containing TACs were compared to the GenBank non-redundant database using BLASTX, and subsequently, gene ontology (GO) analysis was performed using QuickGO browser to reduce complexity

lP

and highlight biological processes associated with the SSRs-containing TACs. The identified SSRscontaining TACs were categorized into 35 enriched functional-related gene group. The mapping of characterized SSRs-containing TACs onto chickpea chromosomes was performed using BLASTN. The mapping result showed that, a total of 1,798 SSRs-containing TACs were mapped onto the chickpea

na

genome. Based on the functional analysis result, 249 and 242 of the mapped SSRs-containing TACs were found in the genes encoding for putative stress-related proteins and transcription factors, respectively. The

ur

results presented here can be applied to improve and speed up the chickpea breeding programs.

Key words: Chickpea, Functional analysis, EST-SSR mapping, gSSRs, TAC-SSRs

Jo

1. Introduction

In plant researches, particularly in plant breeding, useful alleles can be identified by pedigree analysis, morphological traits, molecular markers, and sequencing projects. Molecular markers have the great potential to speed up the process of developing improved cultivars. Among the PCR-based markers, microsatellites are typically co-dominant, multi-allelic, and abundantly distributed across genomes, including coding and non-coding regions (Naghavi et al., 2012; Asadi and Rashidi-Monfared 2014). Although a number of simple sequence repeat (SSR) markers have been previously developed, recent 2

studies have shown that the development of new SSRs from different library sources can contribute to further saturate chromosome regions with large gaps. In the recent years, the SSRs have been also identified in genes and expressed sequence tags (ESTs) (Asp et al., 2007). The ESTs in legumes are cost-effective DNA marker resources because their application may be functionally more informative than the SSRs, which are generally located in unexpressed chromosomal regions. The current emphasis on functional genomics is that coding sequences in genomics and EST databases of a large number of species are rapidly accumulating. The EST-SSR markers have several advantages over the other genomic DNA-based markers including detection of variation in coding sequences, untranslated regions (3′-UTR and 5′-UTR) and introns. Moreover, they are more conserved than the genomic SSRs (gSSR), and have a higher level of transferability to closely related species (Li et al., 2004). While, it has been documented that many

ro of

microsatellites are located within coding and regulatory regions of genomes, and can be considered as

special modifiers altering gene expression, RNA stability, splicing efficiency, protein structure and RNAprotein interactions (Ellegren, 2004; Li et al., 2004). Now that microsatellite alleles are highly polymorphic,

they could provide a large pool of heritable, phenotypic variants for subsequent screening (Naghavi et al., 2012). Taking these into considerations, the EST-SSR markers could lead to the direct gene tagging for the

-p

QTL mapping of agronomically important traits, thereby improving the efficiency of marker assisted

selection (MAS) in biological studies, especially in plant breeding (Wang et al., 2011). Chickpea (Cicer

re

arietinum L.) is the third most important grain legume crop in the world. It is a rich source of protein (20– 25%) and is particularly relevant in cropping systems due to its capacity to fix nitrogen through symbiosis. Owing to possessing small and diploid (2n = 2x = 16) genome, chickpea genome can be considered as an

lP

important model system for legume genetics and genomics research (Kudapa et al., 2014). Nonetheless, Chickpea yields are limited by several biotic and abiotic stresses. Conventional methods have been applied and underway in tolerance to the environmental stresses while molecular breeding approaches have

na

potential to speed up the process of developing improved cultivars when coupled and used in conventional breeding. Consequently, understanding genome organization and genome sequencing could be effectively utilized for the development of molecular markers. Several studies reported the EST sequences from the chickpea (Boominathan et al., 2004; Romo et al., 2004; Buhariwalla et al., 2005; Coram and Pang 2005;

ur

Choudhary et al., 2009, Varshney et al., 2009, Agarwal et al., 2012, and Kudapa et al., 2014). Among these studies, Buhariwalla et al., 2005, and Choudhary et al. 2009, demonstrated the application of the ESTs as a

Jo

source of genetic markers, and Varshney et al. 2009, Agarwal et al. 2012 have developed comparative and functional analysis of the EST-SSR markers in this plant. Kudapa et al. 2014, could generate a more extensive chickpea transcriptome assembly from the multiple source databases, then aligned the chickpea transcript assembly contigs (TACs) to the genome sequence of the model legume Medicago truncatula and identified Intron Spanning Region (ISR) markers from chickpea genome. However, in several studies, a number of SSR markers have been developed, but the recent researches have shown that the development of new SSRs from different library sources could contribute to the further saturation of the linkage maps in the chromosomal regions where there are still large gaps. 3

The goals of the present study were (1) analyzing of gSSRs frequency across the chickpea genome and comprehensive transcript assembly contig-simple sequence repeat (TAC-SSRs), (2) characterizing and analyzing the function of TAC-SSRs (3) mapping the characterized TAC-SSRs onto the chickpea chromosomes. The results of the present study can be used for developing useful tools for chickpea in future studies, including taxonomic, molecular breeding, genomics, and biotic/abiotic stresses. 2. Material and Methods 2.1. Retrieval of DNA sequences DNA sequence of chickpea chromosomes were downloaded (With accession nos.:NC_021160, NC_021161, NC_021162, NC_021163, NC_021164, NC_021165, NC_021166, and NC_021167) from National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov/genbank). 2.2. Retrieval of TAC sequences

ro of

The 48,667 chickpea comprehensive transcript assembly contigs (TACs), which have been developed using

Sanger and Next Generation Sequencing Platforms, were used in the present study (Kudapa et al. 2014).

The used data have been deposited in the NCBI and are available under the accession number: PRJNA175619 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA175619/). 2.3. Microsatellite mining

-p

SSR Locator software, micro- and minisatellites detection and characterization tool, was used to identify

the gSSRs and the TAC-SSRs (Maia et al., 2008) that were considered to contain motifs 1 to 10 nucleotides

re

in size and in the current investigation, only class I (≥20 bp) SSRs has been studied. These repeats have been described as the most efficient and highly polymorphic loci for the development of the molecular markers (Temnykh et al., 2001). The SSR Locator was configured to locate a minimum repeat of 20 bp

lP

SSRs: for mono-(x20), di-(x10), tri-(x7), tetra-(x5), penta-(x4), hexa-(x4), and minisatellites: hepta-(x3), octa-(x3), nona-(x3), and decanucleotides (x3), where x is repeat unit. Compound or imperfect microsatellites, the motifs were not continuous and there were some nucleotides between two or more

na

motifs, and perfect SSRs were defined as ≥2 SSRs interrupted by ≤100 bases. The SSR Locator identified 32,962 gSSRs and 1,949 TAC-SSRs. After that, the TAC-SSRs were used for the next investigations. The Kolmogorov-Smirnov (K-S) test was performed to assess the uniformity of the gSSRs distribution on the eight chromosomes of chickpea using SPSS software, version 20 (SPSS I 2013). All subsequent analysis

ur

and diagrams drawing were conducted using the Microsoft Excel 2013. 2.4. Nucleotide BLAST, mapping and primer design

Jo

The mapping of the characterized TAC-SSRs onto the chromosomes was performed by using BLASTN. 1,798 out of 1949 TAC-SSRs have been chosen to map onto 8 chromosomes of chickpea. 576 out of the mapped TAC-SSRs were located in the start and end of the TAC sequence. Primer pairs were designed from the flanking regions of repeat units of the mapped TAC-SSRs with Primer 3 (v. 0.4.0) (http://frodo.wi.mit.edu). The primers were designed based on these parameters: primer length: 20 – 24 bp, with 20 bp as the optimum; GC%: 40 – 60%, with the optimum value being 50%; Tm: 50 – 60ºC, and product size range: 200 – 300 bp. The designed primers have been tested with BLASTN to evaluating their

4

annealing positions on the chickpea genome and make sure about the absence of introns in the binding sites of each designed primers (supplementary materials Table S5, Excel file). 2.5. Functional annotation and construct map To study various biological processes that are possibly regulated by the 1,949 TAC-SSRs, all SSRcontaining TACs were used in a search for homologous proteins using the NCBI. The TAC-SSRs (1,949) were compared to the GenBank non-redundant database using BLASTX at an E-value threshold of 1e-7 and the other default algorithm parameter values. Subsequently, gene ontology (GO) enrichment analysis with the gene ontology and GO annotations browser was performed to reduce complexity and highlight biological processes associated with the TAC-SSRs (http://www.ebi.ac.uk/QuickGO/). The TAC-SSRs were classified into 35 functional group based on their functions. Enzymes and transcription factors (TFs) showed the highest frequency among predicted functions. The enzymes were further grouped into six

ro of

categories. Moreover, to identify the TFs, we compared the SSR-containing TACs to plant-specific transcription factor databases including PlantTFDB (http://planttfdb.cbi.pku.edu.cn/), PlantFDB

(http://plntfdb.bio.uni-potsdam.de/v3.0/) and Plant Stress Responsive Transcription Factor DataBase

(STIFDB) which provided a comprehensive collection of the abiotic stress responsive TFs in Arabidopsis

thaliana and Oryza sativa L. (http://caps.ncbs.res.in/stifdb2/), using the BLAST search with the stringency

-p

of E-value 1e-07. Subsequently, the plant TF families were identified then the distribution map of TFs and

well-known functional TAC-SSRs on the chickpea genome were drawn using MapChart 2.1 (Voorrips,

re

2002). 2.6. EST-SSR Analysis

lP

23 landraces of Kabuli chickpea were procured from the Gene Bank at the University of Tehran, Karaj, Iran (https://utcan.ut.ac.ir/en). Total genomic DNA was isolated from the leaves using the CTAB method (Saghai-Maroof et al., 1984) with minor modifications. The concentration of extracted DNA was determined by using spectrophotometer (Epoch Microplate Spectrophotometer, Biotek, USA). 10 EST-

na

SSR primer pairs were used to amplify the genomic DNA of the 23 landraces. The polymerase chain reaction (PCR) mixture contained: 20 ng DNA, 15 mM MgCl2, 0.77 mM primer, 0.15 mM dNTPs, 1X PCR buffer, and 0.5 unit of Taq DNA polymerase (Smart taq DNA polymerase, Sinaclon™, Iran) in a total

ur

volume of 13 µL. The PCR was performed as fellow: 4 min of denaturing at 95°C followed by 1 min of denaturing at 95°C, 30 s annealing at 53-56°C (depending on primer) and 30 s of elongation at 72°C for 35 cycles. A final cycle was allowed at 72°C for 10 min. The PCR of the EST-SSRs were carried out in a

Jo

C1000™ Thermal Cycler - Bio-Rad (BIORAD, USA). The PCR products were analyzed on 1.5 % agarose gels and then 10% polyacrylamide gels in 1x TBE buffer running at 400v for 1 h following silver stained according to reported protocol (Bassam et al., 1991). The resulting images were scored manually by using DNA Molecular Weight Marker VIII (19 – 1114 bp) (Roche Molecular Biochemical, USA). Six primer pairs generated 10 polymorphic markers (Table 1). A binary data matrix (presence = 1 and absence = 0) obtained from scoring polymorphic bands. The unweighted pair-group method using arithmetic average (UPGMA) cluster analysis, the resulting dendrogram was performed on the genetic 5

distance matrix using the computer program NTSYpc version 2.02 (Rohlf, 2000). Power of primers discrimination or the polymorphism information content (PIC) was calculated using

;

th

where, Pi is the frequency of the i marker for an EST-SSR primer.

3. Results 3.1. Identification of Genomic SSRs A total of the 32,962 gSSR markers were identified in the eight chromosomes of the chickpea. The total length of the Cicer arietinum L. genome is 347,247,377 bp, of which 1,399,129 bp is covered by the characterized SSRs. Chromosome coverage with a total gSSR length of 1,399,129 bp was calculated to be 0.25%. The gSSRs distribution and chromosome coverage percentage in the chickpea genome are presented

ro of

in supplementary materials Table S1. The frequency of gSSRs was counted per five Mb of each chromosome to assess the uniformity of the gSSR distribution along chromosomes, as well as the average frequency of gSSRs was calculated (94.93 SSR/Mb). Chromosome 6 showed the highest frequency (98.38 SSRs/Mb), whereas chromosome 1 demonstrated the lowest frequency (91.19 SSRs/Mb). The frequency

of microsatellites on each chromosome can reflect their density in the coding and non-coding regions of

-p

the genome. The mean, highest, lowest and standard deviation (SD) of the gSSRs across the eight chickpea

chromosomes are presented in Table 2, and their distributions of each chromosome are shown in supplementary materials Fig.S1. The mean and SD of the gSSR distribution for all the eight chickpea

re

chromosomes were 457.80 and 90.40, respectively. Chromosome 8 had the highest SD (160.59) and chromosome 4 possessed the lowest SD (45.02). As shown in supplementary materials Table S1, chromosome 6, which is the longest chromosome (59,463,898bp) in the chickpea genome, showed the

lP

highest gSSR frequency (5,850), whereas its chromosome coverage was similar to that of the other chromosomes. Furthermore, chromosome 8, as the shortest chromosome (16,477,302bp), had the lowest gSSR frequency (1,522), whereas its chromosome coverage was 0.35%.

na

The gSSRs identified in the present study were categorized according to the type of motif on each chromosome. The SSR motifs such as mono-, di-, tri-, tetra-, penta-, hexa-, hepta-, octa-, nova-, and decanucleotides were detected on each chromosome (supplementary materials Fig.S2). The dinucleotide

ur

repeat motifs showed the highest frequency in all chromosomes followed by trinucleotide repeats. Among which, chromosome 6 showed the highest frequency for di- and trinucleotide motifs. The frequent motifs of each mono-, di-, tri-, tetra-, penta-, hexa-, hepta-, octa-, nona-, and decanucleotide repeat motifs are

Jo

presented in supplementary materials Table S2. In addition, AT dinucleotide motif showed the highest frequency among chromosomes. On chromosomes 1, 3, and 7, the trinucleotide AAT motif and on the other chickpea chromosomes, motif TTA, showed the highest frequency. After the di- and trinucleotide motifs, A and T motifs showed the highest frequency in mononucleotide motifs. The results also illustrated that among the tetranucleotide motifs, TTTA motif had the highest frequency on the majority of the chromosomes. In the previous study, GA and TA were frequent dinucleotide motif and ATT and AAG were dominant trinucleotide motifs (Datta et al., 2015). According to the result, the AT, GA, TA, AAT, TTA, 6

ATT, and AAG are introduced as the most frequent motifs in the chickpea genome. As illustrated in supplementary materials Table S1, A+T of the chickpea genome content is about 69.74% and these nucleotides are the major nucleotides of the chickpea genome which comprised most of the motifs. SSRs may be changed within microsatellite loci, thereby becoming an imperfect microsatellite (Mudunuri and Nagarajaram, 2007). Table 3 indicates the distribution pattern of the gSSRs in terms of perfect and imperfect microsatellites. Among the characterized gSSRs, there were imperfect and perfect SSRs. The gSSRs with one locus, perfect gSSRs, were the most frequent (90.86%), followed by gSSRs with two loci or imperfect gSSRs (7.51%), three loci (1.10%), five loci (0.29%), and four loci (0.21%). 3.2. Identification of TAC-SSRs The 48,667 chickpea comprehensive TACs used in the present study were developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads, and 139,214 Sanger ESTs from 17

ro of

chickpea genotypes (Kudapa et al., 2014). Overall, 1,949 SSR markers were also identified in 48,667 TACs that covered around 44.74 Mb of the genome (Table 3). The frequency of the SSR markers observed in the chickpea TAC-SSRs was approximately 43.6 SSR/Mb. The percentages of trinucleotide and dinucleotide

motifs in the TAC-SSRs were 31.21%, and 26.10%, respectively (supplementary materials Fig.S3). It was reported that the trinucleotide motifs were the most abundant (51.2%) in the chickpea transcriptome

-p

sequences followed by di-, tetra-, and pentanucleotide motifs (Choudhary et al., 2009). In the present study,

tri- and dinucleotides motifs showed the highest frequency, followed by hexanucleotide (18.16 %) which

re

was not described in the previous studies.

Among the characterized TAC-SSRs, AG motif showed the highest frequency in dinucleotide motifs, followed by TC. The result of trinucleotide motifs indicated that TTC and GAA motifs had the highest

lP

frequency. In the previous study, the most frequent dinucleotide motifs were GA, TA, and GT, furthermore, AAG and ATT showed the highest frequency in trinucleotide motifs (Choudhary et al., 2009). Herein, AG, TC, TTC, GAA, TTTC, AAAAT, and GAGAAA motifs were identified as new motifs from the

na

transcriptome sequences of chickpea. Earlier studies on chickpea microsatellites have reported that the TAA motif is the most abundant (Udupa et al., 1999), whereas the present study showed that, particularly the TTC and GAA motifs were dominant. It seems that triplet motifs could potentially encode the amino acids and could be used as a source of multiple functional markers (Asadi and Rashidi-Monfared, 2014). Among

ur

the identified TAC-SSRs, there were 1,873 (96.1%) perfect (one locus), 70 (3.6%) two loci or imperfect, 5 (0.25%) three loci, and 1 (0.05%) four loci TAC-SSRs (Table 3). According to the previous research, 19

Jo

and 3 perfect and imperfect EST-SSRs were identified from chickpea, respectively (Datta et al., 2015). Choudhary et al., have also revealed that 84.1% of identified SSRs were perfect repeats and 11.7% were imperfect repeats in the chickpea transcriptome (Choudhary et al., 2009). It has been noted that the imperfect or compound microsatellites are more stable than the perfect

microsatellites as they are less prone to slippage mutations and are known to play role in gene regulation (Mudunuri and Nagarajaram, 2007). Therefore, the perfect TAC-SSRs could be a source of variation in coding regions. In the present study, the imperfect TTTC/TC/TC/CACACTT, GAA/AGA/CAAAAT, ACAAC/ACAAC/ACAAC, ATT/TTA, CT/ATTCA, and CTCCTT/CACCAG motifs were detected, 7

which could be a stable and new source of the EST-SSR markers in the chickpea. Consequently, the newly identified motifs here along with the other motifs introduced in previous studies, can cover the most of the chickpea genome; thereby having potential application for the studies related to the construction of genetic linkage maps and genetic diversity assessments. In the frame of this work, the distribution of the TAC-SSRs and gSSRs with different motifs was also studied across different repeat numbers. The distribution of tetra-, penta-, hexa-, hepta-, octa-, nova-, and decanucleotides in the TAC-SSRs and the gSSRs was skewed generally to the smaller number of repeats. A few higher repeat numbers were observed in the di- and trinucleotide TAC-SSRs and gSSRs. The frequencies of the TAC-SSRs and the gSSRs distributed in different repeat numbers are presented in Table 4. There were 2,844 gSSRs with 10 tandem repeats and 338 TAC-SSRs with 4 tandem repeats as the highest repeat number, and 692 gSSRs with 9 tandem repeats and nine TAC-SSRs with 3 tandem repeats as the

ro of

lowest repeat number. Previously, 3 to 39, 2 to 14, and 2 to 6 repeats were reported for di-, tri-, and tetranucleotide motifs, respectively (Datta et al., 2015) and the results are in consistent with this investigation. It is worth noting that increase or decrease of repeat numbers in microsatellites in coding

regions often lead to shifts in reading frames, thereby causing changes in protein products and in non-

3.3. Functional analysis and Mapping of TAC-SSRs

-p

coding regions, known to affect the gene regulation (Mudunuri and Nagarajaram, 2007).

re

In this investigation, 1,949 TAC-SSR loci were identified. Finally, the SSR-containing TAC sequences were compared to the GenBank non-redundant database using BLASTX to assign putative functions, and Gene ontology and GO annotations browser for genes annotation. We also used the previously published

lP

literature to confirm the functions (Asadi and Rashidi-Monfared, 2014). We detected 1,269 sequences (65.51%) showing homology with known proteins, and 680 (34.49%) sequences which were homologues of hypothetical or unknown proteins. One of the goals of the present study was identifying the exact

na

physical location of characterized and well-annotated TAC-SSRs on the chickpea genome and generating a functional map with well-known functional TAC-SSRs. A total of 1,802 of the 1,949 TAC-SSRs were successfully mapped to the chickpea genome. Map of TAC-SSRs and their exact locations on each chromosome are presented in supplementary materials Fig.S6. Chromosome 6 possessed the highest

ur

number of the TAC-SSRs (279), whereas chromosome 8 showed the lowest number (115). The SSRcontaining TACs were categorized into 35 group based on the result of the functional annotation analysis

Jo

and their biological roles (supplementary materials Table S3). Of which, enzymes (23.76%) showed the highest frequency among the predicted functions. The enzymes were categorized into six group (Hydrolase, Isomerase, Ligase, Lyase, Oxidoreductase and Transferase). With 23 members, the transferases showed the highest frequency among the TAC-SSRs, followed by Hydrolases (12), and Lyases (13) (Fig.2). Transcriptional regulatory proteins were identified for 13.44% of SSR-containing TACs (253 TAC-SSRs) based on the conserved domains. These putatively assigned transcripts are distributed in 46 TF-gene families (supplementary materials Table S4). TF families were mapped on each chickpea chromosome (supplementary materials Fig.S6). In addition, TFs distribution in the chickpea chromosomes showed that 8

chromosomes 4 and 6 contained 44 and 36 TFs, respectively, compared to chromosomes 1, 2, 3, 5, 7, and 8, which contained 32, 26, 33, 23, 30, and 18 TFs, respectively. The SSR-containing TFs were not uniformly distributed on the chickpea chromosomes. Among TF families, the bHLH family showed the highest number of transcripts (9.88%), followed by C3H (5.93%), MYB (5.54%), C2H2 (5.53%), WRKY and bZIP (5.14%), ERF (4.74%), TCP and GRAS (3.95%), and TALE (3.56%). 3.4. Experimental study DNA fingerprinting was performed on 23 accessions of Iranian chickpea using 10 EST-SSR primer pairs (supplementary materials Fig.S4 and S5). Out of the 10 EST-SSR primer pairs, 4 pairs of primers (TAC Codes: 1220, 779, 1129, 1370) amplified monomorphic bands and while 6 primer pairs amplified polymorphic profile (Table 1). A total of 10 polymorphic markers were detected among all the chickpea accessions. The average numbers of the generated EST-SSR markers were 1.7 per primer pairs and the

ro of

level of polymorphism was 52.94%. The number of polymorphic markers detected with each primer pairs varied from one (1220, 1287, 808 and 384) to four (1626). The PIC for each primer pairs ranged from 0.22

to 0.49 with a mean value of 0.40 (Table 1). High value of PIC was related to primer pairs #1393 which resulting in more efficiency rather than other primers. High PIC values indicate more discriminatory power

of any primers to detect polymorphism between populations (Table 1) (Anderson et al., 1993; De Riek et

-p

al., 2001). The UPGMA tree constructed based on Dice genetic coefficient similarity was depicted in Fig.1 and showed that the 23 populations were separated into two main clusters (C1 and C2). According to the

re

molecular dendrogram, there are two main branches within the C1 and C2 clusters defining as sub- cluster 1 (SC1) and sub- cluster 2 (SC2). Datta et al., 2015 submitted 35 EST sequence derived microsatellite markers from 15 wilt resistant chickpea cultivars. 22 of 35 primer pairs amplified EST-SSRs fragments and

lP

generated 35 alleles with an average of 1.5 alleles. They indicated that the identified EST-SSR markers are

4. Discussion

na

useful tools in intraspecific and interspecific diversity assessments in Cicer species (Datta et al., 2015).

ur

The increasing growth of nucleotide sequence data, particularly ESTs as well as the developing computer science, has been provided unique opportunity to gain much information about the biological processes.

Jo

Although both experimental and in silico-based methods have been utilized to characterize SSR markers, due to high speed and efficiency, applying in silico methods are being used widely. Herein, a total of 32,962 gSSR and 1949 TAC-SSR markers were identified in the eight chromosomes of the chickpea and the 48,667 chickpea comprehensive TACs, respectively (supplementary materials Table S5, Excel file). Interestingly, Kolmogorov–Smirnov uniformity testing showed that the frequency and distribution of the gSSRs were uniformly distributed along the 8 chickpea chromosomes (Table 2), indicating a non-random distribution during the chickpea genome evolution. It could be due to their effects on chromatin organization, regulation of gene activity, recombination, DNA replication and cell cycle (Li et al., 2004; Zou, 2012). Several studies 9

on the use of microsatellite markers for mapping multiple quantitative traits in plants have been conducted and consequently, numerous SSR markers have been identified (Buerstmayr et al., 2012; Sandlin et al., 2012; Gonthier et al., 2013; Würschum et al., 2013; Singh et al., 2014). Attempt to identify find the SSRs markers linked to any important traits have great potential to speed up the process of developing improved cultivars and increase the efficiency of MAS (Gupta and Rustgi, 2004). It has been proposed that SSR derived from transcriptome sequences are better candidates for gene tagging, and the constructing linkage maps are preferred over using gSSR markers. Because EST-SSRs are powerful markers which enhance the role of genetic markers by assaying variation in transcribed and known functional genes so the gene tagging should give ‘‘perfect’’ marker–trait associations. Furthermore, due to driving from conserved coding regions, they have shown a high level of transferability to close and wild relatives of the plants (Zhang et al., 2005). Increase or decrease in repeat motifs of microsatellites in protein-

ro of

coding genes can activate or inactivate the correspondence enzymes or truncate the protein (Li et al., 2004). Alternation in the motif size (particularly trinucleotide motif types) of a SSR embedded in coding region

can potentially affects the structure of the corresponding enzyme/protein. In the coding region of genomic DNA, selection pressure against frameshift mutations effectively impedes the expansion of repeat motifs, except trinucleotide repeats. These types of repeats comprise a special class of microsatellites in coding

-p

DNA that undergo extensive repeat expansions, which is considered as a mutational mechanism (Ellegren, 2004). Therefore, it is necessary to investigate the polymorphism of these SSRs in different chickpea

re

genotypes to identify new biotic and abiotic stress-related markers. In this project, therefore comprehensive functional analysis of EST-SSRs and their physical mapping on the chickpea genome have been performed. The chromosomal location and distribution of TAC-SSRs appeared to be unevenly dispersed across the

lP

chickpea genome. Several studies have revealed that EST-SSRs were unevenly distributed on genomes (Allender et al., 2007; Gao et al., 2014; Sun et al., 2013). Regarding high A/T content of the chickpea genome (69.74%), these nucleotides are the major nucleotides of the most of the high-frequency repeat

na

motifs. It is worth to note that genomic sequences of dicot are generally rich with AT as compare to monocot. The genome of dicot plants have microsatellites with AT rich motifs whereas monocot lack GC rich motifs. The relationship between microsatellite evolution and chromosomal duplications has not been well documented. It seems that duplicated regions of genomes experience different selection pressures than

ur

other regions, which could be a reason for motif preference and frequency in monocots and dicots (Sonah et al., 2011). The characterized TAC-SSRs were distributed in 249 and 242 of putative stress-related

Jo

proteins and TF families, respectively. As indicated in the table 5, plant biotic and abiotic stresses were categorized into different group. The biotic stress group included bacterium, fungus, insect, nematode, and virus. Moreover, chromosome 2 showed the highest frequency of SSR-containing TACs that are involved in responses to biotic stresses compared to the other chromosomes (14 markers). The functional map showed that the maximum number of the TAC-SSR markers involved in biotic stress response was related to bacterial pathogens response (15 markers). Abiotic stresses were also classified into five group, namely, cold, salt, heat, osmotic, and drought (Table 5). According to Fig.3 and Table 5, chromosome 3 contains the highest number of salt (22 markers) and 10

cold (22 markers) stress-response TAC-SSRs. Most of the mapped SSR-containing TACs were functionally associated with responses to salt and cold stresses. Additionally, some other mapped-TAC-SSRs which are functionally associated with responses to salt and cold stresses and, were located on chromosomes 1, 3, and 5. Table 5 shows that the maximum number of TAC-SSRs (56.6% or 141 TAC-SSRs) was observed in genes encoding proteins that play a common role in various stresses, which in turn could be the reason for cross protection and tolerance in plants. Several studies have indicated many components that are involved in cross-talk among stress (biotic and abiotic) signaling pathways. A range of molecular mechanisms that act together in a complex regulatory network conduct the interaction between biotic and abiotic stresses. TFs, kinase cascades and the reactive oxygen species are key components of this cross-talk (Cao et al., 2011; Fujita et al., 2006). Common stress-related TAC-SSRs were distributed on all chickpea chromosomes, as chromosome 6 possessed the highest number of TAC-SSRs (28 markers) and

ro of

chromosome 4 showed the lowest number of TAC-SSRs (8 markers, supplementary materials Table S5, Excel file).

The TFs are important cellular proteins regulating many biological processes. It is well -documented that TFs modulate biotic and abiotic stress responses by activating many genes that play important and effective

roles in controlling different biological functions (Alves et al., 2014; Chen et al., 2012; Kakeshpour et al.,

-p

2015; Liu et al., 2013; Mizoi et al., 2012; Nakashima et al., 2009; Yang et al., 2014). Based on wellannotated TF families in the STIFDB, bHLH, MYB, bZIP and WRKY transcription factor families, which

re

contained high frequency of the identified SSRs, are mainly involved in regulation of different genes expression in response to ABA, cold, drought, light and salt stresses (Naika et al., 2013). In addition, some of the SSR-containing genes play a role in signal transduction i.e. they could sense stress and induce cell

lP

to prepare an appropriate response to stresses. Plants acclimate to environmental stresses by a cascade or a network of events that starts with the stress sensing and ends with the expression of suitable target genes to adjust their biological functions for maintaining homeostasis. While, the key components of the stress-

na

response relationship are stress triggers, signals, transducers, transcription regulators, target genes, and stress responses, including morphological, biochemical, and physiological changes (Danquah et al., 2014). Polymorphisms which are identified by TAC-SSRs embedded in the sequences that play important roles in different biological process, could be representative of the different potential functions in response to the

ur

environmental stresses. Indeed, the vigor and responsiveness of plants to biotic and abiotic stresses resulted from the sustained re-adjustment of their physiological activity that is determined by their genetic

Jo

background (Pastori and Foyer, 2002; Pandolfi et al., 2012; Danquah et al., 2014). Therefore, applying the known-functional SSRs in breeding programs are more informative and essential in response to different biotic and abiotic stresses (Lindemose et al., 2013). Furthermore, known-functional and positional SSRs could be applied in breeding programs for the development of chromosome segment substitution lines (CSSLs). Substitution lines are considered as a powerful tool for the introgression of valuable genes from wild species into cultivated crops (Xu et al., 2010). The SSRs not only reduce the difficulties in developing substitution lines, but also ensure high-resolution mapping and genomic targeting of the interesting traits.

11

Author Contribution Statement

Sajad Rashidi-Monfared designed, organized and interpreted the project and all analysis. SRM, Amin Ebrahimi, AliAkbar Asadi, Mohamad Basiri and Javad Akbari Afjani performed the analysis. AE, MB and AAA performed the all mapping and functional analysis with control and coordination SRM. JAA performed the experimental study. SRM drafted the manuscript. AAA and MB helped and edited the draft of manuscript. SRM, AAA and AE read and approved the

5. Acknowledgments

Jo

ur

na

lP

re

-p

This research was supported by Tarbiat Modares University, Iran.

ro of

final manuscript.

12

Jo

ur

na

lP

re

-p

ro of

References Agarwal, G., Jhanwar, S., Priya, P., Singh, V. K., Saxena, M. S., Parida, S. K., Jain, M. (2012). Comparative Analysis of Kabuli Chickpea Transcriptome with Desi and Wild Chickpea Provides a Rich Resource for Development of Functional Markers. PLoS ONE, 7(12). https://doi.org/10.1371/journal.pone.0052443 Allender CJ, Allainguillaume J, Lynn J, King GJ .(2007). Simple sequence repeats reveal uneven distribution of genetic diversity in chloroplast genomes of Brassica oleracea L. and (n = 9) wild relatives. Theor Appl Genet 114: 609–618. http://dx.doi.org/10.1007/s00122-006-0461-5. Alves M, Dadalto S, Gonçalves A, Souza G.de, BarrosV, Fietto L .(2014). Transcription Factor Functional Protein-Protein Interactions in plant defense responses. Proteomes 2: 85–106. doi: 10.3390/proteomes2010085. Anderson J A, Churchill G, Autrique J, Tanksley S, Sorrells. M .(1993). Optimizing parental selection for genetic linkage maps. Genome 36: 181-186. https://doi.org/10.1139/g93-024. Asadi, A. A., & Rashidi-Monfared, S. (2014). Characterization of EST-SSR markers in durum wheat EST library and functional analysis of SSR-containing EST fragments. Molecular Genetics and Genomics, 289(4), 625–640. https://doi.org/10.1007/s00438-014-0839-z Asp T, Frei UK, Didion T, Nielsen KK, Lübberstedt T .(2007). Frequency, type, and distribution of ESTSSRs from three genotypes of Lolium perenne, and their conservation across orthologous sequences of Festuca arundinacea, Brachypodium distachyon and Oryza sativa. BMC. Plant .Biol 7: 36.DOI:10.1186/1471-2229-7-36. Bassam B, Caetano-Anollés G, Gresshoff PM .(1991). Fast and sensitive silver staining of DNA in polyacrylamide gels. Anal Biochem 196: 80-83. DOI: 10.1016/0003-2697(91)90120-I. Buerstmayr M, Huber K, Heckmann J, Steiner B, Nelson JC, Buerstmayr H (2012) Mapping of QTL for Fusarium head blight resistance and morphological and developmental traits in three backcross populations derived from Triticum dicoccum× Triticum durum. Theor Appl Genet 125: 1751–1765 Boominathan P, Shukla R, Kumar A, Manna D, Negi D, Verma PK, Chattopadhyay D (2004) Long term transcript accumulation during the development of dehydration adaptation in Cicer arietinum. Plant physiol 135:1608–1620 Buhariwalla HK, Jayashree B, Eshwar K, Crouch JH (2005) Development of ESTs from chickpea roots and their use in diversity analysis of the Cicer genus. BMC Plant Biol 5:16 Cao FY, Yoshioka K, Desveaux D .(2011). The roles of ABA in plant–pathogen interactions. J Plant Res 124:489–499. DOI:10.1007/s10265-011-0409-y. Chen L, Song Y, Li S, Zhang L, Zou C, Yum D .(2012). The role of WRKY transcription factors in plant abiotic stresses. Biochim Biophys Acta (BBA)-Gene Regul Mech 1819: 120–128. DOI:10.1016/j.bbagrm.2011.09.002. Choudhary S, Sethy N. K, Shokeen B, Bhatia S .(2009). Development of chickpea EST-SSR markers and analysis of allelic variation across related species. Theor Appl Genet 118: 591–608. DOI:10.1007/s00122-008-0923-z. Danquah A, de Zelicourt A, Colcombet J, Hirt H (2014) The role of ABA and MAPK signaling pathways in plant abiotic stress responses. Biotechnol Adv 32 (1): 40–52 Datta, S., Kaashyap, M., & Gupta, P. (2015). Development of EST derived microsatellite markers in chickpea and their validation in diversity analysis. Indian Journal of Biotechnology, 14(1), 55–58. De Riek J, Calsyn E, Everaert I, Van Bockstaele E, De Loose M .(2001). AFLP based alternatives for the assessment of distinctness, uniformity and stability of sugar beet varieties. Theor Appl Genet 103: 1254-1265. https://doi.org/10.1007/s001220100710. Ellegren H .(2004). Microsatellites: simple sequences with complex evolution. Nat Rev Genet 5: 435–445. DOI:10.1038/nrg1348. Fujita M, Fujita Y, Noutoshi Y, Takahashi F, Narusaka Y, Yamaguchi-Shinozaki K .(2006). Crosstalk between abiotic and biotic stress responses: a current view from the points of convergence in the stress signaling networks. Curr Opin Plant Biol 9: 436–442. DOI:10.1016/j.pbi.2006.05.014 Gao C, Yin J, Mason A.S, Tang Z, Ren X, Li C, Zeshan A, Donghui F, Jiana L .(2014). Regularities in simple sequence repeat variations induced by a cross of resynthesized Brassica napus and natural Brassica napus. POJ 7:35–46. Gonthier L, Blassiau C, Mörchen M, Cadalen T, Poiret M, Hendriks T, et al. (2013) High-density genetic maps for loci involved in nuclear male sterility (NMS1) and sporophytic self-incompatibility (Slocus) in chicory (Cichorium intybus L., Asteraceae). Theor Appl Genet 126: 2103–2121 13

Jo

ur

na

lP

re

-p

ro of

Gupta PK, Rustgi S (2004) Molecular markers from the transcribed/expressed region of the genome in higher plants. Funct Integr Genomics 4: 139–162 Kakeshpour T, Nayebi S, Rashidi-Monfared S, Moieni A, Karimzadeh G .(2015). Identification and expression analyses of MYBand WRKY transcription factor genes in Papaver somniferum L. Physiol Mol Biol Plants 21: 465–478. DOI:10.1007/s12298-015-0325-z Kudapa H, Azam S, Sharpe A.G, Taran B, Li R, Deonovic B, Cameron C, Farmer A.D, Cannon S.B, Varshney R.K .(2014). Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L.) using sanger and next generation sequencing platforms: development and applications. Plos One 9:e86039. DOI:10.1371/journal.pone.0086039. Li B, Xia Q, Lu C, Zhou Z, Xiang Z .(2004). Analysis on frequency and density of microsatellites in coding sequences of several eukaryotic genomes. Geno Prot Bioinfo 2: 24–31. doi: 10.1016/S16720229(04)02004-2. Lindemose S, O’Shea C, Jensen MK, Skriver K (2013) Structure, function and networks of transcription factors involved in abiotic stress responses. Int J Mol Sci 14: 5842–5878 Liu T, Zhu S, Tang Q, Yu Y, Tang S .(2013). Identification of drought stress-responsive transcription factors in ramie (Boehmeria nivea L. Gaud). BMC Plant Biol 13:130. doi: 10.1186/1471-2229-13130. Maia L, Da C, Palmieri D.A, Souza V.Q, Kopp M.M, Carvalho FIF de, Costa de Oliveira A .(2008). SSR Locator: Tool for Simple Sequence Repeat Discovery Integrated with Primer Design and PCR Simulation. Int J Plant Genomics 2008: 412696. doi: 10.1155/2008/412696. Mizoi J, Shinozaki K, Yamaguchi-Shinozaki K .(2012). AP2/ERF family transcription factors in plant abiotic stress responses. Biochim Biophys Acta (BBA)-Gene Regul Mech 1819: 86–96. DOI:10.1016/j.bbagrm.2011.08.004. Mudunuri S.B, Nagarajaram H.A .(2007). IMEx: imperfect microsatellite extractor. Bioinformatics 23: 1181–1187. DOI:10.1093/bioinformatics/btm097. Naghavi MR, Rashidi-Monfared S, Humberto G (2012) Genetic diversity in Iranian chickpea (Cicer arietinum L.) landraces as revealed by microsatellite markers. Czech J Genet Plant Breed 48: 131– 138. Naika M, Shameer K, Mathew OK, Gowda R, Sowdhamini R (2013) STIFDB2: an updated version of plant stress-responsive transcription factor database with additional stress signals, stress-responsive transcription factor binding sites and stress-responsive genes in Arabidopsis and rice. Plant Cell Physiol 54: e8 Nakashima K, Ito Y, Yamaguchi-Shinozaki K .(2009). Transcriptional regulatory networks in response to abiotic stresses in Arabidopsis and grasses. Plant Physiol 149:88–95. DOI:10.1104/pp.108.129791. Pandolfi C, Mancuso S, Shabala S (2012) Physiology of acclimation to salinity stress in pea (Pisum sativum L.). Environ Exp Bot 84: 44–51 Pastori GM, Foyer CH (2002) Common components, networks, and pathways of cross-tolerance to stress. The central role of “redox” and abscisic acid-mediated controls. Plant Physiol 129: 460–468 Rohlf F.J .(2000). NTSYS 2.1: Numerical taxonomic and multivariate analysis system. Exeter publicshing Setauket, New York. Romo S, Labrador E, Dopico B (2004) Water stress-regulated gene expression in Cicer arietinum seedlings and plants. Plant Physiol Biochem 39:1017–1026 Saghai-Maroof M A, Soliman K M, Jorgensen R. A, Allard R. W .(1984). Ribosomal DNA spacer-length polymorphisms in barley: Mendelian inheritance, chromosomal location, and population dynamics. Proceedings of the National Academy of Sciences 81(24): 8014-8018. Sandlin K, Prothro J, Heesacker A, Khalilian N, Okashah R, Xiang W, et al. (2012) Comparative mapping in watermelon [Citrullus lanatus (Thunb.) Matsum. et Nakai]. Theor Appl Genet 125: 1603–1618 Singh A, Knox RE, DePauw RM, Singh AK, Cuthbert RD, Campbell HL, et al. (2014) Stripe rust and leaf rust resistance QTL mapping, epistatic interactions, and co-localization with stem rust resistance loci in spring wheat evaluated over three continents. Theor Appl Genet 127: 2465–2477 Sonah H, Deshmukh R.K, Sharma A, Singh V.P, Gupta D.K, Raju N, Gacche R.N, Rana J.C, Singh N.K, Sharma T.R .(2011). Genome-wide distribution and organization of microsatellites in plants: an insight into marker development in Brachypodium. Plos One 6: e21298. https://doi.org/10.1371/journal.pone.0021298. SPSS I .(2013). SPSS Statistics, version 20. IBM.

14

na

7. Supplementary materials

lP

re

-p

ro of

Sun L, Yang W , Zhang Q , Cheng T , Pan H , Xu Z. J, Chen C .(2013). Genome-wide characterization and linkage mapping of simple sequence repeats in mei (Prunus mume Sieb. et Zucc.). Plos One 8: e59562. https://doi.org/10.1371/journal.pone.0059562. Temnykh S, Declerck G, Lukashova A, Lipovich L, Cartinhour S, Mccouch S .(2001). Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res 11: 1441–1452. DOI:10.1101/gr.184001. Udupa SM, Robertson LD, Weigand F, Baum M, Kahl G (1999) Allelic variation at (TAA) n microsatellite loci in a world collection of chickpea (Cicer arietinum L.) germplasm. Mol Gen Genet MGG 261: 354–363 Varshney RK, Thiel T, Stein N, Langridge P, Graner A (2002) In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cell Mol Biol Lett 7: 537–546 Varshney, R. K., Hiremath, P. J., Lekha, P., Kashiwagi, J., Balaji, J., Deokar, A. A., Hoisington, D. A. (2009). A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.). BMC Genomics, 10. https://doi.org/10.1186/1471-2164-10-523 Voorrips RE .(2002). MapChart: software for the graphical presentation of linkage maps and QTLs. J Hered 93:77–78. https://doi.org/10.1093/jhered/93.1.77. Wang Z, Li J, Luo Z, Huang L, Chen X, Fang B, et al. (2011) Characterization and development of ESTderived SSR markers in cultivated sweetpotato (Ipomoea batatas L.). BMC Plant Biol 11: 139 Würschum T, Langer SM, Longin CFH, Korzun V, Akhunov E, Ebmeyer E, et al. (2013) Population structure, genetic diversity and linkage disequilibrium in elite winter wheat assessed with SNP and SSR markers. Theor Appl Genet 126: 1477–1486 Xu J , Zhao Q , Du P, Xu C , Wang B , Feng Q , Liu Q , Tang S , Gu M , Han B, Liang G .(2010). Developing high throughput genotyped chromosome segment substitution lines based on population whole-genome re-sequencing in rice (Oryza sativa L.). BMC Genomics 11: 656. DOI:10.1186/14712164-11-656. Yang ZT, Wang MJ , Sun L, Lu SJ, Bi DL, Sun L, Song ZT, Zhang SS, Zhou SF, Liu JX .(2014). The Membrane-Associated Transcription Factor NAC089 Controls ER-Stress-Induced Programmed Cell Death in Plants. Plos Genet 10: e1004243. DOI:10.1371/journal.pgen.1004243. Zhang LY, Bernard M, Leroy P, Feuillet C, Sourdille P (2005) High transferability of bread wheat ESTderived SSRs to other cereals. Theor Appl Genet 111: 677–687 Zou C, Lu C, Zhang Y, Song G .(2012). Distribution and characterization of simple sequence repeats in Gossypium raimondii genome. Bioinformation 8: 801. doi: 10.6026/97320630008801.

Jo

ur

Supplementary materials are available on the publisher’s web site along with the published article.

15

ro of

Jo

ur

na

lP

re

-p

Fig. 1. Cluster analysis of 23 accessions of Iranian chickpea using Dice’s coefficient similarity and UPGMA algorithm based on the identified SSR markers.

16

ro of

Jo

ur

na

lP

re

-p

Fig. 2. Relative share and classification of SSR-containing enzymes according to the BLAST results.

17

. Chromosome1

OS

CB

CBA CB

ro of

CBA

C

255

CB

250 260

320

435

550

na

Jo

555

ur

545

BI

540

CBA

500

CB

495

N

OS

490

CB

D

485

CBA

CBA

465

CB

450

CB

445

CBA

440

CBA

430

535

lP

425

530

CBA

420

525

CBA

415

520

B

V

410

S

405

CBA

400

N

CBA

395

CS

390

CB

CBA

385

B

CBA

375 380

CBA

N

370

D

CBA

365

B

CBA

360

CBA

355

H

CBA CBA

CBA

350

CBA C

335

CBA

B S

330

re

325

CBA

CBA

315

ABI

CBA

CB

310

CBA

D

305

-p

SD

300

CBA

CBA

CBA

295

B

290

SD

285

B

CBA CBA

280

CB

275

CBA

270

C

CBA

265

515

D

Vir

CA

CB

245

510

D

CBA

CB

240

505

CBA

CBA

CBA

235

480

CBA

CBA

B

CBA

CBA

CBA

230

475

CBA CB V

215

470

CBA

D

210

460

CBA

CBA

CBA

205

455

H

200

CBA

CBA

195

345

S

CBA

190

340

CBA

CBA

185

C F

180

225

VF

H

CBA

175

220

CB

CBA

CBA H

CB

CS

170

CB

155 165

H CBA

CBA

CBA

150

O

CBA

C

CB

D

145

160

CBA

CBA

CB

CBA

IBF

140

S

130 135

DS

CBA

125

CB

S

115 120

V

CBA

CBA

CBA

CBA

H

B

110

CBA

100 105

V V

95

CBA CBA

90

V

O

CA H

CBA

CBA

CBA

80 85

Chromosome8

CB

CBA

75

BF

70

CBA CBA

65

C

60

CBA

55

Chromosome7

H

CB

45 50

CBA CBA

D CBA

H

40

Chromosome6 S

CA

CBA

30 35

Chromosome5

S

CBA

15 25

Chromosome4

CB

10 20

Chromosome3

F

5

CBA

0

Chromosome2

560 565 570

D

575 580

CBA

585

Fig. 3. Chromosome location of SSR-containing TACs involved in different types of stress. The map was drawn using MapChart software. (B: Bacterium; F: Fungus; V: Virus; N: Nematode; Vir: Viroid; BF: Bacterium and Fungus; VF: Virus and Fungus; C: Cold; D; Drought; H: Heat; O: Osmotic; S: Salt; CS; Cold and Salt; OS: Osmotic and Salt; SD: Salt and Drought; CB: Common Biotic; CA: Common Abiotic; CBA: Common Biotic and Abiotic. Scaling in Mb/mm.). 18

Table 1. Polymorphism information content (PIC) in the 23 landraces of the chickpea for six EST-SSR primer pairs Chromosome NO. TAC code SSR type Function PIC location 1 1626 (AG)20 Response to salt stress 1 0.385 2 972 (TAT)12 Responses to drought stress 8 0.489 1287

(AGAA)6

4

1393

(GAA)12

5

808

(TTTAC)6

6

384

(AGA)8

Response to osmotic stress and salt stress Response to salt stress Response to osmotic stress and salt stresses Response to salt and osmotic stresses

1

0.385

4

0.499

7

0.226

4

0.226

Jo

ur

na

lP

re

-p

ro of

3

19

Table 2. Mean, highest, lowest and Kolmogorov-Smirnov uniformity testing for the frequency and distribution of the gSSRs on each five Mb of the 8 chromosomes of chickpea Mean

Highest

Lowest

Standard Test deviation value¶ 1 441 329 549 79.94 0.72 2 419.88 229 530 94.25 1.1 3 482.75 413 582 53.71 0.79 4 473.5 406 546 45.02 0.43 5 452.5 225 658 140.38 0.52 6 487.3 359 566 77.60 1.03 7 469.9 314 564 77.30 0.85 8 380.5 162 545 160.59 0.66 Total 457.80 90.4 ¶ The value of the Kolmogorov-Smirnov Test > 0.05, indicating that gSSRs on chickpea

Jo

ur

na

lP

re

-p

ro of

Chromosomes

20

Table 3. Distribution pattern of perfect and imperfect SSRs in genomic DNA and transcriptome of the chickpea A number of loci (perfect and imperfect SSRs) Chromosome 1 2 3 4 5 Total number 1 3,984 342 64 9 11 4,410 3,079

223

34

11

12

3,359

3

3,507

300

40

12

3

3,862

4

4,335

352

38

5

5

4,735

5

4,034

396

62

13

20

4,525

6

5,341

409

72

14

14

5,850

7

4,258

362

43

6

30

4,699

8

1,412

93

12

2

3

1,522

Total

29,950

2,477

365

72

98

32,962

Coding data

1,873

70

5

1

0

1,949

Jo

ur

na

lP

re

-p

Genomi c data

ro of

2

21

Jo

ur

na

lP

re

-p

ro of

Table 4. Frequency of different type of motifs in TAC-SSRs and gSSRs in the chickpea (other: > 6) Motif Location Number of repeats Total %Tota lengt l 3 4 5 6 7 8 9 10 >10 h DiGenome 231 1329 1560 42.93 8 1 9 Transcrip 123 275 398 1.09 t TriGenome 62 967 60 459 5902 9556 26.28 0 8 Transcrip 25 109 50 22 44 476 1.31 t 1 TetraGenome 167 58 24 92 54 54 118 2825 7.77 4 7 6 Transcrip 56 18 7 4 1 0 2 88 0.24 t PentaGenome 177 393 70 25 10 7 5 89 2378 6.54 9 Transcrip 115 25 5 6 0 0 0 0 151 0.42 t HexaGenome 906 212 69 27 22 7 4 63 1310 3.60 Transcrip 199 52 15 8 3 0 0 0 277 0.76 t Other Genome 258 327 81 29 18 14 16 4 89 3166 8.71 s 8 Transcrip 95 24 3 1 0 0 0 0 1 124 0.34 t Total Genome 258 301 236 75 93 110 69 284 1955 SSRs 8 2 0 5 6 5 2 4 2 Transcrip 95 338 136 39 27 116 51 145 322 t 2

22

TAC-SSRs Frequency 3 4

7

1

na l

Jo ur

viroid

Fungus and Virus Insect, Bacterium Fungus Bacterium and Fungus

Common Stress

Biotic

and

f 2

7

1

1

6

1

1

4

2

1

2

2

1

3

7

2

8

3

6

1

1

2

6

4

2

1

5

1

Pr

4 7

Fungus

3

1

2 Virus

1

1

5

Biotic Stress

2

4

4

Nematode

TAC-SSRs Frequency 1 1

6 3

1 Chromosome Number 4 1

Stress

pr

Bacterium

Chromosome Number 3 2

Cold

e-

Stress

oo

Table 5. Chromosomal location and putative stress related function of the SSR-containing TACs

Drought

2 2

Abiotic Stress

4

1

1

2

1 Heat

6

2

8

1

7

1

1

1

4

1

1

1

6

1

2

7

4

1

8

2

1

1

4

4

2

3

Salt

23

2 3 1 2

3

1

7

19

Common Stress

Biotic and Abiotic

8 4

28

na l

5

Drought and Salt

5

2

6

1

1 6 4 1

1 1 2 1

7

1

2

1

3

2

19 14 10 8

18 25

1

Jo ur

1

Osmotic Salt and Cold Salt and Osmotic

Pr

6 2 3

f

3 5 6 5

oo

1

pr

6

6

e-

Abiotic

7

24