Usefulness of COMT gene polymorphisms in North African populations

Usefulness of COMT gene polymorphisms in North African populations

Accepted Manuscript Usefulness of COMT gene polymorphisms in North African populations Sami Boussetta, Lotfi Cherni, Andrew J. Pakstis, Nesrine Ben S...

1MB Sizes 0 Downloads 50 Views

Accepted Manuscript Usefulness of COMT gene polymorphisms in North African populations

Sami Boussetta, Lotfi Cherni, Andrew J. Pakstis, Nesrine Ben Salem, Sarra Elkamel, Houssein Khodjet-el-Khil, Kenneth K. Kidd, Amel Ben Ammar Elgaaied PII: DOI: Reference:

S0378-1119(19)30146-5 https://doi.org/10.1016/j.gene.2019.02.021 GENE 43602

To appear in:

Gene

Received date: Revised date: Accepted date:

23 August 2018 8 January 2019 1 February 2019

Please cite this article as: S. Boussetta, L. Cherni, A.J. Pakstis, et al., Usefulness of COMT gene polymorphisms in North African populations, Gene, https://doi.org/10.1016/ j.gene.2019.02.021

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Usefulness of COMT gene polymorphisms in North African populations Sami Boussetta1, Lotfi Cherni1,2, Andrew J. Pakstis3, Nesrine Ben Salem1, Sarra Elkamel1, Houssein Khodjetel-Khil1,4, Kenneth K. Kidd3, Amel Ben Ammar Elgaaied1

1

Laboratory of Genetics, Immunology and Human Pathology, Faculty of Science of Tunis, University of Tunis El Manar, 2092 Tunis, Tunisia 2 High Institute of Biotechnology, University of Monastir, 5000 Monastir, Tunisia.

PT

3 Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA

AC

CE

PT E

D

MA

NU

SC

RI

4 Department of Biomedical Sciences. College of Health Sciences. Qatar University. Doha. Qatar.

1

ACCEPTED MANUSCRIPT Abstract: The COMT gene encodes for catechol-O-methyl-transferase, an enzyme playing a major role in regulation of synaptic catecholamine neurotransmitters. Investigating 4 markers of the COMT gene (rs2020917, rs4818, rs4680, rs9332377) in 6 Tunisian populations and a pool of Libyans. Our objective was to determine the distribution of allelic, genotypic and haplotypic frequencies by

PT

comparison to other populations of the 1000 genomes project and 59 populations from the Kidd

RI

Lab dataset.

SC

The allelic frequencies established for these SNPs in the North African populations are similar to those of Europeans and South Asians. Linkage disequilibrium between these SNPs and haplotypes

NU

frequencies are different between populations whose clustering in principal components analysis

MA

(PCA) according to their geographic origin was more significant using haplotypic frequencies. COMT activity prediction by haplotypes genotyping could be limited to rs4818-rs4680 microhaplotypes. The Low activity haplotype (CG) displays the highest frequency in African populations

D

(55%), in the 59 Kidd Lab populations we found also that Sub-Saharan Africans,

Native

PT E

Americans, and some East Asian and Pacific Island populations all have frequencies in the 50-81% range for (CG) where as its lowest frequency was found in Europeans (10%), this results have been

CE

also confirmed for Southwest Asians. North Africans and South Asians with intermediate frequencies have approximately similar values (20% and 25%). Europeans show the highest

AC

frequencies of haplotypes with predicted High and Medium activity in contrast to Africans. North Africans and South Asians present similar results for all the category of the COMT activity prediction by haplotypes genotyping. The high level of genetic diversity of COMT haplotypes, not only allows distinction between populations according to their history settlement, origin and ethnicity, it constitutes a basis for studies of association of the COMT gene polymorphism with pathologies, drugs response and for forensic investigation in North African populations.

2

ACCEPTED MANUSCRIPT

1. INTRODUCTION The COMT gene is located on chromosomal band22q11.2, extends over 28 kb, contains six exons, and encodes for catechol-O-methyl-transferase (Weinshilboum and Raymond, 1977; Grossman et al., 1992; Lundström et al., 1995; Bearden et al., 2005).This enzyme plays a major role in

PT

regulation of synaptic catecholamine neurotransmitters (Badner and Gershon, 2002; Anna et al.,

RI

2017). The COMT enzyme transfers a methyl group (-CH3) onto catecholamines such as dopamine, adrenaline, epinephrine and norepinephrine, as well as onto various types of drugs or

SC

substances with a catechol structure (Lachman et al., 1996; Huang et al., 2016).The enzymes are

NU

degraded following this methylation. There are two isoforms for the COMT enzyme, the short form is soluble (S-COMT)and expressed in the peripheral tissues while the long form is bound to the cell

MA

membrane (MB-COMT) and mainly expressed by brain neurons (Nissinen and Mannisto, 2010). Both forms are normal developmental isoforms resulting from alternative transcription initiation

D

positions.

PT E

COMT is a therapeutic target since the regulation of catecholamines is impaired in several pathologies such as hypertension (Annerbrink et al., 2008; Htun et al., 2011), cardiovascular diseases (Hall et al., 2014), Asthma (Fenech and Hall, 2002) and Parkinson's disease (Jimenez-

CE

Jimenez et al., 2014; Lin et al., 2017; Xiao et al., 2017). One of the most common therapeutic

AC

strategies is to use a drug that targets the COMT enzymatic activity. On the other hand that enzyme activity was shown to vary as a polymorphism attributable to an amino acid substitution. Many variants of the COMT gene have been described as associated with the risk of various neuropsychiatric diseases such as schizophrenia (Lo Bianco et al., 2013; Lacerda-Pinheiro et al., 2014), panic disorder (Lonsdorf et al., 2010; Konishi et al., 2014; Asselmann et al., 2018), bipolar disorder (Hosang et al., 2017; Miskowiak et al., 2017; Minassian et al., 2018), anorexia nervosa (Brandys et al., 2012; Favaro et al., 2013; Peng et al., 2016).

3

ACCEPTED MANUSCRIPT The most studied polymorphism in this gene is rs4680, a non-synonymous SNP, at codon position 108 of the soluble isoform and position 158 of the membrane bound isoform (Bertocci et al., 1991; Grossman et al., 1992; Winqvist et al., 1992; Tenhunen et al., 1994; Lundström et al., 1995; Nedic et al., 2011). The ancestral allele (G) is located in a triplet encoding amino acid Valine. The triplet with the derived allele (A) encodes the amino acid Methionine. The amino acid substitution alters

PT

the structure and causes lower enzymatic activity of COMT: the Met allele is associated with a 3 to 4 fold decrease in COMT activity compared to the Val allele (Lachman et al., 1996). Hence,

RI

individuals who have the Met/Met genotype catabolize dopamine at a lower rate than individuals

SC

with the homozygous ancestral state Val/Val. Heterozygous individual have an intermediate activity level as the two alleles are co-dominant. The COMT activity decrease due to the Met/Met

NU

genotype promotes the accumulation of dopamine. This increase of dopamine is associated with several neuropsychological disorders (Dobryakova et al., 2015). In addition to the demonstrated

MA

importance of the 108 S-COMT / 158 MB-COMT polymorphism, some studies have shown that other synonymous SNPs, that exist in this gene, may affect its expression and decrease the

D

enzymatic activity of COMT (Nackley et al., 2006).

PT E

SNP rs4818 is a synonymous variant that does not cause any change in the amino acid sequence of the enzyme. It has been shown to be associated with the activity level of COMT, the high activity

CE

allele of rs4818 being (G) (Sagud et al., 2018). This polymorphism has been shown to produce greater variation in COMT activity than rs4680 (Nackley et al., 2006). The homozygous (GG) state

AC

is associated with a high activity of the enzyme; the heterozygous genotype has an intermediate activity while the (CC) genotype has a weak enzymatic activity (Barbosa et al., 2012; Sagud et al., 2018).

The two SNPs rs4818-rs4680 presented above are part of the haploblock (rs6269, rs4633, rs4818, rs4680) presenting the 3 major haplotypes which influence the enzymatic activity of COMT. This activity is inversely related to the sensitivity of pain in a chronic pain syndrome; so the haplotype (GCGG) has a high activity associated with low pain sensitivity (LPS), the haplotype (ATCA) has an intermediate activity with average pain sensitivity (APS); while the haplotype (ACCG) which 4

ACCEPTED MANUSCRIPT has a low enzymatic activity, is considered associated with high pain sensitivity (HPS) (Diatchenko et al., 2005; Nackley et al., 2006). Haplotype activity can be predicted according to a classification found in the pharmacogenomics database PharmGKB (https://www.pharmgkb.org/haplotype/PA165947990), this classification was

PT

based on previous studies (Nackley et al., 2006; Bitsios and Roussos, 2011). Moreover, micro-haplotypes from COMT gene are very powerful forensic markers since the SNPs

RI

that constructed them are very close with an extremely low recombination rate, which makes it

SC

possible to distinguish between human groups and justifies their major utility in genetic anthropology (Kidd et al., 2013; Kidd et al., 2017). Haplotypes from the COMT gene have been

NU

used already as micro-haplotypes in a panel developed by the Kidd Lab (Kidd et al., 2014) and they can be found in the ALFRED database (https://alfred.med.yale.edu/alfred/microhaps.asp). The

MA

distance between the two SNPs rs4818-rs4680 is 65bp and their micro-haplotype name is mh22KK-060. These haplotypic markers give more informative results than those found by the use

D

of single SNPs.

PT E

There is very little data about the COMT gene polymorphism in North African populations, only one recent work done on Tunisian patients with cancer studied the effect of rs4680 of the COMT gene on Mu opioid receptor function and the clinical efficacy of morphine without finding any

CE

association (Chatti et al., 2016; Chatti et al., 2017).

AC

Our investigation of the COMT gene focused on 4 markers (rs2020917, rs4818, rs4680, rs9332377), especially rs4680, in North African populations. Our objective is to determine the distribution of allelic, genotypic and haplotypic frequencies by comparison to other populations. The results will be discussed according to the complexity of settlement history of the North African populations on one hand and according to the functional effect of the variants and haplotypes in relationship with environmental adaptation, way of life, and pharmacogenetics.

2. Materials and methods

5

ACCEPTED MANUSCRIPT 2.1. DNA samples and COMT SNP typing We have collected samples from 275 North African individuals including 219 Tunisians from 6populations well distributed throughout Tunisia: Kesra (n = 39) to the north, Sousse (n=37), Mahdia (n = 31) and Kairouan (n = 31) to the center, Smar (n = 50) to the south, the population of Kerkennah island (n = 31), and 56 Libyans in addition. All individuals sampled were unrelated and

PT

healthy persons (Figure 1); all individuals gave informed consent for the study of DNA sequence

RI

variants.

SC

Total human genomic DNA was isolated from peripheral blood samples collected into EDTA tubes using the phenol-chloroform method.

NU

The 4 SNPs (rs2020917, rs4818, rs4680, rs9332377) extend over 27 of the 28 KB of the COMT gene and have been typed in 3 µl reactions using TaqMan ® Assay-on-demand following the

TaqMan

Catalog

Numbers

MA

manufacturer's protocol. Assays were obtained from Applied Biosystems, Thermo Fisher AB C__11731880_1_;

C___2538750_10;

C__25746809_50;

D

C__29614343_10, respectively. 384-well plates were read on an AB7900 thermo cycler using SDS

PT E

software. These SNP frequency results are in Appendix Table A; see also the ALFRED database

AC

CE

(Rajeevan et al., 2012; Cherni et al., 2016) at https://alfred.med.yale.edu.

6

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

D

Figure 1: Map locations of the 7 North African populations analyzed in this study.

PT E

2.2. Statistical analysis

The analysis of allelic and genotypic frequencies was performed using Plink 1.09 software (Purcell

CE

et al., 2007) (http://pngu.mgh.harvard.edu/purcell/plink/), and the determination of haplotypes has been done with Phase v2.1.1 software (Stephens et al., 2001; Stephens and Scheet, 2005).The

AC

determination of linkage disequilibrium (LD) between the studied SNPs was performed with Haploview Software for all the North African populations. For comparative analyses we used data from the 59 Kidd Lab populations (Cherni et al., 2016; Brissenden et al., 2015) and the 26 worldwide populations(The Genomes Project) (1KG).The haplotypic data have been downloaded from LDlink website (Machiela and Chanock, 2015) for the 4 SNPs in all populations of the 1000 Genome Project. Data obtained from the 7 North African populations analyzed in this study are merged with data from the 1KG project subset (Population list file, supplementary material). Fst distance matrix was 7

ACCEPTED MANUSCRIPT calculated using the ARLEQUIN v3.11 computer package (Excoffier and Lischer, 2010) and Principal Component Analysis (PCA) was performed with PAST software (Hammer et al., 2001).

3. RESULTS 3.1. Allelic and genotypic frequencies

PT

The analysis of allelic and genotypic frequency in the North African populations for all the SNPs

RI

studied in this work shows no significant deviation (at P<0.01) from Hardy-Weinberg equilibrium (Table S1, supplementary material) in the Tunisian and Libyan populations analyzed. Moreover,

SC

the allele frequencies of 3 SNPs (rs2020917, rs4818, rs4680) are almost identical for all populations except for the population of Sousse where the frequency of nucleotide allele A coding

NU

for Met at the SNP rs4680 is 36.5% which is lower than in the other North African populations

MA

where it varies between 48% and 50%. Hence, the population of Sousse presents the lowest frequency (8%) of the Met/Met genotype among the analyzed populations and the highest

D

frequency (35%) of the Val/Val genotype (Table 1).

PT E

Table1: Allele frequencies in North African populations at 4SNPs (rs2020917, rs4818, rs4680, rs9332377) of the COMT gene (2n: Number of analyzed chromosome) Kerkennah

Kesra

Mahdia

Smar

Sousse

Libya

(2n= 62)

(2n= 62)

(2n= 78)

(2n= 42)

(2n= 100)

(2n= 74)

(2n= 112)

21 (0.3387)

C

41 (0.6613)

rs2020917

19 (0.3065)

21 (0.2692)

20 (0.3226)

22 (0.22)

32 (0.4324)

33 (0.2946)

43 (0.6935)

57 (0.7308)

42 (0.6774)

78 (0.78)

42 (0.5676)

79 (0.7054)

AC

T

CE

Kairouan

rs4818

G

19 (0.3065)

19 (0.3065)

25 (0.3205)

21 (0.3387)

31 (0.31)

35 (0.473)

36 (0.3214)

C

43 (0.6935)

43 (0.6935)

53 (0.6795)

41 (0.6613)

69 (0.69)

39 (0.527)

76 (0.6786)

rs4680 A

30 (0.4839)

30 (0.4839)

35 (0.4487)

28 (0.4516)

50 (0.5)

27 (0.3649)

55 (0.4911)

G

32 (0.5161)

32 (0.5161)

43 (0.5513)

34 (0.5484)

50 (0.5)

47 (0.6351)

57 (0.5089)

rs9332377 T

11 (0.1774)

10 (0.1613)

15 (0.1923)

15 (0.2419)

9 (0.09)

13 (0.1757)

18 (0.1607)

C

35 (0.8226)

52 (0.8387)

63 (0.8077)

47 (0.7581)

91 (0.91)

61 (0.8243)

94 (0.8393)

8

ACCEPTED MANUSCRIPT

In all 26 populations of 1000 genomes project, the G: Val allele is the most common allele with an average frequency of 63.1% (Table 2). In the American populations, frequencies are similar to that average value. The populations of East Asia and Africa have the highest average frequencies of the (G: Val) allele at 72%. Our results on SNP rs4680 show that the allelic frequencies of the North

PT

African populations studied in this paper are most similar to those of Europe and South Asia (Table

RI

2).

Table2: Allelic frequencies in worldwide populations of rs4680at COMT gene.

Populations

SC

rs4680allelefrequency G

A

ALL:

0.631 (3159)

0.369 (1849)

MA

(1 KG Project)

(Met)

NU

(Val)

Data from

1 KG

0.378 (262)

1 KG

CLM

0.638 (120)

0.362 (68)

1 KG

MXL

0.602 (77)

0.398 (51)

1 KG

0.353 (60)

1 KG

0.647 (110)

PT E

PEL

D

0.622 (432)

America

0.601 (125)

0.399 (83)

1 KG

East Asia

0.720 (726)

0.280 (282)

1 KG

CDX

0.731 (136)

0.269 (50)

1 KG

CHB

0.684 (141)

0.316 (65)

1 KG

CHS

0.719 (151)

0.281 (59)

1 KG

JPT

0.716 (149)

0.284 (59)

1 KG

KHV

0.753 (149)

0.247 (49)

1 KG

0.559 (547)

0.441 (431)

1 KG

BEB

0.558 (96)

0.442 (76)

1 KG

GIH

0.563 (116)

0.437 (90)

1 KG

ITU

0.554 (113)

0.446 (91)

1 KG

PJL

0.479 (92)

0.521 (100)

1 KG

STU

0.637 (130)

0.363 (74)

1 KG

0.500 (503)

0.500 (503)

1 KG

AC

CE

PUR

South Asia

Europe

9

CEU

0.535 (106)

0.465 (92)

1 KG

FIN

0.409 (81)

0.591 (117)

1 KG

GBR

0.473 (86)

0.527 (96)

1 KG

IBS

0.528 (113)

0.472 (101)

1 KG

TSI

0.547 (117)

0.453 (97)

1 KG

Sub-Saharan Africa

0.719 (951)

0.281 (371)

1 KG

ACB

0.677 (130)

0.323 (62)

1 KG

ASW

0.730 (89)

0.270 (33)

ESN

0.722 (143)

0.278 (55)

PT

ACCEPTED MANUSCRIPT

LWK

0.712 (141)

0.288 (57)

MAG

0.743 (168)

0.257 (58)

1 KG

MSL

0.765 (130)

0.235 (40)

1 KG

YRI

0.694 (150)

0.306 (66)

1 KG

0.461 (255)

This study

0.484(30)

This study

RI

1 KG 1 KG

SC

NU

0.539 (295)

North Africa

1 KG

0.5161 (32)

Kerkennah

0.5161 (32)

0.484 (30)

This study

Kesra

0.5513 (43)

0.449 (35)

This study

Mahdia

0.548 (34)

0.452 (28)

This study

0.500 (50)

0.500 (50)

This study

Sousse

0.635 (47)

0.365 (27)

This study

Libya

0.509 (57)

0.491 (55)

This study

D

PT E

Smar

MA

Kairouan

CE

MXL: Mexican Ancestry from Los Angeles USA, PUR: Puerto Ricans from Puerto Rico, CLM: Colombians from Medellin, Colombia, PEL: Peruvians from Lima, Peru, CHB: Han Chinese in Beijing, China, JPT: Japanese in Tokyo, Japan, CHS: Southern Han Chinese,

AC

CDX: Chinese Dai in Xishuangbanna, China, KHV: Kinh in Ho Chi Minh City, Vietnam, GIH: Gujarati Indian from Houston, Texas, PJL: Punjabi from Lahore, Pakistan, BEB: Bengali from Bangladesh, STU: Sri Lankan Tamil from the UK, ITU: Indian Telugu from the UK, CEU: Utah Residents (CEPH) with Northern and Western European Ancestry, TSI: Toscani in Italy, FIN: Finnish in Finland, GBR: British in England and Scotland, IBS: Iberiansin Spain, YRI: Yoruba in Ibadan, Nigeria, LWK: Luhyain Webuye, Kenya, GWD: Gambians in Western Divisions of Gambia, MSL: Mende in Sierra Leone, ESN: Esan in Nigeria, ASW: Americans of African ancestry in SW USA, ACB: African Caribbeans in Barbados.

3.2. Heterozygosity analysis We compared the North African populations according to their level of heterozygosity on the basis of

their

distance

from

centroid

by

using

GeDis2.1

software

available

online

at 10

ACCEPTED MANUSCRIPT (http://www.ehu.eus/~ggppegaj/XVsoftware.html) and we find that Mahdia, Kesra and Sousse appear to be the highest outliers, the populations of Kairouan, Kerkhennah and Libya are close to the theoretical regression line and the population of Smar significantly under the regression line (Figure F1, supplementary material). When we considered all populations from 1000 genomes project along with North Africans

PT

considered in this study, the plot of the correlation between distance from centroid and the heterozygosity level based on data of 4 SNPs from COMT gene (Figure 2) reveals an excess of

RI

heterozygosity among the North African populations. The populations of Mahdia, Kesra and

SC

Sousse appear to be the highest outliers and there is no population close to the theoretical regression line. Only the population of Smar was under the regression line, suggesting limited gene

AC

CE

PT E

D

MA

NU

flow in its genetic background compared to the other North Africans populations.

11

PT E

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

CE

Figure 2: Plot of average heterozygosity vs distance from allele frequency centroid for all populations (1000 genomes populations and North Africans) according to heterozygosity at 4 SNPs (rs2020917, rs4818 and rs4680, rs9332377) of the COMT gene

AC

(H: heterozygosity ;Ri: Distance from centroid)

We performed a Principal Component Analysis (PCA) using Fst distance matrix data based on the frequencies of the 4 SNPs computed with the results of North African populations compared to those of other populations reported in the 1000 genomes project. The two axes absorb 83.2% of the variation with 64.2% for the first axis (PC1) and 18.9% for the second (PC2). This analysis shows a grouping of African populations in a single cluster. It also shows a dispersion of American populations which is already expected given the heterogeneity of these populations. Moreover, we observed a grouping of North African populations that are close to the Eurasian populations. 12

ACCEPTED MANUSCRIPT However, this study revealed the population of Sousse with a very distinct genetic structure as compared to the other North African populations (SM1, supplementary material). 3.3. Linkage disequilibrium We compared the linkage disequilibrium (LD) structure of COMT markers (rs2020917, rs4818, rs4680, rs9332377) between the studied populations. Table 3 illustrates r2 and D' values for each

PT

pair of SNPs in North African populations. The two pairs (rs4818-rs4680) and (rs4680-rs9332377) have the highest LD levels.

Populations

Kerkennah

Mahdia

SC

RI

Table3: Linkage disequilibrium between every pair of the studied SNPs (rs2020917, rs4818, rs4680, rs9332377) of the COMT gene in North African populations Sousse

Pairs of SNP 0.569

rs2020917

rs4680

0.149

0.538

rs2020917

rs9332377

0.165

rs4818

rs4680

0.869

rs4818

rs9332377

0.636

rs4680

rs9332377

0.753

NU

0.488

Kesra

Libya

0.666

0.854

0.834

0.585

0.644

0.880

0.868

0.644

0.516

0.716

MA

rs4818

0.550

0.010

0.072

0.128

0.148

0.857

0.905

1

1

1

1

0.377

0.774

0.364

0.497

0.443

0.686

1

1

1

1

1

D

0.026

PT E

0.615

0.238

0.301

0.377

0.457

0.600

0.268

0.365

0.011

0.114

0.339

0.212

0.199

0.080

0.207

rs9332377

0.002

0

0.084

0

0.002

0.011

0.010

rs4680

0.356

0.310

0.422

0.449

0.414

0.384

0.457

rs9332377

0.176

0.088

0.142

0.029

0.121

0.099

0.190

rs9332377

0.116

0.099

0.122

0.099

0.202

0.194

0.185

Pairs of SNP rs4818

rs2020917

rs4680

rs2020917 rs4818

AC

CE

rs2020917

rs4680

Kairouan

D'

rs2020917

rs4818

Smar

r2

As shown in figure 3, the North African populations studied display different levels of linkage disequilibrium (LD), the population of Sousse has the highest values followed by the Libyan population. The two SNPs that are in the same LD block (rs4818-rs4680) have moderate LD as measured by r2 and high LD as measured by D’. Except for Kerkennah and Mahdia populations, the degree of LD is also almost complete (D '= 1) between SNPs rs4680 and rs9332377. 13

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

D

3.4. Analysis of haplotype frequencies

MA

Figure 3: Linkage disequilibrium patterns between SNPs(rs2020917, rs4818, rs4680, rs9332377)of the COMT gene in North Africans populations

PT E

We have calculated the haplotype frequency distribution formed by the 4 SNPs of the COMT gene in each population (Figure 4). Thirteen different haplotypes were found in the North African populations studied. The six most common haplotypes occur in all seven populations and account

CE

for 80 to 95% of the variation in each group. The most abundant haplotype, CCAC, accounting

AC

for224/550(or 40.7%) of all North African chromosomes, is least frequent in Sousse (32%) and most frequent in Smar (48%). The least frequent haplotypes (CGAT and TCGT) were encountered respectively only once each in the population of Mahdia. Haplotypic diversity of the 4 SNPs studied has also been compared to the 26 populations of the 1000 genomes project (Figure 5a). Fourteen different haplotypes were found and the populations from each geographical area have a similar haplotypic profile. Results show similar haplotypic frequencies among all the Africans; a different pattern of haplotypic frequencies is similar among

14

ACCEPTED MANUSCRIPT all the East Asians populations. Still another overall profile seems similar among Europeans, South West Asians and North Africans. CCAT TCAT

CCGC TCGC

CCGT TCGT

CGAC TGAC

CGAT TGGC

CGGC TGGT

CGGT

Kesra

Kherkenna

Mahdia

Smar

Sousse

Libya

NU

Kairoun

SC

RI

PT

CCAC TCAC

CCAT TCAT

CCGC TCGC

CCGT TCGT

CGAC TGAC

CGAT TGGC

CGGC TGGT

CGGT

Kairoun Kesra Kherkenna Mahdia Smar Sousse Libya

GIH PJL BEB STU ITU

CEU TSI FIN GBR IBS

CHB JPT CHS CDX KHV

MXL PUR CLM PEL

YRI LWK GWD MSL ESN ASW ACB

AC

CE

PT E

D

CCAC TCAC

MA

Figure 4: Haplotypic distribution of the COMT gene SNPs(rs2020917, rs4818, rs4680, rs9332377) in Tunisian and Libyan populations.

Figure 5a: Worldwide (1000 genomes project data) Haplotypic distribution of the 4 SNPs (rs2020917, rs4818, rs4680, rs9332377) of the COMT gene

15

ACCEPTED MANUSCRIPT CCAC TCAC

CCAT TCAT

CCGC TCGC

CCGT TCGT

CGAC TGAC

CGAT TGGC

CGGC TGGT

CGGT

T P

I R

C S U

African_Americans

Ashkenazi_Jews Druze Kuwaiti Samaritans Yemenite_Jews

Keralites Thoti Kachari

Guihiba Karitiana Quechua Rondonian_Surui Ticuna

Khanty Komi_Zyriane Yakut

Micronesians Nasioi Papuans_NewGuinea Samoans

Plains_AmerIndians Maya SouthWest_AmerIn… Pima_Mexico

Hungarians Chuvash Danes Finns Irish Russians_Archangelsk Russians_Vologda EuroAmericans Roman_Jews Sardinians Adygei

C C

M

Kairoun Kesra Kherkenna Mahdia Smar Sousse Libya

D E

T P E

Chinese_Taiwan Japanese Koreans Laotians Cambodians Malaysians Outer_Mongolians Tsaatan Chinese_S.F Ami Atayal Hakka

Hausa Ibo Yoruba

Chagga Ethiopian_Jews Masai Sandawe Zaramo

Biaka Lisongo Mbuti

N A

A

Figure 5b: Worldwide (Kidd Lab data) haplotypic distribution of the 4 SNPs (rs2020917, rs4818, rs4680, rs9332377) of the COMT gene Africa_Central (Biaka, Lisongo, Mbuti); Africa_East (Chagga, Ethiopian Jews, Masai, Sandawe, Zaramo); Africa_West (Hausa, Ibo, Yoruba); East_Asia (Chinese, Taiwan, Japanese, Koreans, Laotians, Malaysians, Outer Mongolians, Tsaatan, Chinese_San-Francisco, Ami, Atayal, Hakka); Europe_Central (Hungarians); Europe_North(Chuvash, Danes, Finns, Irish, Russians, Archangelsk, Russians, Vologda); Europe_South (Roman Jews, Sardinians, Adygei); European_ancestry (EuroAmericans); North_America (Plains AmerIndians, Maya, SouthWest_AmerIndians, Pima_Mexico); Pacific (Micronesians, Nasioi, Melanesia, Papuans--NewGuinea, Samoans); Siberia (Khanty, KomiZyriane, Yakut); South_America (Guihiba, Karitiana, Quechua, RondonianSurui, Ticuna); South Central_Asia (Keralites, Thoti, Kachari); South East_Asia (Cambodians); South West_Asia (Ashkenazi Jews, Druze, Kuwaiti, Samaritans, Yemenite Jews); West African & European ancestry (African Americans).

16

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

Figure 6a: Principal component analysis using haplotypic frequencies of 4 SNPs (rs2020917, rs4818, rs4680, rs9332377) of the COMT gene (PC1 and PC2)

Figure 6b: Principal component analysis using haplotypic frequencies of 4 SNPs (rs2020917, rs4818, rs4680, rs9332377) of the COMT gene (PC1 and PC3)

The principal component analysis (PCA) was also performed using the haplotype frequencies estimated for each population. Results reveal a grouping of the populations according to their geographical origins (Figure 6a). This grouping is more obvious than the one obtained with PCA based on the FST distance matrix data based on the individual SNP frequencies. The PCA 17

ACCEPTED MANUSCRIPT performed using haplotypic frequencies shows that PC1 (56.88%) separates Europeans from the other populations (Figure 6a). PC2 (23.16%) separates Africans and East Asians from the other populations; PC3 (10.77%), especially in combination with PC1, separates the North African populations from the South Asian populations (Figures 6a & b). The North Africans studied in this paper are closer to the European cluster than to the South

PT

Asians. According to this multivariate analysis, the population of Sousse is still differentiated from other North African populations but to a lesser extent. In order to confirm what we have observed

RI

with the populations of the 1000 genome project we have used the results of the same 4 SNPs from

SC

59 Kidd Lab populations well classified according to their geographical and predominant ancestries in order to carry out a more refined evaluation that respects a micro-geographical approach. Thus,

NU

the results confirm those we have found with the populations of the 1000 genome project, they also show that the populations of North Africa have a profile that is similar to that of the European

MA

populations and South West Asia.

D

3.5.Haplotype activity prediction

PT E

We have in this paper compared the haplotype frequencies of the haploblock (rs6269,rs4633, rs4818, rs4680) in the 5 population groups (Africans, Americans, East Asians, Europeans and

CE

South Asians) with the frequencies of the micro-haplotypes limited to only two SNPs (rs4818rs4680)since the two SNPsrs6269 and rs4633 have not been typed in the Tunisian and Libyan

AC

population samples (SM2, supplementary material). We have found in the same group, very similar or even identical values between haploblocks and corresponding micro-haplotypes frequencies due to very strong LD. An exception is seen for African populations (Figure 7b) where there is a lower LD than in the other groups; the same results on the haplotypic frequencies were also confirmed with data from the 59 Kidd Lab populations for Europeans, Africans, South West and Central South Asians populations.

18

ACCEPTED MANUSCRIPT The results are different for populations of the Americas reflecting varying amounts of European, West African, and Native American ancestry and high level of interbreeding between the Europeans and the West Africans (Gravel et al., 2013). So, we can assume that the haplotype rs4818-rs4680 (GG) is associated with high activity (Sagud et al., 2018), the other haplotypes are either of average or low activity. It should be noted that

PT

predicting activity from haplotypes rather than rs4680 alone is important since the ancestral allele

RI

(G) is present in both high and low activity haplotypes (Diatchenko et al., 2005).

SC

According to the results found in Tunisian populations, the Low activity haplotype has similar frequencies in all populations (18 to 23%) whereas the population of Sousse displayed the lowest

NU

frequency (35%) of the Medium activity haplotype and the highest frequency (46%) of the High activity haplotype (Table 4and Figure 7a).

PT E

Haplotype 2 Medium activity CA 0.5 0.45 0.48 0.44 0.5 0.35 0.49

Haplotype 3 High activity GG 0.29 0.32 0.31 0.32 0.31 0.46 0.32

AC

CE

Kairouan Kesra Kherkenna Mahdia Smar Sousse Libya

Haplotype 1 Low activity CG 0.19 0.23 0.21 0.23 0.19 0.18 0.18

D

Haplotypes (rs4680-rs4818)

MA

Table 4: Frequencies of (rs4818-rs4680) haplotypes found in North African populations classified according to their activity.

19

ACCEPTED MANUSCRIPT 100%

100%

80%

80%

60% 40% 20%

CG

60%

GG

40%

CG GG CA

CA

0%

20% 0%

RI

PT

AFR AMR EAS EUR SAS NAF

SC

Figure 7: Frequencies of (rs4818-rs4680) haplotypes classified according to their activity in North African populations (a) compared to worldwide populations (b) AFR: Africans, AMR: Americans, EAS: East West Asia, SAS: South Asia, NAF: North Africa

NU

The results show that the Low activity haplotype of rs4818-rs4680 (CG) is found with the highest frequency in African populations (55%) whereas the lowest frequency was found in European

MA

populations (10%), the populations of North Africa and South Asians with intermediate frequencies have approximately similar values (20% and 25%). Europeans show the highest values of high and

D

medium activity haplotype frequencies in contrast to Africans. North Africa and South Asians

genotyping (Figure 7b).

PT E

present similar results for all the category of the COMT activity predicted by haplotypes

CE

The results from the 1000 genomes project and especially the Kidd Lab populations make possible frequency estimate of each COMT gene haplotype according to the level of activity in worldwide

AC

populations according taking into account geographical regions or Predominant Ancestries.

4. Discussion

In the present paper we analyzed the polymorphisms of 4 SNPs (rs2020917, rs4818, rs4680 and rs9332377) of the COMT gene in 6 Tunisian populations and in Libyans. The results will be discussed first according to the complexity of settlement history of the North African populations and then according to the functional effect of the variants and haplotypes in relationship with environmental adaptation, way of life, and pharmacogenetics. 20

ACCEPTED MANUSCRIPT The SNP allelic haplotypic frequencies analysis according to PCA and to distance from centroid and the heterozygosity displayed three main features on North African populations. First, similarities among most of the North African populations is revealed. Second the level of heterozygosity suggested the great level of admixture of the North African populations. Third, the allelic and haplotypic frequencies of the North African populations are similar to those of Europe

PT

and Southwest Asia but are very distinct from Sub-Saharan Africans and East Asians. Our results are in agreement with those found by Mukherjee et al.(2010) though deriving from different

RI

haplotype combinations of the COMT gene (Mukherjee et al., 2010).They are also consistent with

SC

previous studies using other markers (Elkamel et al., 2017; Frigi et al., 2017; Hajjej et al., 2017; Elkamel et al., 2018) and in agreement with the settlement history of these populations. Since

NU

North Africa has a complex human demographic history, many waves of migrations have colonized this region from Africa, Europe and the Middle East (Cherni et al., 2016) during pre-historic and

MA

modern times.

In this work, clustering of populations according to their geographic origin was more clearly

D

visualized using haplotypic frequencies than the Fst distance matrix data based on individual SNP

PT E

frequencies which shows the greater informativity of using haplotypes to distinguish between populations of different ethnicity or geographic origin. The distinctive geographical clustering of populations from different continental regions is quite impressive considering that these results are

CE

based on variation of only 4 SNPs at one gene. Such results usually are not seen until one includes

AC

data from many, often dozens, of independent polymorphisms. Since other studies have also shown that COMT gene frequencies differ according to ethnicity (Palmatier et al., 1999; DeMille et al., 2002; Mukherjee et al., 2010; Gonzalez-Castro et al., 2013) in populations sampled worldwide. COMT gene polymorphism could be useful for forensic studies. Within the cluster of North African populations the Sousse population sample is somewhat different from the other six populations. One has to analyze more in depth the reasons of this isolation that has been shown by other genetic studies (Fadhlaoui-Zid et al., 2015). It seems that the particular genetic structure of Sousse would be more related to the high level of admixture of this

21

ACCEPTED MANUSCRIPT population rather than to lack of genetic flow. Indeed, this city is from an ancient foundation by Phoenicians 1100 years BC, Interestingly, it is one of the rare Punic towns that escaped destruction by Romans during the Punic wars. The presence of the Phoenician Y chromosome at a frequency of about 10% in Sousse (Fadhlaoui-Zid et al., 2012; Fadhlaoui-Zid et al., 2015) confirms the continuity of this population since its foundation. Moreover its location on the sea and its economic

PT

role leading to commercial and human exchanges would explain the diversity observed at

RI

haplotypic level.

SC

Indeed, if analysis of haplotype distribution gives information about population relationships, study of linkage disequilibrium might be related to the time of settlement of the population and to its

NU

level of admixture. LD profiles in the 22q11.2 location that contains the COMT gene showed a variation that depends on both the region of the gene and the population geographical origin

MA

(Mukherjee et al., 2010). High LD was demonstrated in the 5' and 3' regions but not in the coding region. The 3 studied SNPs (rs2020917, rs4818, rs4680) belong to the high LD regions in the

D

Europeans, Asians (South West Asia and East Asia) and Native Americans populations (Mukherjee

PT E

et al., 2010). In this study we also have found the same results but we have used a different strategy based on a small number of SNPs well distributed over the entire gene. In this region we found high levels of LD in North African populations too, the two pairs (rs4818-rs4680) and (rs4680-

CE

rs9332377) displaying the highest LD levels. The two SNPs (rs4818-rs4680) analyzed in this study

AC

are located in the coding region of the COMT gene, which is characterized by low haplotype diversity and a low level of LD except for the Eurasian populations (Mukherjee et al., 2010) and for the populations of North Africa as shown in this study. This could be probably due to less ancient origin and /or to high level of admixture. However, one cannot exclude drift and selection effects that might accompany population settlement history, including adaptation to geographical latitude The second objective of this study is to analyze the functional effect of the variants and haplotypes in relationship with adaptation to environment and way of life. The COMT gene plays a key role as

22

ACCEPTED MANUSCRIPT an enzyme that metabolizes catecholamine and especially the neurotransmitter dopamine in the nervous system which may perhaps explain in one way or another its impact on behavior including stress, panic and intelligence. Those effects seem to be associated with COMT activity, itself related to COMT genetic polymorphisms. The levels of COMT activity vary not only between individuals but also between

PT

the tissues of the same person. Val158Met seems to be the most important SNP showing an impact on COMT function. The co-dominance of these alleles explains the tri-modal distribution (Low/

RI

Medium / High) of the COMT activity. There are studies that have shown that the enzymatic

SC

activity of COMT has decreased during evolution because of occurrence of such non-synonymous SNPs (Chen et al., 2004).Moreover, the influence of COMT activity on Dopamine metabolism

NU

suggested that we can classify individuals according to their genotypes into warrior or worrier

MA

(Stein et al., 2006). This impact on behavior might have been under selective pressures during prehistory. Indeed, there is a significant correlation between rs4680 polymorphism and population with agricultural economy lifestyle associated with the (A: Met) allele vs. hunter-gatherer lifestyle

PT E

D

associated with the (G: Val) allele (Piffer, 2013). In this study, the activity of the 14 described haplotypes in human populations in the 1000 genomes project and the North African data set, has been predicted according to a classification of the

CE

pharmacogenomics database PharmGKB (https://www.pharmgkb.org/haplotype/PA165947990), based on studies (Nackley et al., 2006; Bitsios and Roussos, 2011) The (CCAC) haplotype

AC

predicted with average enzymatic activity due to the presence of the pattern (rs4818-rs4680: CA) is found in the majority of populations, especially in the European, North African and South-West Asian populations. On the other hand, the (CCGC) haplotype which is expected to sto result in low enzymatic activity and is characterized by the presence of the pattern (rs4818-rs4680: CG) has the highest frequencies in the populations of Africa and East Asia while the lowest frequencies have been observed in Europeans populations, South-West Asians and North Africans. In the present work, the predicted activities in North African populations have not been confirmed at functional level. This should be of great interest, since the COMT gene is one of the genes encoding drug 23

ACCEPTED MANUSCRIPT metabolizing enzymes (DME) which makes it a candidate gene in pharmacogenomics. Hence, association studies of COMT polymorphism/activity with psychiatric diseases and with response to treatment could be of great interest. Indeed, some studies that examined the morphine response have reported that the (C) allele of rs4818 was associated with a low dose administered compared to that given for individuals with the

PT

(G) allele (Rakvag et al., 2008).

RI

In a publication about Tunisian patients with cancer by Chatti et al. (2016) , where rs4860 was

SC

analyzed to study the response to morphine (Chatti et al., 2016), results have shown no association. Interestingly, the patients are from three main Tunisian regions (Tunis, Sousse, and Sfax). The

NU

authors of this study pooled all the samples. For comparison, we also pooled our Tunisian populations and showed that allelic frequencies of the Tunisian patients (Chatti et al.2016) and

MA

healthy Tunisians population (the present study) are the same, but there is a small difference for genotype frequencies with an excess of heterozygous for the healthy individuals (Table 5). The

D

population of Sousse presents allelic and genotypic frequencies different from the two pooled

PT E

samples due to its genetic structure that distinguishes it from other populations of North Africa. Hence pooling it with other populations might constitute a bias for results in the analysis which may explain the absence of an association with the response with morphine. This study on

CE

population genetics of the COMT gene polymorphism highlights the importance of considering very precise geographic origin of subjects, along with age and sex in cases/controls studies

AC

performed in Tunisia since pooling of heterogeneous populations can result in false positives and false negatives. Indeed, the Tunisian population is a mosaic of sub-populations, each community displaying its own genetic structure due to high level of endogamy practiced in this country (Cherni et al., 2016). Table5: Comparison between the allelic and genotypic frequencies of rs4860 in this study and in Chatti et al 2016

rs4680

Tunisians Patients (Chatti et al, 2016)

Healthy Tunisians (This Study)

Healthy Sousse(This Study)

24

ACCEPTED MANUSCRIPT (117) 45.30%

(200) 45.60%

(27) 36.49%

G

(141) 54.70%

(238) 54.50%

(47) 63.51%

AA

(30) 23.30%

(44) 20%

(3) 8.1%

AG

(57) 44.20%

(112) 50.80%

(21) 56.8%

GG

(42) 32.60%

(63) 29%

(13) 35.1%

PT

A

RI

5. Conclusions

SC

Our results show that rs4818 and rs4680 of COMT gene help build micro-haplotypes that distinguish between populations according to their ethnic or geographic origin. These data would

NU

be very useful for forensic investigation by micro-haplotypes taking into account their high level of genetic diversity. micro-haplotypes would also allow to predict COMT enzymatic activity

MA

.Comparison of our results with other populations shows that there is a difference at the level of COMT enzymatic activity between ethnic groups according to the frequency of each haplotype, the

D

North Africans showing similar profile to Europeans and South West Asians.

PT E

The data obtained from healthy individuals indifferent populations should constitute a basis for new studies of association of the COMT gene polymorphism with pathologies and drugs

CE

metabolism in North African populations.

AC

Acknowledgments

This work was partially supported by the Tunisian Ministry of Higher Education and Scientific Research as well as by the University of Tunis El Manar. Special thanks go to the thousands of individuals around the world who volunteered to give blood and saliva samples to made this study possible.

Reference

25

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

RI

PT

Anna, K., Anna, M., Andrzej, J., Jakub, K., Anna, W., Maciej, K., Sylwia, F., Margit, B., Kirk, B. and Marcin, W., 2017. COMT and BDNF gene variants help to predict alcohol consumption in alcohol-dependent patients. Journal of addiction medicine 11, 114-118. Annerbrink, K., Westberg, L., Nilsson, S., Rosmond, R., Holm, G. and Eriksson, E., 2008. Catechol Omethyltransferase val158-met polymorphism is associated with abdominal obesity and blood pressure in men. Metabolism 57, 708-11. Asselmann, E., Hertel, J., Beesdo-Baum, K., Schmidt, C.O., Homuth, G., Nauck, M., Grabe, H.J. and Pane-Farre, C.A., 2018. Interplay between COMT Val158Met, childhood adversities and sex in predicting panic pathology: Findings from a general population sample. J Affect Disord 234, 290-296. Badner, J.A. and Gershon, E.S., 2002. Meta-analysis of whole-genome linkage scans of bipolar disorder and schizophrenia. Mol Psychiatry 7, 405-11. Barbosa, F.R., Matsuda, J.B., Mazucato, M., de Castro Franca, S., Zingaretti, S.M., da Silva, L.M., Martinez-Rossi, N.M., Junior, M.F., Marins, M. and Fachin, A.L., 2012. Influence of catechol-O-methyltransferase (COMT) gene polymorphisms in pain sensibility of Brazilian fibromialgia patients. Rheumatol Int 32, 427-30. Bearden, C.E., Jawad, A.F., Lynch, D.R., Monterossso, J.R., Sokol, S., McDonald-McGinn, D.M., Saitta, S.C., Harris, S.E., Moss, E., Wang, P.P., Zackai, E., Emanuel, B.S. and Simon, T.J., 2005. EFFECTS OF COMT GENOTYPE ON BEHAVIORAL SYMPTOMATOLOGY IN THE 22q11.2 DELETION SYNDROME. Child neuropsychology : a journal on normal and abnormal development in childhood and adolescence 11, 109-117. Bertocci, B., Garotta, G., Da Prada, M., Lahm, H.-W., Zürcher, G., Virgallita, G. and Miggiano, V., 1991. Immunoaffinity purification and partial amino acid sequence analysis of catechol-Omethyltransferase from pig liver. Biochimica et Biophysica Acta (BBA) - Protein Structure and Molecular Enzymology 1080, 103-109. Bitsios, P. and Roussos, P., 2011. Tolcapone, COMT polymorphisms and pharmacogenomic treatment of schizophrenia. Pharmacogenomics 12, 559-66. Brandys, M.K., Slof-Op't Landt, M.C., van Elburg, A.A., Ophoff, R., Verduijn, W., Meulenbelt, I., Middeldorp, C.M., Boomsma, D.I., van Furth, E.F., Slagboom, E., Kas, M.J. and Adan, R.A., 2012. Anorexia nervosa and the Val158Met polymorphism of the COMT gene: metaanalysis and new data. Psychiatr Genet 22, 130-6. Chatti, I., Creveaux, I., Woillard, J.B., Langlais, S., Amara, A., Ben Fatma, L., Saad, A., Gribaa, M. and Libert, F., 2016. Association of the OPRM1 and COMT genes' polymorphisms with the efficacy of morphine in Tunisian cancer patients: Impact of the high genetic heterogeneity in Tunisia? Therapie 71, 507-513. Chatti, I., Woillard, J.B., Mili, A., Creveaux, I., Ben Charfeddine, I., Feki, J., Langlais, S., Ben Fatma, L., Saad, A., Gribaa, M. and Libert, F., 2017. Genetic Analysis of Mu and Kappa Opioid Receptor and COMT Enzyme in Cancer Pain Tunisian Patients Under Opioid Treatment. Iran J Public Health 46, 1704-1711. Chen, J., Lipska, B.K., Halim, N., Ma, Q.D., Matsumoto, M., Melhem, S., Kolachana, B.S., Hyde, T.M., Herman, M.M., Apud, J., Egan, M.F., Kleinman, J.E. and Weinberger, D.R., 2004. Functional analysis of genetic variation in catechol-O-methyltransferase (COMT): effects on mRNA, protein, and enzyme activity in postmortem human brain. Am J Hum Genet 75, 807-21. Cherni, L., Pakstis, A.J., Boussetta, S., Elkamel, S., Frigi, S., Khodjet-El-Khil, H., Barton, A., Haigh, E., Speed, W.C., Ben Ammar Elgaaied, A., Kidd, J.R. and Kidd, K.K., 2016. Genetic variation in Tunisia in the context of human diversity worldwide. Am J Phys Anthropol 161, 62-71. DeMille, M.M., Kidd, J.R., Ruggeri, V., Palmatier, M.A., Goldman, D., Odunsi, A., Okonofua, F., Grigorenko, E., Schulz, L.O., Bonne-Tamir, B., Lu, R.B., Parnas, J., Pakstis, A.J. and Kidd, K.K., 2002. Population variation in linkage disequilibrium across the COMT gene considering promoter region and coding region variation. Hum Genet 111, 521-37. 26

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

RI

PT

Diatchenko, L., Slade, G.D., Nackley, A.G., Bhalang, K., Sigurdsson, A., Belfer, I., Goldman, D., Xu, K., Shabalina, S.A., Shagin, D., Max, M.B., Makarov, S.S. and Maixner, W., 2005. Genetic basis for individual variations in pain perception and the development of a chronic pain condition. Hum Mol Genet 14, 135-43. Dobryakova, E., Genova, H.M., DeLuca, J. and Wylie, G.R., 2015. The Dopamine Imbalance Hypothesis of Fatigue in Multiple Sclerosis and Other Neurological Disorders. Frontiers in Neurology 6, 52. Elkamel, S., Boussetta, S., Khodjet-El-Khil, H., Benammar Elgaaied, A. and Cherni, L., 2018. Ancient and recent Middle Eastern maternal genetic contribution to North Africa as viewed by mtDNA diversity in Tunisian Arab populations. Am J Hum Biol 30, e23100. Elkamel, S., Cherni, L., Alvarez, L., Marques, S.L., Prata, M.J., Boussetta, S., Benammar-Elgaaied, A. and Khodjet-El-Khil, H., 2017. The Orientalisation of North Africa: New hints from the study of autosomal STRs in an Arab population. Ann Hum Biol 44, 180-190. Excoffier, L. and Lischer, H.E., 2010. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour 10, 564-7. Fadhlaoui-Zid, K., Chennakrishnaiah, S., Zemni, R., Grinberg, S., Herrera, R.J. and BenammarElgaaied, A., 2012. Sousse, Tunisia: tumultuous history and high Y-STR diversity. Electrophoresis 33, 3555-63. Fadhlaoui-Zid, K., Garcia-Bertrand, R., Alfonso-Sanchez, M.A., Zemni, R., Benammar-Elgaaied, A. and Herrera, R.J., 2015. Sousse: extreme genetic heterogeneity in North Africa. J Hum Genet 60, 41-49. Favaro, A., Clementi, M., Manara, R., Bosello, R., Forzan, M., Bruson, A., Tenconi, E., Degortes, D., Titton, F., Di Salle, F. and Santonastaso, P., 2013. Catechol-O-methyltransferase genotype modifies executive functioning and prefrontal functional connectivity in women with anorexia nervosa. J Psychiatry Neurosci 38, 241-8. Fenech, A. and Hall, I.P., 2002. Pharmacogenetics of asthma. British Journal of Clinical Pharmacology 53, 3-15. Frigi, S., Mota-Vieira, L., Cherni, L., van Oven, M., Pires, R., Boussetta, S. and El-Gaaied, A.B.A., 2017. Mitochondrial DNA analysis of Tunisians reveals a mosaic genetic structure with recent population expansion. Homo 68, 298-315. Gonzalez-Castro, T.B., Tovilla-Zarate, C., Juarez-Rojop, I., Pool Garcia, S., Genis, A., Nicolini, H. and Lopez Narvaez, L., 2013. Distribution of the Val108/158Met polymorphism of the COMT gene in healthy Mexican population. Gene 526, 454-8. Gravel, S., Zakharia, F., Moreno-Estrada, A., Byrnes, J.K., Muzzio, M., Rodriguez-Flores, J.L., Kenny, E.E., Gignoux, C.R., Maples, B.K., Guiblet, W., Dutil, J., Via, M., Sandoval, K., Bedoya, G., The Genomes, P., Oleksyk, T.K., Ruiz-Linares, A., Burchard, E.G., Martinez-Cruzado, J.C. and Bustamante, C.D., 2013. Reconstructing Native American Migrations from WholeGenome and Whole-Exome Data. PLoS Genetics 9, e1004023. Grossman, M.H., Emanuel, B.S. and Budarf, M.L., 1992. Chromosomal mapping of the human catechol-O-methyltransferase gene to 22q11.1→q11.2. Genomics 12, 822-825. Hajjej, A., Almawi, W.Y., Hattab, L., El-Gaaied, A. and Hmida, S., 2017. The investigation of the origin of Southern Tunisians using HLA genes. J Hum Genet 62, 419-429. Hall, K.T., Nelson, C.P., Davis, R.B., Buring, J.E., Kirsch, I., Mittleman, M.A., Loscalzo, J., Samani, N.J., Ridker, P.M., Kaptchuk, T.J. and Chasman, D.I., 2014. Polymorphisms in catechol-Omethyltransferase modify treatment effects of aspirin on risk of cardiovascular disease. Arterioscler Thromb Vasc Biol 34, 2160-7. Hammer, Ø., Harper, D. and Ryan, P., 2001. PAST-palaeontological statistics, ver. 1.89. Palaeontologia electronica 4. Hosang, G.M., Fisher, H.L., Cohen-Woods, S., McGuffin, P. and Farmer, A.E., 2017. Stressful life events and catechol-O-methyl-transferase (COMT) gene in bipolar disorder. Depress Anxiety 34, 419-426. 27

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

RI

PT

Htun, N.C., Miyaki, K., Song, Y., Ikeda, S., Shimbo, T. and Muramatsu, M., 2011. Association of the catechol-O-methyl transferase gene Val158Met polymorphism with blood pressure and prevalence of hypertension: interaction with dietary energy intake. Am J Hypertens 24, 1022-6. Huang, E., Zai, C.C., Lisoway, A., Maciukiewicz, M., Felsky, D., Tiwari, A.K., Bishop, J.R., Ikeda, M., Molero, P., Ortuno, F., Porcelli, S., Samochowiec, J., Mierzejewski, P., Gao, S., CrespoFacorro, B., Pelayo-Terán, J.M., Kaur, H., Kukreti, R., Meltzer, H.Y., Lieberman, J.A., Potkin, S.G., Müller, D.J. and Kennedy, J.L., 2016. Catechol-O-Methyltransferase Val158Met Polymorphism and Clinical Response to Antipsychotic Treatment in Schizophrenia and Schizo-Affective Disorder Patients: a Meta-Analysis. International Journal of Neuropsychopharmacology 19, pyv132. Jimenez-Jimenez, F.J., Alonso-Navarro, H., Garcia-Martin, E. and Agundez, J.A., 2014. COMT gene and risk for Parkinson's disease: a systematic review and meta-analysis. Pharmacogenet Genomics 24, 331-9. Kidd, K.K., Pakstis, A.J., Speed, W.C., Lagace, R., Chang, J., Wootton, S., Haigh, E. and Kidd, J.R., 2014. Current sequencing technology makes microhaplotypes a powerful new type of genetic marker for forensics. Forensic Sci Int Genet 12, 215-24. Kidd, K.K., Pakstis, A.J., Speed, W.C., Lagace, R., Chang, J., Wootton, S. and Ihuegbu, N., 2013. Microhaplotype loci are a powerful new type of forensic marker. Forensic Science International: Genetics Supplement Series 4, e123-e124. Kidd, K.K., Speed, W.C., Pakstis, A.J., Podini, D.S., Lagace, R., Chang, J., Wootton, S., Haigh, E. and Soundararajan, U., 2017. Evaluating 130 microhaplotypes across a global set of 83 populations. Forensic Sci Int Genet 29, 29-37. Konishi, Y., Tanii, H., Otowa, T., Sasaki, T., Motomura, E., Fujita, A., Umekage, T., Tochigi, M., Kaiya, H., Okazaki, Y. and Okada, M., 2014. Gender-specific association between the COMT Val158Met polymorphism and openness to experience in panic disorder patients. Neuropsychobiology 69, 165-74. Lacerda-Pinheiro, S.F., Pinheiro Junior, R.F., Pereira de Lima, M.A., Lima da Silva, C.G., Vieira dos Santos Mdo, S., Teixeira Junior, A.G., Lima de Oliveira, P.N., Ribeiro, K.D., Rolim-Neto, M.L. and Bianco, B.A., 2014. Are there depression and anxiety genetic markers and mutations? A systematic review. J Affect Disord 168, 387-98. Lachman, H.M., Papolos, D.F., Saito, T., Yu, Y.M., Szumlanski, C.L. and Weinshilboum, R.M., 1996. Human catechol-O-methyltransferase pharmacogenetics: description of a functional polymorphism and its potential application to neuropsychiatric disorders. Pharmacogenetics 6, 243-50. Lin, C.H., Chaudhuri, K.R., Fan, J.Y., Ko, C.I., Rizos, A., Chang, C.W., Lin, H.I. and Wu, Y.R., 2017. Depression and Catechol-O-methyltransferase (COMT) genetic variants are associated with pain in Parkinson's disease. Sci Rep 7, 6306. Lo Bianco, L., Blasi, G., Taurisano, P., Di Giorgio, A., Ferrante, F., Ursini, G., Fazio, L., Gelao, B., Romano, R., Papazacharias, A., Caforio, G., Sinibaldi, L., Popolizio, T., Bellantuono, C. and Bertolino, A., 2013. Interaction between catechol-O-methyltransferase (COMT) Val158Met genotype and genetic vulnerability to schizophrenia during explicit processing of aversive facial stimuli. Psychol Med 43, 279-92. Lonsdorf, T.B., Ruck, C., Bergstrom, J., Andersson, G., Ohman, A., Lindefors, N. and Schalling, M., 2010. The COMTval158met polymorphism is associated with symptom relief during exposure-based cognitive-behavioral treatment in panic disorder. BMC Psychiatry 10, 99. Lundström, K., Tenhunen, J., Tilgmann, C., Karhunen, T., Panula, P. and Ulmanen, I., 1995. Cloning, expression and structure of catechol-O-methyltransferase. Biochimica et Biophysica Acta (BBA) - Protein Structure and Molecular Enzymology 1251, 1-10. Machiela, M.J. and Chanock, S.J., 2015. LDlink: a web-based application for exploring populationspecific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics 31, 3555-7. 28

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

RI

PT

Minassian, A., Young, J.W., Geyer, M.A., Kelsoe, J.R. and Perry, W., 2018. The COMT Val158Met Polymorphism and Exploratory Behavior in Bipolar Mania. Mol Neuropsychiatry 3, 151156. Miskowiak, K.W., Kjaerstad, H.L., Stottrup, M.M., Svendsen, A.M., Demant, K.M., Hoeffding, L.K., Werge, T.M., Burdick, K.E., Domschke, K., Carvalho, A.F., Vieta, E., Vinberg, M., Kessing, L.V., Siebner, H.R. and Macoveanu, J., 2017. The catechol-O-methyltransferase (COMT) Val158Met genotype modulates working memory-related dorsolateral prefrontal response and performance in bipolar disorder. Bipolar Disord 19, 214-224. Mukherjee, N., Kidd, K.K., Pakstis, A.J., Speed, W.C., Li, H., Tarnok, Z., Barta, C., Kajuna, S.L.B. and Kidd, J.R., 2010. The complex global pattern of genetic variation and linkage disequilibrium at Catechol-O-methyl transferase (COMT). Molecular psychiatry 15, 216225. Nackley, A.G., Shabalina, S.A., Tchivileva, I.E., Satterfield, K., Korchynskyi, O., Makarov, S.S., Maixner, W. and Diatchenko, L., 2006. Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science 314, 1930-3. Nedic, G., Nikolac, M., Sviglin, K.N., Muck-Seler, D., Borovecki, F. and Pivac, N., 2011. Association study of a functional catechol-O-methyltransferase (COMT) Val108/158Met polymorphism and suicide attempts in patients with alcohol dependence. Int J Neuropsychopharmacol 14, 377-88. Nissinen, E. and Mannisto, P.T., 2010. Biochemistry and pharmacology of catechol-Omethyltransferase inhibitors. Int Rev Neurobiol 95, 73-118. Palmatier, M.A., Kang, A.M. and Kidd, K.K., 1999. Global variation in the frequencies of functionally different catechol-O-methyltransferase alleles. Biol Psychiatry 46, 557-67. Peng, S., Yu, S., Wang, Q., Kang, Q., Zhang, Y., Zhang, R., Jiang, W., Qian, Y., Zhang, H., Zhang, M., Xiao, Z. and Chen, J., 2016. Dopamine receptor D2 and catechol-O-methyltransferase gene polymorphisms associated with anorexia nervosa in Chinese Han population: DRD2 and COMT gene polymorphisms were associated with AN. Neurosci Lett 616, 147-51. Piffer, D., 2013. Correlation of the COMT Val158Met polymorphism with latitude and a huntergather lifestyle suggests culture–gene coevolution and selective pressure on cognition genes due to climate. Anthropological Science 121, 161-171. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J. and Sham, P.C., 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559-75. Rajeevan, H., Soundararajan, U., Kidd, J.R., Pakstis, A.J. and Kidd, K.K., 2012. ALFRED: an allele frequency resource for research and teaching. Nucleic Acids Res 40, D1010-5. Rakvag, T.T., Ross, J.R., Sato, H., Skorpen, F., Kaasa, S. and Klepstad, P., 2008. Genetic variation in the catechol-O-methyltransferase (COMT) gene and morphine requirements in cancer patients with pain. Mol Pain 4, 64. Sagud, M., Tudor, L., Uzun, S., Perkovic, M.N., Zivkovic, M., Konjevod, M., Kozumplik, O., Vuksan Cusa, B., Svob Strac, D., Rados, I., Mimica, N., Mihaljevic Peles, A., Nedic Erjavec, G. and Pivac, N., 2018. Haplotypic and Genotypic Association of Catechol-O-Methyltransferase rs4680 and rs4818 Polymorphisms and Treatment Resistance in Schizophrenia. Frontiers in Pharmacology 9. Stein, D.J., Newman, T.K., Savitz, J. and Ramesar, R., 2006. Warriors versus worriers: the role of COMT gene variants. CNS Spectr 11, 745-8. Stephens, M. and Scheet, P., 2005. Accounting for Decay of Linkage Disequilibrium in Haplotype Inference and Missing-Data Imputation. American Journal of Human Genetics 76, 449462. Stephens, M., Smith, N.J. and Donnelly, P., 2001. A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68, 978-89.

29

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

RI

PT

Tenhunen, J., Salminen, M., Lundstrom, K., Kiviluoto, T., Savolainen, R. and Ulmanen, I., 1994. Genomic organization of the human catechol O-methyltransferase gene and its expression from two distinct promoters. Eur J Biochem 223, 1049-59. The Genomes Project, C. A global reference for human genetic variation. Weinshilboum, R.M. and Raymond, F.A., 1977. Inheritance of low erythrocyte catechol-omethyltransferase activity in man. American journal of human genetics 29, 125-135. Winqvist, R., Lundstrom, K., Salminen, M., Laatikainen, M. and Ulmanen, I., 1992. The human catechol-O-methyltransferase (COMT) gene maps to band q11.2 of chromosome 22 and shows a frequent RFLP with BglI. Cytogenet Cell Genet 59, 253-7. Xiao, Q., Qian, Y., Liu, J., Xu, S. and Yang, X., 2017. Roles of functional catechol-Omethyltransferase genotypes in Chinese patients with Parkinson's disease. Transl Neurodegener 6, 11.

30

ACCEPTED MANUSCRIPT Abbreviations: COMT: Catechol-O-methyl-transferase SNP: Single nucleotide polymorphism PCA: Principal component analysis LD: Linkage disequilibrium

PT

Met: Methionine Val: Valine

RI

1KG Project: 1000 Genomes project

SC

LPS: Low pain sensitivity APS: Average pain sensitivity

AC

CE

PT E

D

MA

NU

HPS: High pain sensitivity

31

ACCEPTED MANUSCRIPT Highlights 

There are very few studies that have studied the COMT gene in North Africa.



Determine the distribution of allelic, genotypic and haplotypic frequencies and make comparison to other populations.



Predicting the enzymatic activity of COMT by using the Micro-haplotypes (rs4818-rs4680)

RI

constitutes a basis for new studies of COMT gene polymorphism in North African

CE

PT E

D

MA

NU

SC

populations.

AC



PT

and their frequency in the studied populations.

32