Whole exome sequencing of sporadic patients with Currarino Syndrome: A report of three trios

Whole exome sequencing of sporadic patients with Currarino Syndrome: A report of three trios

Accepted Manuscript Whole exome sequencing of sporadic patients with Currarino Syndrome: A report of three trios Ingunn Holm, Mari Spildrejorde, Barb...

663KB Sizes 7 Downloads 66 Views

Accepted Manuscript Whole exome sequencing of sporadic patients with Currarino Syndrome: A report of three trios

Ingunn Holm, Mari Spildrejorde, Barbro Stadheim, Kristin L. Eiklid, Pubudu S. Samarakoon PII: DOI: Reference:

S0378-1119(17)30281-0 doi: 10.1016/j.gene.2017.04.030 GENE 41879

To appear in:

Gene

Received date: Revised date: Accepted date:

1 November 2016 6 March 2017 19 April 2017

Please cite this article as: Ingunn Holm, Mari Spildrejorde, Barbro Stadheim, Kristin L. Eiklid, Pubudu S. Samarakoon , Whole exome sequencing of sporadic patients with Currarino Syndrome: A report of three trios. The address for the corresponding author was captured as affiliation for all authors. Please check if appropriate. Gene(2017), doi: 10.1016/j.gene.2017.04.030

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Title: Whole exome sequencing of sporadic patients with Currarino Syndrome: a report of three trios.

Order of Authors: Ingunn Holm1, M.Sc.; Mari Spildrejorde1, M.Sc.; Barbro

CE

PT E

D

MA

NU

SC

RI

Department of Medical Genetics, Oslo University Hospital, Oslo, Norway.

AC

1

PT

Stadheim, MD1; Kristin L Eiklid, PhD1; Pubudu S Samarakoon, PhD1.

1

ACCEPTED MANUSCRIPT Abstract Currarino Syndrome is a rare congenital malformation syndrome described as a triad of anorectal, sacral and presacral anomalies. Currarino Syndrome is reported to be both familial and sporadic. Familial CS is today known as an autosomal dominant disorder caused by mutations in the transcription factor MNX1. The aim of this study was to look for genetic causes of Currarino Syndrome in sporadic patients after ruling out other causes, like

PT

chromosome aberrations, disease-causing variants in possible MNX1 cooperating transcription factors and aberrant methylation in the promoter of the MNX1 gene. The hypothesis was that MNX1 was affected through interactions with other transcription factors or through other

RI

regulatory elements and thereby possibly leading to abnormal function of the gene. We

SC

performed whole exome sequencing with an additional 6 Mb custom made region on chromosome 7 (GRCh37/hg19, chr7:153.138.664 -159.138.663) to detect regulatory elements

NU

in non-coding regions around the MNX1 gene. We did not find any variants in genes of interest shared between the patients. However, after analyzing the whole exome sequencing data with Filtus, the in-house SNV filtration program, we did find some interesting variants in

MA

possibly relevant genes that could be explaining these patients` phenotypes. The most promising genes were ETV3L, ARID5A and NCAPD3. To our knowledge this is the first

AC

CE

PT E

D

report of whole exome sequencing in sporadic CS patients.

2

ACCEPTED MANUSCRIPT Abbreviations: CS; Currarino Syndrome, MNX1; Motor Neuron and Pancreas Homeobox, Mb; Megabase, SNV; single nucleotide variant, ETV3L; ETS Variant 3 Like, ARID5A; AT-rich interactive domain-containing protein 5A, NCAPD3; non-SMC condensin II complex subunit D3, ARM; Anorectal malformations, REK; Regional Committees for Medical and Health Research Ethics, FISH; fluorescence in situ hybridization, bp; basepair, BAM; Binary Alignment/Map,

PT

GATK; Genome Analysis Toolkit, CNV; Copy Number Variant, DP; reading depth, QD; QualByDepth, , PolyPhen; Polymorphism Phenotyping, SIFT; Sorting Intolerant From

RI

Tolerant, PTF1a; Pancreas Specific Transcription Factor, 1a, PCSK5; Proprotein Convertase Subtilisin/Kexin Type 5, ETS; erythroblast transformation-specific, SOX9; SRY-box 9, SRY;

AC

CE

PT E

D

MA

NU

SC

sex-determining region Y, MRI; Magnetic resonance imaging

3

ACCEPTED MANUSCRIPT

1. Introduction

PT

Anorectal malformations (ARM) are a complex group of congenital malformations which involve the rectum, the distal anus, the urinary tract and the genital tract in most of the cases.

RI

ARMs are associated with other congenital abnormalities in 20-70 % of the cases (Stoll et al., 2007). Currarino Syndrome (CS) is a rare congenital malformation syndrome described as a

SC

triad of anorectal, sacral and presacral anomalies. At least two of these features must be present to fulfill the clinical criteria. Patients present with a variable clinical picture with

NU

constipation as the most common symptom (Monclair, 2013). In CS the anorectal malformation is either anal stenosis or agenesis (Alamo, 2013). Malformations of the sacrum

MA

can be variable, but sickle shaped or absent sacral bone are the most characteristic findings (Crétolle, 2008). These malformations are often associated with a presacral mass of variable type including anterior meningocele, teratoma or duplicated rectum (Crétolle, 2008). CS can

D

also be associated with a number of other malformations, the most common being

PT E

genitourinary malformations (i.e. horseshoe or duplex kidneys, duplex ureteres), bicornuate uterus, septate vagina, neurogenic bladder, but also spinal cord anomalies are commonly

CE

described (i.e. tethered cord) (Lynch, 2000).

Currarino Syndrome can be familial or sporadic and clinical signs and symptoms are similar

AC

in both cases. Familial CS is today known as an autosomal dominant disorder caused by mutations in the transcription factor motor neuron and pancreatic homeobox 1 (MNX1). The syndrome is associated with loss of function mutations and haploinsufficiency is thought to be the mechanism of CS. The penetrance is incomplete, asymptomatic mutation carriers are common and expression is highly variable even among family members. Because of clinical variation it is difficult to determine the incidence, but 1-9:100 000 is used as an estimate (Orphanet) . In sporadic patients less than one third have detectable MNX1 mutations (Belloni, 2000; Garcia-Barceló, 2009; Zu, 2011). MNX1 is also involved in the development of the pancreas, but generally no pancreatic abnormalities are found in CS patients (Dalgin, 2011).

4

ACCEPTED MANUSCRIPT The gene is also known to cause neonatal diabetes mellitus in humans, but the patients in this study did not suffer from any kind of diabetes.

The aim of this study was to look for genetic causes of Currarino Syndrome in sporadic patients without detectable disease-causing variants in MNX1.We wanted to look for variants in other genes related to CS, common to all three sporadic patients or in genes known to interact with MNX1.The hypothesis was that MNX1 could be negatively affected through

PT

interactions with other transcription factors or through other regulatory elements, which

RI

subsequently could lead to abnormal function of the gene.

SC

2. Material and methods

NU

2.1 Patients

The project was approved by the Regional Committees for Medical and Health Research

MA

Ethics (REK, Number 2013/807) and all participants have given written informed consent. Three patients were included in this study in addition to their parents. All patients had previously been evaluated by a clinician experienced with Currarino Syndrome and the

D

clinical diagnosis was confirmed. The patients had variable features, but they all had different

PT E

degrees of sacral anomalies including sacrococcygeal agenesis, sacral dysgenesis in addition to coccygeal agenesis and duplicated coccygeal bone. All patients had different variants of presacral masses including anterior meningocele in combination with lipoma, lipomyelocele

CE

and teratoma. Two patients had various degrees of anorectal malformations. The three patients had different additional signs namely vesicoureteric reflux with recurrent urinary tract

AC

infections, limb reduction defect and tethered cord. The genetic cause of the limb reduction defect is not previously known in patient 2. Limb reduction defect is not reported to be associated with CS, so we assume that this is an isolated defect. None of the patients had genital abnormalities. The cognitive status was reported as normal with no intellectual disability. No facial dysmorphism, deafness or microcephalus were reported (Table 1). The parents did not have any known clinical signs related to Currarino Syndrome and were reported as healthy and not affected.

5

ACCEPTED MANUSCRIPT 2.2 Material EDTA-blood and PAX-blood (PreAnalytix) were collected from the patients. DNA was extracted from EDTA-blood using QIAsymphony DSP DNA Midi Kit 96 (version 1) on Qiasymphony (QIAGEN) according to the manufacturer’s instructions and the extraction of RNA from PAX-blood was performed using PAXgene Blood RNA Kit according to standard protocols.

PT

Surgery was performed on patient 2 during the sample collection period and a biopsy from the removed presacral mass (lipomyelocele) was obtained. According to the pathologist this mass

RI

consisted of mature mesenchymal tissue without neurogenic or endodermal tissue. One part of the biopsy was used for direct DNA and RNA isolation and a separate part was cultivated and

SC

grown in vitro. After approximately 14 days a fibroblast culture was established. DNA and RNA were extracted from the fibroblast culture and the biopsy using MasterPure Complete

NU

DNA and RNA Purification kit (Epicentre) according to standard protocols. Both DNA from the EDTA-blood as well as from the biopsy and the fibroblast culture was used in further

MA

analysis for this patient.

D

2.3 Methods

PT E

Before performing whole exome sequencing of the trios the patients had been investigated extensively with different conventional methods to look for aberrations in the patients to explain the patient`s phenotypes. This included karyotyping, array comparative genomic

CE

hybridization (105k), fluorescence in situ hybridization (FISH) with MNX1 probe, Sanger sequencing of coding regions and intron/exon boundaries of MNX1 (NM_005515.3) and

AC

bisulfate sequencing of promoter region of MNX1. All methods showed normal results, thus whole exome sequencing was performed on the trios.

2.3.1 Whole exome sequencing and sequencing analysis

Exome capture was performed with the SureSelect XT Human All Exon v5 Plus library (Agilent Technologies) according to manufacturer`s instructions (Illumina, San Diego, CA). An additional 6 Mb custom region on chromosome 7 (GRCh37/hg19, chr7:153.138.664159.138.663) was included to be able to sequence regulatory elements in non-coding regions

6

ACCEPTED MANUSCRIPT around MNX1and neighboring genes (Fig. 1). Appropriate amounts of enrichment DNA libraries were sequenced on a HiSeq 2000 (Illumina) with 100 bp paired-end reads. Sequence alignment was performed with NovoAlign (V2.07.17). Next, using GATK (V2.4-9), the initial BAM files were realigned and the base quality scores were recalibrated. After marking the duplicates with Picard (V1.74), the final set of alignment data (BAM files) were generated. These final alignment dataset was used for the Single Nucleotide Variant (SNV)

PT

prediction and Copy Number Variant (CNV) prediction programs. SNV prediction was performed using the GATK UnifiedGenotyper and joint variant calling was performed to call

RI

SNVs simultaneously across multiple samples. Detected SNVs were then annotated using

MA

2.3.2.1 Variant analysis

NU

2.3.2 Analysis of whole exome sequencing data

SC

ANNOVAR.

Here, several approaches were used and all models of inheritance were applied for candidate gene identification. The first approach was to filter the variants through a candidate gene list.

D

The gene list was made based on a literature search using keywords: transcription factors,

PT E

development and embryogenesis. Second a de novo analysis of the whole exome was performed. When no relevant genes were found, the third approach was to filter the samples through other inheritance models - autosomal dominant, homozygous recessive and

CE

compound heterozygous.

AC

Following the identification and annotation of SNVs in the sample collection, computationally predicted variants were filtered using Filtus (Vigeland et al., 2016), the inhouse SNV filtration program. These filtered SNVs were then analyzed to identify diseasecausing SNVs. First “bad genes” were filtered out. This is a list of genes that have been reported to give false negative results in exome sequencing and have been downloaded from Fajardo et al. (Fajardo, 2012). Next variants registered in an in-house database were filtered out. This list contains previously detected variants from the Department of Medical Genetics with a frequency > 0.5 %. The reading depth (DP) was set to a minimum of 9 reads and the QualByDepth (QD) which is the quality score normalized by allele depth for a variant was set

7

ACCEPTED MANUSCRIPT to 2. We then excluded the variants that were in regions with no known genes and variants found in dbSNP build 137.

The Integrative Genomics Viewer (IGV) version 2.3 (Robinson, 2011) was used to visualize data and to check for variants manually. dbSNP build 147

(http://www.internationalgenome.org/), Exome Variant Server

PT

(https://www.ncbi.nlm.nih.gov/projects/SNP/), 1000 Genome project

(http://evs.gs.washington.edu/EVS/) and ExAC Browser database

SC

RI

(http://exac.broadinstitute.org/) were used to check for known variants.

NU

Similar to SNVs, CNVs identified in exome samples were also analyzed to detect disease casing variants. Here, annotated CNVs were filtered using the candidate gene list, and rare CNVs were selected based on an in-house CNV database, CNV score, a high-quality stringent

MA

map of Database of Genomic Variants (DGV), 1000 Genomes CNVs and ClinVar variants (https://www.ncbi.nlm.nih.gov/clinvar/). Remaining rare CNVs were evaluated in light of

PT E

the exomes tested in this study.

D

clinical phenotype and assessed using IGV. However, disease-causing CNVs were not fond in

CE

2.3.3 Verification of variants

A complete list of all genes or nearby genes with detected sequence variants after filtering with Filtus in each patients with all models of inheritance are listed in the Supplement 1. All

AC

variants were validated by Sanger sequencing of DNA from leucocytes from both patients and parents. Primers was designed by Primer 3 Plus (http://primer3plus.com/) and purchased from Eurofins MWG Operon. Primer sequences can be obtained by request. The sequencing products were sequenced on an ABI 3730xl DNA Analyzer (Life Technologies) and analyzed using SeqScape Software (Life Technologies).

8

ACCEPTED MANUSCRIPT 3. Results 3.1 Trio 1 – de novo variant In patient one 114 883 SNVs were detected in 17 608 genes. Six de novo variants were detected after filtering had been performed using Filtus (Table 2), of which five were located in non-coding regions. One of the six detected variants was located within the coding region of the gene ETV3L (NM_001004341.2). This variant was located in exon 3 in position

PT

c.376G>A (p.Val126Ile). It was confirmed with Sanger sequencing and was not detected in the parents. The variant was registered in the dbSNP build 147 as rs371804655, but had no

RI

frequency data. It was not detected among 1300 in-house blood donor alleles, in the 1000

SC

Genomes Project nor in the Exome Variant Server databases. In ExAC Browser (Beta version) it was not reported among 6614 Europeans (Non-Finnish), but seen two times in South Asian population with a frequency of 2/16352. The overall frequency was 0.00001651 % (Table 3).

NU

PolyPhen (Adzhubei, 2010) predicted this variant to be benign and SIFT (Ng, 2001) predicted

MA

it to be tolerated (Table 3).

3.2 Trio 2 – dominant variant

D

In patient two 114 242 SNVs was detected in 17 459 genes (Table 2). No relevant de novo

PT E

variants were detected, but a dominant variant in a possibly relevant gene ARID5A (NM_001319085.1) was found. This missense variant was located in exon 3 in position c.164G>T (p.Arg55Leu) and located in an AT-rich interactive domain

CE

(http://www.uniprot.org/). The variant was also detected in the patient`s mother. This was confirmed with Sanger sequencing in blood and abnormal tissues, both DNA isolated directly

AC

from the removed presacral mass and in DNA from cultivated fibroblasts from the presacral mass from the patient, and in blood from the mother. In dbSNP build 147 the variant was registered as rs61748139 with an allele frequency in ESP cohort population of 99.7 % (G) and 0.3 % (T). In ExAC Browser (Beta version) this variant was reported with an allele frequency of 0.001628 % in European (Non-Finnish) population and an overall frequency of 0.00115 %. SIFT predicted this variant as tolerated and PolyPhen as possibly damaging (Table 3). In this patient we also found a variant in intron 1 in MNX1 (c.692-38G>A) in both the patient and his mother. None of the other participants in this study had intron variants in MNX1 with 9

ACCEPTED MANUSCRIPT frequency <1%. According to Alamut splice prediction software (www.interactivebiosoftware.com/alamut-visual), the variant could possibly lead to a cryptic acceptor splice site in MNX1. To determine if the variant could lead to a splicing error, mRNA of exon 1-2 was Sanger sequenced, with negative result (Supplement 2). It is therefore unlikely that this intronic variant is causative in this patient’s disease.

PT

3.3 Trio 3 – de novo variant

In patient three 115 918 SNVs were found in 17 471 genes. 11 de novo variants were detected

RI

after filtering had been performed using Filtus (Table 2). One de novo variant of particular

SC

interest was found in intron 11 in the gene NCAPD3 (NM_015261.2) in position c.1468+2T>C, which was predicted by Alamut to cause a loss of the donor splice site after

NU

exon 11 (Table 3) and skip of exon 11 was likely. To determine if the variant resulted in differential splicing, RNA sequencing of cDNA-converted mRNA was performed. The c.1468+2T>C variant was found to cause an insertion of 95 bp of intron 11, which leads to a

MA

premature stop codon (TGA) in position c.1468+46 in the inserted sequence (Fig. 2). The

PT E

D

variant was named p.S490Sfs*15.

4. Discussion 4.1 Hypothesis

CE

The hypothesis was that MNX1 could be negatively affected through interactions with other transcription factors or through other regulatory elements which subsequently could lead to

AC

abnormal function of the gene. Other events leading to abnormal expression of MNX1 have been described in the literature. Thompson et al. has shown that binding of PTF1a upstream of MNX1 is involved in expression of MNX1 (Thompson, 2012). It is also known that mutations in PCSK5 will affect the expression of MNX1 (Tsuda, 2011). Ultra-conserved elements upstream of homeobox genes are also associated with genes that are involved in control of development (Woolfe, 2005). Based on this information we searched for sequence variants in other transcription factor genes known to interact with MNX1 and in non-coding regions around MNX1. Furthermore, we searched for variants in other genes. We have previously examined conserved regions around MNX1 in these patients by Sanger sequencing, but did not detect any variation of interest (unpublished). In this study we wanted to extend 10

ACCEPTED MANUSCRIPT this region with a 6 Mb sequence. Therefore, we performed whole exome sequencing with an additional custom made region on chromosome 7 to be able to detect regulatory elements in non-coding regions around the MNX1 gene. We did not find any relevant variants in genes of interest shared between the patients. However, in each CS patient, we did find variants on other chromosomes worth investigating as they may be of relevance to CS.

PT

4.2 ETV3L (c.376G>A)

ETV3L (ETS Variant 3 Like) is an ETS transcription factor, which is one of the largest

RI

families of transcription factors. It is implicated in the development of different tissues and in

SC

cancer progression and is involved in the equilibrium between proliferation and differentiation. ETS transcription factors also have a function in cell cycle control, cell migration, apoptosis

NU

and angiogenesis. This group of genes can be both transcriptional repressors and/or activators. Only two papers are published about ETV3L on PubMed.

MA

We believe that despite software prediction of c.376G>A as a benign variant, the low frequency as well as the predicted gene-function of ETV3L makes the gene a candidate for

PT E

D

further investigations in the future.

4.3 ARID5A (c.164G>T)

ARID5A (AT-rich interactive domain-containing protein 5A) is a helix-turn-helix motif

CE

binding domain that mediates several downstream functions, and seem to play an important role in development, tissue-specific gene expression, as well as regulation of cell growth

AC

(Patsialou, 2005). It has been demonstrated that ARID5A interacts with SOX9 and enhances SOX9 induced chondrocyte-specific transcription. ARID5A is highly expressed in cartilage and induced during chondrocyte differentiation (Amano, 2011). The c.164G>T variant in ARID5A was also found in the patient’s mother, which has no known manifestation of Currarino Syndrome. Also we cannot know if the mother has hidden signs as the parents have not been fully examined.

11

ACCEPTED MANUSCRIPT 4.4 NCAPD3 (c.1468+2T>C) NCAPD3 (non-structural maintenance of chromosome condensin II complex subunit D3) is one of three non-structural maintenance of chromosome subunits that define condensin II (Ono, 2003). Condensin complexes I and II play essential roles in the assembly and segregation of the chromosome during mitosis. NCAPD3 is a subunit of the condensin II protein complex and is involved in chromosome condensation. It can lead to disrupt condensation of chromosomes and give segregation errors. Furthermore NCAPD3 is

PT

highlighted as an outcome predictor in Pancreatic Ductal Adenocarcinoma (Dawkins, 2016). MNX1 also is known to be involved in early pancreatic development and has recently been

RI

identified to have a role in the development of pancreatic cancer (Bailey, 2016).

SC

The sequence variant (c.1468+2T>C) found in NCAPD3 in patient 3 led to an insertion of 95 bp of intron 11 and a premature stop codon was detected in position c.1468+46 (Fig. 2). Thus,

NU

it is likely that this transcript will result in a truncated protein. More research is needed to

MA

determine whether this gene is relevant to CS or not.

4.5 Future perspective

D

We did not succeed in finding a common Currarino Syndrome gene for the sporadic patients.

PT E

A possible reason for this may be undetected variants due to bad coverage, deep intronic or intragenic variation that has not been picked up or variants removed during filtration. Intronic variants and rare missense variants found in single patients are also hard to interpret. Based on

CE

our findings it may seem like sporadic CS is genetically heterogeneous rather than being caused by defects in one common gene. The patients in this study all present with different

AC

clinical features which could indicate that different genes are involved in the pathogenesis. Dworschak et al. (Dworschak, 2016) reported recently a de novo partial duplication of the long arm of chromosome 3 (3q26.32-q27.2) in a patient with CS and additional features. The diagnosis of each patient should be reconsidered in order to discover any missing signs. It is important to have a complete clinical diagnosis before trio testing and complete clinical evaluation including MRI of the sacrum and the rectum. In this study we only had MRI of the patients. It would have been useful to have MRI of all family members to exclude incomplete penetrance which is well known in MNX1 positive CS patients. By performing whole exome sequencing we have identified variants in three different genes that could be of potential interest for future research of sporadic Currarino Syndrome. These variants were further 12

ACCEPTED MANUSCRIPT tested in three other sporadic CS patients as well as 14 familial MNX1 positive CS patients, but none of the patients had the same variants. The variants should in the future be tested in a larger cohort of patients with similar clinical signs. Also we cannot exclude that environmental factors can contribute to the development of this syndrome. The International consortium on Anorectal Malformation found an association between ARM and family history of ARM, and epigenetic factors in both the German and the Dutch cohort (Wijers,

RI

PT

2010).

5. Conclusion

SC

We have suggested three candidate genes, ETV3L, ARID5A and NCAPD3, all of which are involved in processes of cell growth and differentiation. NCAPD3 is the most promising

NU

candidate gene possibly being involved in the pathogenesis of CS, however further

MA

investigations in larger cohorts of all three genes are required.

Acknowledgements

D

The sequencing service was provided by the Norwegian Sequencing Centre. We would like to

PT E

thank all the patients and parents for participating in this study. Thanks to Eli Ormerod for performing the karyotyping, Øystein Mathias Sauar Olsen for his patience teaching us how to

CE

do FISH and Magnus Dehli Vigeland for his helpfulness with Filtus.

AC

Conflict of interest

The authors have no conflict of interest.

Funding This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.

13

SC

RI

PT

ACCEPTED MANUSCRIPT

Adzhubei, I.A., 2010. A method and server for predicting damaging missense mutations. Nature

NU

Methods.

Alamo, L., 2013. Anorectal Malformations: Finding the Pathway out of the Labyrinth. RadioGraphics

MA

33.

Alamo, L., Meyrat, B.J., Meuwly, J.Y., Meuli, R.A. and Gudinchet, F., 2013. Anorectal Malformations: Finding the Pathway out of the Labyrinth. Radiographics 33, 491-512.

D

Amano, K., 2011. Arid5a cooperates with Sox9 to stimulate chondrocyte-specific transcription. Mol

PT E

Biol Cell.

Bailey, P., 2016. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. Belloni, E., 2000. Involvement of the HLXB9 homeobox gene in Currarino syndrome. Am J Hum Genet. Crétolle, C., 2008. Spectrum of HLXB9 gene mutations in Currarino syndrome and genotype-

CE

phenotype correlation. Human Mutation 2008 Jul;29(7):903-10. doi. Dalgin, G., 2011. Zebrafish mnx1 controls cell fate choice in the developing endocrine pancreas.

AC

Development 138(21).

Dawkins, J., 2016. Reduced Expression of Histone Methyltransferases KMT2C and KMT2D Correlates with Improved Outcome in Pancreatic Ductal Adenocarcinoma. Cancer Res. Dworschak, G., 2016. Comprehensive review of the duplication 3q syndrome and report of a patient with Currarino syndrome and de novo duplication 3q26.32-q27.2. Clin Genet. Fajardo, F., 2012. Detecting false positive signals in exome sequencing. Hum Mutat. Garcia-Barceló, M., 2009. MNX1 (HLXB9) mutations in Currarino patients. J Pediatr Surg. Lynch, S., 2000. Autosomal dominant sacral agenesis; Currarino syndrome. Journal of medical genetics 2000 Aug;37(8):561-6.

14

ACCEPTED MANUSCRIPT Monclair, T., 2013. Currarino syndrome at Rikshospitalet 1961-2012. Tidsskrift for Den norske legeforening. Ng, 2001. Predicting deleterious amino acid substitutions. Genome Res. Ono, T., 2003. Differential contributions of condensin I and condensin II to mitotic chromosome architecture in vertebrate cells. Cell. Patsialou, A., 2005. DNA-binding properties of ARID family proteins. Nucleic Acids research. Robinson, J.T., 2011. Integrative genomics viewer. Nature Biotechnology.

PT

Stoll, C., 2007. Associated malformations in patients with anorectal anomalies. European Journal of Medical Genetics. 10.1016/j.ejmg.2007.04.002.

RI

Thompson, N., 2012. RNA profiling and chromatin immunoprecipitation-sequencing reveal that

of other transcription factors. Mol Cell Biol.

SC

PTF1a stabilizes pancreas progenitor identity via the control of MNX1/HLXB9 and a network

Tsuda, T., 2011. PCSK5 and GDF11 expression in the hindgut region of mouse embryos with anorectal

NU

malformations. Eur J Pediatr Surg.

Vigeland, M.D., Gjotterud, K.S. and Selmer, K.K., 2016. FILTUS: a desktop GUI for fast and efficient

MA

detection of disease-causing variants, including a novel autozygosity detector. Bioinformatics 32, 1592-4.

Wijers, C., 2010. Research perspectives in the etiology of congenital anorectal malformations using

D

data of the International Consortium on Anorectal Malformations: evidence for risk factors

PT E

across different populations. Pediatr Surg Int. Woolfe, A., 2005. Highly conserved non-coding sequences are associated with vertebrate Development. PloS Biol.

CE

Zu, S., 2011. Mutation analysis of the motor neuron and pancreas homeobox 1 (MNX1, former HLXB9)

AC

gene in Swedish patients with Currarino syndrome. J Pediatr Surg.

15

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

RI

PT

Fig.1. Overview of the 6 Mb custom region (chr7:153.138.664-159.138.663) and neighboring RefSeq genes (UCSC Genome Browser) included in the whole exome sequencing. MNX1 is marked with a red circle.

16

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

RI

Fig.2. A de novo variant in intron 11 (c.1468+2T>C) in NCAPD3 (NM_015261.2) was found to cause an insertion of 95 bp consisting of sequence from intron 11. Here illustrated by electropherogram (left) and schematic (right). The inserted sequence led to a premature stop codon (TGA) in position c.1468+46 here marked by an red arrow (left and right).

17

ACCEPTED MANUSCRIPT Table 1. Clinical description of sporadic Currarino Syndrome patient 1-3. Patient 2

Patient 3

Age

16 years old

2 years old

15 years old

Gender

Female (46,XX)

Male (46,XY)

Female (46,XX)

Anorectal malformations

Anal stenose

Anal stenosis

None

Sacral anomalies

Sacrococcygeal agenesis

Sacral dysgenesis left side +

PT

Patient 1

SC

RI

Duplicated coccygeal bone

Anterior meningocele + lipoma

Urinary tract

Vesicoureteric reflux, recurrent urinary tract infections

Lipomyelocele

MA

Presacral mass

NU

coccygeal agenesis

No

No

PT E

D

anomalies

Teratoma

No

No

No

Spinal cord anomalies

No

Tethered cord

Tethered cord

Limb reduction defects

No

Absent toes left foot

No

Intellectual disability

No

No

No

Microcephalus

No

No

No

Facial dysmorphism

No

No

No

Deafness

No

No

No

AC

CE

Genital tract malformations

18

ACCEPTED MANUSCRIPT Table 2. Results of exome data after filtering the sequence variants in the inhouse SNV filtration program Filtus. Here showing the number of variants in number of genes in patient 1-3 after each filtration in the de novo model.

Number of variants

Number of variants

(Number of Genes)

(Number of Genes)

(Number of Genes)

Patient 1

Patient 2

No filter

114 883 (17 608)

114 242 (17 459)

115 918 (17 471)

“Bad genes”

108 051 (16 959)

106 965 (16795)

In-house SNP database

22 511 (3150)

22 086 (3025)

23 598 (3181)

DP>9

20 561 (2947)

QD>2

17 982 (2425)

Variants in regions

17 326 (2420)

RI

SC

21 682 (2929)

17 791 (2383)

19 153 (2519)

17 770 (2382)

18 383 (2514)

2065 (1113)

2238 (1091)

2397 (1171)

6 (3)

4 (4)

11 (11)

NU

20 206 (2816)

D

PT E

AC

CE

de novo

Patient 3

108 485 (16 805)

with no known genes Variants in dbSNP 137

PT

Number of variants

MA

Filter

19

ACCEPTED MANUSCRIPT Table 3. Summary of the most relevant variants in patient 1-3 after filtering with Patie

Model

nt

of

Gene

cDNA

Protein

position

inherita

Functi

dbSNP

on

147

rs371804

0.0000165

PolyPh

L

A (exon

le

nse

655

1%

en:

(2*/121

PT

Misse

170)

SC ARID5 c.164G>T A

SIFT: tolerate d

Misse

rs617481

0.00115

PolyPh

eu

nse

39

%

en:

MA

(139/120

D

838)

possibl y damagi ng

PT E CE

benign

p.Arg55L

(exon 3)

SIFT: tolerate d

NCAP

c.1468+2

p.S490Sfs

Splici

D3

T>C

*15

ng

AC

de novo

ons

p.Val126I

nt

3

Predicti

c.376G>

NU

domina

e

ETV3

3)

2

Softwar

RI

de novo

(ExAc Browser)

nce 1

Frequency

-

-

Skip of exon 11

(intron

is

11)

likely**

the in-house filtration program Filtus. *South Asian population. ** mRNA analysis showed an insertion of 95 bp of intron 11 and a premature stop codon in position c.1468+46.

20

ACCEPTED MANUSCRIPT Highlights: 

We performed whole exome sequencing of three trios with sporadic Currarino Syndrome



We have included an extra 6 Mb region in the whole exome sequencing to be able to detect sequence variation in non-coding regions around the MNX1 gene



We have whole exome sequenced DNA from the removed presacral mass from one of



PT

the patients

To our knowledge no one has previously published whole exome sequencing results in

CE

PT E

D

MA

NU

SC

We have suggested some potential candidate genes for each trio

AC



RI

Currarino Syndrome patients

21