Comparing the utility of host and primary endosymbiont loci for predicting global invasive insect genetic structuring and migration patterns

Comparing the utility of host and primary endosymbiont loci for predicting global invasive insect genetic structuring and migration patterns

Accepted Manuscript Use of the Brown Marmorated Stinkbug’s primary symbiont for population genetic analyses Alejandro Otero-Bravo, Zakee L. Sabree PII...

395KB Sizes 0 Downloads 35 Views

Accepted Manuscript Use of the Brown Marmorated Stinkbug’s primary symbiont for population genetic analyses Alejandro Otero-Bravo, Zakee L. Sabree PII: DOI: Reference:

S1049-9644(17)30074-9 http://dx.doi.org/10.1016/j.biocontrol.2017.04.003 YBCON 3572

To appear in:

Biological Control

Received Date: Revised Date: Accepted Date:

7 January 2017 28 March 2017 3 April 2017

Please cite this article as: Otero-Bravo, A., Sabree, Z.L., Use of the Brown Marmorated Stinkbug’s primary symbiont for population genetic analyses, Biological Control (2017), doi: http://dx.doi.org/10.1016/j.biocontrol.2017.04.003

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Title: Use of the Brown Marmorated Stinkbug’s primary symbiont for population genetic analyses Alejandro Otero-Bravo1 and Zakee L. Sabree1* 1 Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA *Corresponding Author: 318 W 12th Ave, Room 300, Columbus, OH, USA. [email protected] Highlights Invasive stinkbug and its bacterial symbiont show low nucleotide diversity. Vertically inherited bacterial symbiont shows lower diversity than its host. Host haplotypes are structured within symbiont haplotypes. Abstract Halyomorpha halys, commonly known as the Brown Marmorated Stinkbug, is a highly polyphagous invasive pest introduced from East Asia into North America and Europe. It harbors ‘Candidatus Pantoea carbekii’, an obligately-associated, vertically-inherited gamma-proteobacterial mutualist. We evaluated the use of this symbiont as a proxy for measuring host diversity, distribution, and phylogeography. Despite the symbiont’s accelerated molecular evolution, the symbiont genome shows relatively lower genetic diversity and structuring compared to the host mitochondrial genome in both native and invaded ranges. Therefore, we conclude that P. carbekii is not as effective as the host mitochondria for determining recent host population history and migration. Keywords Brown Marmorated Stinkbug, ‘Candidatus Pantoea carbekii’, vertically transmitted endosymbiont, invasive species, genetic diversity. 1. Introduction The Brown Marmorated Stinkbug, Halyomorpha halys (Pentatomidae), (henceforth called BMSB) is an invasive pest native to eastern Asia that has been recently introduced into North America and Europe. BMSB is highly polyphagous, attacking a wide range of plants from up to 45 different families (Lee et al., 2013), including many economically important crops such as apple, soybean, and corn (Leskey et al., 2012). Initially detected in North America in Allentown, PA in 1996 (Hoebeke and Carter, 2003), and in Europe in Zurich-Seefeld, Switzerland in 2004, it has since rapidly expanded, reaching 43 US states and two Canadian provinces (StopBMSB, June 2016) in North America and 8 countries in Europe, being widespread in Switzerland (CABI, 2016). BMSB harbors a primary, obligately-associated, vertically-transmitted endosymbiont, ‘Candidatus Pantoea carbekii’ (henceforth called P. carbekii), that is the sole inhabitant of host midgut gastric invaginations called caeca (Bansal et al. 2014). P. carbekii is vertically transmitted from mother to offspring through symbiont-enriched gastric secretions posteriorly deposited on the eggs. As only a subset of the total maternally-

associated P. carbekii population is transmitted to offspring, a population bottleneck that is commonly observed in insect endosymbionts (Wernegreen, 2015) is created each generation for the symbiont, leading to reduced effective population size and magnified impacts of genetic drift (Wernegreen, 2015). On the other hand, given the extracellular nature and transfer of this symbiont there is a possibility of some degree of horizontal transfer between individuals (Kikuchi et al., 2009). P. carbekii exhibits many traits typically observed in bacterial mutualists of insects: no in vitro cultivation, bears a relatively reduced genome (1.15 Mb) that retains many essential and nonessential amino acid and vitamin biosynthesis pathways (Bansal et al., 2014; Kenyon et al., 2015), and its presence enables the host to develop normally (Taylor et al., 2014). Other stinkbugs from the Pentatomidae have shown similar associations with extracellular gut symbionts (Otero-Bravo and Sabree, 2015), and display a monophyly of symbionts within the species, genus, and sometimes subfamily of the host indicative of co-speciation (Bistolas et al., 2014; Duron and Noël, 2016). Additionally, other genomereduced bacterial symbionts of insects show accelerated rates of molecular evolution (Hosokawa et al., 2013, 2006; Kikuchi et al., 2009; Nikoh et al., 2011). Mitochondrial loci are often used for tracing the movement and origins of introduced species due to their relatively rapid rates of molecular evolution compared to nuclear loci. Previous studies have used BMSB mitochondrial regions (COI, CYTB, ITS1, 12S+CR) to identify host diversity, population history, distribution, spread, and possible source populations in the three continents (Gariepy et al., 2013; Xu et al., 2014; Zhu et al., 2016). A different strategy that has not been exploited in BMSB is to use symbiont loci for the same purpose. Symbionts, particularly parasitic species, have successfully been used to trace the source populations and movement of their hosts (Nieberding and Olivieri, 2007). While symbionts may or may not show similar phylogeographic patterns as their hosts (Espíndola et al., 2014), they can have different rates of molecular evolution than their hosts (Hafner et al., 1994) and in some cases show even higher resolution than host loci (Criscione et al., 2006; Funk et al., 2000). This can be due to higher structuring of the symbiont population within the hosts’ sub-population, different life history parameters such as generation time and effective population size, and the symbiont’s method of transmission between the host individuals. P. carbekii has some life history characteristics such as stable vertical transmission, and a different effective population size from its host that make it a prime candidate for its usage as a proxy for understanding BMSB population dynamics. We investigated whether the rate of molecular evolution of P. carbekii was accelerated as in the other symbionts, which would further indicate its use as a useful marker for the host. Then, we evaluated the genetic diversity and distribution of P. carbekii and its relation to the host genetic diversity obtained with mitochondrial markers in populations across its introduced range, as well as a population in its native range identified as the most likely source of the American invasion (Xu et al., 2014). We hypothesized that the symbiont, due to a high rate of molecular evolution, would show greater genetic variation and geographic structuring relative to its host and allow a more detailed picture of its distribution and spread.

2. Methods 2.1 Accelerated rate of molecular evolution in P. carbekii To compare the rate of sequence evolution between P. carbekii and it’s congenerics, we calculated the average pairwise identity of the entire 16S region for two datasets: a) only congenerics (hereafter called ‘Pantoea’) of P. carbekii and b) congenerics and members of the sister genus Erwinia (hereafter called ‘P+E’). We extracted all sequences from the SILVA database (Quast et al., 2013) belonging to the taxonomical classification Pantoea, and Erwinia with sequence length >1500 and sequence quality >90 preserving common gaps. We performed the Tajima relative rate test (Tajima, 1993) on the 16S sequences, as well as the conserved genes rpoB and dnaE as implemented in the R package ‘pegas’ using a variety of outgroups. Accession numbers of sequences used are shown in Table S3. We evaluated significance under an alpha value of 0.05 as well as a Bonferroni corrected alpha value when comparing simultaneously across multiple samples. 2.2 Collection and DNA extraction Individuals were sampled from three populations of the United States: California (38°33’20” N, 121°28’08” W), Ohio (40°48’33” N, 81°56’14” W), and Michigan (42°35’38” N, 86°6’13” W) (n=34) (see Bansal, et al., 2014), one population from their native range in China (Langfang, Hebei province, n=18), and three populations in Switzerland (Canton Ticino, Lugano; Canton Basel, Basel; and Canton Zurich, Zurich, n=18). Samples were stored in 70% ethanol before being transported. Individuals were rinsed in 70% ethanol, and dissected with sterile forceps to extract the V4 region of the midgut. DNA extraction was done with the Qiagen DNEasy Blood and Tissue kit according to the manufacturer’s instructions. For host loci, DNA was extracted from muscle tissue or used from the same DNA as the symbiont loci. A list of primers used in this study can be found in Table S1. PCR was performed with GoTaq Green Master Mix from Promega and amplicons were cleaned with the Zymo Research DNA Clean and Concentrator before Sanger sequencing in both directions. Quality trimming, alignment, and base calling were done on Geneious 8.1.8. Unique haplotypes were amplified and sequenced twice to account for PCR errors. 2.3 Loci variability in the invaded range Five loci identified as putative pseudogenes from the annotation of the P. carbekii genome (∆ybgF, ∆ftsN, ∆speA, ∆transglosylase C, ∆yigL) and a 345 bp region of the 16S rRNA (coordinates 649995 to 650339 from NZ_CP010907) were selected to identify variability in the symbiont populations in the invaded range. Primers were designed using Primer3 as implemented in Geneious 8.1.8. 30 individuals from 3 locations in the United States (California, Ohio, Michigan) were sequenced for all loci. PCR conditions consisted of an initial denaturation at 95°C for 3 min, followed by 30 cycles consisting of 95°C for 30 s, 50°C for 30 s, and 72°C for 30 s, and a final extension at 72°C for 2 min. Sequences were uploaded to GenBank with accession numbers KY379170-KY379176. Sequences were aligned to the previously sequenced genomes (NZ_CP010907 and NC_022547). Nucleotide diversity and haplotype diversity were compared where possible, and linkage disequilibrium and correlation for mutations was calculated using DnaSP 5.10 (Librado and Rozas, 2009).

2.4 Host-symbiont haplotype correlation Two more markers were selected based on high diversity identified between the two sequenced genomes of P. carbekii: NC_022547 from Tsukuba, Japan and NZ_CP010907 from Wooster, Ohio, United States. Previous studies using host markers have identified these populations as separate with some gene flow (Xu et al., 2014; Zhu et al., 2016). The two regions chosen were the hypothetical protein CDS with similarity to a primosomal protein or “primo” and the gene for 2-oxoglutarate dehydrogenase subunit E1 (odhA). The primo region is a 232 bp region with 3 SNPs between the two sequenced strains, while odhA is a 408 bp region with 4 SNPs between the two strains, both significantly higher than the average of 1 SNP per kb identified in the whole genome (Kenyon et al., 2015). PCR conditions consisted of an initial denaturation at 94°C for 3 min, 11 touchdown cycles consisting of 94°C for 30 s, 63°C for 30 s, decreasing by 1°C each cycle, and 72°C for 45 s, followed by 19 regular cycles with 94°C for 30 s, 53°C 30 s, and 72°C for 45 s, and a final extension at 72°C for 5 min. Sequences were uploaded to GenBank with accession numbers KY379163-KY379169. Nucleotide diversity, haplotype diversity, and population diversity were compared where possible and linkage between mutations was calculated with DnaSP v5.10.1 (Librado and Rozas, 2009). 2.5 Comparison of symbiont and host markers Two mitochondrial loci, COII and 12SCR, used by Xu et al., 2014 were chosen to compare with symbiont loci. PCR conditions were as described in the original paper with the following modifications: the first step was replaced with a 4 cycle touchdown starting with an annealing temperature of 52°C, decreasing by 1°C each cycle, and also increasing the second step’s annealing temperature to 49°C. Nucleotide diversity, haplotype diversity, and population diversity were compared where possible and linkage between mutations was calculated with DnaSP. Haplotype networks and analysis of molecular variance (AMOVA) were done using Popart 1.7.2 (Leigh and Bryant, 2015). 3. Results 3.1 Acceleration in the rate of molecular evolution in P. carbekii We obtained 482 16S sequences for the Pantoea dataset and 606 for the P+E dataset. For the six 16S sequences available on the SILVA database for P. carbekii, the average pairwise identity to all other sequences varied between 93.5 and 94%, while average pairwise identities between the other sequences ranged from 96.5 to 98% (See Supplementary Table S2). When using the second dataset, P+E, that included Erwinia, the average divergence for P. carbekii’s sequences ranged between 93 and 94%, while for all other sequences ranged from 96 to 98%. When comparing P. carbekii to 18 other strains of the Pantoea+Erwinia clade using the relative rate test with E. coli as an outgroup, significant p-values were obtained even when using the Bonferroni correction, the largest being 0.000379 for Pantoea ananatis LMG 20103. When compared to other strains of the Enterobacteria using Pseudomonas protegens Pf-5 we only found significance with the corrected p-value to Candidatus ‘Ishikawaella gelatinosa’, another stinkbug endosymbiont. Comparisons to another endosymbiont in the dataset, Buchnera aphidicola from Aphis glycines was only

significant under the uncorrected alpha level. When using Proteus vulgaris, a basal member of the Enterobacteriaceae, significant p-values are obtained for all strains except Citrobacter rodentium, when using an uncorrected alpha value, but not all if using a corrected alpha value. If using closer outgroups such as Yersinia pestis KIM10+ or Serratia marcescens, significance is found for all strains even using the corrected alpha value. Results from the relative rate test are found on Table S4. We found similar results when doing the test on the coding genes rpoB and dnaE, with the exception that comparisons to P. vulgaris were not significant as opposed to the most distant outgroup Aeromonas hydrophila. For comparison, the same test for Pantoea rodasii and Pantoea dispersa do not show significance for most comparisons even using the uncorrected alpha value (See Table S4). 3.2 Symbiont haplotype identification We sequenced amplicons generated for six P. carbekii loci (regions within five putative pseudogenes: ∆ybgF, ∆ftsN, ∆speA, ∆transglosylase C, ∆yigL; and a 345 bp region within the 16S rRNA) in three populations in the United States. Of these loci, only ∆ybgF showed any variability, with one segregating site dividing populations into two haplotypes. Parameters estimated for these populations using this region are shown in Table 1. Due to the lack of variability in the remaining markers, two more P. carbekii loci, odhA, and primo, were used. odhA, primo, and ∆ybgF loci were sequenced from multiple individuals obtained from the native (Chinese, CH, n=13) and invaded (American, Ohio, n=3) populations. Two haplotypes were identified, with 4 SNPs separating them across the three genes, one SNP in ∆ybgF, one SNP in the primo region, and two SNPs in the odhA region. In all cases, each gene had two versions which were always correlated with the versions in the other genes (correlation coefficient R=1.000, Chi-Square test p < 0.0001 for all six pairwise comparisons). Nucleotide diversity in the source population was 0.00191 if considering all three regions. Therefore, we identified these two haplotypes as P1 and P2 and used only the ∆ybgF gene for further analysis. Haplotype distributions for ∆ybgF are shown in Fig. 1a-c 3.3 Host-symbiont haplotype genotyping and correlation Using the previous samples and additional ones we identified host haplotypes using the markers described by Xu et al., 2014 and the symbiont haplotypes using ∆ybgF per individual. We identified a total of two symbiont haplotypes (P1-P2), three host 12S haplotypes (12S-A, 12S-B, 12S-C) and four host COII haplotypes (CO2-A, CO2-B,CO2C, CO2-D) which combined yielded six host haplotypes (named Hh1 through Hh6, see Table 2). The two symbiont haplotypes were identified in the native population, while all individuals from America contained P1 and those from Europe contained P2. Only one host haplotype was found in the American population (Hh1) which was also present in the Chinese population in a high proportion (41%), while three host haplotypes were found in the European populations (Hh3, Hh4 and Hh6). The population from Canton Zurich, in the north of Switzerland contained two host haplotypes, Hh3 and Hh6, with only Hh3 being present in the native population. The population from Ticino, in southern Switzerland contained one haplotype, Hh4, which was also present in the native population. The Chinese population showed the greatest diversity with 5 of the 6 haplotypes identified (not including Hh6). Haplotype and nucleotide diversity based for

all populations is summarized in Table 3. Haplotype network of the host genes is also shown in Figure 2. The AMOVA for host genes yielded a ΦST of 0.45 indicating a large genetic differentiation with p<0.001, and for symbiont populations yielded a ΦST of 0.83 indicating even larger genetic differentiation with p<0.001. The symbiont P1 haplotype was detected in the Hh1 and Hh2 hosts and the P2 haplotype was present in the Hh3-6 hosts. While the 2x5 contingency table does not have sufficient statistical power due to a low number of individuals (not all expected outcomes were above 5), the calculated correlation value is 1. The correlation can be observed with host and symbiont haplotype networks as shown in Figure 2. Since the quality of the sequenced regions for the host genes were not uniform around the flanks, the sequence used was only 351 bps for 12S and 322 bps for COII (as opposed to 552 and 534 bps for the previously published sequences, respectively). When comparing the sequenced regions to the sequences obtained by Xu et al., 2014, the 43 unique haplotypes from these authors (named H1 through H43) were collapsed into 24 unique haplotypes. After collapsing these haplotypes, no collapsed group contained sequences collected from more than one of the three main regions sampled (China, Korea, Japan). Five of the six identified haplotypes in this study were identical to one of the haplotypes from the previous study, with Hh1 corresponding to either H2 or H23, Hh2 to either H1, H25, or H26, Hh3 to one of H7, H14, H21, H22, Hh4 to one of H3, H4, H5, H12, H16, H18, H19, and Hh6 to H35 or H40. Hh5 was not identical to any of the published sequences. The host haplotype found in America, Hh1, is identical to H2, the haplotype previously identified as the dominant haplotype in invasive American populations. The host haplotype recovered from the Canton Ticino, in the south of Switzerland, as well as from one individual from the native population, Hh4 is identical to haplotype H3, which has been previously found in this location and in the native population (Gariepy et al., 2015). The haplotype Hh3 found in most individuals of the population in the Canton Zurich, north of Switzerland, was not identical to any haplotype identified in this region, but was identical to several ones identified previously in China (H7, H14, H21, H22). The other haplotype found in this location as a singleton, Hh6, which was not identified in the source population, was identical to H35 and H40 both haplotypes identified from Japanese populations. 4. Discussion 4.1 P. carbekii has a high rate of molecular evolution as other insect symbionts P. carbekii showed a high 16S divergence from other members of its genus and the sister genus Erwinia. The 16S divergence is higher between P. carbekii and its congenerics than the average difference between Pantoea and Erwinia species. However, phylogenetically, P. carbekii is firmly placed in the Pantoea + Erwinia clade using multilocus sequence analysis (Kenyon et al., 2015). Relative rate tests show that P. carbekii has a significantly different rate of molecular evolution to other members of the Pantoea and Enterobacteriaceae even when comparing at multiple levels of closeness. We found that choosing outgroups too distant to the focal taxa when using this test could

result in all comparisons being significant or none at all, even when using stringent corrections for multiple comparisons. We evaluated different outgroups and found that P. carbekii has a different rate than it’s congenerics and other members of the Enterobacteriaceae, similar to other insect endosymbionts (Hosokawa et al., 2013, 2006; Kikuchi et al., 2009; Nikoh et al., 2011). We found that for the 16S gene as well as two coding regions, rpoB and dnaE, P. carbekii also has a different rate than I. capsulata and B. aphidicola, two insect symbionts with accelerated mutation rates. However, in these cases, both other symbionts have higher rates than P. carbekii. This also correlates with the fact that while P. carbekii’s genome is 1.15 Mb, I. capsulata’s is 0.75 and B. aphidicola’s is 0.64. This is consistent with the hypothesis that longer, more stable relationships yield smaller genomes and simultaneously faster rates of molecular evolution (Hosokawa et al., 2016). 4.2 P. carbekii has lower diversity than H. halys mitochondria We identified two symbiont haplotypes in 69 individuals spanning a native population and six invaded populations, while in 32 individuals we identified 6 host haplotypes. Haplotype and nucleotide diversity was shown to be higher for host loci than for symbiont loci in all but one exception (the Californian population). However, results taken together show that P. carbekii has a lower genetic diversity and structure than its host H. halys. Only two haplotypes were identified for P. carbekii in both the native and the introduced populations. Six haplotypes were identified for the host in the same number of individuals and a similar length of sequence using mitochondrial loci. The invaded range exhibited reduced diversity, with two host haplotypes and one symbiont haplotype when considering samples sequenced for both organisms, and two haplotypes for the symbiont when including samples for which host haplotype identification was not possible. This lack of diversity is consistent with an introduced population with few introduction events (Xu et al., 2014). The European range of invasion displays more variability than the American, which has been previously assessed multiple times (Cesari et al., 2015; Gariepy et al., 2015, 2013) and exhibits the possibility of multiple introductions. Additionally, the presence of two host haplotypes in the invaded North American range was identified previously (Xu et al., 2014). In this case, one haplotype was present in both eastern and western US populations while the second one was only present in eastern US populations. In our study, we were only able to sample host haplotypes in regions where the previous study found only one of the two haplotypes. Considering that the previous study used specimens sampled between 2006 and 2008, while our specimens were sampled between 2013 and 2015 we can see that despite other nearby populations having a different haplotype, it has not migrated in sufficient numbers to be detected by our sampling. 4.3 Origin of alternate haplotype in North American range Of the two identified P. carbekii haplotypes, P1 was previously sequenced as part of the P. carbekii sequencing project. An alternate haplotype, P2, differed from P1 by only one SNP in the ∆ybgF region, one SNP in the primo region, and two SNPs in the odhA region. When aligning the sequenced regions to the second sequenced genome of P. carbekii (AP012554), from Tsukuba, Japan, we see that the new SNPs are always present

in the second genome. However, the Japan haplotype had additional SNPs in the sequenced regions, which means the polymorphism is likely old and represents part of the native diversity. The P2 haplotype was found in the native population of China, in the region of Beijing, which is theorized as the source of the North American introduction. This provides further evidence of this area as the source of the invasion. However, more information on the distribution of the symbiont haplotypes in other areas may be needed. In our study, we detected the two mitochondrial host haplotypes detected by Xu et al. (2014) as present in North America, but one of them only in the source population. Both of these haplotypes were always simultaneously present with the symbiont haplotype P1 in the Beijing population. Therefore, the introduction of these two host haplotypes may not be enough to explain the distribution of symbiont haplotypes in North America, unless there was a horizontal symbiont transfer between hosts prior to the invasion. 4.4 Use of P. carbekii as a proxy for host population history Expected elevated mutation rates and/or population structuring of bacterial symbionts of invasive insects have provided attractive characteristics for their use in predicting host migration routes and identifying source populations, and the utility of these characteristics have been assessed multiple times using mostly parasites of these invasive species (Criscione et al., 2006; Nieberding and Olivieri, 2007). Other vertically inherited symbionts have shown co-divergence with their hosts (Degnan et al., 2004; Hosokawa et al., 2012), and sometimes similar degrees of resolution at the intra-species level, as is the case of aphid symbionts (Funk et al., 2000). We report that P. carbekii exhibits elevated mutation rates, yet its genes provide no greater resolution for population structuring than those from the BMSB mitochondria. Other symbionts have also shown lower genetic structure than their host. In the case studied by (Anderson et al., 2004), hemipteran symbionts associated to a semi-carnivorous plant host displayed lower genetic diversity than the host. These were attributed to the breeding systems and dispersal capabilities of these organisms. Since H. halys relies on the consumption of maternal secretions after birth to acquire P. carbekii, as well as the gregariousness of this insect, it is possible that one female’s depositions could infect a different one’s offspring. This scenario has been used to explain observed incomplete cophylogeny between other stinkbugs and their respective symbionts (Kikuchi et al., 2009). While the probability of this happening has not been directly assessed, small rates of migration of the symbiont between host individuals could help explain reduced genetic structuring. The lack of new mutations in the pseudogenes for P. carbekii show that even with the fast mutation rate of this symbiont and in regions where selection is relaxed, there has been no appearances of new mutations that have reached sufficient proportion in the population to be detected. This can be due to the time scale evaluated, with the estimated invasion of America being estimated to be 20 years. Additionally, the difference between symbiont haplotypes and mitochondrial haplotypes may be due to the fact that while they are both maternally inherited, there is a higher probability for the symbiont to undergo horizontal gene transfer.

5. Conclusion The results shown here indicate that the use of P. carbekii loci is not as effective as the use of host mitochondrial loci to infer host population history, source populations, and spread, in spite of the symbiont containing the hallmarks of a potential proxy: high mutation rates, obligate mutualism, and vertical inheritance. Whether this is specific to P. carbekii or to other extracellular, externally transmitted symbionts such as other stinkbug symbionts remains to be found. Acknowledgements: We would like to thank Tim Haye for the donation of specimens from Europe and China. Funding: This work was supported by the Ohio Agricultural Research Development Center SEEDS Research Enhancement-Interdisciplinary Grant, The Ohio State University, and the Colombian Administrative Department of Science, Technology and Innovation COLCIENCIAS. Figures Table 1. Haplotype diversity and nucleotide diversity across seven gene regions in P. carbekii's invaded range. Hd: Haplotype diversity; π: nucleotide diversity; n: number of samples; H: number of haplotypes detected. Table 2. Combination of host haplotypes into final haplotypes. Table 3. Haplotype diversity and nucleotide diversity for H. halys and P. carbekii. Hd: Haplotype diversity; π: nucleotide diversity; n: number of samples; H: number of haplotypes detected. Figure 1. Sampling and haplotype distribution for H. halys and P. carbekii. (a-c) Map for P. carbekii haplotypes in (a) China, (b) Switzerland, and (c) the United States using the ∆ybgF gene. Each slice represents an individual and different colors represent different haplotypes. Correlation of host-symbiont haplotypes in (d) China, (e) Switzerland, and (f) the United States per individual, with each slice representing an individual, inner slices representing the symbiont haplotype and outter slices representing the host haplotype. Figure 2. Haplotype networks for host and symbiont. Minimum spanning haplotype network of the ∆ybgF gene (right) and the COII + 12SCR regions (left). Size of the circles represents the number of individuals and tick marks represent substitutions between sequences. Colors represent locations sampled, CH= China; OH= Ohio, USA; ZU= Canton Zurich, Switzerland; TI= Canton Ticino, Switzerland; BA= Canton Basel. Boxes indicate haplotypes that coexist within individuals.

Anderson, B., Olivieri, I., Lourmas, M., Stewart, B.A., 2004. Comparative population genetic structure and local adaptation of two mutualists. Evolution (N. Y). 58, 1730– 1747.

Bansal, R., Michel, A., Sabree, Z., 2014. The Crypt-Dwelling Primary Bacterial Symbiont of the Polyphagous Pentatomid Pest Halyomorpha halys (Hemiptera: Pentatomidae). Environ. Entomol. 617–625. Bistolas, K.S.I., Sakamoto, R.I., Fernandes, J. a M., Goffredi, S.K., 2014. Symbiont polyphyly, co-evolution, and necessity in pentatomid stinkbugs from Costa Rica. Front. Microbiol. 5, 1–15. doi:10.3389/fmicb.2014.00349 Cesari, M., Maistrello, L., Ganzerli, F., Dioli, P., Rebecchi, L., Guidetti, R., 2015. A pest alien invasion in progress: potential pathways of origin of the brown marmorated stink bug Halyomorpha halys populations in Italy. J. Pest Sci. (2004). 88, 1–7. doi:10.1007/s10340-014-0634-y Criscione, C.D., Cooper, B., Blouin, M.S., 2006. Parasite genotypes identify source populations of migratory fish more accurately than fish genotypes. Ecology 87, 823– 828. doi:[823:PGISPO]2.0.CO;2 Degnan, P., Lazarus, A., Brock, C., Wernegreen, J., 2004. Host-Symbiont Stability and Fast Evolutionary Rates in an Ant-Bacterium Association: Cospeciation of Camponotus Species and Their Endosymbionts, Candidatus Blochmannia. Syst. Biol. 53, 95–110. doi:10.1080/10635150490264842 Duron, O., Noël, V., 2016. A wide diversity of Pantoea lineages are engaged in mutualistic symbiosis and cospeciation processes with stinkbugs. Environ. Microbiol. Rep. doi:10.1111/1758-2229.12432 Espíndola, A., Carstens, B.C., Alvarez, N., 2014. Comparative phylogeography of mutualists and the effect of the host on the genetic structure of its partners. Biol. J. Linn. Soc. 113, 1021–1035. doi:10.1111/bij.12393 Funk, D.J., Helbling, L., Wernegreen, J.J., Moran, N.A., 2000. Intraspecific phylogenetic congruence among multiple symbiont genomes. Proc. Biol. Sci. 267, 2517–21. doi:10.1098/rspb.2000.1314 Gariepy, T.D., Bruin, A., Haye, T., Milonas, P., Vétek, G., 2015. Occurrence and genetic diversity of new populations of Halyomorpha halys in Europe. J. Pest Sci. (2004). 88, 451–460. doi:10.1007/s10340-015-0672-0 Gariepy, T.D., Haye, T., Fraser, H., Zhang, J., 2013. Occurrence, genetic diversity, and potential pathways of entry of Halyomorpha halys in newly invaded areas of Canada and Switzerland. J. Pest Sci. (2004). 87, 17–28. doi:10.1007/s10340-013-0529-3 Hafner, M., Sudman, P., Villablanca, F., Spradling, T., Demastes, J., Nadler, S., 1994. Disparate rates of molecular evolution in cospeciating hosts and parasites. Science (80-. ). 265, 1087–1090. doi:10.1126/science.8066445 Hoebeke, E.R., Carter, M.E., 2003. Halyomorpha halys (Stål) (Heteroptera: Pentatomidae): a polyphagous plant pest from Asia newly detected in North America. Proc. Entomol. Soc. Washingt. 105, 225–237. Hosokawa, T., Hironaka, M., Inadomi, K., Mukai, H., Nikoh, N., Fukatsu, T., 2013. Diverse Strategies for Vertical Symbiont Transmission among Subsocial Stinkbugs. PLoS One 8, 4–11. doi:10.1371/journal.pone.0065081 Hosokawa, T., Kikuchi, Y., Nikoh, N., Shimada, M., Fukatsu, T., 2006. Strict hostsymbiont cospeciation and reductive genome evolution in insect gut bacteria. PLoS Biol. 4, 1841–1851. doi:10.1371/journal.pbio.0040337 Hosokawa, T., Matsuura, Y., Kikuchi, Y., Fukatsu, T., 2016. Recurrent evolution of gut symbiotic bacteria in pentatomid stinkbugs. Zool. Lett. 2, 24. doi:10.1186/s40851-

016-0061-4 Hosokawa, T., Nikoh, N., Koga, R., Satô, M., Tanahashi, M., Meng, X.-Y., Fukatsu, T., 2012. Reductive genome evolution, host–symbiont co-speciation and uterine transmission of endosymbiotic bacteria in bat flies. ISME J. 6, 577–587. doi:10.1038/ismej.2011.125 Kenyon, L.J., Meulia, T., Sabree, Z.L., 2015. Habitat Visualization and Genomic Analysis of “Candidatus Pantoea carbekii,” the Primary Symbiont of the Brown Marmorated Stink Bug. Genome Biol. Evol. 7, 620–635. doi:10.1093/gbe/evv006 Kikuchi, Y., Hosokawa, T., Nikoh, N., Meng, X.-Y., Kamagata, Y., Fukatsu, T., 2009. Host-symbiont co-speciation and reductive genome evolution in gut symbiotic bacteria of acanthosomatid stinkbugs. BMC Biol. 7, 2. doi:10.1186/1741-7007-7-2 Lee, D.-H., Short, B.D., Joseph, S. V, Bergh, J.C., Leskey, T.C., 2013. Review of the biology, ecology, and management of Halyomorpha halys (Hemiptera: Pentatomidae) in China, Japan, and the Republic of Korea. Environ. Entomol. 42, 627–641. Leigh, J.W., Bryant, D., 2015. popart : full-feature software for haplotype network construction. Methods Ecol. Evol. 6, 1110–1116. doi:10.1111/2041-210X.12410 Leskey, T.C., Hamilton, G.C., Nielsen, A.L., Polk, D.F., Rodriguez-Saona, C., Christopher Bergh, J., Ames Herbert, D., Kuhar, T.P., Pfeiffer, D., Dively, G.P., Hooks, C.R.R., Raupp, M.J., Shrewsbury, P.M., Krawczyk, G., Shearer, P.W., Whalen, J., Koplinka-Loehr, C., Myers, E., Inkley, D., Hoelmer, K. a., Lee, D.H., Wright, S.E., 2012. Pest status of the brown marmorated stink bug, Halyomorpha halys in the USA. Outlooks Pest Manag. 23, 218–226. doi:10.1564/23oct07 Librado, P., Rozas, J., 2009. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25, 1451–2. doi:10.1093/bioinformatics/btp187 Nieberding, C.M., Olivieri, I., 2007. Parasites: proxies for host genealogy and ecology? Trends Ecol. Evol. 22, 156–165. doi:10.1016/j.tree.2006.11.012 Nikoh, N., Hosokawa, T., Oshima, K., Hattori, M., Fukatsu, T., 2011. Reductive evolution of bacterial genome in insect gut environment. Genome Biol. Evol. 3, 702–714. doi:10.1093/gbe/evr064 Otero-Bravo, A., Sabree, Z.L., 2015. Inside or out? Possible genomic consequences of extracellular transmission of crypt-dwelling stinkbug mutualists. Front. Ecol. Evol. 3, 1–7. doi:10.3389/fevo.2015.00064 Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., Peplies, J., Glockner, F.O., 2013. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596. doi:10.1093/nar/gks1219 Tajima, F., 1993. Simple methods for testing the molecular evolutionary clock hypothesis. Genetics 135, 599–607. Taylor, C.M., Coffey, P.L., DeLay, B.D., Dively, G.P., 2014. The importance of gut symbionts in the development of the brown marmorated stink bug, Halyomorpha halys (St??l). PLoS One 9. doi:10.1371/journal.pone.0090312 Wernegreen, J.J., 2015. Endosymbiont evolution: predictions from theory and surprises from genomes. Ann. N. Y. Acad. Sci. 1360, 16–35. doi:10.1111/nyas.12740 Xu, J., Fonseca, D.M., Hamilton, G.C., Hoelmer, K. a., Nielsen, A.L., 2014. Tracing the origin of US brown marmorated stink bugs, Halyomorpha halys. Biol. Invasions 16,

153–166. doi:10.1007/s10530-013-0510-3 Zhu, G.-P., Ye, Z., Du, J., Zhang, D.-L., Zhen, Y.-H., Zheng, C.-G., Zhao, L., Li, M., Bu, W.-J., 2016. Range wide molecular data and niche modeling revealed the Pleistocene history of a global invader (Halyomorpha halys). Sci. Rep. 6, 23192. doi:10.1038/srep23192 Web References CABI, 2016. Invasive species compendium. Wallingford, UK: CAB International. www.cabi.org/isc StopBMSB, 2016. Stop BMSB: Biology, ecology, and management of brownmarmorated stink bug in specialty crops. www.stopbmsb.org

Figure 1

Figure 2

∆ybgF ∆ftsN ∆speA ∆transglycosylase C ∆yigL 16S

California Hd 0.533 0 0 0 0 0

π 0.00896 0 0 0 0 0

n 10 10 10 10 10 10

H 2 1 1 1 1 1

Ohio Hd 0 0 0 0 0 0

π 0 0 0 0 0 0

n 10 10 10 10 10 10

H 1 1 1 1 1 1

Michigan Hd π 0 0 0 0 0 0 0 0 0 0 0 0

n 10 10 10 10 10 10

Table 1. Haplotype diversity and nucleotide diversity across seven gene regions in P. carbekii's invaded range. Hd: Haplotype diversity; π: nucleotide diversity; n: samples; H: number of haplotypes detected.

H 1 1 1 1 1 1

CO2 12S Total Identical previously identified haplotypes A A B B C D

A B A B C B

Hh1 Hh2 Hh3 Hh4 Hh5 Hh6

H2, H23 H1, H25, H26 H7, H14, H21, H22 H3, H4, H5, H12, H16, H18, H19 H35, H40

Location of previously identified haplotypes China, United States China, United States China China Japan

Table 2. Combination of host haplotypes into final Haplotypes

Halyomorpha halys

Pantoea carbekii

12S

Regions China

Introduc ed

Europ e

Ameri ca

Concatenated

ybgF

primo

odhA

Concatenated

Hd

π

n

H

Hd

π

n

H

Hd

π

n

H

Hd

π

n

H

Hd

π

n

H

Hd

π

n

H

Hd

π

n

H

0.52 2 0.58 2

0.001 6 0.001 82

3 2 1 8

3

0.55 8 0.38 6

0.001 97 0.001 48

3 2 1 8

4

0.78 3 0.68

0.001 78 0.001 55

3 1 1 8

6

0.46 9 0.30 8

0.003 17 0.001 85

6 9 1 7

2

0.31 5 0.36 3

0.001 4 0.001 56

1 7 1 4

2

0.43 1 0.30 9

0.002 11 0.001 51

2 4 1 7

2

0.32 5 0.38 5

0.001 61 0.001 91

1 6 1 3

2

0

0

6

1

0.33 3

0.001 04

6

2

0.33 3

0.000 5

6

2

0

0

1 0

1

-

-

-

-

-

-

-

-

-

-

-

-

0

0

4

1

0

0

4

1

0

0

4

1

0

0

4

1

-

-

-

-

0

0

4

1

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

0

0

4

1

-

-

-

-

-

-

-

-

-

-

-

-

OH

0

0

4

1

0

0

4

1

0

0

4

1

0

0

1

0

0

3

1

0

0

3

1

0

0

3

1

MI

-

-

-

-

-

-

-

-

-

-

-

-

0

0

1

-

-

-

-

-

-

-

-

-

-

-

-

CA

-

-

-

-

-

-

-

-

-

-

-

-

0.53 3

0.008 96

1 4 1 0 1 0

2

-

-

-

-

-

-

-

-

-

-

-

-

Total

Native

CO2

Beijin g Zuric h Ticin o Basel

3

3

5

2

2

2

Table 3. Haplotype diversity and nucleotide diversity for host and symbiont markers. Hd: Haplotype diversity; π:nucleotide diversity; n: samples; H: Haplotypes detected.

2

Highlights Invasive stinkbug and its bacterial symbiont show low nucleotide diversity. Vertically inherited bacterial symbiont shows lower diversity than its host. Host haplotypes are structured within symbiont haplotypes.