MinION as part of a biomedical rapidly deployable laboratory

MinION as part of a biomedical rapidly deployable laboratory

Journal of Biotechnology 250 (2017) 16–22 Contents lists available at ScienceDirect Journal of Biotechnology journal homepage: www.elsevier.com/loca...

840KB Sizes 3 Downloads 196 Views

Journal of Biotechnology 250 (2017) 16–22

Contents lists available at ScienceDirect

Journal of Biotechnology journal homepage: www.elsevier.com/locate/jbiotec

MinION as part of a biomedical rapidly deployable laboratory Mathias C. Walter a , Katrin Zwirglmaier a , Philipp Vette a , Scott A. Holowachuk b , Kilian Stoecker a , Gelimer H. Genzel a , Markus H. Antwerpen a,∗ a b

Bundeswehr Institute of Microbiology, Neuherbergstr. 11, 80937 Munich, Germany Defense Research and Development Canada, Suffield Research Centre, PO Box 4000 Stn Main, Medicine Hat, Alberta, T1A 8K6, Canada

a r t i c l e

i n f o

Article history: Received 30 September 2016 Received in revised form 1 December 2016 Accepted 5 December 2016 Available online 6 December 2016 Keywords: MinION Nanopore sequencing Outbreak investigation

a b s t r a c t Fast turnaround times are of utmost importance for biomedical reconnaissance, particularly regarding dangerous pathogens. Recent advances in sequencing technology and its devices allow sequencing within a short time frame outside stationary laboratories close to the epicenter of the outbreak. In our study, we evaluated the portable sequencing device MinION as part of a rapidly deployable laboratory specialized in identification of highly pathogenic agents. We tested the device in the course of a NATO live agent exercise in a deployable field laboratory in hot climate conditions. The samples were obtained from bio-terroristic scenarios that formed part of the exercise and contained unknown bacterial agents. To simulate conditions of a resource-limited remote deployment site, we operated the sequencer without internet access. Using a metagenomic approach, we were able to identify the causative agent in the analyzed samples. Furthermore, depending on the obtained data, we were able to perform molecular typing down to strain level. In our study we challenged the device and discuss advances as well as remaining limitations for sequencing biological samples outside of stationary laboratories. Nevertheless, massive parallel sequencing as a non-selective methodology yields important information and is able to support outbreak investigation − even in the field. Crown Copyright © 2016 Published by Elsevier B.V. All rights reserved.

1. Introduction The key to success of reconnaissance of unusual outbreaks is rapid identification of the causative agent. Current sequencing technologies in combination with bioinformatics allow DNA sequence determination of a priori or previously unknown organisms. While benchtop-sized devices become more and more popular and affordable for medical and microbiological institutions, sequencing outside of a laboratory became practical only recently with the development of the MinIONTM device by Oxford Nanopore Technologies (ONT) (Mikheyev and Tin, 2014). Connected to a laptop via a USB port, this sequencer can be used site-independently even in outback regions (or close to them) to obtain genomic sequences, thus providing essential information for tracing back the organisms and supporting identification and contact tracing of patients. In the latter case, short turnaround times and results on-site are of utmost importance for diagnosing bacterial or viral infections especially with dangerous pathogens.

∗ Corresponding author. E-mail address: [email protected] (M.H. Antwerpen). http://dx.doi.org/10.1016/j.jbiotec.2016.12.006 0168-1656/Crown Copyright © 2016 Published by Elsevier B.V. All rights reserved.

In previous years, outbreak investigations have often been supported by whole genome sequencing (Antwerpen et al., 2015; Gieraltowski et al., 2016; Keim et al., 2015; Mellmann et al., 2011; Quick et al., 2016, 2015), as they provide useful information for development and research purposes (assays) but also combined with bioinformatics as a helpful tool to address epidemiological questions such as contact tracing. The MinION device has previously been tested in combination with benchtop devices like Illumina MiSeq (Gieraltowski et al., 2016; Quick et al., 2016) or IonTorrent PGM (Gieraltowski et al., 2016), showing the feasibility of the devices in general. Limitations or challenges were mostly accuracy and the volume of data output (Oikonomopoulos et al., 2016). Much effort has been put into increasing both, but for sequencing in the field with only limited access to infrastructure, this device has never been challenged. There are several differences between rapidly deployable and conventional stationary laboratories, which might have effects on the outcome of diagnostic results, particularly for highly sensitive molecular biology methods like sequencing. On the one hand, sample quality and quantity often differs extremely and limited access to equipment and consumables impedes evaluation processes and the use of the optimal nucleic acid extraction procedures. On

M.C. Walter et al. / Journal of Biotechnology 250 (2017) 16–22

the other hand, unusual laboratory infrastructure, such as lack of air conditioning and internet access, poses a challenge during the reconnaissance mission. In modern diagnostics today, internet access is crucial for many applications, including sequencing using the MinION. While most steps from DNA extraction to library preparation and sequencing using MinION can be done offline, basecalling of the obtained raw data requires a stable broadband internet connection as it is currently processed as a cloud-based application. Some progress has been made to solve this issue using open source algorithms (Boˇza et al., 2016; David et al., 2016). Once this technical bottleneck has been passed, several pipelines (Judge et al., 2016; Cao et al., 2016; Moore et al., 2014) for data analysis are available, which allow a broad range of different applications. Rather than closing genomes by hybrid assemblies, of more importance for outbreak investigations are the applications of amplicon sequencing or metagenomic approaches, both of which have already been successfully carried out in stationary laboratories. Target organisms sequenced and identified range from bacteria (Karlsson et al., 2015; Moore et al., 2014) to DNA viruses (Kilianski et al., 2015) as well as RNA viruses after a preceding cDNA transcription (Wallerman, 2015; Graf et al., 2016; Kilianski et al., 2016; Wang et al., 2015). If wet-lab and e-lab are suitably harmonized they can support outbreak reconnaissance using this device. One of the first MinIONsupported outbreak investigations was the fight against Ebola in West Africa (Quick et al., 2016). Currently, the Zika virus epidemic is also being investigated using this field-deployable device (“Zibra project”, 2016). In this study we tested the MinION system as part of a military rapidly deployable laboratory specialized in diagnostics of highly pathogenic bacteria during a multinational NATO exercise in July 2016 at a training ground at the Counter Terrorism and Technology Centre (CTTC) situated at Suffield, Alberta, Canada. The continental climate with extreme temperatures − seasonal, as well as diurnal − is characteristic of this area and not comparable with stable, climate controlled conditions in a stationary laboratory with internet access. While investigating unknown samples obtained from a range of bioterroristic scenarios, we focused on three different sequencing approaches: i) confirmation of identification based on other molecular tests, ii) identification of an unknown biological agent where all other molecular tests had failed, iii) identification and subsequent molecular typing of a known bacterial sample in order to provide additional bioforensic information to the chain of command.

17

Eppendorf tubes for 3 min at 2000 x g using a microfuge (VWR, Germany). The cell pellet was resuspended in 200 ␮l PBS. From each sample 200 ␮l were used for subsequent DNA isolation using DNEasy Blood and Tissue kit (QIAGEN, Hilden, Germany). 2.2. DNA isolation and library preparation According to the safety regulations, biological agents in the samples were inactivated inside a glovebox using Qiagen’s AL buffer, a heating step to 55 ◦ C and 96% ethanol and subsequently transferred to the molecular workbench for further processing. For DNA extraction, DNEasy Tissue Kit (Qiagen, Hilden, Germany) was used according to the manufacturer’s protocol and eluted in 100 ␮l Elution Buffer/ddH2 O. Quantification was done using the fluorometric system Qubit 2.0 (Life Technologies, Darmstadt, Germany). Genomic DNA was sheared using Covaris g-TUBEs (Covaris, Brighton, UK) following the manufacturer’s recommendations in order to obtain fragments of an average length of 8 kb. Possible DNA damage was repaired using NEBNext FFPE RepairMix (New England Biolabs, Frankfurt a.M., Germany). After purification using AMPure XP beads (Beckman Coulter, Brea, California, USA), a dA-tailing/end-repair was performed (NEBNext Ultra II End-Repair/dA-tailing Module, New England Biolabs) and once again purified using AMPure XP beads. Tether-Adapter are attached to the DNA double strands using NEB Blunt/TA Ligase Master Mix in combination with Nanopore sequencing kit (SQKNSK007) with designated R9 chemistry (July 2016). Tethered DNA was purified and concentrated by using MyOne C1 streptavidin beads (Thermofisher, Waltham, Massachusetts, USA) and eluted in 25 ␮l ELB (elution buffer, ONT). DNA concentration was measured using Qubit 2.0. Total time requirements from sample reception to the finished library were ca. 60 min for sample processing within the glove box, 15 min for DNA extraction and 150 min for library preparation. 2.3. Sequencing For sequencing the MinION device MK1b (ONT, Oxford, UK) was used in combination with workflow “NC 48Hr Sequencing Run FLO MIN104.py” within the software MinKNOW v51.3.55 (ONT, offline version, which does not require an internet connection during the initializing phase) running on a laptop (16 GB RAM, 250 GB SSD, dual-core Intel i7-4600U with Hyper-Threading, Microsoft Windows 7, MinGW (“MinGW | Minimalist GNU for Windows,” 2016) overnight.

2. Material and methods

2.4. Offline basecalling

All laboratory work was performed in an inflatable tent, which is part of the rapidly deployable biomedical reconnaissance laboratory of the Bundeswehr Institute of Microbiology (Fig. 1; Wölfel et al., 2015).

To retrieve the nucleotide sequences from the raw signal data, usually Metrichor is used to process the fast5 files within the ONT cloud. As it requires an internet connection and produces a transfer volume with an upload to download ratio of about 1:10, it is not applicable in the field. Fortunately, ONT provides an offline basecaller, “nanonet”, based on recurrent neural networks to participants of the Oxford Developers program. During our exercise, we used this nanonet basecaller to get 2D, template and complement reads in realtime during sequencing (nanonet2d −watch 300 −fastq). Due to its long processing time (to keep enough resources for MinKNOW and downstream processing, only two parallel threads were used) we kept it running overnight. On the following day we stopped nanonet and the subsequent piped downstream analyses and uploaded (via the hotel’s internet connection) a compressed subset of 1000 randomly selected 2D raw reads together with all basecalled reads (approximately 25 Mb per run) to our reach-back laboratory for further investigations.

2.1. Samples The following samples were processed for the MinION runs: 1st run: Sample 685 was a bacterial lawn on an agar plate with an unknown bacterial agent. Ca. one third of the lawn was scraped off using a pipette tip and the cells were resuspended in 1.5 ml PBS. 2nd run: Sample 693 was a turbid liquid containing an unknown bacterial agent. Three ml of the liquid were spun down in 1.5 m Eppendorf tubes for 3 min at 2000 x g using a microfuge (VWR, Germany). The cell pellet was resuspended in 200 ␮l PBS. 3rd run: Sample was a turbid liquid in a tube labelled “Francisella tularensis LVS”. Two ml of the liquid were spun down in 1.5 m

18

M.C. Walter et al. / Journal of Biotechnology 250 (2017) 16–22

Fig. 1. Setup of the laboratory tent (left) and MinION workbench (right) at the Suffield training site.

2.5. Realtime analysis, taxonomic assignment and visualization

2.7. Reach-back analyses

On the laptop controlling the connected MinION, basecalled 2D reads were continuously aligned against the locally available NCBI Representative Genomes database (May 2016, containing 235 archaeal and 4836 bacterial species including their taxonomic information; Tatusova et al., 2014) as well as against the control DNA contained in the ONT sequencing kit using blastn (“BLAST: Basic Local Alignment Search Tool,” 2016) with an e-Value of 1e-50. The blast results in tabular output format were directly piped to a modified script of the software package Krona (Ondov et al., 2011) which then creates a hierarchical and zoomable pie chart of the sequenced reads and updates it every 15 s. At the same time, the blast results were aggregated based on taxonomic information and printed to the console to show the overall analysis status.

In the reach-back laboratory we used Metrichor v2.4.17 to upload and basecall all three raw data read sets. This was done after the end of the exercise. Then we used the most recent version of poretools from GitHub (as of Jun 21, 2016) to extract the 2D high quality and best (2D and 1D if no 2D was available) reads of the basecalled passed and failed reads. Afterwards, we evaluated the basecalling performance of the reads both basecalled by nanonet and Metrichor to ensure the offline basecalling provides a similar quality. Furthermore, we repeated the realtime analysis with the full set of best reads to identify all species present in the sample. We also repeated the taxonomic assignment at the strain level for all identified species whether they are classified as pathogenic biological agents or not and especially, if there was no genomic data available on the laptop. For a higher sensitivity and shorter runtime we used GraphMap (Sovic´ et al., 2016) instead of blastn to map the best reads to the genomes. Afterwards we used genomeCoverageBed from bedtools (Quinlan and Hall, 2010) to compute the overall genome coverage. Furthermore, the reads of the most abundant species per sequence run were extracted and assembled using canu (Phillippy et al., 2016). The assembled contigs were then circularized and blasted against the NCBI non-redundant nucleotide (nr) database (Tatusova et al., 2014) to verify the assigned organism name.

2.6. Analysis of genome sequences, typing and visualization The NCBI Representative Genomes database usually contains either one representative genome or sometimes several type strains of a species (i.e. B. anthracis). Because of performance, we used the blastn parameter −max target seqs 1, showing only the first organism of many highly similar hits. But due to the relatively high error rate of the reads (up to 10% incorrect nucleotides or INDELs) different strains of the same species could have been reported. Hence, at this stage it is only possible to reliably assign the organism’s name at the species level. Under the assumption that only one strain of a species is present in the sample, we aligned the reads of the most abundant species (if a biological agent) against each locally available complete genome of that species using blastn. We then used the blast hit coordinates to compute the overall genome coverage and obtained the strain name from the genome with the greatest coverage. For further molecular typing of some high risk agents, python scripts were available for performing in silico Multiple-Locus Variable-number of tandem repeat Analysis (MLVA) as well as determination of canonical Single Nucleotide Positions (canSNPs) for subsequent comparison. For Francisella tularensis we had available MLVA 12 reference data based on 11 markers (Vogler et al., 2009) with one additional marker (Svensson et al., 2009) and a representative dataset of in silico determined Francisella genomes. For determination of the phylogenetic clade, SNP positions of common literature comprising 102 SNPs were checked and compared (Birdsell et al., 2014; Gyuranecz et al., 2012; Karlsson et al., 2013; Kuroda et al., 2012; Svensson et al., 2009).

3. Results 3.1. Description of samples and scenarios 1st run: Sample 685 was a red agar plate, possibly blood agar, with a bacterial lawn with several antibiotic discs on it. Further utensils in the improvised lab as well as literature found at the site indicated that this sample was reflecting attempts to introduce plasmids with various antibiotic resistance genes into B. anthracis. We carried out a number of tests to corroborate this suspicion. Hand held rapid tests and qPCR for B. anthracis were negative. Gram staining showed gram negative rods with few interspersed gram positive rods (contaminants). Based on this, our assumption was that B. anthracis had been replaced with a non-pathogenic surrogate strain for training purposes and the aim was to use MinION whole genome sequencing to confirm this, identify and genotype the strain. These results were confirmed in the debriefing. 2nd run: Sample 693 was a turbid liquid from a scenario that suggested an attempt to isolate Francisella tularensis from sheep feces. However, hand held rapid tests and qPCR carried out in our lab for F. tularensis were negative, as were qPCR tests for Burkholderia mallei, Brucella abortus, E. coli EHEC STX1 and STX2, Yersinia pestis

M.C. Walter et al. / Journal of Biotechnology 250 (2017) 16–22

and Coxiella burnetii. Gram staining showed a mix of gram negative rods and few gram positive cells. With no clear idea of what the sample contained, the aim of MinION sequencing in this case was to identify the unknown biological agent and genotype them. During the debriefing of the scenario the sample was described as diluted culture isolated from guinea pig feces and supposedly contaminated with Francisella tularensis. 3rd run: The sample for this run was a diluted, gamma irradiated culture of F. tularensis that was given to us by CTTC staff, as previous attempts to isolate sufficient high quality DNA from other scenario samples that had been spiked with F. tularensis had failed. The aim of this MinION run was to explore the possibilities of genotyping of known samples under field conditions. 3.2. Successful implementation of MinION in the field During this exercise we were able to sequence successfully in our tent-based laboratory. Outdoor temperatures from 27 ◦ C to 42 ◦ C challenged the device for reaching the target temperature of 37 ◦ C as no active cooling is available. Downstream analysis did not show any significant differences between basecalling in the cloud using Metrichor and offline basecalling using nanonet on a local laptop. Basecalling as well as taxonomic assignments to species-level therefore showed the same results. Due to the piped downstream analysis combined with visualization, results could be given to the public over night at the latest, whereas first results became visible already approximately 10 min after starting the sequencing process. We were able to perform 3 different runs differing in the obtained data amount (5k–96k reads, Table 1). Based on the raw reads, basecalling and subsequent data evaluation was performed. Using our evaluation pipeline for all runs, the unknown species could be determined correctly (Fig. 2). 3.3. Typing and whole genome analysis The genome assembly of the reads in the reach-back laboratory was successful only for the 1st run. It produced two contiguous circular contigs covering the chromosome and a plasmid of E. coli with a mean coverage of 54X. The blast against the NCBI nr database revealed 99% identity to the E. coli K-12 strain. A simple blast search showed, as already performed in the field-lab revealed, no shiga toxin encoding gene. For the 2nd run no contigs could be assembled because there were too few 2D reads available. The re-identification of the species revealed L. lactis subsp. lactis KF147 as the most similar strain − in contrast to the strain predicted in the field: L.lactis subsp. lactis Il1403. The presence of Lactobacillus is not unexpected, as it is a common bacterium in the intestine, and therefore in a sample that is washed from fecal matter. As Lactobacillus is a fast growing bacteria in contrast to Francisella, Lactobacillus was much overrepresented and the identification of Francisella failed completely. In the 3rd run, only 19,437 2D reads could be basecalled. This amount was not sufficient to assemble into a contiguous chromosome, but into 303 single contigs. After identification of the agent as F. tularensis, we performed molecular typing in order to determine its subspecies and phylogenetic assignment. F. tularensis is a gram-negative rod and the causative agent of the zoonosis tularemia (Vogler et al., 2009). Considered as a category A select agent with potential to be misused in bioterrorism (“Federal Select Agent Program − Select Agents and Toxins List,” 2016), its correct identification is essential. F. tularensis is a genetically monomorphic pathogen, whereas its subspecies F. tularensis holarctica, F. tularensis tularensis and F. tularensis novicida differ in their virulence for humans (Svensson et al., 2005; Vogler et al., 2009). Therefore, a

19

reliable assignment of an identified F. tularensis isolate to a phylogenetic branch is also of medical importance. First, we performed in silico canSNP typing and were able to identify all Francisella-specific canSNPs and assign the strain to the branch whose most prominent strain is F. tularensis holarctica LVS, a live vaccine strain. All determined positions were covered by at least one MinION read. A mapping of the reads to the F. tularensis subsp. holarctica LVS reference genome showed a mean coverage of 7X, though about 10,000 bases were not covered at all. Using in silico MLVA, only 9 of 12 markers could be evaluated. For markers M2, M22, M26 no hypothetical PCR product could be estimated, as the PCR primer could not bind in silico, because of too many ambiguous bases in the generated consensus sequence. Interestingly, the calculated lengths of all markers were identical or showed a difference of only 2 nt at a maximum (5 markers) compared to F. tularensis subsp. holarctica LVS. This difference was due to INDELs in the obtained MinION reads. As none of the Francisella MLVA markers showed a repeat unit of less than 6 bases, comparison and assignment of the phylogenetic branch was possible. 4. Discussion In a case of an unusual outbreak, rapid and reliable identification of the causative agent is needed to provide recommendations for therapy and prophylaxis or restriction of movements to the responsible authorities. Rapid sequence analysis can provide valuable support in these situations. Molecular analysis is the first method of choice for rapidly deployable laboratories, as cultivation of the selected agents cannot be performed and only limited resources are available. A portable sequencing device could support molecular diagnostics in the field, as a non-targeted method of identification − which is important especially if other molecular tests have failed due to PCR primer binding sites being mutated or no possible identification of the causative agent by other means. With the MinION device for the first time a sequencer was developed that shows the feasibility to be a useful tool for reconnaissance missions abroad. 16S profiling (Kilianski et al., 2015) as well as whole genome sequencing with subsequent genotyping or metagenomics are now applications for identification or attribution, which might help in unusual events. We successfully performed identification of unknown bacterial agent by whole genome sequencing of a previously unknown bacterial agent in a tent-based laboratory. Based on realistic performance of the obtained data output, different levels of information could be reached: From simple identification in the case of a low output sequencing run, up to a level, at which molecular typing of the strains based on canonical SNPs could be performed. In our study we could even perform in silico MLVA for F. tularensis, regardless of nucleotide differences and INDELs in the MinION reads. In this special case, these differences do not hinder a correct allele determination. But in general, evaluation of in silico MLVA based on short VNTRs with only 2 or 3 nucleotides, would lead to misdetermination. A higher data output or a better read accuracy with respect to INDELs would enhance the sequence evaluation. The field-based setting provided a number of challenges that need to be taken into consideration for future studies. 4.1. Quality and amount of DNA and type of sample material The greatest challenge in this setting was getting enough high quality DNA (1-1.5 ␮g) for library preparation. Due to the nature of the samples, i.e. potentially highly pathogenic organisms up to BSL3, all samples had to undergo initial processing and inactivation in a glove box, which precluded processing of large amounts

20

M.C. Walter et al. / Journal of Biotechnology 250 (2017) 16–22

Table 1 Overview of obtained reads from the three different sequencing runs.

Sequencing runtime Reads offline basecalled Total reads Average read length 2D reads Artificial reads Control DNA reads Reads of other detected species Most abundant species

1st run

2nd run

3rd run

16 h 4381 96,800 5207 nt 52,441 394 (2%) 60 (0.3%)

18 h 1082 5728 1970 nt 1416 819 (58%) 114 (8%) 19 (1%) Lactococcus lactis (33%)

31 h 13,475 35,182 594 nt 19,437 17,144 (49%) 199 (0.6%) 372 (1%) Francisella tularensis (50%)

Escherichia coli (97%)

of sample material. Furthermore, the samples we obtained from the scenarios were often rather dilute or impure (e.g. liquids). As this was only an exercise, no actual live BSL3 organisms were used in the scenarios. Instead, all material used contained either nonpathogenic surrogate strains (1st run, E. coli) or inactivated bacteria. Inactivation was based either on gamma-irradiation or formalin fixation. In addition, some samples contained lyophilized organisms. While DNA extraction from gamma irradiated material worked well (3rd run, Francisella), formalin fixed lyophilized cells proved to be extremely problematic (several failed attempts, data not shown). The difficulties of DNA extraction from fixed bacterial cells have been described before (Hykin et al., 2015; Judge et al., 2016). In a real setting, the agents will not be fixed, so DNA extraction is likely to work better, but in contrast to procedures available in a stationary laboratory, limited yield is still a challenge for the library preparation. This could be addressed with the recently released ONT Rapid Sequencing Kit I, which requires only 200 ng of input DNA and simplifies the protocol to two steps and takes only 10 min but produces 1D reads by sequencing only one strand of the double-stranded fragment. New devices in the pipeline of e.g. ONT, such as VoltraxTM claim to simplify and shorten library preparation and ultimately sample extraction. But whether they perform better, particularly reducing the loss of DNA during library preparation, remains to be tested. The availability of new upgrades of the procedure, using the new low-input DNA kit or the new SpotON flowcells address these problems further. The use of enrichment or depletion kits might enhance the output as well. 4.2. High temperatures Temperatures in the lab tent frequently reached up to 35 ◦ C. The MinION device has a heating unit to provide the required working

temperature of 34–36 ◦ C, but for cooling it is equipped only with an air-cooled fan. This led to frequent overheating and necessitated the use of improvised cool packs. Conversely, at the other end of the temperature scale, low temperatures of 10–15 ◦ C encountered at an earlier field trial with MinION (data not shown) presented no problem. 4.3. MinION on the road On several occasions, we had to evacuate the tent and leave the training site due to weather warnings while a sequencing run was in progress. In these cases, the active MinION, attached to the laptop, with MinKNOW running was transported in a car over bumpy track roads back to the hotel to complete the run. Impressively, apart from some minor spillage of the sequencing fluid, the shaking caused no problems and there was no obvious difference in the amount and quality of data generated during the car ride. 4.4. Further downstream analyses Further downstream analyses of interest depend mainly on quality, as well as data output. Molecular typing with strain typing as performed initially for the 3rd run with F. tularensis strongly depends on the knowledge with prepared scripts and tables for the reconnaissance team, too. In a broad range scenario, it is also thinkable to send a molecular epidemiologist with bioinformatics skills as additional person into mission. Rarely, this would be the case. Queries against simple genetic markers like possible acquired antibiotic resistance genes or virulence factors on the other hand can be performed easily with a simple blast request and the results are meaningful independent of the identified species. Calculated

Fig. 2. Krona pie charts showing the taxonomic distribution of the sequenced 2D reads from run 2 (left) and run 3 (right).

M.C. Walter et al. / Journal of Biotechnology 250 (2017) 16–22

medication – even if the phenotype of the causative agent has not been determined – can be offered to the responsible physicians. 4.5. Software and computational resources The bottleneck is still the basecalling of the acquired signals. While software – such as nanonet used here – works reliably on even local laptops, this process is resource-intensive, and therefore normally performed in big clouds using several CPU cores in parallel. For rapidly deployable laboratories upscaling of CPUs e.g. by using low-power portable computers like Raspberry PI or Odroid would be one possibility. A different approach would be the implementation of an adapted software library for calculation of these algorithms on GPUs or FPGA. In our opinion, this could massively reduce the computing time. Of course, a combined approach would provide the best performance. When basecalling has finished and the nucleic sequences have been determined successfully, subsequent analysis can easily be performed site-independently without the need for internet access or a direct connection to a stationary reach-back laboratory. Databases for 16S sequencing only comprise 2GB and databases of reference sequences of bacteria can also be transported into the outbreak region, as storage space is not a limiting factor nowadays anymore, but for fast sequence comparison in real-time a solid state disk (SSD) is required. Combined with a small set of smart scripts, simple examination of the sequence and identification processes can be started and performed by persons without any knowledge in bioinformatics. A further improvement for independent sequencing and realtime analysis in the field would be database matching by using faster algorithms like MinHash (Ondov et al., 2016) or GraphMap (Sovic´ et al., 2016). Unfortunately, for both tools currently no Microsoft Windows binary is available and they are not designed to accept data streams. Also, GraphMap requires a large memory (RAM). Since the MinKNOW software is currently only available for MacOSX and Windows, using a MacBook in the field would be a better option also, because the GPU-optimized version of the nanonet basecaller is currently not stable for Microsoft Windows. Meanwhile, this basecaller has been released as open source on 27th September, 2016 and is freely available (https://github.com/ nanoporetech/nanonet). As an additional feature, MinKNOW software (from version 1.02 onwards) supports local basecalling but currently only for 1D reads. In cases, where 1D reads are sufficient, this is definitely an improvement and simplifies the realtime analysis pipeline. 5. Conclusions This was, to our knowledge, the first time MinION sequencing has been employed in a tent based laboratory using a variety of different sample matrices and unknown biological agents. Investigations in such a scenario are often supports rapidly deployable outbreak investigation teams of governmental or supra-national institutions. However, the apparent ease of use of the system only applies, if people − trained and skilled in molecular diagnostics and sequencing − are part of the team. Therefore, for first responders or broad-range CBRNE teams this device is not a useful detection tool, as normally these people only have limited access to research facilities specialized in molecular diagnostics. Further advancement of this technology should be continued to further refines its application in specialized outbreak/theatre of operation settings to enhance public health/military responses as appropriate. In the hands of skilled people, this device is able to directly support in the outbreak investigation and extremely shorten the turnaround times from sample to sequence. This offers the

21

possibility of early identification of antibiotic resistances and thus helps precise treatment of the patients. As the MinION device is portable and robust, it has the ability to also push the field of the environmental microbiology. In the not too distant future we might see MinION sequencing from different outback regions: from jungle and hot springs to ice-glaciers and ships as it has already entered space (Burton et al., 2016) Competing interests M.H.A. is a member of the MinION Access Programme (MAP) and has received an additional sequencing device for this study. Acknowledgements We thank Stephan Motzkus for excellent technical assistance. In addition, we are grateful to the members of theCTTC biological training team for providing additional sample material and ONT for admission to the MinION Access Programme as well as for instrument and software technical support. Finally, we would like to acknowledge the NATO Precise Response 2016 exercise which enabled and supported the execution of this particular study and important findings. Last but not least, the authors thank the two anonymous reviewers for their insightful and helpful suggestions. This work was supported by the Medical Service of the German Ministry of Defense. References Antwerpen, M.H., Prior, K., Mellmann, A., Höppner, S., Splettstoesser, W.D., Harmsen, D., 2015. Rapid high resolution genotyping of Francisella tularensis by whole genome sequence comparison of annotated genes (MLST+). PLoS One 10, http://dx.doi.org/10.1371/journal.pone.0123298. BLAST: Basic Local Alignment Search Tool [WWW Document], 2016. URL https:// blast.ncbi.nlm.nih.gov/Blast.cgi (Accessed 9.28.2016). Birdsell, D.N., Johansson, A., Öhrman, C., Kaufman, E., Molins, C., Pearson, T., Gyuranecz, M., Naumann, A., Vogler, A.J., Myrtennäs, K., Larsson, P., Forsman, M., Sjödin, A., Gillece, J.D., Schupp, J., Petersen, J.M., Keim, P., Wagner, D.M., 2014. Francisella tularensis subsp. tularensis group A.I, United States. Emerg. Infect. Dis. 20, 861–865, http://dx.doi.org/10.3201/eid2005.131559. Boˇza, V., Brejová, B., Vinaˇr, T., 2016. DeepNano: Deep Recurrent Neural Networks for Base Calling in MinION Nanopore Reads. Burton, A.S., Federman, S., Izquierdo, F., Turner, D.J., Yu, G., Juul, S., Alexander, N., Stephenson, T.A., Mason, C., Somasekar, S., Botkin, D.J., Stryke, D., Castro-Wallace, S.L., Smith, D.J., Chiu, C.Y., Lupisella, M.L., Dworkin, J.P., John, K.K., McIntyre, A.B.R., Stahl, S.E., Rubins, K.H., 2016. Nanopore DNA sequencing and genome assembly on the international space station. bioRxiv. Cao, M.D., Ganesamoorthy, D., Elliott, A., Zhang, H., Cooper, M., Coin, L., 2016. Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinION sequencing. bioRxiv, 019356–019356. David, M., Dursi, L., Yao, D., Boutros, P.C., Simpson, J.T., 2016. Nanocall: an open source basecaller for oxford nanopore sequencing data. Bioinformatics, http:// dx.doi.org/10.1093/bioinformatics/btw569, btw569–btw569. Federal Select Agent Program − Select Agents and Toxins List [WWW Document], 2016. URL http://www.selectagents.gov/SelectAgentsandToxinsList.html (Accessed 9.28.2016). Gieraltowski, L., Higa, J., Peralta, V., Green, A., Schwensohn, C., Rosen, H., Libby, T., Kissler, B., Marsden-Haug, N., Booth, H., Kimura, A., Grass, J., Bicknese, A., Tolar, B., Defibaugh-Chávez, S., Williams, I., Wise, M., Salmonella Heidelberg Investigation Team, 2016. National outbreak of multidrug resistant salmonella heidelberg infections linked to a single poultry company. PLoS One 11, http:// dx.doi.org/10.1371/journal.pone.0162369. Graf, E.H., Simmon, K.E., Tardif, K.D., Hymas, W., Flygare, S., Eilbeck, K., Yandell, M., Schlaberg, R., 2016. Unbiased detection of respiratory viruses by use of RNA sequencing-based metagenomics: a systematic comparison to a commercial PCR panel. J. Clin. Microbiol. 54, 1000–1007, http://dx.doi.org/10.1128/JCM. 03060-15. Gyuranecz, M., Birdsell, D.N., Splettstoesser, W., Seibold, E., Beckstrom-Sternberg, S.M., Makrai, L., Fodor, L., Fabbi, M., Vicari, N., Johansson, A., Busch, J.D., Vogler, A.J., Keim, P., Wagner, D.M., 2012. Phylogeography of Francisella tularensis subsp. holarctica, europe. Emerg. Infect. Dis. 18, 290–293, http://dx.doi.org/10. 3201/eid1802.111305. Hykin, S.M., Bi, K., McGuire, J.A., 2015. Fixing formalin: a method to recover genomic-scale DNA sequence data from formalin-fixed museum specimens using high-throughput sequencing. PLoS One 10, http://dx.doi.org/10.1371/ journal.pone.0141579.

22

M.C. Walter et al. / Journal of Biotechnology 250 (2017) 16–22

Judge, K., Hunt, M., Reuter, S., Tracey, A., Quail, M.A., Parkhill, J., Peacock, S.J., 2016. Comparison of bacterial genome assembly software for MinION data. bioRxiv, 049213–049213. Karlsson, E., Svensson, K., Lindgren, P., Byström, M., Sjödin, A., Forsman, M., Johansson, A., 2013. The phylogeographic pattern of Francisella tularensis in Sweden indicates a Scandinavian origin of Eurosiberian tularaemia. Environ. Microbiol. 15, 634–645, http://dx.doi.org/10.1111/1462-2920.12052. Karlsson, E., Lärkeryd, A., Sjödin, A., Forsman, M., Stenberg, P., 2015. Scaffolding of a bacterial genome using MinION nanopore sequencing. Sci. Rep. 5, http://dx. doi.org/10.1038/srep11996. Keim, P., Grunow, R., Vipond, R., Grass, G., Hoffmaster, A., Birdsell, D.N., Klee, S.R., Pullan, S., Antwerpen, M., Bayer, B.N., Latham, J., Wiggins, K., Hepp, C., Pearson, T., Brooks, T., Sahl, J., Wagner, D.M., 2015. Whole genome analysis of injectional anthrax identifies two disease clusters spanning more than 13 years. EBioMedicine 2, 1613–1618, http://dx.doi.org/10.1016/j.ebiom.2015.10.004. Kilianski, A., Haas, J.L., Corriveau, E.J., Liem, A.T., Willis, K.L., Kadavy, D.R., Rosenzweig, C.N., Minot, S.S., 2015. Bacterial and viral identification and differentiation by amplicon sequencing on the MinION nanopore sequencer. GigaScience 4, http://dx.doi.org/10.1186/s13742-015-0051-z. Kilianski, A., Roth, P.A., Liem, A.T., Hill, J.M., Willis, K.L., Rossmaier, R.D., Marinich, A.V., Maughan, M.N., Karavis, M.A., Kuhn, J.H., Honko, A.N., Rosenzweig, C.N., 2016. Use of unamplified RNA/cDNA-Hybrid nanopore sequencing for rapid detection and characterization of RNA viruses. Emerg. Infect. Dis. 22, 1448–1451, http://dx.doi.org/10.3201/eid2208.160270. Kuroda, M., Sekizuka, T., Shinya, F., Takeuchi, F., Kanno, T., Sata, T., Asano, S., 2012. Detection of a possible bioterrorism agent, Francisella sp., in a clinical specimen by use of next-generation direct DNA sequencing. J. Clin. Microbiol. 50, 1810–1812, http://dx.doi.org/10.1128/JCM.06715-11. Mellmann, A., Harmsen, D., Cummings, C.A., Zentz, E.B., Leopold, S.R., Rico, A., Prior, K., Szczepanowski, R., Ji, Y., Zhang, W., McLaughlin, S.F., Henkhaus, J.K., Leopold, B., Bielaszewska, M., Prager, R., Brzoska, P.M., Moore, R.L., Guenther, S., Rothberg, J.M., Karch, H., 2011. Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104: H4 outbreak by rapid next generation sequencing technology. PLoS One 6, http://dx.doi.org/10.1371/ journal.pone.0022751. Mikheyev, A.S., Tin, M.M.Y., 2014. A first look at the Oxford Nanopore MinION sequencer. Mol. Ecol. Resources 14, 1097–1102, http://dx.doi.org/10.1111/ 1755-0998.12324. MinGW | Minimalist GNU for Windows [WWW Document], 2016. URL http:// www.mingw.org (Accessed 9.28.2016). Moore, J.E., Huang, J., Yu, P., Ma, C., Moore, P.J., Millar, B.C., Goldsmith, C.E., Xu, J., 2014. High diversity of bacterial pathogens and antibiotic resistance in salmonid fish farm pond water as determined by molecular identification employing 16S rDNA PCR, gene sequencing and total antibiotic susceptibility techniques. Ecotoxicol. Environ. Saf. 108, 281–286, http://dx.doi.org/10.1016/j. ecoenv.2014.05.022. Oikonomopoulos, S., Wang, Y.C., Djambazian, H., Badescu, D., Ragoussis, J., 2016. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations. Sci. Rep. 6, http://dx.doi.org/ 10.1038/srep31602. Ondov, B.D., Bergman, N.H., Phillippy, A.M., 2011. Interactive metagenomic visualization in a Web browser. BMC Bioinf. 12, 385–385. Ondov, B.D., Treangen, T.J., Melsted, P., Mallonee, A.B., Bergman, N.H., Koren, S., Phillippy, A.M., 2016. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132–132. Phillippy, A.M., Berlin, K., Walenz, B.P., Koren, S., Miller, J.R., 2016. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. bioRxiv. Quick, J., Ashton, P., Calus, S., Chatt, C., Gossain, S., Hawker, J., Nair, S., Neal, K., Nye, K., Peters, T., De Pinna, E., Robinson, E., Struthers, K., Webber, M., Catto, A., Dallman, T.J., Hawkey, P., Loman, N.J., 2015. Rapid draft sequencing and

real-time nanopore sequencing in a hospital outbreak of Salmonella. Genome Biol. 16, http://dx.doi.org/10.1186/s13059-015-0677-2. Quick, J., Loman, N.J., Duraffour, S., Simpson, J.T., Severi, E., Cowley, L., Bore, J.A., Koundouno, R., Dudas, G., Mikhail, A., Ouédraogo, N., Afrough, B., Bah, A., Baum, J.H.J., Becker-Ziaja, B., Boettcher, J.P., Cabeza-Cabrerizo, M., Camino-Sánchez, Á., Carter, L.L., Doerrbecker, J., Enkirch, T., García-Dorival, I., Hetzelt, N., Hinzmann, J., Holm, T., Kafetzopoulou, L.E., Koropogui, M., Kosgey, A., Kuisma, E., Logue, C.H., Mazzarelli, A., Meisel, S., Mertens, M., Michel, J., Ngabo, D., Nitzsche, K., Pallasch, E., Patrono, L.V., Portmann, J., Repits, J.G., Rickett, N.Y., Sachse, A., Singethan, K., Vitoriano, I., Yemanaberhan, R.L., Zekeng, E.G., Racine, T., Bello, A., Sall, A.A., Faye, O., Faye, O., Magassouba, N., Williams, C.V., Amburgey, V., Winona, L., Davis, E., Gerlach, J., Washington, F., Monteil, V., Jourdain, M., Bererd, M., Camara, A., Somlare, H., Camara, A., Gerard, M., Bado, G., Baillet, B., Delaune, D., Nebie, K.Y., Diarra, A., Savane, Y., Pallawo, R.B., Gutierrez, G.J., Milhano, N., Roger, I., Williams, C.J., Yattara, F., Lewandowski, K., Taylor, J., Rachwal, P., Turner, D.J., Pollakis, G., Hiscox, J.A., Matthews, D.A., O’Shea, M.K., Johnston, A.M., Wilson, D., Hutley, E., Smit, E., Di Caro, A., Wölfel, R., Stoecker, K., Fleischmann, E., Gabriel, M., Weller, S.A., Koivogui, L., Diallo, B., Keïta, S., Rambaut, A., Formenty, P., et al., 2016. Real-time, portable genome sequencing for Ebola surveillance. Nature 530, 228–232, http://dx.doi.org/10. 1038/nature16996. Quinlan, A.R., Hall, I.M., 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842. ˇ c, ´ I., Siki ´ M., Wilm, A., Fenlon, S.N., Chen, S., Nagarajan, N., 2016. Fast and Sovic, sensitive mapping of nanopore sequencing reads with GraphMap. Nat. Commun. 7, 11307. Svensson, K., Larsson, P., Johansson, D., Byström, M., Forsman, M., Johansson, A., 2005. Evolution of subspecies of Francisella tularensis. J. Bacteriol. 187, 3903–3908, http://dx.doi.org/10.1128/JB.187.11.3903-3908.2005. Svensson, K., Granberg, M., Karlsson, L., Neubauerova, V., Forsman, M., Johansson, A., 2009. A real-time PCR array for hierarchical identification of Francisella isolates. PLoS One 4, http://dx.doi.org/10.1371/journal.pone.0008360. Tatusova, T., Ciufo, S., Fedorov, B., O’Neill, K., Tolstoy, I., 2014. RefSeq microbial genomes database: new representation and annotation strategy. Nucleic Acids Res. 42, D553–D559, http://dx.doi.org/10.1093/nar/gkt1274. Vogler, A.J., Birdsell, D., Price, L.B., Bowers, J.R., Beckstrom-Sternberg, S.M., Auerbach, R.K., Beckstrom-Sternberg, J.S., Johansson, A., Clare, A., Buchhagen, J.L., Petersen, J.M., Pearson, T., Vaissaire, J., Dempsey, M.P., Foxall, P., Engelthaler, D.M., Wagner, D.M., Keim, P., 2009. Phylogeography of Francisella tularensis: global expansion of a highly fit clone. J. Bacteriol. 191, 2474–2484, http://dx.doi.org/10.1128/JB.01786-08. Wölfel, R., Stoecker, K., Fleischmann, E., Gramsamer, B., Wagner, M., Molkenthin, P., Di Caro, A., Günther, S., Ibrahim, S., Genzel, G.H., Ozin-Hofsäss, A.J., Formenty, P., Zöller, L., 2015. Mobile diagnostics in outbreak response, not only for Ebola: a blueprint for a modular and robust field laboratory. Euro surveillance: bulletin Europe´ıen sur les maladies transmissibles = Eur. Commun. Dis. Bull. 20, http://dx.doi.org/10.2807/1560-7917.ES.2015.20.44.30055. Wallerman, O., 2015. Current status of nanopore sequencing using the MinION device −from full length cDNA sequencing to genome assembly improvements. EMBnet.journal 21, http://dx.doi.org/10.14806/ej.21.A.819. Wang, J., Moore, N.E., Deng, Y.-M., Eccles, D.A., Hall, R.J., 2015. MinION nanopore sequencing of an influenza genome. Front. Microbiol. 6, http://dx.doi.org/10. 3389/fmicb.2015.00766. Zibra project [WWW Document], 2016. URL http://www.zibraproject.org (Accessed 9.28.2016).