Open science – combining open data and open source software: Medical image analysis with the Insight Toolkit

Open science – combining open data and open source software: Medical image analysis with the Insight Toolkit

Medical Image Analysis 9 (2005) 503–506 www.elsevier.com/locate/media Editorial Open science – combining open data and open source software: Medical...

210KB Sizes 0 Downloads 101 Views

Medical Image Analysis 9 (2005) 503–506 www.elsevier.com/locate/media

Editorial

Open science – combining open data and open source software: Medical image analysis with the Insight Toolkit

In 2003, the US National Institutes of Health, Biomedical Engineering Consortium focused their annual symposium on ‘‘Catalyzing Team Science.’’ The 2004 symposium concentrated on ‘‘Biomedical Informatics for Clinical Decision Support.’’ A trend in the health sciences is emerging that supports collaborative research and development with a strong focus on clinical practice, early diagnosis, and the discovery of new treatments. Economic forces and the realities of workflow management create natural barriers that impede collaboration among bench scientists who deal with cell cultures and pharmaceutical design. Even with great care, establishing and maintaining control over distributed clinical trials is a difficult task consuming significant human and financial resources. In many ways, despite its advantages, collaborative scientific research is often beyond the capacity of independent groups to establish and maintain. Some notable exceptions that contradict this observation appear in those fields that naturally integrate with digital technologies. Genomics, medical informatics, and medical image analysis are areas where collaborative work has not only appeared, but has prospered. Because the research tools and the innate characteristics of the information lend themselves to digital storage and retrieval, these fields are empowered by their infrastructure to gravitate and coalesce into natural, often geographically distributed teams. Facilities such as e-mail, internet services, and shared data and software enable rapid communication and dissemination of important results among members of these research communities. One such field is the community of software developers that investigates medical image analysis. This research area is underdeveloped, rich with unexplored and important questions, and approachable using digital technologies and applications of existing mathematical, statistical, and engineering techniques. The community that surrounds this area is dominated by computer pro1361-8415/$ - see front matter Ó 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.media.2005.04.008

grammers and software developers who intrinsically communicate with a polyglot of data, algorithms, mathematics, a predominance of programming code, and only a smattering of English (or other spoken language). It is often necessary to examine the implementation of a computer program in order to fully understand a research result in this field. This fact has cause some to remark that, ‘‘programmers speak to each other using 10% language, and 90% in code.’’

1. Open source for medical image analysis This particular combination of challenging, important research questions approachable through digital analysis and a deployed scientific community already interconnected with computer communication networks is predisposed to adopt the principles and practices of open source software development. Open source communities are an example of a concept of scientific rendezvous, essentially a shared vision that promotes dialog throughout a field. It is most likely to be an open and welcoming group, admitting free dialog across many subjects, but since it primarily advances a small emerging but significant field, the topics will be narrowed and focused almost by definition. A scientific rendezvous is not likely to be closed, proprietary, nor a standard. Scientific rendezvous sometimes emerge from funded research projects. Perhaps the most successful sponsored project that led to the formation of a community dialog and later to an entire industry was the development of HTML, paving the way for derivative products such as Netscape Communicator and Internet Explorer. The US National Library of Medicine has participated in multiple efforts of this nature, including GenBank, the public gene sequence repository. Also, the Visible Human ProjectTM has produced a pair of studies of human anatomy that have provided common ground

504

Editorial / Medical Image Analysis 9 (2005) 503–506

from which to extend multidisciplinary research in the fields of anatomy, medical imaging, visualization, and other areas of computer science (Ackerman, 1998). The pressures that encourage the medical image analysis practitioners toward shared software are not sufficient to assure the establishment of successful opensource initiatives. Open source ideas encourage but do not guarantee collaboration and dissemination of ideas and information among software developers and algorithm designers. If software is developed with a too narrow a focus, it may prove unsuitable for a wider audience. If programming tools are not robust and are prone to failure, they will be abandoned even by their own developers. The encouragement of a scientific rendezvous in medical image analysis requires the careful design and construction of usable, stable code to provide a sufficiently strong foundation for collaboration, interaction, and community development.

2. The need for public image analysis tools There are pressing needs for automated techniques in image analysis. Medical processes such as early cancer detection, monitoring the progress of medical treatment, response to cardiovascular disease, and analysis of neurological disorders such as stroke all benefit from advanced imaging. Public exchange of software tools accelerates the development and improvement of medical care through imaging. This need creates a focus for the community and a basis for a scientific rendezvous. After holding a joint NIH workshop (Ackerman, 2000), the National Library of Medicine (NLM), in partnership with the National Institute for Dental and Craniofacial Research, the National Eye Institute, the National Institute of Mental Health, the National Science Foundation, the National Institute for Neurological Disorders and Stroke, the National Institute on Deafness and other Communication Disorders, the DoD Telemedicine and Advanced Technology Research Center, and the National Cancer Institute, has supported the creation of the Insight Toolkit (ITK) (Yoo and Ackerman, 2005), an application programmers interface (API) for high dimension image processing (see Fig. 2). This work has been the focus over which both research and government institutions have come together. The initial emphasis of this effort is to provide public software tools for 3D segmentation and deformable and rigid registration, capable of analyzing the head-and-neck anatomy of the Visible Human ProjectTM data. The eventual goal is to provide the cornerstone of a self-sustaining software community in 3D, 4D and higher dimensional data analysis. Ultimately, we hope that this will be a public software resource that will serve as a foundation for future medical image understanding research.

3. Building a team The idea of constructing libraries of image processing algorithms is not new. Every research team in this field has attempted it at one time or another, but only with marginal success at distributing their software among their colleagues. Even public sponsorship of imaging software tools has been tried repeatedly. Examples of previous medical image processing efforts include: NIH Image, MedX (Sensor Systems), Analyze (Mayo Clinic), 3DViewnix (UPenn), and others. Also, publicly available methods are often developed using Matlab and Mathematica, but require a significant investment in a proprietary software product and are only layered on top of the software framework. The Insight project is somewhat unique, or at least the first of its kind among NIH software development awards. It is distinguished by having included a significant component to foster good open source software development practices, source-forge systems, crosscompiling software systems, and elements for software comparison, testing, validation, distribution, and publishing (Schroeder et al., 2004). We sought a consortium of software developers, rather than a single award in the hopes that the natural process of building consensus would generate a broad versatile software base for a wider audience. Since ‘‘design by committee’’ leads to disaster, we empowered a core group of software architects and systems integrators to make primary design choices. The Insight Software Consortium (ISC) is working to deliver a software toolkit to improve and enable research in volume imaging for all areas of health care. The successful collaboration among such disparate groups is a measure of the success of this effort. A partial list of the principal participants in the group includes: Bill Lorensen (GE Global Research), Will Schroeder (Kitware), Lydia Ng (Insightful), Stephen Aylward (UNC-CH), Jim Gee, Jay Udupa (UPenn), George Stetten (UPitt, and Carnegie Mellon), Paul Yushkevich (Cognitica), Celina Imielinska (Columbia), Kevin Cleary (Georgetown), C.F. Westin (Harvard), Daniel Rueckert (Imperial College London), Richard Robb (Mayo Clinic), Dimitris Metaxas (Rutgers), Yarden Livnat (SCI Institute), Vincent Magnotta (UIowa), Ross Whitaker (U. Utah) (see Fig. 1). This group continues to grow internationally, demonstrating increasing momentum for this concept and some proof of its value.

4. Open data + open source = open science A non-profit association, the ISC was incorporated in early 2004. The formation of this group is an indication of the future of this effort. In time, the ISC may become

Editorial / Medical Image Analysis 9 (2005) 503–506

505

the archival literature, but also with experimental findings that can be repeated by users and peers. The concept of disseminating working implementations should help to accelerate algorithm development, amplify the expertise of the limited software resources of small laboratories, and enforce scientific integrity among active researchers. The ISC and the NLM are attempting to construct open data collections to support the wider image analysis research community. These efforts are nascent, and the results of these pilot and prototype studies are beyond the scope of this discussion. However, the future releases of ITK can be empowered with a broad array of shared data to accompany the tools. When tied to shared and distributable software tools, the combination of public data and public software will open doors for new research.

5. Contents of this special issue Fig. 1. A growing community of ITK public software developers: NLM, Carnegie Mellon, Cognitica, Columbia, GE Global Research, Georgetown, Harvard Brigham and WomenÕs Hospital, Imperial College London, Insightful, Kitware, Mayo Clinic, Rutgers, SCI Institute, University of Iowa, University of North Carolina, UPenn, University of Pittsburgh, and the University of Utah.

Fig. 2. A community of sponsors of the Insight Software Consortium – The National Library of Medicine, the National Institute for Dental and Craniofacial Research, the National Eye Institute, the National Institute of Mental Health, the National Science Foundation, the National Institute for Neurological Disorders and Stroke, the National Institute on Deafness and other Communication Disorders, the DoD Telemedicine Advanced Technology Research Center, and the National Cancer Institute.

the trustees of the open source software and serve as a focal point for the community. One of the founding tenets of the ISC is the notion of reproducible scientific results. A significant strength of an open repository of software tools is that if it is matched with a comparable digital repository of experimental data, results can be reported not only through

At the time of this publication, the project has produced its sixth release of ITK (2.0). NLM is committed to supporting this effort, and regular software releases are intended. All software is publicly available in source-code form. It builds and runs on WindowsTM systems as well as on a variety of UnixTM systems including SolarisTM, Linux, SGI IrixTM, and MacOS-X, all on a wide variety of compilers. Interested software developers should visit the Insight home page (URL: http:// www.itk.org). An open mailing list, insight-users, is currently active and open to all subscribers. Support tools including CMake (Martin and Hoffman, 2003; Hoffman and Martin, 2003) (a cross-platform build utility), Cable (King and Schroeder, 2003; Martin et al., 2002) (a system to support multiple language bindings), and DART (the dashboard regression testing support infrastructure), are all part of the public software associated with this project and available as source code. A companion textbook (Yoo, 2004) describing the algorithmic methods and a software guide (Ibanez et al., 2003) to assist in using the implementations were incorporated as part of a comprehensive documentation effort are both currently available. This special issue of Medical Image Analysis presents a cross-section of the work that has been enabled by this exceptional initiative. The issue covers a variety of ITK implemented techniques for multimodal registration, segmentation, algorithm validation, software integration, and medical image analysis education. We have not attempted to describe the full breadth of ITK in this editorial, believing instead that the articles in this issue speak more eloquently than would be possible here. It has been a rare honor to participate in the Insight program and a privilege to serve as guest editors for this special issue of Medical Image Analysis.

506

Editorial / Medical Image Analysis 9 (2005) 503–506

References Ackerman, M.J., 1998. The Visible Human Project. Proc. IEEE 86 (3), 504–511. Ackerman, M.J, Yoo, T.S., Jenkins, D., 2000. The visible human project: from data to knowledge. In: Lemke, H.U., et al. (Eds.), Computer Assisted Radiology and Surgery (Proceedings of CARS2000). Elsevier, Amsterdam, pp. 11–16. Hoffman, W., Martin, K., 2003. The CMake build manager. Dr. DobbÕs Journal, January. Ibanez, L., Schroeder, W., Ng, L., Cates, J., 2003. The ITK Software Guide. Kitware, Inc., ISBN 1-930934-10-6. King, B., Schroeder, W., 2003. Automated wrapping of complex C++ code. C/C++ Users Journal, January. Martin, K., Hoffman, B., 2003. Mastering CMake: A Cross-Platform Build System. Kitware Inc. Martin, K., Geveci, B., Hoffman, W., 2002. Creating libraries for multiple programming languages. Dr. DobbÕs Journal, February.

Schroeder, W., Ibanez, L., Martin, K., 2004. Software process: The key to developing robust, reusable and maintainable open-source software. In: Proceedings of ISBI Nano To Macro 2004 Conference. Yoo, T.S. (Ed.), 2004. Insight into Images: Principles and Practice for Segmentation, Registration, and Image Analysis. AK Peters, Wellesley, MA. Yoo, T.S., Ackerman, M.J., 2005 Open Source Software for Medical Image Processing and Visualization. In: Metaxas, D. (Ed.), Special Issue of Communications of the ACM on Medical Image Modeling 48 (2) 55–59.

Guest Editors Terry S. Yoo Dimitris N. Metaxas Available online 19 September 2005