Gently introducing the wonderful world of R

Gently introducing the wonderful world of R

Book Reviews Gently introducing the wonderful world of R Getting Started with R: an Introduction for Biologists by Andrew P. Beckerman and Owen L. Pe...

149KB Sizes 1 Downloads 165 Views

Book Reviews

Gently introducing the wonderful world of R Getting Started with R: an Introduction for Biologists by Andrew P. Beckerman and Owen L. Petchey, Oxford University Press, 2012. US$37.99/£19.99, (160 pp.) ISBN978-0-19-960162-2

Graeme D. Ruxton School of Biological Sciences, University of St Andrews, St Andrews KY16 9TH, UK

From beginnings as recently as 1997, the statistical software package R has really conquered biology: I surveyed the most recent issues of the journals Ecology, Evolution, Ecology Letters, and Journal of Evolutionary Biology and found reference to R three times as often as the next most commonly cited software package – SAS. This conquest is not an accident – R really is wonderful. It can help you perform a vast array of statistical techniques and produce exceptionally flexible publication-quality graphics to go alongside your statistics. But what makes it particularly wonderful is its command-line driven interface, which means that at the end of your analyses you have a script (a list of the commands you used) that as the authors of this book explain is a ‘permanent, repeatable, annotated, shareable, cross-platform record of your analyses.’ That means that when in six months’ time a journal referee asks for a small modification to your analyses – this truly should be a minor undertaking – you simply load up your old script – familiarise yourself with what you did before (easy with the annotation), make the little tweak, and ask R to run the analyses again. You can do this on any computer provided you have your data-set and this script. R is free to download without restriction and works on any operating system you’ll encounter. This command-line interface is part of what makes R wonderful, but also what can make it intimidating to get started on. Blowing away any feeling of intimidation is what this book is about. There are scientists who feel they should try R but don’t know how they can find the time to learn how to use it, and scientists who have tried but given up in frustration. This is truly the book for them. The reduction in intimidation starts from the look of the book: it is tiny, just over 100 not-remotely-densely-packed pages. You think, ‘I could read this book in an afternoon’ – and you can. More importantly, if you clear two days in your diary (and you can), then you can work through this book carefully trying the little exercises and working in R. At the end of those two days you will not be completely clear on how to do every complex procedure in R – but you will feel that you already have a strong foundation of familiarity in R and the justified belief that your confidence and ability with it will improve quickly. The authors achieve this through a clear and approachable style. The writing style is very friendly, consider for example this quote: ‘If you’re still stuck, email one of us. Really.’ R is so flexible that there are often many ways to Corresponding author: Ruxton, G.D. ([email protected]).

532

achieve the same ends – but in truth most of us find a way that works for us and stick to it. This is another of the book’s strengths; it does not aim to be exhaustive, and presents one simple and effective way to do the most daunting thing of all in R: importing your data from a spreadsheet. A final strength of this book is that it does not try to do too much, its aim is to get you comfortable with R, not to teach you new statistical methodologies. However, in an understated way, it does actually have some important things to say about the importance of graphical representation of data and about testing to see how well your data fits the assumptions of a given analysis before getting too concerned about the results of that analysis. These are issues that we all appreciate on some level, but it does no harm to have them reinforced. My future analyses will be improved through absorbing the philosophy of data exploration presented in this book. I share the authors of this book’s enthusiasm for the R script. One currently under-exploited opportunity this offers us is a way to greatly enrich scientific publication. Journals are already encouraging authors to lodge datasets in some freely accessible store. I see great benefits to an annotated R script containing all a paper’s statistical analyses and figures being provided as an electronic appendix to published papers. If the data is available too, if you then read a paper and are not sure exactly what analysis was done or what the effect would be of a small alteration to the analysis, it should be the work of only a few minutes to explore this yourself. Similar benefits would accrue in the reviewing process. As I said earlier, there is an initial challenge in using R compared to menu-driven packages. This leaves those of us looking to teach statistics to biology undergraduates in a quandary. Some would argue that students are best served by learning the fundamentals of statistics using a menudriven package like MINITAB, then moving onto R (and more complex analyses) when their confidence with statistical fundamentals is higher. Others would argue that because R is so utterly superior to the menu-driven alternatives, that the sooner students get to experience its joys the better it is for them, and so they should learn statistics in R from day one. The layout of this book suggests another approach to me. R lets you draw wonderful figures. My suggestion is that students should be introduced to R as a figure-drawing package initially, and once their confidence in handling it has risen the students can then turn to applying that confidence to statistical analyses. My experience is that students derive a lot of satisfaction from producing colourful, aesthetically-pleasing figures, and the enormous flexibility of R gives their creative imaginations

Book Reviews tremendous scope. I think further pedagogical benefits of this approach would be to cement the idea of looking at your data graphically before launching into statistical analyses, and to allow us to emphasise to students the fundamentals of a well-designed scientific figure. The book would make the ideal text for a short course on data

Trends in Ecology and Evolution October 2012, Vol. 27, No. 10

management and presentation – it truly packs an amazing amount of wisdom and wit between slim covers. 0169-5347/$ – see front matter http://dx.doi.org/10.1016/j.tree.2012.06.007 Trends in Ecology and Evolution, October 2012, Vol. 27, No. 10

The fossil record of development Embryos in Deep Time: The Rock Record of Biological Development by Marcelo Sa´nchez, University of California Press, 2012. US$39.95, hbk (265 pp.) ISBN 978-0-520-27193-7

John A. Cunningham School of Earth Sciences, University of Bristol, Bristol, BS8 1RJ, UK

Evolutionary developmental biology, or ‘evo-devo’, is the study of how development has evolved and how modifications of development affect evolutionary change [1,2]. Although recent advances in developmental genetics have revolutionised this field, fossils also have a vital role. They provide insight into the sequence of changes involved in the assembly of new body plans or structures, and also document the historical patterns of anatomy and ontogeny that models of developmental evolution aim to explain. As almost all evolutionary history took place in species that are now extinct, evo-devo workers ignore this evidence at their peril. In Embryos in Deep Time, Marcelo Sa´nchez eloquently outlines the myriad ways in which palaeontology can inform about development in deep time; he does not restrict himself to embryology, as the title of the book might suggest. One chapter describes the evidence that comes from fossilised embryonic and juvenile vertebrates, which can provide important information about ancient lifehistory strategies. The amount of data available is perhaps surprising: there are hundreds of scientific papers describing such fossils in the literature. Another chapter details the ontogenetic information that can be gleaned from bones and teeth. These preserve a record of the life of the animal and can be used to infer traits such as the age at sexual maturity and growth rate of extinct animals. A fine example of what can be achieved when palaeontology is combined with developmental genetics comes from Sa´nchez’ own work on vertebral numbers in amniotes (tetrapod vertebrates with a terrestrially adapted egg) [3]. The number of vertebrae corresponds directly to segmentation, whereas the subdivision of the vertebral column into distinct regions correlates with Hox-gene boundaries. Surveying the vertebral number of extinct as well as extant amniotes showed that segmentation and Hox-gene reorganisation act independently, at least at higher taxonomic levels. It also revealed that the conservative pattern seen in living mammals extends right back to the early evolution of the synapsids during the Palaeozoic. By contrast, reptiles have had a much more plastic vertebral number since early in their evolution. Corresponding author: Cunningham, J.A. ([email protected]).

The inclusion of information from stem-group taxa, as in Sa´nchez’ study, is an area where fossils can provide key insights. Stem-group fossils document the acquisition of the characters that are diagnostic of the living members of a particular group. In the case of mammals, the last common ancestor of the living groups evolved some 220 million years ago, yet the stem group extends back to the divergence from reptiles up to 100 million years earlier. The inclusion of these fossils means that it is possible to begin to understand ancient developmental mechanisms and transformations that occurred as the mammalian body plan was being established. Sa´nchez describes other examples of the importance of such ‘missing links’, including Friedman’s work on the origin of asymmetry in flatfish [4], which adorns the cover of the book and is surely set to become a classic. Sa´nchez is well placed to provide such a synthesis, given his expertise in integrating fossils into the study of developmental evolution. He has written an engaging personal book that largely sticks to the vertebrate fossil record, and often to his own work or that of his close colleagues. The flip side is that there is no attempt to be comprehensive; famous examples, even among the vertebrates, are not included (work by Shubin and colleagues on the evolution of tetrapod limbs [5] springs to mind). There is a single chapter on invertebrates, although Sa´nchez steers clear of a discussion of the origin of the animal phyla. This is perhaps one of the prime areas where fossil data, including preserved embryos, can provide new insights into developmental evolution [6,7]. Only animals are covered, meaning that there is no discussion of plants, which represent another fertile ground for developmental palaeontology [8], not least because the entire developmental sequence can sometimes be preserved. Ultimately, Sa´nchez’ focus on the vertebrates is the strength of the book as it means that the reader is entertained by his personal take on numerous examples from charismatic organisms. This makes for a captivating account of what the fossil record can say about development. The existence of other groups where the developmental evidence is just as good as in the vertebrates (and, in a few cases, possibly even better) offers up opportunities for additional books in a series on Embryos in Deep Time. References 1 Raff, R.A. (1996) The Shape of Life: Genes, Development, and the Evolution of Animal Form, University of Chicago Press 533