Journal of Ethnopharmacology 139 (2012) 695–697
Contents lists available at SciVerse ScienceDirect
Journal of Ethnopharmacology journal homepage: www.elsevier.com/locate/jethpharm
Commentary
Reply to the commentary: “Regression residual vs. Bayesian analysis of medicinal floras” Marco Leonti a,∗ , Stefano Cabras b , Maria Eugenia Castellanos c , Caroline S. Weckerle d a
Dipartimento Farmaco Chimico Tecnologico, Università di Cagliari, Facoltà di Farmacia, Via Ospedale 72, 09124 Cagliari, CA, Italy Department of Mathematics and Informatics, Università di Cagliari, Via Ospedale 72, 09124 Cagliari, CA, Italy Department of Statistics, O.R. Rey Juan Carlos University C/Tulipan, SN 28938 Madrid, Spain d Institute of Systematic Botany, University of Zürich, Zollikerstrasse 107, CH-8008 Zürich, Switzerland b c
a r t i c l e
i n f o
Article history: Received 9 September 2011 Accepted 16 September 2011 Available online 29 September 2011
Since our paper introducing the Bayesian approach for the analysis of over- and underused taxonomic groups (Weckerle et al., 2011) addressed firstly the flaws in Bennett and Husby’s (2008) Binomial method, Moerman’s advocacy for regression analysis comes somewhat unexpectedly. Moerman’s critical commentary is in large part an anecdotally referenced historical account of regression analysis. However, that regression analysis and the Binomial method have been repeatedly used (including by ourselves: cf. Leonti et al., 2003, 2009) does not in itself testify to their infallibility (cf. Leonti, 2011). That the medicinal flora is not a random selection has already been made clear by Dioscorides (cf. Matthioli, 1568; Berendes, 1902). Although they did not rely on the concept of higher plant taxa, classical and Renaissance medicinal plant experts and pharmacognosists were well aware that some “kind of plants” were more heavily relied upon as medicine than others (cf. Tschirch, 1910). We agree with Moerman that quantitatively comparing the medicinal flora with the overall flora and testing specific taxa for over- and underuse is a “useful and interesting way to proceed” and hence addressed the questions posed by Moerman (1979) by introducing a more scientifically robust Bayesian method (cf. Weckerle et al., 2011). We try to address here the methodological issues and disagreements raised in Moerman’s commentary. The central point addressed by our critics concerns the statistical model used to represent the available data. In fact, Bennett and Husby (2008) used the same statistical model as we do, namely the Binomial one, but with a different approach towards estimation. This is because percentages play a central role in their approach, being the maximum likelihood estimators of the true proportion of uses. We avoid this approach and opted for the Bayesian method, in
∗ Corresponding author. Tel.: +39 0706758712; fax: +39 0706758553. E-mail addresses:
[email protected],
[email protected] (M. Leonti). 0378-8741/$ – see front matter © 2011 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.jep.2011.09.023
acknowledgement of the highly recognized serious questions and answers (and the rejoinder) raised in the article by Gelman (2008), which we encourage all those interested in this discussion to consult. In fact, in the rejoinder (Section 2) Gelman points out that the “challenge comes in constructing realistic models and in assessing their fit”. This is exactly our point in criticizing the regression analysis approach. References on standard Bayesian analysis are indicated in the textbook by Gelman et al. (2003). On one particular point we (Weckerle et al., 2011) and Moerman seem to agree: regression analysis privileges large families. But while Moerman argues that there is “a very good reason to privilege large families because that’s where medicinal plants are” we consider such circular reasoning as non-scientific. The question here is “where are the medicinal plants?” and not “where are the medicinal plants in the large families?” To answer the question whether a medicinal flora departs from a random selection, all families have to be considered equally, independent of their size. The Bayesian analysis does neither “privilege” small plant families nor does it “ignore” large ones and Bayesian analysis is not based on percentage calculation either. Why does the regression analysis privilege large families? “This is because the residual of a relatively small plant family (e.g. n = 10)” may hypothetically achieve a value < |10|, “while a relatively large plant family (e.g. n = 100) may get a residual” <|100|. Moerman has spotted it very well – it’s not “maximally” 10 or 100 but larger than −10 and smaller than +10 or larger than −100 or smaller than +100. This does not change anything in our argumentation. The problem is that regression analysis lumps all plant families together as if small families may achieve residuals with the same order of magnitude as large families. Figs. 1 and 2 show the association between regression residuals and family size based on our dataset from Campania and highlight again that the result of regression analysis is an artifact of family size and regression approach.
696
M. Leonti et al. / Journal of Ethnopharmacology 139 (2012) 695–697
Fig. 1.
Furthermore, the lack of inferential data analysis renders the interpretation of regression analysis very subjective and leaves it open to arbitrary argumentation. Moerman exemplifies it himself by suggesting that anyone interested in the smaller families might eliminate the larger families in any way one prefers (calling them outliers), and repeating the regression residual analysis [sic!]. The problem is that large families or outliers cannot be mathematically defined, which inevitably leads to arbitrariness. Thus, the theory proposing a global pattern of medicinal plant knowledge, put forward by Moerman and colleagues (1999) and based on results obtained with the regression analysis needs further verification with a more robust method such as the Bayesian. This approach more accurately represents a random mechanism for “the result of generations of human beings studying nature”, as argued by Moerman. Why is it important to treat all families equally? Plant families as all other supraspecific ranks in the Linnean hierarchy are a matter of convenience and today often a matter of consensus among leading researchers in the field (e.g. history of APG I–III). Only the circumscription of clades (a collection of lineages descended from a common ancestor) as currently favored by phylogenetic systematists has a sound scientific basis. The assignment of a certain rank to
such clades is not fixed. Thus, the size and circumscription of specific families and higher taxonomic ranks is changeable. While this is not “the result of some statistical trick” it will, however, heavily influence regression analysis. For instance, in our regression analysis of the Popoluca medicinal flora the Fabaceae (s.l.) obtained the third highest residual (cf. Leonti et al., 2003). Using a different possible taxonomic classification, i.e. separating the Fabaceae into Caesalpiniaceae, Fabaceae and Mimosaceae, would have launched all three families into the midfield of the residual ranking. With the Bayesian method, however, the results would not have been affected by such a taxonomic variation because it is a more robust approach, which generates more coherent results. A survey of medicinal plants is never a census, it is only ever a sample. During the fist author’s fieldwork with the Popoluca, in Mexico, he interviewed 72 specialists who named 614 medicinal species and “new” medicinal plants were also shown to him by the last few healers interviewed (Leonti et al., 2003). We argue that exhaustive fieldwork would lead to a medicinal flora congruent with the total flora and represent a considerable proportion of personal (individual) knowledge and beliefs and hence including data without cultural consensus. However, since no field worker interviews the whole population (in the case of the Popoluca, ca. 50.000 people) such data is always a more or less representative sample. Because of this fact, the analysis has to rely on statistical approaches, which are based on probabilities, such as the Bayesian approach proposed in the paper under discussion. Our approach was not limited to plant families only but also included the higher taxonomic groups: Pteridophytes, Gymnosperms, early diverging Angiosperms, Monocots, early diverging Eudicots, Rosids, and Asterids (cf. Weckerle et al., 2011). Bayesian analysis at this taxonomic level showed that the probability for a common proportion of medicinal plants is very low and that it is to be rejected because two of the 95% credible intervals for the proportion of uses of taxonomic groups differ from the common proportion of medicinal species (in fact Monocots are underused and early diverging Eudicots overused, see graphical abstract). Therefore, the hypothesis of a common proportion underlying the regression analysis does not hold. The regression analysis is based on the assumption that medicinal plants are distributed in equal proportions across taxonomic groups, with the calculated regression coefficient supposedly representing this common proportion. Neither regression analysis nor Bayesian analysis of medicinal floras serve, by themselves, as tools for bioprospecting purposes. However, a combination of the Bayesian method with a chi-square analysis of the association between use categories and taxonomic groups may lead to interesting ethnopharmacological hypotheses (Weckerle et al., 2011), in the attempt to differentiate between medicinal plant selection driven by pharmacological properties on the one hand and cognitive features, ecological factors and cultural history on the other (cf. Leonti, 2011).
References
Fig. 2.
Bennett, B.C., Husby, C.E., 2008. Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analysis. Journal of Ethnopharmacology 116, 422–430. Berendes, J., 1902. Des Pedanios Dioskurides aus Anazarbos Arzneimittellehre in fünf Büchern. F. Enke, Stuttgart. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B., 2003. Bayesian Data Analysis, Second ed. Chapman & Hall-CRC, London. Gelman, A., 2008. Objections to Bayesian statistics. Bayesian Analysis 3, 445–450. Leonti, M., Ramirez, R.F., Sticher, O., Heinrich, M., 2003. Medicinal flora of the Popoluca, México: a botanico-systematical perspective. Economic Botany 57, 218–230. Leonti, M., Casu, L., Sanna, F., Bonsignore, L., 2009. A comparison of medicinal plant use in Sardinia and Sicily – De Materia Medica revisited? Journal of Ethnopharmacology 121, 255–267. Leonti, M., 2011. The future is written: impact of scripts on the cognition, selection, knowledge and transmission of medicinal plant use and its implications
M. Leonti et al. / Journal of Ethnopharmacology 139 (2012) 695–697 for ethnobotany and ethnopharmacology. Journal of Ethnopharmacology 134, 542–555. Matthioli, A., 1967. MDLXVIII, (1568). I Discorsi di M. Pietro Andrea Matthioli. Sanese, Medico Cesareo, et del Serenissimo Principe Ferdinando Archiduca d’Austria & c.Nelli Sei Libri Di Pedacio Dioscoride Anazarbeo della Materia Medicale. Vincenzo Valgrisi, Venezia. Anastatic reproduction in 6 volumes, Roma. Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of native American medical ethnobotany. Journal of Ethnopharmacology 1, 111–119.
697
Moerman, D.E., Pemberton, R.W., Kiefer, D., Berlin, B., 1999. A comparative analysis of five medicinal floras. Journal of Ethnobiology 19, 49–67. Tschirch, A., 1910. Handbuch der Pharmakognosie. Allgemeine Pharmakognosie, Erster Band II. Abteilung. Chr. Herm. Tauchnitz, Leipzig. Weckerle, C.S., Cabras, S., Castellanos, M.E., Leonti, M., 2011. Quantitative methods in ethnobotany and ethnopharmacology: considering the overall flora-hypothesis testing for over- and underused plant families with the Bayesian approach. Journal of Ethnopharmacology 137, 837–843.