Maximum-likelihood estimation of xylem vessel length distributions

Maximum-likelihood estimation of xylem vessel length distributions

Accepted Manuscript Maximum-likelihood estimation of xylem vessel length distributions Roman M. Link , Bernhard Schuldt , Brendan Choat , Steven Jans...

1MB Sizes 0 Downloads 41 Views

Accepted Manuscript

Maximum-likelihood estimation of xylem vessel length distributions Roman M. Link , Bernhard Schuldt , Brendan Choat , Steven Jansen , Alexander R. Cobb PII: DOI: Reference:

S0022-5193(18)30367-9 10.1016/j.jtbi.2018.07.036 YJTBI 9561

To appear in:

Journal of Theoretical Biology

Received date: Revised date: Accepted date:

11 March 2017 26 July 2018 27 July 2018

Please cite this article as: Roman M. Link , Bernhard Schuldt , Brendan Choat , Steven Jansen , Alexander R. Cobb , Maximum-likelihood estimation of xylem vessel length distributions, Journal of Theoretical Biology (2018), doi: 10.1016/j.jtbi.2018.07.036

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

Highlights • Mathematical equations for vessel length estimation from silicon injection data • Analysis of overall and size-biased vessel length estimators in the literature • Maximum-likelihood estimators for two different sampling schemes • R scripts to easily estimate overall and size-biased vessel length distributions

1

ACCEPTED MANUSCRIPT Maximum-likelihood estimation of xylem vessel length distributions Revised resubmission of Journal of Theoretical Biology MS# JTB-D-17-01009R1 Roman M. Linka,b, Bernhard Schuldta,b, Brendan Choatc, Steven Jansend, and Alexander R. Cobb e,1,2

c

d e

1

2

CR IP T

b

Plant Ecology, Albrecht von Haller Institute of Plant Sciences, University of Göttingen, 37073 Göttingen, Germany University of Würzburg, Julius-von-Sachs-Institute of Biological Sciences, Ecophysiology and Vegetation Ecology, Julius-von-Sachs-Platz 3, 97082 Würzburg, Germany Hawkesbury Institute for the Environment, University of Western Sydney, Richmond, New South Wales, Australia Institute of Systematic Botany and Ecology, Ulm University, 89081 Ulm, Germany Organismic and Evolutionary Biology, Harvard University, 02138 Cambridge, Massachusetts, USA Present address: Center for Environmental Sensing and Modeling, Singapore-MIT Alliance for Research and Technology, 138602 Singapore, Singapore Corresponding author: [email protected]

AN US

a

R. Link: [email protected] B. Schuldt: [email protected] B. Choat: [email protected] Alex Cobb: [email protected]

Running title:

Estimating vessel length distributions

Word count:

Main body total

8252

Abstract

321

Introduction

1971

PT

ED

M

Email addresses:

2784

Results

987

AC

CE

Material and methods

Discussion

2510

Acknowledgements

88

Competing interests

6

Tables

5

Figures

3

Keywords: Hydraulic architecture, paint perfusion, silicon injection, xylem structure, wood anatomy. 2

ACCEPTED MANUSCRIPT Abstract Vessel length is an important functional trait for plant hydraulics, because it determines the ratio of flow resistances posed by lumen and pit membranes and hence controls xylem hydraulic efficiency. The most commonly applied methods to estimate vessel lengths are based on the injection of silicon or paint into cut-off stem segments. The number of stained

CR IP T

vessels in a series of cross-sections in increasing distance from the injection point is then counted. The resulting infusion profiles are used to estimate the vessel length distribution using one of several statistical algorithms. However, the basis of these algorithms has not

AN US

been systematically analysed using probability theory.

We derive a general mathematical expression for the expected shape of the infusion profile for a given vessel length distribution, provide analytic solutions for five candidate distributions (exponential, Erlang(2), gamma, Weibull, and log-normal), and present

M

maximum likelihood estimators for the parameters of these distributions including

ED

implementations in R based on two potential sampling schemes (counting all injected vessels or counting the injected and empty vessels in a random subset of each cross-section). We then

CE

experiments.

PT

explore the performance of these estimators relative to other methods with Monte Carlo

Our analysis demonstrates that most published methods estimate the conditional length

AC

distribution of vessels that cross an injection point, which is a size-biased version of the overall length distribution in the stem. We show the mathematical relationship between these distributions and provide methods to estimate either of them. According to our simulation experiments, vessel length distribution was best described by the more flexible models, especially the Weibull distribution. In simulations, the estimators were able to recover the parameters of the vessel length distribution if its functional form was known, achieving an 3

ACCEPTED MANUSCRIPT overlap of 90% or more between the true and predicted length distribution when counting no more than 500 injected vessels in 10 cross-sections. This sample size nowadays can easily be

AC

CE

PT

ED

M

AN US

CR IP T

reached with the help of automated image analysis.

4

ACCEPTED MANUSCRIPT Introduction The resistance water faces during flow through the xylem conduits of plants arises from the resistance of the conduit lumen and of pit membranes between conduits. The lumen resistivity, or pressure gradient required to drive a rate of volumetric flow through the xylem lumens in a piece of xylem of a given cross-sectional area and length, can be reasonably well

CR IP T

described as a function of the lumen diameter according to the Hagen-Poiseuille equation, possibly with a correction for perforation plates (Christman and Sperry, 2010). The resistivity contributed by pit membranes is more challenging to quantify as it depends on aspects of pit membrane structure and connectivity between vessels that are difficult to measure directly

AN US

(Loepfe et al., 2007; Choat et al., 2008; Martínez-Vilalta et al., 2012).

However, as the ratio of lumen to inter-vessel pit resistivity is determined by the vessel length distribution, vessel length is an important control variable for the flow resistance of the

M

hydraulic pathway (Comstock and Sperry, 2000; Wheeler et al., 2005; Cai and Tyree, 2014).

ED

Notably, while the contribution of pits to overall flow resistance could be easily minimized by increasing vessel length, empirical data suggest that inter-vessel pits commonly make up 50%

PT

or more of the total resistivity (Sperry et al., 2005; Choat et al., 2006; Jansen et al., 2009). The reason why vessels generally seem to be too short to compensate for the resistance of pit

CE

membranes has been explained by constraints on vessel allometry posed by trade-offs

AC

between hydraulic efficiency and safety from drought-induced embolism (Wheeler et al., 2005; Gleason et al., 2016). Two possible mechanisms have been suggested to explain how safety from embolism could constrain vessel lengths. First, longer vessels have a larger surface area, which, according to the rare pit hypothesis, may be correlated to the fraction of the area covered by pit membranes, and therefore the probability of embolism spreading through the xylem by air seeding through an unusually large pore in a pit membrane 5

ACCEPTED MANUSCRIPT (Christman et al., 2009, 2012; but see also Scholz et al., 2013). Second, higher average vessel length leads to an increased spread of embolism via reduced embolism containment, and thus may increase the effect of an embolized vessel on the total conductivity of the flow pathway (Comstock and Sperry, 2000). Measurements of xylem vulnerability to embolism may also be affected by artefacts that depend on vessel length (Choat et al., 2010; Torres-Ruiz et al.,

CR IP T

2017; cf. Hacke et al., 2015). For these reasons, accurate estimates of vessel length distributions are of central importance in studies investigating hydraulic trade-offs.

While the importance of vessel length for studies of the conductive system of angiosperms is

AN US

clear, in practice it is difficult to obtain accurate measurements due to the peculiar nature of vessels. As the lengths and diameters of angiosperm vessels usually differ by several orders of magnitude, it is not possible to directly measure vessel lengths from longitudinal sections, as is done for tracheids (Ladell, 1959; Wilkins and Bamber, 1983). Neither is it possible to

M

determine vessel lengths from macerations, another common method to determine tracheid

ED

lengths (Franklin, 1945). For these reasons, vessel length measurements usually rely on indirect measurement methods that involve cutting a stem segment and infusing the cut-open

PT

vessel fragments with a substance that moves freely throughout the lumen, but is held back by pit membranes. The relationship between the counts of infused vessels in a series of

CE

consecutive cross-sections and the distance of these cross-sections from the injection point

AC

(henceforth called infusion profile) can then be used to estimate vessel length distributions given a set of simplifying assumptions about the shape of the vessel length distribution and the positioning of vessels along the stem axis. While there is an alternative approach based on air injection (Zimmermann and Jeje, 1981; Cohen et al., 2003; Pan et al., 2015), it will not be the focus of this paper (see discussion).

6

ACCEPTED MANUSCRIPT The main objective of the present study was to derive a general mathematical relationship between vessel length distribution and the shape of the infusion profile given the assumptions about wood structure underlying the most commonly used algorithms for vessel length estimation. We discuss the theoretical implications of our findings for the different methods used to estimate vessel length distributions in the literature, and provide maximum likelihood

CR IP T

methods to estimate vessel lengths. Our approach allows us to estimate both the overall distribution of lengths of vessels in stems, and also the distribution of lengths of vessels passing through a plane in the stem. We show that the latter is the distribution estimated by most methods in the literature, and is precisely related to the overall distribution of vessel

AN US

lengths as a size-biased version of that distribution (Cox, 1969).

Based on the general equation for the infusion profile, we develop analytic solutions for the infusion profiles that arise assuming that the vessel length distributions have one of five

M

functional forms. We provide R code that fits the models corresponding to these distributions

ED

onto infusion profiles from experimental data to obtain maximum likelihood estimates of the parameters of the length distribution. Finally, using Monte Carlo experiments, we explore the

PT

performance of our estimators in recovering the parameters of vessel length distributions

CE

provided that common assumptions are satisfied.

Survey of vessel length distribution estimators

AC

Botanists of the late 19th century realized that vessels do not span entire plant stems, and performed experiments to estimate vessel length. These experiments focused on determining maximum vessel lengths by injecting plant stems with substances that pass through the lumen but are held back by pit membranes, including dyed spermaceti (Caspary 1862), mercury (Strasburger 1891, 1893; Ewart 1906), suspensions of iron chloride (Adler 1892), India ink 7

ACCEPTED MANUSCRIPT and other water-based pigments (Rivett, 1920; Handley, 1936), molten paraffin wax (Handley, 1936), or pressurized air (Bennett et al. 1927, Greenidge, 1952). The first authors that explicitly tried to determine the distribution of vessel lengths were Skene and Balodis (1968), analysing paint injection profiles from Eucalyptus stems. They developed a method for the estimation of vessel length distributions based on a finite-difference approximation to

CR IP T

the second derivative of the infusion profile. Under the assumptions that vessels run parallel to the stem axis and are unbranched, the average density of vessel ends is uniform along the stem, and vessel length is independent of position along the stem, their method produced an approximation to the overall vessel length distribution in the form of uniform densities within

AN US

discrete bins. However, parts of the methodology applied remain unclear since the authors fitted a non-specified continuous function onto the infusion profile and calculated the vessel length distribution from this fit instead of applying their algorithm directly to the raw data.

M

Milburn and Covey-Crump (1971) re-evaluated the data from Skene and Balodis (1968) and

ED

criticized their use of a continuous, unimodal length distribution, arguing that the dataset would be better described by a multimodal distribution. They derived the correct equation for

PT

the expected infusion profile (accounting for size bias) based on the approach used by Cox (1969) to estimate length distributions of textile fibres. However, they subsequently argued

CE

that the data of Skene and Balodis (1968) are better described by three discrete populations of

AC

vessels of constant length, and derived a simple method to determine the lengths of these populations based on piece-wise linear regression, which they illustrated by an easy-to-grasp graphical approach. This graphical method estimates a conditional distribution of vessel lengths for the vessels that cross the injection point. As the probability of cutting a vessel is proportional to its length, the distribution of lengths of vessels passing through a plane in the stem can be described as a „size-biased‟ version of the overall distribution of lengths of 8

ACCEPTED MANUSCRIPT vessels in the stem based on the definition of a size-biased distribution in the statistical literature (Cox, 1969). Depending on the biological question and the focus of a study, either the size-biased or the overall vessel length distribution may be more useful. Building upon the graphical approach of Milburn and Covey-Crump (1971), Zimmermann and Jeje (1981) developed an algorithm that attempts to estimate vessel length distributions

CR IP T

for a number of discrete bins, the so-called double-difference (DD) algorithm, and used it to analyse data both from latex paint injection and air flow measurements. Essentially, the DD algorithm is based on a finite-difference approximation to the second derivative of the

AN US

infusion profile, but unlike the approach of Skene and Balodis (1968), their equation introduces another factor, the distance from the infusion profile, and thus estimates the sizebiased vessel length distribution. It should be noted that the authors of the DD algorithm were aware that it estimates the length distribution of the vessels at the cutting plane, which differs

M

from the unconditional length distribution (Zimmermann and Potter, 1982).

ED

Subsequent publications on vessel length distributions mostly referred to the DD algorithm of Zimmermann and Jeje (1981), and focused on trying to overcome its shortcomings such as the

PT

frequently observed negative “probabilities”. This behaviour has repeatedly been attributed to

CE

the “non-random” distribution of vessel ends (Zimmermann and Jeje, 1981; Ewers and Fisher, 1989; Tyree, 1993). However, as noted by Cai and Tyree (2014), this arises from a

AC

misunderstanding of the concept of “randomness”, as the negative “probabilities” observed precisely result from random variation in vessel length and position. Cohen et al. (2003) was the first publication to derive a theoretical model for vessel length distributions from hypotheses on vessel development and wood structure, and also the first to estimate its parameters based on parametric curve fitting onto the infusion profile. Similar to Zimmermann and Jeje (1981), they measured changes in air conductivity after successive 9

ACCEPTED MANUSCRIPT shortening of stem segments. Cohen et al. (2003) assumed the distances of vessel ends from the cut surface follow an exponential decay, and calculated the resulting vessel length distribution based on a differential formulation of the DD algorithm, yielding a Gamma distribution with a shape parameter of 2 (Erlang(2)-distribution). The parameters of this model were estimated by fitting a log-linear model to the infusion profile. The size-biased

CR IP T

nature of the vessel length distributions estimated with the method of Cohen et al. (2003) was recognized by Nijsse (2004), who pointed out that an exponential vessel length distribution arises if every vessel has an equal probability of terminating at any length. The curve fitting approach of Cohen et al. (2003) was subsequently used to analyse infusion profiles from

AN US

silicon injection (Sperry et al., 2005; Wheeler et al., 2005) in the contexts of studies focusing on hydraulic trade-offs.

As the exponential model of infused vessels was found too inflexible to accommodate many

M

of the observed infusion profiles, Hacke et al. (2007; corrected in Christman et al., 2009)

ED

proposed to model infusion profiles with a more flexible stretched exponential model (i.e., by fitting the survivor function of the Weibull distribution onto the infusion profile). They

PT

provided a solution for the resulting length distribution based on the differential DD

CE

algorithm, which will be discussed in more detail later. A rather different perspective on vessel length distributions is provided by Oberle et al.

AC

(2016), who tried to entirely avoid estimating the length of uncut vessels based on the length distribution of cut-off vessels by injecting silicon directly from the apical end of stems. Based on these infusion profiles, they provide a thorough analysis of vessel length distributions from a hierarchical Bayesian perspective. Their analysis involves a correction for subsampling and effects of secondary growth, tests of differences between vessel length distributions for a set of biologically sensible contrasts, and tests of meaningful mechanistic hypotheses about 10

ACCEPTED MANUSCRIPT vessel formation (by testing whether the distribution observed significantly differs from an exponential null model). Their method is likely to give reliable results for many purposes, with the limitation that it is only applicable for the length of vessels situated at the apical ends of stems, which in part might not yet be fully developed.

CR IP T

Material and methods Assumptions and conventions

Most studies of length distributions of xylem conduits have focused on vessels. To reflect this, throughout this publication we use the term „vessel length‟, although our approaches also

AN US

could be applied to infusion profiles of tracheids. A list of all abbreviations, terms and symbols is given in Table 1.

We start from similar assumptions to those used in the original papers on vessel length

M

distributions (Skene and Balodis, 1968; Milburn and Covey-Crump, 1971; Zimmermann and

ED

Jeje, 1981): a) vessels run parallel to the stem axis and do not branch, b) the average number of basal ends of vessels per stem length is uniform along the stem axis (that is, positions of

PT

basal ends along the stem are described by a homogeneous Poisson point process), c) vessel lengths are mutually independent, identically distributed, and independent of position along

CE

the stem, and d) vessel counts and length measurements are performed without errors. The

AC

limitations imposed by these assumptions are discussed further in the Discussion (“Limitations”). While for our methods to be valid it is not necessary to assume that vessels are arranged end on end in longitudinal files as e.g. in Milburn and Covey-Crump (1971), our approach also results in valid estimates under these stronger assumptions. The most obvious way to generate vessel counts for an infusion profile is to exhaustively count all injected vessels in each cross section. However, in stems with many vessels this may 11

ACCEPTED MANUSCRIPT not be feasible. We therefore consider two different sampling schemes: (1) counts of all injected vessels in each cross section (Figure 1a), and (2) counts of all infused and all empty vessels within a random subset of vessels in each cross section (Figure 1b). For the subsampling scheme (2), an additional assumption is necessary, namely that the stem

CR IP T

segments do not branch.

Expected infusion profile

In the following, starting from the probability density function (pdf) f(x) of vessel length x, which in practical applications is unknown, we will derive an equation for the expected value

AN US

of the proportion of stained vessels in an infusion profile. First, consider that if the average number of vessel ends per stem length is uniform along the stem axis (homogeneous Poisson process), not all vessels will have the same probability of crossing the cut surface. Instead, the

M

probability that a specific vessel will cross the infusion point will be directly proportional to

expressed as: ( ) ∫

( )

( )

,

(1)

PT

( )

ED

its length (compare Figure 1). The pdf of the size-biased vessel length g(x) can therefore be

where the integral in the denominator is a normalization constant so that g(x) integrates to 1,

CE

and is equal to the average vessel length µ. Similar size-biased distributions arise, for

AC

example, in the sampling of textile fibre length (Cox, 1969) and of discontinuity traces in rocks (Priest and Hudson, 1981), and in line transect sampling (Drummer and McDonald, 1987).

The length Z of a cut-off vessel fragment injected in an experiment is equal to the length of the vessel X times the fraction C of the vessel downstream of the cut. As the positions of vessels along the stem axis are assumed to be described by a homogeneous Poisson process, 12

ACCEPTED MANUSCRIPT each vessel at the cutting plane can be cut anywhere along its length with equal probability. Thus, the fraction C = Z / X of the vessel that is downstream of the cut has a uniform distribution C ~ U(0, 1), with probability density a(c) of 1 if c is on the interval [0, 1], and 0 otherwise. Because the downstream fraction of the vessel C and the length of a cut vessel X

( )

( ) ( )



CR IP T

are independent random variables, their product is given by (2)

for positive vessel lengths x. The probability density a(c) of the injected fraction C is non-zero only on [0, 1], and therefore a(z/x) is nonzero only for x ≥ z, so after substituting from Eqn.

( )

( )



AN US

(1), the pdf b(z) of lengths of injected vessel fragments is: ( )

(3)

This distribution also occurs in the analysis of recurrent processes, where it is known as the backwards recurrence time distribution (Cox, 1962), and is used to reconstruct the distribution

M

of intervals between events from sampling at random points in time.

ED

Based on the pdf of injected fragment lengths b(z), it is possible to calculate the expected

( )

( )



PT

proportion p of injected vessels with a length of z or longer: ∫

( )

(4)

CE

Accordingly, if there are N infused vessels at the cutting plane, the expected number n of

AC

infused vessels observed at a distance z is: ( )



( )

(5)

An analogous result for the length distribution of textile fibres was found by Cox (1969), and was related to vessel length distributions by Milburn and Covey-Crump (1971). By solving Eqn. (5) for the vessel length distribution f, the vessel length distribution can be expressed as a function of the expected infusion profile. 13

ACCEPTED MANUSCRIPT

( )

(6)

By substituting Eqn. (1) in Eqn. (6), the size-biased vessel length distribution can be related to the ideal infusion profile as: ( )

,

(7)

CR IP T

which is the differential form of the DD algorithm of Zimmermann and Jeje (1981) as used by Cohen et al. (2003). Therefore, the differential form of the DD algorithm estimates the sizebiased conditional length distribution of vessels given that they cross the injection point. Accordingly, the majority of publications on vessel length distributions describe the size-

AN US

biased vessel length distribution. As shown by Cox (1969), the expected value of the sizebiased distribution is related to the population mean of vessel length by: [ ( )]

(

),

(8)

M

where σ2 is the variance of the overall vessel length distribution. Accordingly, the difference between the expected values of the vessel length distribution and its size-biased version

ED

depends on the coefficient of variation (σ/μ) of the overall length distribution.

PT

The equations above are correct for any valid vessel length distribution. In order to facilitate length estimates, expressions for g(x), b(z) and p(z) were derived for a set of commonly used

AC

CE

positive continuous probability distributions, and are provided in Table S1.1 in Supplement 1.

Random variation in vessel counts Due to the randomness in the placement of vessels along the stem in real injection experiments, the observed proportion of injected vessels at a distance zi will vary stochastically around its expected value p(zi). We now discuss estimation of the parameters of vessel length distributions from stochastic counts of either (1) all injected vessels in each 14

ACCEPTED MANUSCRIPT cross-section or (2) injected and empty vessels among a subset of the vessels in each cross section (Figure 1).

Estimating vessel length from counts of all injected vessels (exhaustive sampling) The probability that a vessel reaches at least a distance z given that it is also present at the

CR IP T

injection point is given by p(z) (Table S1.1). If vessel lengths and positions are independent and there are no counting errors, the number of vessels n out of a total sample of N vessels present at the infusion point that pass a cross-section at distance z can be described with a binomial distribution. The probability of the observation of any single count n can then be

( ))

( |

( ))

(

AN US

calculated as:

( )( ( )) (

( ))

(9)

If more than one cross-section of the same stem is considered, the counts of injected vessels

M

in adjacent cross-sections cannot be assumed to be independent, because all vessels that end

ED

before a certain cross-section will also be absent in all subsequent cross-sections (see Fig. 1). For this reason, observations in datasets obtained by exhaustive sampling will not be scattered

PT

above and below the expected value in a random fashion. Instead, the infusion profile

CE

observed is constrained to decrease monotonically (see Figure 2a, c). This dependence structure can be accommodated by expressing the counts of injected vessels

AC

ni observed at a distance zi as the outcome of a binomial distribution with a probability of success conditional on the observed counts ni-1 at position zi-1: |

where

(

(

(

|

|

)),

(10)

) is the conditional probability that a vessel reaches at least until zi

given that it surpassed zi-1. By the definition of conditional probability, this probability is given by: 15

ACCEPTED MANUSCRIPT

(

|

( )

)

(

.

(11)

)

The joint likelihood of a set of parameters Θ describing the vessel length distribution can then be expressed as the product of the conditional probabilities of the observed number of injected vessels at each of the ncut cuts: ∏

(

( |

)

(

).

(12)

)

CR IP T

( | )

At the infusion point (i = 1), ni-1 is the number of vessels at the infusion point N0, and p(zi-1, Θ) is 1, because the injected vessels are all present at the infusion point. Therefore, the first element of the product term in Eqn. (12) can be factored out as the unconditional likelihood of

( | )

(

(

|

AN US

the first observation: )) ∏

) (

) (

(

))



)

(

(

),

(13)

)

)(

(

)

(

(

) (

)

)

(

)

(14)

)

ED

(

M

which can be written out as: ( | )

(

( |

It is possible to obtain a maximum likelihood estimate ̂ of the parameters of the vessel

PT

length distribution by maximizing Eqn. (14) with respect to Θ:

CE

̂

) (

AC

((

(

(

)

)

) (

(

))



)

)

(

)(

( (

)

) (

)

(15)

For computational reasons, it is better to obtain the maximum likelihood parameters by maximizing the log-likelihood ̂

(

( (

|

(

)))



( ( |

( (

)

)))

)

(16) 16

ACCEPTED MANUSCRIPT It is sometimes difficult to measure N0, the number of vessels crossing the injection point, as the sample at this location will be entirely submerged in the staining agent. A better count can often be obtained by shaving the injected end of the stem, but it is also possible to obtain estimates of the parameters of the length distribution by omitting the likelihood of the first

̂

( ∑

(

( ( |

)

(

)))

)

CR IP T

observation and minimizing the negative conditional log-likelihood: (17)

Our simulation experiments suggest that analogous to the exact and conditional maximum likelihood estimators in models of time series, ̂ and ̂ can be assumed to be asymptotically

AN US

equal. Numerical optimization methods can be used to estimate ̂ or ̂ . An example for an implementation using R package bbmle (Bolker and R Development Core Team, 2017) can

M

be found in the supplementary material S1.

Estimating vessel length from subsets of cross-sections (subsampling)

ED

If the number of vessels in each cross section is very large so that only a small fraction of the total number of vessels in the cross-section can be feasibly counted, it is reasonable to assume

PT

that the random deviations from the expected infusion profiles between cuts are

CE

approximately independent from each other. If the counts in adjacent cross-sections are independent and there are no counting errors, the number of injected vessels n(zi) out of a

AC

total number of Ni vessels in a subset of a cross-section at position zi can be assumed to follow a binomial distribution, with deviations scattered randomly around the expected infusion profile (Figure 2b, d): (

( ))

(18)

17

ACCEPTED MANUSCRIPT The ( | )

joint

likelihood



of (

( |

the ))



dataset

can

( ) (

then

) (

be (

expressed

))

as: (19)

Note that the subsampling estimator requires that the vessels present in each cross-section arose from the same process, in the sense of having the same length distribution, and the same

CR IP T

mean density of vessels ends per length of stem (assumptions b and c, Assumptions and Conventions). These assumptions imply that the expected number of vessels in a cross section is uniform along the stem.

A maximum likelihood estimate of the parameters of the vessel length distribution ̂ can be

̂

(( ) (

(∑

) (

AN US

obtained by maximizing the log-likelihood with respect to Θ: (

))

))

(20)

Again, numerical optimization methods can be used to find the parameter set ̂ that

M

maximizes Eqn. (20) (see supplementary material S1 for implementations in the R

ED

programming language).

PT

Simulation of infusion profiles

In order to assess the performance of the maximum likelihood vessel length estimators

CE

described above, we used R version 3.4.4 (R Core Team, 2017) to perform a series of Monte

AC

Carlo simulation experiments for the two possible forms of experimental setups (exhaustive sampling and subsampling). For this purpose, wood samples were simulated by generating a set of artificial vessels that were each assigned a randomly generated basal end position z0 and a length x sampled from one of the five probability distributions in Table S1.1 (exponential, Erlang(2), Weibull, gamma and lognormal distribution). Vessel end positions were sampled from a uniform distribution on [-L, L], where z = 0 is the position of the cutting plane and L is 18

ACCEPTED MANUSCRIPT the 0.999999th quantile of the underlying vessel length distribution (i.e., F(z ≥ L) = 0.999999). Thus, it was assured that only one in a million vessels was excluded from the sample because of a length exceeding the sampling interval. The total number of simulated vessels in the simulations was set to 2 × nvess × L / µ, thus guaranteeing that the expected number of vessels present in each cross-section was nvess. In the case of exhaustive sampling, infusion profiles

CR IP T

were simulated by counting the ni injected vessels at each of ncut distances zi by summing the number of vessels with z0 ≤ 0 that also satisfied z0 + x ≥ zi. To test the performance of the subsampling estimator, for each simulation with nvess ≥ 500, a simple random sample of nvess/10 vessels was taken from all vessels (injected and uninjected) in each cross-section. The

AN US

number of injected vessels ni in these subsamples was determined as the number of vessels that simultaneously satisfied z0 ≤ 0 and z0 + x ≥ zi.

The cut positions were chosen to follow an exponential spacing (Sperry et al., 2005): (

)

)

M

(

(20)

ED

Where zmax was set to the 98th percentile of the sample of cut-off vessel fragments and zmin

PT

was arbitrarily chosen to be 3 mm.

Simulated infusion profiles were generated for both counting methods using a set of 100

CE

different parameter combinations for each of the five possible distributions, using 10 different

AC

numbers of cuts ncut and 10 different numbers of vessels nvess (6 for the subsampling estimator), leading to a total of 80,000 simulated infusion profiles, of which each was fitted with 7 different candidate models. Code for the simulations is available on GitHub at https://github.com/r-link/vessel_length_simulations, and a summary of the settings used in the Monte Carlo experiments is given in Table 2.

19

ACCEPTED MANUSCRIPT Model fitting In realistic applications, the functional form of the vessel length distribution is unknown. To assess the accuracy of estimates based on different models in the absence of information about the true model, non-linear binomial regression models for all five vessel length distributions considered in this paper were fitted with the R package bbmle (Bolker and R

CR IP T

Development Core Team, 2017) as described in the supplementary material S1, using both exhaustive and subsampling schemes. Additionally, the vessel length distribution was estimated by fitting a log-linear (Cohen et al., 2003) and a stretched exponential model (Christman et al., 2009) onto the infusion profile using least squares nonlinear regression via

AN US

R function nls(). Average vessel lengths were then calculated from the model output as described by these authors. As these methods estimate the size-biased length distribution and hence cannot be compared directly to our estimates of the overall length distribution, their

M

estimates were only included to illustrate the magnitude of size bias in realistic applications.

ED

Relative performance of models

PT

In order to compare the relative performance of the different models fitted onto the infusion profile in terms of bias, precision and accuracy, a set of evaluation statistics was calculated.

CE

As the simulations were performed for a range of different average vessel lengths, bias and accuracy were normalized by mean vessel length. The bias in the estimates of mean vessel

AC

length was quantified by the relative mean signed error of the estimates (

, mean signed

error expressed in relation to the true mean vessel length). ∑

̂

(21)

The precision of the average vessel length estimates was quantified as the standard deviation of the relative estimates 20

ACCEPTED MANUSCRIPT



∑ (

̂

( ̂)

) .

(22)

In order to obtain a measure of the overall accuracy of the vessel length distribution estimates, the overlapping coefficient (OVL, Inman and Bradley Jr., 1988) was calculated as the amount of overlap between the density functions of the true vessel length distribution f(x) and the

CR IP T

estimated distribution ̂( ) by numeric integration of the smaller of the two densities at each vessel length: ∫

( ̂( ) ( ))

(23)

The OVL is a fraction that ranges between 1 (full overlap between true and estimated

true and predicted length distribution.

AN US

distribution) and 0 (no overlap) and can be used to quantify the degree of agreement between

For meaningful comparison between exhaustive sampling and subsampling, we contrasted ,

and OVL) between the simulations with

M

average accuracy of our length estimates (

ED

nvess equal to 1000, 1500, 2500 and 5000 for exhaustive sampling and nvess equal to 100, 150, 250 and 500 for subsampling, thus characterizing the average loss of accuracy that occurs

PT

when subsampling only 10% of the vessels in each cross section. Additionally, the coverage probability of the estimates of average vessel length was quantified as the proportion of

CE

models that contained the true population average of vessel length within their 95% likelihood

AC

profile-based confidence intervals (see supplementary material S1).

Results

Simulation example The relationship between the unconditional and size-biased vessel length distribution is illustrated by a simulated vessel length dataset based on 250 injected vessels drawn from an 21

ACCEPTED MANUSCRIPT exponential length distribution with an average vessel length of 10 cm in Figure 3. The simulated infusion profile (Figure 3a) is well fit both by the log-linear least squares model of Cohen et al. (2003) and the conditional maximum likelihood estimate based on Eqn. (17) using the corresponding equation for the infusion profile from Table 2. The observed length distribution of the cut-off injected vessel fragments (Figure 3b) and the size-biased length

CR IP T

distribution of the vessels crossing the initial cutting plane (Figure 3c) are in accordance with the theoretical probability densities, and the length distributions estimated with the conditional maximum likelihood estimator are very similar to the theoretical distributions (Figure 3a-c). The method of Cohen et al. (2003) provides a close estimate of the size-biased

AN US

vessel length distribution (Figure 3c), the average of which exceeds the average of the unconditional length distribution by almost a factor of 2 (Figure 3d).

M

Accuracy of average vessel length estimates

For both the exhaustive sampling and subsampling scheme, the values of

averaged

ED

over the simulations with different parameter sets and values of ncut and nvess were close to zero and the standard deviation of the estimates was lowest when fitting the correct model

PT

onto the infusion profiles (Table 3). This includes the exponential distribution in case of the

CE

Weibull, and the exponential and Erlang(2)-distribution in case of the gamma distribution, as the former constitute special cases of the latter for both sets of distributions. Fitting incorrect

AC

models induced bias in the estimates and reduced their precision, especially in the cases of the less flexible one-parametric exponential and Erlang(2) models. Overall, using the subsampling-based algorithm on 1/10th of each subsection only marginally decreased precision when using the correct distribution. When applied to an unknown distribution, the subsampling estimator on average achieved lower bias and better precision, even though it required only 1/10th the vessel counts of exhaustive sampling. In both cases, when predicting 22

ACCEPTED MANUSCRIPT an unknown vessel length distribution, the Weibull distribution yielded on average the lowest bias (-6% of the true mean for full counts and -4% for subsamples) and the best precision (

of 13% of the true mean for full counts and 11% for subsamples). Estimates of the size-

biased distribution from curve-fitting procedures provided in the literature were biased

the unconditional distribution.

Overlap between true and predicted length distributions

CR IP T

upwards by on average 75% (Cohen et al., 2003) and 70% (Christman et al., 2009) relative to

AN US

The average overlapping coefficient OVL was over 95% in all cases for exhaustive sampling and still over 90% in all cases when only subsampling 1/10th of the vessels if the correct model was fitted (Table 4). If an incorrect model was fitted, the overlap varied considerably between distributions, and was generally higher when using the subsampling estimator on a

M

subset of the data. Overall, it was lower for the less flexible one-parametric distributions, and

ED

higher for the estimates based on the subsampling procedure than for the estimates based on data from all vessels. The overall accuracy in absence of information about the shape of the

PT

true vessel length distribution was highest for the models based on the Weibull distribution, which performed relatively well on all vessel length distributions besides the log-normal, and

CE

achieved an average overlap of 90.7% in the simulations based on all vessels and 90.0% in the

AC

simulations based on subsamples.

Coverage probabilities The coverage probabilities for the different models when fitting the correct vessel length distribution and averaging over all sets of ncut, nvess and distribution parameters suggest that the conditional maximum likelihood estimator for the entire number of vessels in each cross23

ACCEPTED MANUSCRIPT section has desirable coverage properties with coverages around the nominal level of 95% (Table 5). In contrast, there is evidence that the estimator for subsamples of vessels has anticonservative coverage properties with coverage probabilities of between 81.1 and 83.1%.

Influence of number of cuts and vessels

CR IP T

Both for models based on complete counts and for those based on subsamples, the bias in the estimates of average vessel length largely depended on the choice of the model, and was nearly independent of ncut and nvess (except for random noise at low nvess; Figure S2.1 and S2.2 in supplementary material S2). However, the precision in the estimates increased with ncut and

AN US

nvess (results not shown), resulting in higher values of OVL (Figure S2.3, S2.4). Note that as the subsampling procedure involved stems with on average 10 × nvess vessels in each crosssection, the accuracy in the figures in the supplementary material at the same values of nvess

M

cannot be directly compared between estimators.

ED

When analysing the data from a realistically low number of 10 different cross-sections with a known vessel length distribution in exhaustive sampling with 1,500 vessels crossing the

PT

injection point, the amount of overlap was above 95% for all distributions. When analysing a subsample of 150 of the vessels of that stem in each cross-section and using the subsampling

CE

estimator, it was still possible to achieve an average overlap of over 90% for all but the

AC

Gamma distribution (OVL = 0.882). For the estimates based on exhaustive sampling, the coverage probability of the models remained constant at about 95% when fitting the correct model, and declined rapidly both with ncut and nvess when fitting an incorrect model (Figure S2.5). As the Erlang(2) distribution is a special case of the gamma distribution, and the exponential distribution both of the gamma and the Weibull distribution, the corresponding models had high coverage 24

ACCEPTED MANUSCRIPT probabilities on infusion profiles generated from these distributions. In the case of the estimates of subsamples of cross-sections, the coverage probabilities were sensitive to both ncut and, to a lesser degree, nvess, resulting in anti-conservative estimates at higher values of

Discussion Size-bias in published vessel length estimates

CR IP T

both variables also when the correct model was used (Figure S2.6).

We found that the most commonly applied methods estimate the length distribution of vessels crossing an injection point and hence estimate the size-biased vessel length distribution. The

AN US

only exceptions are the work from Milburn and Covey-Crump (1971), who approximated the unconditional vessel length distribution by a finite-difference approach, and the work of Oberle et al. (2016), who avoided this problem by applying a modified methodology. The

M

average vessel length estimated from a size-biased sample of vessels is higher than the mean of the unconditional distribution. The difference depends on the coefficient of variation of the

ED

unconditional vessel length distribution (Eqn. (8)).

PT

Accounting for size bias in previously published methods – and thus obtaining estimates of the unconditional length distribution – is mostly straightforward. In case of the DD algorithm,

CE

it is sufficient to remove the factor of x2 from Eqn. (A-2) of Zimmermann and Jeje (1981) and

AC

normalize the result to one in order to obtain the unbiased result of Skene and Balodis (1968). For the method of Cohen et al. (2003), the unconditional vessel length distribution can be estimated by dividing their expression for pdf of vessel lengths by xk, which results in an exponential distribution, a solution already provided by Nijsse (2004). If, as hypothesized by Cohen et al. (2003), the shape of the infusion profile really follows an exponential decay (implying an exponential vessel length distribution; see Table S1.1), the 25

ACCEPTED MANUSCRIPT averages of the unconditional and size biased distributions differ by a factor of two (see Eqn. (8)). This is reflected in our simulation experiments, where a relative mean signed error of around 100% was found for both literature methods when vessel lengths followed an exponential distribution (see Table 3). If vessel lengths follow an Erlang(2)-distribution – the size-biased length distribution in Cohen et al. (2003) – its average length can easily be

CR IP T

estimated by fitting the corresponding equation for the infusion profile from Table S1.1.

The situation is more complicated in the case of the stretched exponential model of Hacke et al. (2007) and Christman et al. (2009). Hacke et al. (2007) noted that fitting this model onto

AN US

an infusion profile results in improper pdfs for vessel lengths with negative probability densities under a certain minimum length when the shape parameter of the fitted Weibull survivor function is larger than 1. The authors tried to correct this behaviour by normalizing the positive part of the length distribution and discarding the part below the minimum length.

M

However, this is problematic as the resulting distribution no longer corresponds to the

ED

equation fitted to the infusion profile. Accordingly, this method can be assumed to introduce a

PT

second form of bias additional to the size-related bias and should hence be handled with care.

Simulation experiments

CE

According to our simulation experiments, the proposed estimators produce unbiased estimates

AC

of vessel lengths if standard assumptions are satisfied and the functional form of the vessel length distribution is known (Table 3), both when the entire number of infused vessels in each cross-section can be counted, and when it is only possible to analyse subsamples. The precision of estimates was higher when more vessels were counted per cross-section (higher nvess) and when more cross-sections were analysed (higher ncut) (Figure S2.3, S2.4). In our simulations, average overlap with the true distribution of over 90% was achieved with counts 26

ACCEPTED MANUSCRIPT of 500 vessels or more from only 10 cross-sections when the functional form of the distribution was known. Counts of hundreds of vessels now can be easily obtained with the help of automatic image analysis programmes such as the freely available ImageJ (Schneider et al., 2012). In real experiments, the functional form of the vessel length distribution will generally be

CR IP T

unknown. Under these conditions, the choice of the most appropriate model becomes the largest source of uncertainty in the estimates. The more flexible two-parameter distributions perform relatively well when applied on datasets created with other distributions (Table 3,

AN US

Table 4). As the Erlang(2)-distribution is a special case of the gamma distribution and the exponential distribution a special case of both Weibull and gamma distribution, the good performance of the corresponding models on these distributions is not surprising. Overall, the Weibull model showed the highest flexibility in describing datasets generated from different

M

distributions. Uncertainty about the functional form of vessel length distributions is an issue

ED

that cannot be solved easily, as discussed further below (“Limitations”). When sampling only a subset of 10% of the vessels in each cross-section, the subsampling

PT

procedure on average had a marginally lower precision when using the right functional form

CE

of the distribution, and achieved a lower bias, higher precision and higher overlapping coefficient when this functional form was unknown (Table 3, Table 4). This illustrates the

AC

value of the subsampling procedure as an alternative to the more labour-intensive exhaustive sampling approach. The excellent coverage properties of the average length estimates obtained with the exhaustive maximum likelihood estimator (Table 5, Figure S2.5) over a large range of values of ncut and nvess, different length distributions, and different parameter sets indicate that the underlying statistical model accurately reflects the processes that generate the simulated 27

ACCEPTED MANUSCRIPT infusion profiles. In contrast, the estimator based on subsamples proved to have anticonservative coverage properties, i.e., the true mean vessel length did not fall into its 95% confidence intervals in more than 5% of the cases, especially at higher values of ncut. The likely reason is that the assumption of independence between the observations in different cross-sections does not hold because the subsets are each sampled from the same pool of

CR IP T

vessels. This is corroborated by the decrease in coverage probabilities with higher ncut (which should exacerbate the effect of dependence between counts in consecutive cross-sections), as well as by the observation that coverage improved considerably when increasing the pool of vessels from which the subsets where sampled from on average 10 × nvess to 100 × nvess

AN US

(results not shown). However, for values of ncut that can reasonably be achieved in real infusion experiments (i.e., 15 or less), the coverage probability is still around 90% for the correct models (Figure S2.6), which indicates that also when only subsamples of the cross-

M

sections can be analysed it is possible to obtain relatively reliable estimates.

ED

Limitations

Ultimately, the accuracy of the assumptions underlying estimators of vessel length

PT

distributions from infusion profiles can only be determined through detailed anatomical study

CE

of xylem structure. The technical challenges in analysing xylem networks remain formidable, though recent advances in analysis of serial sections and X-ray tomography show great

AC

promise (Huggett and Tomlinson, 2010; Wason et al., 2017). While it is reasonable to assume that vessels are approximately parallel to the stem axis and do not branch, it is less obvious whether vessel end positions are really independent of vessel length and position in the stem. Indeed, several studies have found a decrease in vessel length along the flow path in stems similar to the tapering of vessel diameters (e.g. Zimmermann and Potter, 1982, Jacobsen et al., 2018). Additionally, in some species, vessel ends are clustered around nodes and leaf 28

ACCEPTED MANUSCRIPT insertions (Wiebe et al., 1984; Salleo et al., 1984; André et al., 1999; Rančić et al., 2010). However, for many species, the mentioned assumptions are likely to be approximately met, especially when injection experiments are performed on large, relatively untapered stem segments produced by secondary growth (e.g., Skene and Balodis, 1968; Zimmermann and Jeje, 1981).

CR IP T

The subsampling estimator is limited by its additional assumption of unbranched segments. Also, in practice the easiest subsampling method is to analyse all vessels in a subset of the cross-sectional area, leading to stronger dependence between counts in consecutive cuts than

AN US

in our simulations, where the subset was obtained as a simple random sample of the vessels in the cross-section. Additionally, depending on the staining and visualization method, it may be complicated to obtain counts of the empty vessels.

Oberle et al. (2016) accounted for the issue of estimating vessel length distributions in the

M

presence of branching and secondary growth by explicitly modelling the increase of vessel

ED

count with secondary growth as an overdispersed Poisson process. This is a sensible solution to the above-mentioned issues, though it comes at the price of additional assumptions. When

PT

working with smaller branch samples, it might be preferable to instead count the entire

CE

number of stained vessels by image processing procedures and work with the estimator based on conditional likelihood (Eqn. (13)).

AC

Our analysis assumes that vessels are counted without error. In real measurement settings, detection problems are likely to result in counting errors, which is especially troublesome in case of the model based on the conditional likelihood. In this estimator, the log-likelihood is not defined when the number of vessel increases with distance from the injection point. It might therefore be advisable to extend the models to account for observation errors, or work with the subsampling procedure when counting errors are obvious. 29

ACCEPTED MANUSCRIPT The largest limitation of our approach is uncertainty about the functional form of the vessel length distribution. Our simulation experiments suggest that more flexible models for the vessel length distribution, especially the Weibull distribution, characterize the vessel length distribution relatively well even when applied to data generated from a different length distribution. However, it is questionable whether the lengths of all vessels should be

CR IP T

envisioned as independent draws from a single population with a unimodal length distribution. In many cases the length distribution may be better characterized as comprising different populations of vessels with different average lengths, e.g. in the case of early- and latewood vessels of ring-porous species (Cochard and Tyree, 1990). A simple way to account

AN US

for these differences would be to analyse counts of early- and latewood vessels separately. Another possible approach would be to extend our method to describe the vessel length

M

distribution as a finite mixture of probability distributions with different shapes.

ED

Implications for plant hydraulic theory

Our results suggest that most published estimates of vessel length distributions refer to the

PT

length distribution of vessels cut by a cross-section. This size-biased distribution is relevant when relating vessel length to variables that are derived from the analysis of microscopic

CE

images of cross-sections, e.g. vessel diameter, and when discussing the potential for vessel-

AC

length-related artefacts in hydraulic measurements on stem segments. Nijsse (2004) argued that the size-biased length distribution can be interpreted as a “hydraulic vessel length distribution” since it is weighted by the length of the vessels and hence their contribution to the hydraulic pathway. The idea of hydraulically weighted vessel lengths has its merits, though the contribution of a vessel to the hydraulic pathway also depends on other factors such as its diameter, the hydraulic resistivity of pit membranes and the degree of 30

ACCEPTED MANUSCRIPT connectivity to adjacent vessels, which may all vary along its course through the stem (Akachuku, 1987; Balaz et al, 2016). In studies of hydraulic trade-offs and the constraints they pose on vessel allometry, it might be more advantageous to focus on unbiased vessel length distributions, as the lengths of all vessels equally reflect the result of the optimization of xylem structure with regard to stability

CR IP T

and efficiency. It thus might make sense to re-analyse datasets from studies focusing on hydraulic trade-offs in the light of size-related bias in vessel lengths.

As all properties of vessels obtained from cross-sections inevitably are measured on a size-

AN US

biased sample of vessels, it is important to note that their distributions will be subject to a size bias as well unless those variables are independent of vessel length. This is especially interesting with respect to vessel diameters, which on the level of single vessels are likely to be correlated with vessel lengths (Hacke et al., 2006; Cai et al., 2010; Lens et al., 2011;

M

Martín et al., 2013; but see Jacobsen et al., 2012). In order to understand how hydraulic trade-

ED

offs constrain the dimensions of vessels it might therefore be promising to extend our

PT

approach to estimate a bivariate distribution of vessel lengths and diameters.

CE

Recommendations for future vessel length measurements Recently, Pan et al. (2015) recommended the use of the air injection technique from Cohen et

AC

al. (2003) to estimate conduit lengths. While it is easy to extend our methods to accommodate air injection data by modifying our estimator to handle continuous responses, for two reasons we decided to focus on the silicon injection method (Wheeler et al., 2005, Hacke et al., 2007), even though it might be more time consuming: First, as the conductivity of a vessel for air scales with the 4th power of its diameter, the variability in vessel diameters induces a large additional source of uncertainty in the air injection method. Second, and more importantly, 31

ACCEPTED MANUSCRIPT when vessel length and diameter are correlated, variance in diameters will lead to biased length estimates. Practical constraints put a limit on the number of cross-sections and vessels that can be analysed in an infusion experiment. Automatic image analysis tools such as ImageJ (Schneider et al., 2012) permit a rapid analysis of a large number of injected vessels, but the

CR IP T

preparation and analysis of microscopic slides remains time consuming. Therefore, it may be best to count as many vessels as possible in 10 to 15 cross-sections, which according to our simulations should be sufficient to achieve an acceptable level of accuracy. Other things being

AN US

equal, higher precision is possible in a stem with more vessels in each cross-section.

In stem segments that contain leaf insertions, nodes or branchings, it is advisable to perform the injection from the acropetal end to avoid injecting silicone into vessels that subsequently exit in a branch. Additionally, as the subsampling procedure will lead to biased estimates if

M

the total number of vessels per cross-section changes, in these cases it is recommendable to

ED

either explicitly account for the effect of subsampling (see Oberle et al., 2016) or work with

CE

Conclusions

PT

the conditional maximum likelihood estimator.

Despite its importance for understanding the function of plant‟s conductive tissue, xylem

AC

vessel length remains “the neglected dimension” as it was termed by Comstock and Sperry (2000). Starting from the assumptions stated in previous publications about vessel length estimation, we derive maximum likelihood estimators for vessel length for two different sampling schemes, and provide an analysis of the most widely used vessel length estimators in the literature based on probability theory. As our analysis shows, it is important to distinguish between the unconditional distribution of vessel lengths in the stem and the 32

ACCEPTED MANUSCRIPT distribution of lengths of vessels cut by a plane in the stem, which is termed a size-biased distribution in statistics. We hope to stimulate future studies on the role of vessel length in plant hydraulics, which could build on the R code for our simulations and estimators (on GitHub at https://github.com/r-link/vessel_length_simulations). Further studies on vessel length estimators could explore more flexible vessel length distributions, accommodate

CR IP T

counting and measurement errors, relax assumptions, or focus on the bivariate distribution of vessel length and vessel diameters. Additionally, it would be promising to extend our approach to a hierarchical modelling framework that accounts for different levels of variability in the parameters of the infusion profile, and allows for testing biologically

AN US

meaningful differences in parameters between species.

Acknowledgements

M

We thank John Sperry, Missy Holbrook and Moritz Doll for useful discussions. We also thank two anonymous reviewers, whose comments significantly improved the manuscript. Financial

ED

support by the German Research Foundation granted to B. Schuldt is gratefully acknowledged. A. Cobb was supported by the National Research Foundation (NRF), Prime

PT

Minister‟s Office, Singapore under its Campus for Research Excellence and Technological

CE

Enterprise (CREATE) programme. The Center for Environmental Sensing and Modeling (CENSAM) is an interdisciplinary research group (IRG) of the Singapore MIT Alliance for

AC

Research and Technology (SMART) centre.

Competing interests statement The authors declare no competing interests.

33

ACCEPTED MANUSCRIPT References Adler, A., 1892. Untersuchungen über die Längenausdehnung der Gefäßräume. Inauguraldissertation, Jena. Akachuku, A.E., 1987. A study of lumen diameter variation along the longitudinal axis of wood vessels in Quercus rubra using cinematography. International Association of Wood Anatomists Journal, 8, 41-45.

CR IP T

André, J.P., Catesson, A.M., Liberman, M., 1999. Characters and origin of vessels with heterogenous structure in leaf and flower abscission zones. Canadian Journal of Botany, 77, 253-261.

Balaz, M., Jupa, R., Jansen, S., Cobb, A., Gloser, V., 2016. Partitioning of vessel resistivity in three liana species. Tree Physiology 36(12), 1498-1507.

AN US

Bennett, J.P., Anderssen, F.G., Milad, Y., 1927. Methods of obtaining tracheal sap from woody plants. New Phytologist 26, 316-323.

Bolker, B., R Development Core Team, 2017. bbmle: Tools for general maximum likelihood

estimation.

R

project.org/package=bbmle.

package

version

1.0.20.

http://CRAN.R-

M

Cai, J., Tyree, M.T., 2014. Measuring vessel length in vascular plants: can we divine the truth? History, theory, methods, and contrasting models. Trees 28, 643-655.

ED

Cai, J., Zhang, S. and Tyree, M.T., 2010. A computational algorithm addressing how vessel length might depend on vessel diameter. Plant, Cell & Environment 33(7), 1234-1238. Caspary, R., 1862. Über die Gefässbündel der Pflanzen: vorläufige Mittheilung.

PT

Monatsberichte der Königlich preussischen Akademie der Wissenschaften 448-483. Choat, B., Brodie, T.W., Cobb, A.R., Zwieniecki, M.A., Holbrook, N.M., 2006. Direct

CE

measurements of intervessel pit membrane hydraulic resistance in two angiosperm tree species. American Journal of Botany 93, 993-1000.

AC

Choat, B., Cobb, A.R., Jansen, S., 2008. Structure and function of bordered pits: new discoveries and impacts on whole-plant hydraulic function. New Phytologist 177, 608– 625.

Choat, B., Drayton, W.M., Brodersen, C., Matthews, M.A., Shackel, K.A., Wada, H., McElrone, A.J., 2010. Measurement of vulnerability to water stress‐induced cavitation in grapevine: a comparison of four techniques applied to a long‐vesseled species. Plant, Cell & Environment 33, 1502-1512. 34

ACCEPTED MANUSCRIPT Christman, M.A., Sperry, J.S., 2010. Single-vessel flow measurements indicate scalariform perforation plates confer higher flow resistance than previously estimated. Plant, Cell & Environment, 33(3), 431–443. Christman, M.A., Sperry, J.S., Adler, F.R., 2009. Testing the „rare pit‟ hypothesis for xylem cavitation resistance in three species of Acer. New Phytologist 182: 664-674. Christman, M.A., Sperry, J.S. and Smith, D.D., 2012. Rare pits, large vessels and extreme vulnerability to cavitation in a ring‐porous tree species. New Phytologist 193(3): 713-

CR IP T

720.

Cochard, H., Tyree, M.T., 1990. Xylem dysfunction in Quercus: vessel sizes, tyloses, cavitation and seasonal changes in embolism. Tree Physiology 6(4): 393–407.

Cohen, S., Bennink, J., Tyree, M.T., 2003. Air method measurements of apple vessel length distributions with improved apparatus and theory. Journal of Experimental Botany 54,

AN US

1889-1897.

Comstock, J.P., Sperry, J.S., 2000. Theoretical considerations of optimal conduit length for water transport in vascular plants. New Phytologist 148, 195-218. Cox, D.R., 1962. Renewal theory (Vol. 1). London: Methuen.

M

Cox, D.R., 1969. Some sampling problems in technology. In: Johnson N.L., Smith H. Jr. (Eds.), New Developments in Survey Sampling. New York, New York, ISA: Wiley-

ED

Interscience. 506-527.

Drummer, T.D., McDonald, L.L., 1987. Size bias in line transect sampling. Biometrics 13, 13-21.

PT

Ewart, A.J., 1906. The ascent of water in trees I. Proceedings of the Royal Society of London, series B: Biological Sciences 198, 41-85.

CE

Ewers, F.W., Fisher, J.B., 1989. Techniques for measuring vessel lengths and diameters in stems of woody plants. American Journal of Botany 76, 645-656.

AC

Franklin, G.L., 1945. Preparation of thin sections of synthetic resins and wood-resin composites, and a new macerating method for wood. Nature 155, 51.

Gleason, S.M., Westoby, M., Jansen, S., Choat, B., Hacke, U.G., Pratt, R.B., Bhaskar, R., Brodribb, T.J., Bucci, S.J., Cao, K.-F., Cochard, H., Delzon, S., Domec, J.-C., Fan, Z.-X., Feild, T.S., Jacobsen, A.L., Johnson, D.M., Lens, F., Maherali, H., Martínez-Vilalta, J., Mayr, S., McCulloh, K.A., Mencuccini, M., Mitchell, P.J., Morris, H., Nardini, A., Pittermann, J., Plavcová, L., Schreiber, S.G., Sperry, J.S., Wright, I.J. and Zanne, A.E., 2016. Weak tradeoff between xylem safety and xylem35

ACCEPTED MANUSCRIPT specific hydraulic efficiency across the world's woody plant species. New Phytologist 209, 123-136. Greenidge, K.N.H., 1952. An approach to the study of vessel length in hardwood species. American Journal of Botany 39, 570-574. Hacke, U.G., Sperry, J.S., Wheeler, J.K., Castro, L., 2006. Scaling of angiosperm xylem structure with safety and efficiency. Tree Physiology 26, 689–701. Hacke, U.G., Sperry, J.S., Feild, T.S., Sano, Y., Sikkema, E.H., Pittermann, J. 2007.

CR IP T

Water transport in vesselless angiosperms: conducting efficiency and cavitation safety. International Journal of Plant Sciences 168, 1113-1126.

Hacke, U. G., Venturas, M. D., MacKinnon, E. D., Jacobsen, A. L., Sperry, J. S., Pratt, R. B., 2015. The standard centrifuge method accurately measures vulnerability curves of long-vesselled olive stems. New Phytologist 205(1), 116–127.

AN US

Handley, W.R.C., 1936. Some observations on the problem of vessel length determinations in woody dicotyledons. New Phytologist 35, 456-471.

Huggett, B., Tomlinson, P. B. 2010. Aspects of vessel dimensions in the aerial roots of epiphytic Araceae. International Journal of Plant Sciences 171(4), 362–369.

M

Inman, H. F., Bradley Jr., E.L., 1989. The overlapping coefficient as a measure of agreement between probability distributions and point estimation of the overlap of two normal densities. Communications in Statistics - Theory and Methods 18, 3851-3874.

ED

Jacobsen, A.L., Pratt, R.B., Tobin, M.F., Hacke, U.G., Ewers, F.W., 2012. A global

1591.

PT

analysis of xylem vessel length in woody plants. American Journal of Botany 99, 1583-

Jacobsen, A. L., Valdovinos-Ayala, J., Rodriguez-Zaccaro, F. D., Hill-Crim, M. A.,

CE

Percolla, M. I., Venturas, M. D. 2018. Intra-organismal variation in the structure of plant vascular transport tissues in poplar trees. Trees. doi:10.1007/s00468-018-1714-z.

AC

Jansen, S., Choat, B., Pletsers, A., 2009. Morphological variation of intervessel pit membranes and implications to xylem function in angiosperms. American Journal of Botany 96, 409–419.

Ladell, J.L., 1959. A new method of measuring tracheid length. Forestry 32, 124–125. Lens, F., Sperry, J.S., Christman, M.A., Choat, B., Rabaey, D., Jansen, S., 2011. Testing hypotheses that link wood anatomy to cavitation resistance and hydraulic conductivity in the genus Acer. New Phytologist 190: 709–723.

36

ACCEPTED MANUSCRIPT Loepfe, L., Martínez-Vilalta, J., Pinol, J., Mencuccini, M., 2007. The relevance of xylem network structure for plant hydraulic efficiency and safety. Journal of Theoretical Biology 247, 788–803. Martín, J.A., Solla, A., Ruiz-Villar, M., Gil, L., 2013. Vessel length and conductivity of Ulmus branches: Ontogenetic changes and relation to resistance to Dutch elm disease. Trees - Structure and Function 27: 1239–1248. Martínez-Vilalta, J., Mencuccini, M., Alvarez, X., Camacho, J., Loepfe, L., Piñol, J.,

CR IP T

2012. Spatial distribution and packing of xylem conduits. American Journal of Botany 99, 1189–1196.

Milburn, J.A., Covey‐Crump, P.A.K., 1971. A simple method for the determination of conduit length and distribution in stems. New Phytologist 70, 427-434.

Nijsse, J., 2004. On the mechanism of xylem vessel length regulation. Plant Physiology 134,

AN US

32-34.

Oberle, B., Ogle, K., Zuluaga, J.C.P., Sweeney, J., Zanne, A.E., 2016. A Bayesian model for xylem vessel length accommodates subsampling and reveals skewed distributions in species that dominate seasonal habitats. Journal of Plant Hydraulics 3, 003.

M

Pan, R., Geng, J., Cai, J., Tyree, M.T., 2015. A comparison of two methods for measuring vessel length in woody plants. Plant, Cell & Environment 38, 2519-2526. Priest, S.D.,

ED

Hudson, J.A., 1981. Estimation of discontinuity spacing and trace length using scanline surveys. International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts 18, 183-197.

PT

R Core Team, 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

CE

Rančić, D., Quarrie, S.P., Radošević, R., Terzić, M., Pećinar, I., Stikić, R., Jansen, S., 2010. The application of various anatomical techniques for studying the hydraulic

AC

network in tomato fruit pedicels. Protoplasma 246, 25-31. Rivett, M., 1920. The anatomy of Rhododendron ponticum L. and Ilex aquifolium L. in reference to specific conductivity. Annals of Botany 34, 525-550.

Salleo, S., Gullo, M.L., Siracusano, L., 1984. Distribution of vessel ends in stems of some diffuse-and ring-porous trees: the nodal regions as „safety zones‟ of the water conducting system. Annals of Botany, 54, 543-552. Schneider, C.A., Rasband, W.S., Eliceiri, K.W., 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods 9: 671–675. 37

ACCEPTED MANUSCRIPT Scholz, A., Rabaey, D., Stein, A., Cochard, H., Smets, E., Jansen, S., 2013. The evolution and function of vessel and pit characters with respect to cavitation resistance across 10 Prunus species. Tree Physiology 33, 684-694. Skene, D.S., Balodis, V., 1968. A study of vessel length in Eucalyptus obliqua L'Herit. Journal of Experimental Botany 19, 825-830. Sperry, J.S., Hacke, U.G., Wheeler, J.K., 2005. Comparative analysis of end wall resistivity in xylem conduits. Plant, Cell & Environment 28, 456-465.

CR IP T

Strasburger, E., 1891. Histologische Beiträge (III): Ueber den Bau und die Verrichtungen der Leitungsbahnen in den Pflanzen. Gustav Fischer, Jena.

Strasburger, E., 1893. Histologische Beiträge (V): Ueber das Saftsteigen. Ueber die Wirkungssphäre der Kerne und die Zellgröße. Gustav Fischer, Jena.

Torres‐Ruiz, J.M., Cochard, H., Choat, B., Jansen, S., López, R., Tomášková, I., Padilla-

AN US

Díaz, C., Badel, E., Burlett, R., King, A., Lenoir, N., 2017. Xylem resistance to embolism: presenting a simple diagnostic test for the open vessel artefact. New Phytologist 215, 489-49

Tyree, M.T., 1993. Theory of vessel-length determination: the problem of nonrandom vessel

M

ends. Canadian Journal of Botany 71, 297-302.

Wason, J. W., Huggett, B. A., Brodersen, C. R. 2017. MicroCT imaging as a tool to study

ED

vessel endings in situ. American Journal of Botany 104(9), 1424–1430. Wheeler J.K., Sperry J.S., Hacke U.G., Hoang N., 2005. Inter-vessel pitting and cavitation in woody Rosaceae and other vesseled plants: a basis for a safety versus efficiency

PT

trade-off in xylem transport. Plant, Cell and Environment 28, 800-812. Wiebe, H.H., Greer, R.L., Alfen, N.K., 1984. Frequency and grouping of vessel endings in

CE

alfalfa (Medicago sativa) shoots. New Phytologist 97, 583-590. Wilkins, A.P., Bamber, R.K., 1983. A comparison between Ladell's wood section method

AC

and the macerated wood method for tracheid length determination. International Association of Wood Anatomists Journal 4, 245-247.

Zimmermann, M.H., Jeje A.A., 1981. Vessel-length distribution in stems of some American woody-plants. Canadian Journal of Botany 59, 1882-1892.

Zimmermann, M.H., Potter, D., 1982. Vessel-length distribution in branches, stem and roots of Acer rubrum. International Association of Wood Anatomists Bulletin 3: 103– 109.

38

ACCEPTED MANUSCRIPT Table 1: List of terms, symbols and abbreviations related to vessel length distributions. Following conventions in probability theory, a capital italic letter is used for a random variable and the corresponding lowercase italic letter for a particular value of that variable. For example, X is vessel length, a random variable, and x is the length of a particular vessel. Description probability density function cumulative distribution function vessel length pdf of vessel lengths pdf of size biased vessel lengths fraction of vessel infused pdf of fraction of injected vessel downstream of injection point pdf of cut-off vessel lengths cdf of vessel lengths 1 – F(x): survivor function of vessel lengths distance from infusion point position of cut i observed number of infused vessels at zᵢ

N₀ Nᵢ p(z) (z)

number of infused vessels at the infusion point total number of vessels in a subset of a cross-section probability that an infused vessel reaches z or further observed proportion of infused vessels reaching z or further true average vessel length maximum likelihood estimate of the average vessel length variance of the vessel length distribution parameters of the vessel length distribution maximum likelihood estimate of the parameters of the length distribution number of cross-sections limits of uniform distribution used for simulating infusion profiles total number of vessels (1) or number of vessels counted per cross-section (2) mean signed error of the relative estimates (normalized by true µ) standard deviation of the relative estimates (normalized by true µ) overlapping coefficient

ncut L

AN US

M

ED

AC

nvess MSER SDR OVL

PT

̂

CE

̂ σ²

CR IP T

Symbol pdf cdf x f g c a b F S z zᵢ nᵢ

39

ACCEPTED MANUSCRIPT Table 2: Parameter combinations for the simulation experiments with the corresponding numbers of settings. Sampling scheme: type of data generation (counting entire cross sections or subsamples); distribution: vessel length generating distribution; model: model fit on the data using either the corresponding equation for p(z) from Table S1.1 and the estimator corresponding to the sampling scheme, or the methods from Cohen et al. (2003) or Christman

CR IP T

et al. (2009); numbers of cross-sections and vessels considered within cross-sections (asterisk * indicates exhaustive sampling only); and parameter sets for the different distributions (means were equally spaced, dispersion parameters with exponentially increasing spacing).

AN US

Settings Exhaustive sampling vs. subsampling expon., Erlang(2), Weibull, gamma, log-norm. expon., Erlang(2), Weibull, gamma, log-norm., Cohen, Christman 5, 10, 15, 20, 25, 30, 40, 50, 75, 100 100, 150, 200, 300, 500, 750*, 1000*, 1500*, 2500*, 5000* μ: 1–30 (100, equidist.) expon.: Erlang(2): μ: 1–30 (100, equidist.) μ: 1–30 (10, equidist.) k: 0.6–6 (10, exp. spaced) Weibull: μ: 1–30 (10, equidist.) k: 0.6–6 (10, exp. spaced) gamma: log-norm.: μ: 1–30 (10, equidist.) SDlog: 0.2–1.2 (10, exp. spaced)

n 2 5 7 10 10 100 100 10×10 10×10 10×10

AC

CE

PT

ED

M

Parameter Sampling scheme Distribution Model Number of cuts Number of vessels Parameter sets

40

ACCEPTED MANUSCRIPT Table 3: Mean signed relative errors ± standard deviation for the estimates of average vessel length for all combinations of simulated vessel length distribution and model type, averaged over all different values of ncut and all parameter settings (n=100 for each). For exhaustive sampling, averages were calculated only for the simulations with nvess equal to 1000, 1500, 2500 and 5000, for subsampling, it was averaged over nvess equal to 100, 150, 250 and 500

CR IP T

(see Figure S2.1 and S2.2 for full results). The MSER quantifies the bias of the estimates relative to the true population mean of the simulated distribution. The SD of the relative estimates (identical to the SD of the MSER) is a measure of the precision of the estimates normalized by the population mean.

Sampling

True

AN US

Assumed model

Expon.

Erlang(2)

Gamma

Weibull

Log-norm.

Cohen

Christman

Expon.

0.00 ± 0.03

0.35 ± 0.04

-0.00 ± 0.10

0.00 ± 0.07

0.25 ± 0.07

1.00 ± 0.09

1.00 ± 0.07

Erlang(2)

-0.25 ± 0.02

-0.00 ± 0.03

0.00 ± 0.06

-0.05 ± 0.05

0.10 ± 0.05

0.35 ± 0.05

0.50 ± 0.05

Gamma

-0.26 ± 0.26

0.00 ± 0.36

-0.00 ± 0.08

-0.02 ± 0.08

0.12 ± 0.16

0.34 ± 0.67

0.51 ± 0.51

Weibull

-0.15 ± 0.46

0.15 ± 0.65

-0.06 ± 0.25

0.00 ± 0.06

0.17 ± 0.19

0.67 ± 1.34

0.71 ± 0.86

Log-norm.

-0.03 ± 0.47

0.33 ± 0.68

-0.37 ± 0.38

-0.24 ± 0.18

0.00 ± 0.06

1.42 ± 1.85

0.80 ± 0.74

All Subsamples

Expon.

-0.14 ± 0.34

0.17 ± 0.47

-0.09 ± 0.26

-0.06 ± 0.14

0.13 ± 0.15

0.76 ± 1.14

0.70 ± 0.59

0.00 ± 0.05

0.27 ± 0.07

0.00 ± 0.15

0.01 ± 0.12

0.16 ± 0.12

1.00 ± 0.19

1.00 ± 0.13

-0.20 ± 0.04

0.00 ± 0.05

0.00 ± 0.10

-0.03 ± 0.09

0.06 ± 0.09

0.35 ± 0.12

0.50 ± 0.09

PT

Erlang(2)

ED

Exhaustive

M

Distribution

-0.19 ± 0.21

0.01 ± 0.28

0.00 ± 0.11

-0.00 ± 0.10

0.08 ± 0.14

0.35 ± 0.67

0.51 ± 0.51

Weibull

-0.10 ± 0.35

0.13 ± 0.47

-0.04 ± 0.20

0.00 ± 0.10

0.11 ± 0.16

0.67 ± 1.32

0.71 ± 0.87

Log-norm.

-0.05 ± 0.33

0.21 ± 0.46

-0.26 ± 0.33

-0.15 ± 0.15

0.00 ± 0.10

1.39 ± 1.84

0.81 ± 0.76

All

-0.11 ± 0.25

0.13 ± 0.34

-0.06 ± 0.22

-0.03 ± 0.13

0.08 ± 0.13

0.75 ± 1.13

0.70 ± 0.60

AC

CE

Gamma

41

ACCEPTED MANUSCRIPT Table 4: Average overlapping coefficient (OVL) of the true and estimated vessel length distribution for all combinations of the simulated vessel length distribution and model type, averaged over all different values of ncut and all parameter settings. For exhaustive sampling, averages were calculated only for the simulations with nvess equal to 1000, 1500, 2500 and 5000, for subsampling, it was averaged over nvess equal to 100, 150, 250 and 500 (see Figure

CR IP T

S2.3 and S2.4 for full results). The OVL measures the fraction of overlap between the true distribution used for simulations and the estimated length distribution. Assumed model Erlang(2)

Gamma

Weibull

Log-norm.

Cohen

Christman

Expon.

99.122

77.030

95.181

96.587

77.095

62.076

63.165

Erlang(2)

77.225

98.870

95.953

93.298

85.829

58.627

74.351

Gamma

62.156

69.747

95.266

91.624

86.044

44.529

73.039

Weibull

60.522

64.453

84.964

95.842

80.537

44.470

70.229

Log-norm.

63.238

64.924

60.797

73.326

96.161

49.363

67.113

All

72.453

75.005

86.432

90.135

85.133

51.813

69.580

Expon.

98.576

78.534

92.166

94.024

79.881

55.340

63.026

Erlang(2)

78.861

98.079

92.607

91.771

85.831

48.888

74.213

Gamma

64.831

71.865

90.719

89.388

84.224

37.038

72.890

Weibull

63.365

67.065

84.074

91.246

79.503

37.868

70.040

66.075

68.421

67.842

77.266

92.109

41.270

66.974

74.342

76.793

85.481

88.739

84.309

44.081

69.429

distribution

Exhaustive

Subsample s

Log-norm.

AC

CE

PT

All

AN US

Expon.

M

True

ED

Sampling

42

ACCEPTED MANUSCRIPT Table 5: Coverage probabilities for the correct models, averaged over all different values of ncut and nvess and all parameter settings (see Figure S2.5 and S2.6 for more details).

Exponential 0.951 0.819

Erlang(2) 0.954 0.820

Model Gamma 0.954 0.820

Weibull 0.951 0.823

Log-normal 0.947 0.817

AC

CE

PT

ED

M

AN US

CR IP T

Sampling Exhaustive Subsamples

43

ACCEPTED MANUSCRIPT Figure legends Figure 1: Schematic longitudinal sections of wood illustrating the two possible sampling methods. The left portion of the figures represents the (unknown) cut-off fraction of the wood sample, the right portion represents the part injected with silicon. Red rectangles: siliconinjected vessels; blue rectangles: empty vessels; continuous vertical line: injection point;

CR IP T

dotted vertical lines: positions of cross-sections, red and blue crosses: vessel counts in each cross-section. a) Sampling scheme (1): injected and empty vessels in random subsamples of each cross-section; b) Sampling scheme (2): complete counts only of injected vessels over the

AN US

full cross-section.

Figure 2: Shape of the different infusion profiles resulting from the two sampling schemes. The graphs above show five simulated infusion profiles using data from 10 cross-sections

M

with 50 vessels each from a gamma distribution with a mean of 10 cm and a shape parameter of 5 and the expected value of the infusion profile (black), below are the normalized

ED

deviations from the expected value (deviations divided by standard error). a) Sampling scheme (1): infusion profile monotonously decreasing, deviations from expected values auto-

PT

correlated. b) Sampling scheme (2): random deviations of simulated infusion profiles from

CE

expected value. c) and d) show the normalized deviations from the expected values (absolute deviations divided by theoretical SD from binomial distribution) for the sampling schemes (1)

AC

and (2), respectively.

Figure 3: Simulation run based on an exponential vessel length distribution with an average of 10, counting all vessels in 15 cross-sections beginning with 250 vessels at the injection point. a) Simulated infusion profile overlaid with the theoretical infusion profile from Table S1.1 with the true population mean of the distribution used for simulation (black), the model 44

ACCEPTED MANUSCRIPT predictions using the conditional maximum likelihood estimator from Eqn. (17) (blue), and the predictions of the log-linear model of Cohen et al. (2003) (red). b) Observed lengths of the injected vessel fragments above the infusion point overlaid with the theoretical and estimated densities of b(z). c) Size-biased lengths of the vessels at the infusion point, overlaid with the true and estimated densities of g(z), and the estimated distribution based on Cohen et

CR IP T

al. (2003). Vertical lines: true and estimated average lengths. d) Uncut vessel length distribution overlaid with the theoretical and predicted densities of f(z), the estimated

AC

CE

PT

ED

M

AN US

distribution based on Cohen et al. (2003) and the corresponding average values.

45

CR IP T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

Figure 1

46

CR IP T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

Figure 2

47

CE

PT

ED

M

AN US

CR IP T

ACCEPTED MANUSCRIPT

AC

Figure 3

48