Image and Vision Computing 52 (2016) 88–96
Contents lists available at ScienceDirect
Image and Vision Computing journal homepage: www.elsevier.com/locate/imavis
Towards a mean body for apparel design夽 J. Domingo a, * , A. Simó b , M.V. Ibáñez b , E. Dura c , G. Ayala d , S. Alemany e a
Dpt. of Informatics, School of Engineering, Avda. de la Universidad, s/n. 46100 Burjasot, Valencia, Spain Dpt. of Mathematics-IMAC, University Jaume I, Castellón, Spain Dpt. of Informatics, University of Valencia, Valencia, Spain d Dpt. of Statistics and Operation Research, University of Valencia, Spain e Institute of Biomechanics, Polytechnic University of Valencia, Spain b c
A R T I C L E
I N F O
Article history: Received 19 September 2015 Received in revised form 18 February 2016 Accepted 22 April 2016 Available online 31 May 2016 Keywords: Mean set Confidence set Apparel design Anthropometric survey
A B S T R A C T This paper focuses on shape average with applications to the apparel industry. Apparel industry uses a consensus sizing system; its major concern is to fit most of the population into it. Since anthropometric measures do not grow linearly, it is important to find prototypes to accurately represent each size. This is done using random compact mean sets, obtained from a cloud of 3D points given by a scanner and applying to the sample a previous definition of mean set. Additionally, two approaches to define confidence sets are introduced. The methodology is applied to data obtained from a real anthropometric survey. © 2016 Elsevier B.V. All rights reserved.
1. Introduction Shape analysis is an important topic in many scientific fields such as Biology, Archaeology, Medicine, Geology and, in recent decades, also Computer Vision. Many image processing tasks need some way to average different shapes. Three major approaches can be identified in Shape Analysis, based on how the object’s shape is treated in mathematical terms [34]. Shapes can be treated as sequences of points (landmarks), as compact sets on Rm , or they can be described using functions representing their contour. The main aim of this work is to obtain averages of shapes together with their confidence sets by considering the shape of any object as a compact set on R3 . Finally, these averages and their associated confidence sets will be used to define prototypes for the apparel industry. We have to note that the theory of random compact sets provides a more general framework. For example, a simple random set is obtained as an unordered collection of random points X = {x1 , . . . , xk }. These k points can be interpreted as landmarks; if a specific order of them is prescribed.
夽 This paper has been recommended for acceptance by Vassilis Athitsos. * Corresponding author. E-mail addresses:
[email protected] (J. Domingo),
[email protected] (A. Simó),
[email protected] (M. Ibáñez),
[email protected] (E. Dura),
[email protected] (G. Ayala),
[email protected] (S. Alemany).
http://dx.doi.org/10.1016/j.imavis.2016.04.016 0262-8856/© 2016 Elsevier B.V. All rights reserved.
Anthropometric data provide fundamental information to the apparel industry. Designers and pattern makers would like to have mannequins that represent the main anthropometry of a basic size, which can then be scaled proportionally to cover most of the population. The primary anthropometric information used by clothing designers consists of tables that list the mean values of the main anthropometric measurements for each size. Most of this information has been developed from the designers’ own experience or has been based on a beauty canon that is far from the real shape [18]. Nowadays, the technical design of a garment is still a handcraft job that requires several trial and error tests in order to achieve the patterns of the garment with a good fitting and style. The starting point of a pattern maker to develop a new garment is a basic pattern block that has key features similar to the new garment. This basic block is generated according to a set of body measurements that ranges from 15 to 20 representing an ideal canon of body proportions for the standard size established by the company [21]. The set of body measurements of the standard size is scaled to other sizes creating the ‘sizing tables’ which are the anthropometric references of companies and pattern makers to develop new garments and the range of sizes [3]. Each company has their own sizing tables that are confidential information not shared by the clothing sector. Using the patterns of the basic block a prototype of the garment is manufactured in order to check it with ‘life models’. They are subjects
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
with body dimensions close to the values of the standard size. During the fitting tests, the prototype is manually adjusted including references to develop the updated patterns. Depending on the garment complexity, these manual trials should be repeated between two and four times in order to achieve the definitive patterns, being a significant economic burden for companies [4]. In addition, the body dimensions of the ‘sizing tables’ used as a reference by pattern makers are based on old data refined after many years of practice. Therefore, the fitting of the new developed garments is not for body proportions of the real population. There is a lack of standards in the clothing industry representing the body dimensions and shapes of the real population worldwide segmented by sizes. In recent years, emerging technologies for body scanning and user’s fit problems, together with mass production of clothing, have promoted new sizing surveys to update the anthropometric data of the population in different countries [28]. So far, most of the research studies performed [1,10,36] focus on the analysis of these data for their application in apparel design [11,19,31,35]. The extended use of computer-aided design promotes the development of many tools that use information from anthropometric measurements to create virtual 3D models of humans that can be added to any virtual environment or workspace [16]. There are two types of 3D human body models: digital human models (DHM) and avatars. DHM are parametric body models, which simulate body proportions, postures, reach ranges and motions [9,37] based on anthropometric data and regressions. In the case of avatars, the main objective is to provide a realistic visual representation. They give priority to an aesthetic appearance, leading to unreal 3D body shapes. Besides, deformation methods that are not based on statistical distributions are used. In this context, different authors have analyzed human shape variability using a landmark or dense surface-mesh representation of 3D human bodies. A set of 3D points are taken on the body surface, chosen either by their significance (anatomical landmarks) or by their positions (for instance, nodes on a projected 2D mesh). Their coordinates are organized as a vector of data where each component is taken as a variable. A principal component analysis (PCA) is applied to these data, retaining only the components which account for a certain amount of the total variability [6,29,39]. Some statistical summaries of these PCA-transformed data can be taken as descriptors that can efficiently represent the human body shape and size at different levels of detail; this is an area still under research [37]. In contrast to these approaches, in this work we present a new statistical methodology to define prototypes based on 3D point clouds corresponding to 3D scans. Using these 3D datasets provided by the scanner, we will obtain a 3D binary image, i.e., a 3D shape. The sample of 3D binary images can be considered as a random sample of a random compact set in R3 . From this point of view, we propose to average them using the concept of mean set. Additionally, confidence sets, which are regions containing the corresponding mean with a certain level of confidence, are also calculated. A random compact set is a natural probabilistic model for shapes. The formal definition and examples can be found in [26,34] and [13], among others. There is no single definition of mean set and different definitions can be found in the literature, like the Aumann mean [2,5], the Vorob’ev mean [33,38] and the Baddeley–Molchanov mean [7]. Of these definitions, the Baddeley–Molchanov mean is, perhaps, the most flexible because different results can be obtained by using different metrics in such a way that the distance is chosen by taking into account a specific application. Additionally, the Baddeley–Molchanov mean can be defined for general compact sets, whereas the Aumann mean is only suitable for convex and compact sets. Furthermore, the Vorob’ev mean is applicable to nonconvex sets but is not suitable for random sets with null volume.
89
In our application, we are dealing with non-convex sets and the Aumann mean should not be applied. The Baddeley–Molchanov and Vorob’ev means can be used because our sets have a positive area. Initially both definitions were applied, but the in-depth study was performed with the Baddeley–Molchanov mean. In the 2D case [15] the Vorob’ev mean provided very poor results. However, a preliminary evaluation for this 3D case shows a better performance than in the 2D case but with slightly coarser shapes than the Baddeley– Molchanov approach. The Baddeley mean is the best option for our problem. As it is well known, the Vorob’ev mean is sensitive to misregistration or displacement of thin features and this can easily happen in the problem that concerns us. In our opinion, this is why the Vorobe’ev means obtained are coarser than the Baddeley– Molchanov means. The proposed method will be applied to the 3D anthropometric survey of the Spanish female population. The outline of the paper is as follows. The definition of random compact set and the Baddeley–Molchanov mean are briefly reviewed in Section 2. Confidence sets for the mean sets are discussed in Section 3. Section 4 contains the results of applying our methodology to the anthropometric database of Spanish women. The paper ends with some conclusions and further work in Section 5.
2. Mean sets Let R3 be the 3D Euclidean space and let K be the collection of all non-empty compact subsets of R3 . A random compact set, V, is defined as a measurable function from a probability space (Y, S, P ) into (K , B(K )), where B(K ) is the Borel s-algebra of K generated by the myopic topology. The myopic topology on K has the sub-base that consists of KF = {K ∈ K : K
F = ∅}, F ∈ F
and KG = {K ∈ K : K
G = ∅}, G ∈ G ,
where F and G denote the family of all closed and open subsets of R3 , respectively. The formal definition with theoretical properties and applications of this concept can be found in [13,26,34] and [27] among others. From now on, let us denote by Vi the shape corresponding to the i-th woman (in fact, her torso), with i = 1, . . . , n, which will be considered as a realization of a random compact set in R3 , V. Unlike the uniqueness of the definition of the expectation of a real-valued random variable, the random sets have different features and so particular definitions of expectations highlight particular features which are important in the chosen context (see [27]). That is why different definitions of the mean set of a random compact set can be found in the literature; three of them are particularly relevant: the Aumann mean [33], the Vorob’ev mean [33] and the Baddeley– Molchanov mean [7]. Each of these definitions of mean set is based on the average of a certain random function associated with the random set. The Aumann mean is based on the support function of the set, the Vorob’ev mean on the coverage function and the Baddeley– Molchanov on a distance function. Other recent definitions of mean set can be found in [32] and in [22]. 2.1. Baddeley–Molchanov mean set First, some basic notation will be introduced. Let V be a random compact set on R3 and K the space of non-empty compact subsets
90
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
of R3 . Let d : R3 × K → R+ be a distance function. This distance function could be applied to each realization V(w)w ∈ Y of the random compact set, and, as a result, d(˙,V) is a random distance function. Let be a metric (or pseudo-metric) on the family of distance functions. For instance, the uniform metric between distance functions: D(d( • , K), d( • , L)) = sup |d(x, K) − d(x, L)|, x∈R3
2 or the L2 metric: D(d( • , K), d( • , L)) = R3 (d(x, K) − d(x, L)) dx. Finally, let W be the restriction of to W, a certain compact set (window) [27]. We will assume that d(x, V) is integrable for all x and define the mean distance function d∗ (x) = Ed(x, V). Let V∗ (t) = {x ∈ W : d∗ (x) ≤ t} with t ≥ 0 and let dt (x) = d(x, V∗ (t)). The Baddeley–Molchanov mean of V, EBM V, is defined as the thresholded set V(topt ), where the ‘optimal’ threshold topt is chosen to minimize the W -distance between dt and the mean distance function of V,i.e., topt = argmin W (dt , d∗ ) . t
This is the original definition of the Baddeley–Molchanov mean set. Given {Vi }i=1,. . . ,n , a random sample of the random set V, i.e. given a collection of independent and identically distributed (as V) random sets, the empirical estimation of EBM V, ÊBM V, is obtained from the empirical distance average defined as n 1 d¯ n (x) = d(x, Vi ). n
(1)
i=1
Following the theoretical definition, this empirical distance function should be thresholded at various levels and the sample mean should be defined as the thresholded set at the ‘optimal’ threshold given by the metric . Different plausible metrics were proposed in [7], who particularly suggested using the L2 distance. However, in all the cases, the procedure is complex and computationally quite expensive. Alternatively, a different empirical approximation to the Baddeley–Molchanov mean set was proposed by Lewis et al. [24]. They suggested using another ‘discrepancy criteria’ to estimate the optimal threshold, tˆopt as:
n m(V ) i ¯ ˆtopt = argmin − m Vn (t) n t i=1
¯ n (t) = {x ∈ W : d¯ n (x) ≤ t} and m(V) denotes the volume of V. where V The estimated optimal threshold tˆopt is then the value for which the ¯ tˆopt ) is closest to the mean volume observed. This value volume of V( can be easily determined from the histogram of d¯ n . They call this procedure volume matching. We have to note that this approach is especially suitable in the context of the apparel industry. The human body is a three-dimensional object and we work with its volume because it provides 3D information. The advantage of 3D body representation in the apparel industry is addressed in many papers and books, see e.g., [14]. We will use it in our application. As mentioned previously, one of the main advantages of the Baddeley–Molchanov mean is that it depends on the chosen distance function, d. Thus, we can obtain different kinds of mean sets using the most appropriate distance function for each application.
Baddeleyand Molchanov [7] give a list of examples with different distance functions. We will use d(x, V) = inf{ x − y , y ∈ V}, x
where • denotes the Euclidean distance in R3 , i.e., we choose the metric distance in R3 . We chose it because it is one of the most simple and largely used and because of this relation with the Hausdorff metric (Eq. (4)). A discussion about conditions for choosing one metric or another can be found in Section 3 of [7].
3. Confidence sets As stated above, mean sets define prototypes that represent each class in the sizing systems. It is also of interest for the apparel industry to represent the variation of prototypes around this mean set by means of confidence regions or set intervals that generalize confidence intervals that are widely used in statistics. As well as the interval definition of confidence interval is clear and unique in an Euclidean space, the same does not happen in the space of closed sets. Different approaches to define confidence intervals can be introduced in this space. In this section we introduce two different approaches for deriving confidence sets for the Baddeley–Molchanov mean of V, EBM V. Both approaches give regions containing the corresponding mean with a required level of confidence. In the first (Subsection 3.1) we look for regions in the space K of the nonempty compact sets. This follows the approach used by GonzálezRodrígez et al. [20] for obtaining confidence regions for the mean of a fuzzy random variable and is detailed in Subsection 3.1. The second approach is based on the ideas proposed by Seri and Choirat [30] and Jankowski and Stanberry [23]. According to these papers, an upper confidence set U is defined as a subset of W verifying: P(Ebm V ⊆ U) ≥ 1 − a . This approach is considered in Subsection 3.2. To distinguish both approaches we will talk about “confidence regions” and “confidence sets” respectively. 3.1. First approach. Confidence regions This first approach follows the ideas stated by GonzálezRodríguez et al. [20] for obtaining confidence regions for the mean of a fuzzy random variable. It is well known that given X a real-valued random variable with mean l and finite variance, a (1 − a) × 100% confidence interval for l can be determined as CI = [X¯ − d, X¯ + d], where X¯ is the sample mean of a random sample of n independent variables, X1 , . . . , Xn , with the same distribution as X, and where d = d(X1 , . . . , Xn ) is such that P(l ∈ CI) = 1 − a. Therefore, conventional confidence intervals for the mean l can equivalently be seen as balls with respect to the Euclidean distance, centered in the ¯ and with a suitable radius q which can be computed sample mean X, by bootstrapping. In order to deduce these balls for the Baddeley–Molchanov mean we have to find an appropriate metric on the space K of all the nonempty compact sets. The map K → d( • , K) embeds (in the topological sense) the family K into the space of distance functions. Using this natural embedding we can define distances between sets using the metric , or its restriction W [7]: q(K, L) = W (d( • , K), d( • , L)).
(2)
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
The mean distance function d¯ n is the Frechet mean into the space of functions if the metric chosen is the L2 metric. In this particular case the Baddeley–Molchanov mean could be regarded as something similar to an extrinsic mean like the extrinsic mean on Riemannian manifolds shape spaces defined by Bhattacharya and Patrangenaru [8]. As a result, the natural metric to define confidence regions on the space K would be: q2 (K, L) = W
(d(x, K) − d(x, L))2 dx.
(3)
Our mean set is defined using the approach of the Baddeley– Molchanov mean set proposed by Lewis et al. [24] based on L2 distance matching. This version of mean set does not absolutely guarantee that its distance function is the closest (in the metric) to the mean distance function. Other metrics could be used, like the uniform metric that implies the Hausdorff distance between compact sets qH (K, L) = sup |d(x, K) − d(x, L)|.
(4)
91
where clB (A) denotes the closing of A by a structuring element B defined as: clB (A) = (A ⊕ B) B and A B = b∈B A − b (the erosion of A by B). A similar explicit expression for the confidence region is not available when using L2 metrics; however, we point out that the Hausdorff metric has the disadvantage that the “upper limit” of this confidence region is given by a dilation of the average set and as a result it does not provide any indication of local variability. In the case of the Hausdorff distance the upper limit will be illus¯ ⊕ B(0, d) just given. However, the lower trated using the dilation V ¯ B(0, d). Note that the conlimit could be approximated by the set V dition does not involve the original set A but its closure. Therefore, ¯ B(0, d) is conservative but convenient. the lower limit given by V In the case of the L2 metric only an approximated upper limit will be illustrated. For this illustration we can use the union of all sets in our data set that are included in the confidence region. Because of the property D6 in [7] d(x, A B) = min(d(x, A), d(x, B)) and as a result the union of sets included in the confidence region will be also included in it. 3.2. Second approach. Confidence sets
x∈W
In this second case we have to note also that the myopic topology on the family K of non-empty compact sets is metrizable by the Hausdorff metric [27]. The previous definition is equivalent to
P(Ebm V ⊆ U) ≥ 1 − a .
qH (K, L) = inf{d > 0 : L ⊂ K ⊕ B(0, d), K ⊂ L ⊕ B(0, d)},
(5)
where U ⊕ V = v∈V U +v and B(0, d) is the ball centered at the origin with radius d. Both metrics will be used in our application. Given the sample {V1 , . . . , Vn }, and given a ∈ (0, 1), the chosen significance level, the procedure to build the confidence region can be schematized as follows: 1. Let {V1 , . . . , Vn } be a random sample of the random set V. Let ¯ = ÊBM V be the Baddeley–Molchanov mean estimated with V this random sample.
∗ ∗ 2. Obtain B bootstrap sample sets Vb1 , . . . , Vbn (where b∗ = 1, . . . , B) from the original random sample {V1 , . . . , Vn }. For each resample, compute its corresponding Baddeley– ¯ b∗ . Molchanov mean, and let this be V 3. Compute the distances between the sample mean and each bootstrap sample mean, i.e., calculate d∗b
As was stated before, in this second approach an upper confidence set U for the Baddeley–Molchanov mean set is defined as a subset of W verifying:
Let us see now how we can obtain it. As introduced in Section 2, let V = {Vi }i=1,. . . ,n , be a random sample distributed as V. Let d : R3 × K → R+ be a distance function, with d∗ (x) = Ed(x, V) as the mean distance function and d¯ n (x) = 1n ni=1 d(x, Vi ) as the mean empirical distance. Note that {d(x, Vi )}i=1,. . . ,n are independent and identically distributed random processes, and by the central limit theorem: √ n d¯ n (x) − d∗ (x) → Z(x)∀x ∈ W
where Z(x) is a centered Gaussian random field with covariance cov(Z(x), Z(y)) = cov(d∗ (x), d∗ (y)). In practice, the distance function is evaluated on a discrete grid of points {xj ∈ W}j=1,· · · ,p . Then, as the supremum is a continuous function on Rp , by the continuous mapping theorem [25], sup
¯ b∗ , V ¯ , =q V
(8)
j=1,...,p
L √ n d¯ n (xj ) − d∗ (xj ) → sup Z(xj ) := Z max
(9)
j=1,...,p
L
for b = 1, . . . , B. 4. Choose d as one of the (1 − a) quantiles of the sample ∗ d1 , . . . , d∗B . As said previously, both metrics q2 and qH (Eqs. (3) and (4)) will be used. Although the use of the Hausdorff metric is less natural, it has the advantage that it allows the confidence ball to be shown graphically. The confidence ball with level of confidence 1 − a, CB1−a with respect to the Hausdorff metric is defined as ¯ ≤ d} CB1−a = {A ∈ K : qH (A, V)
(6)
By taking into account Eq. (5), this confidence region is given by
¯ B(0, d) ⊂ clB(0,d) (A), A ⊂ V ¯ ⊕ B(0, d) CB1−a = A ∈ K : V
(7)
where → denotes, as usual, the law of weak convergence. Following the ideas stated in [30], let q be the (1 − a)-percentile of the distribution of Zmax as defined in Eq. (9), then q U = x ∈ W : d¯ n (x) − √ ≤ tˆopt n
(10)
with tˆopt defined as in Eq. (2.1), is an approximated (1 − a) × 100% upper confidence set for EBM V, i.e., : P(EBM V ⊆ U) ≥ 1 − a .
And we will approximate q by the value kp such that 1 − a = P Zpmax ≤ kp . Let us see the whole procedure. • Obtain a subset of points {xj ∈ W}j=1,· · · ,p . • ∀j ∈ {1, · · · , p} obtain a p-points discretized version of the ¯ ) = 1 empirical mean distance function d(x j i=1,··· ,n d(xj , Vi ). n
92
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
• The mean and the variance of these functions are calculated and used to derive the empirical cumulative distribution of Zˆ pmax . • The accumulative distribution function of Zˆ pmax is approximated by simulation from the obtained empirical distribution. Approximate q by the (1 − a)- percentile of this accumulative distribution. As was mentioned in Section 2.1, we use the metric distance function in our work, i.e., d(x, V) = inf x{ x − y , y ∈ V}, where •
denotes the Euclidean distance in R3 . 3.3. Simulation study The aim of this section is to assess the performance of the previously introduced confidence sets by means of a simulation study. Consistence conditions for the Baddeley–Molchanov mean are given in section 7 of [7]. Because we are using a slightly modified estimator, consistence is not assured. Consistence is loss in profit of a more practical estimator. With respect to the first case (confidence regions), there are theoretical consistency results that justify bootstrap confidence intervals for sample means in Euclidean spaces, but these results are not available in our context so, simulations studies are, at this moment, the only way to asses its performance. In the second case (confidence sets) the simulation study tells us the goodness of the approximations in the theoretical proof of Section 3.2. Due to the computational complexity of this method, a very simple random set has been chosen for the simulation study: V = B(0, R) ⊂ R3 , the ball centered at the origin and radius R with a uniform probability distribution in the interval [a, b], b > a > 0. It is trivial to see that the theoretical mean distance function of V is d∗ (x) = E( x −R) + , which means that V∗ (t) will be a ball for each t ≥ 0. As a result the “volume matching” approximation to the mean set is EV = B(0, r), where r = 3 E(R3 ) = (b + a)(b2 + a2 )/4. A total number of 500 original samples of size n = 50 and n = 100 of this random set V are drawn, i.e. 500n simulations of a uniform distribution in the interval [a, b]. Si = {Vi1 , . . . ., Vin }i = 1, . . . , 500, and the corresponding Baddeley–Molchanov means are ¯ 1 , . . . ., V ¯ 500 . Two pairs of values for a and b (the paramobtained, V eters of the distribution of R) will be tested: (a, b) = (5, 40) and (a, b) = (15, 40). In order to evaluate the actual performance of the bootstrap confidence sets, B = 100 bootstrap samples are taken from each sample Si and the corresponding bootstrap confidence sets at a 95% confidence level (nominal coverage) are constructed: CB10.95 , . . . , CB500 0.95 , or what is the same, the radii d1 , . . . , d500 are obtained. In order to evaluate the performance of second confidence sets, the corresponding confidence sets at a 95% confidence level, U1 , . . . , U500 , are constructed following the procedure stated in Section 3.2. The observed coverage proportion of the theoretical mean set in such confidence regions is calculated as
p=
card CBi0.95 :
EV ∈ CBi0.95 ,
(11)
for the first case and as p=
card{Ui :
EV ⊂ Ui , i = 1, . . . , 500} 500
Sample size
n = 50 n = 100
Parameters
(a, b) = (5, 40) (a, b) = (15, 40) (a, b) = (5, 40) (a, b) = (15, 40)
Coverage proportion Conf. region L2 -metric
Conf. region Hausdorff metric
Conf. set
95.20% 93.60% 95.40% 94.60%
94.20% 93.40% 94.80% 93.60%
100% 100% 100% 100%
our simulation study. The values in this table correspond to the observed coverage proportions of EV (Eq. (11)) for different values of n, a and b and for both metrics: L2 -induced metric and Hausdorff metric. Additionally, the last column shows the observed coverage proportion of EV for the obtained confidence sets (Eq. (12)). The results of the simulation study show that the first method achieves good observed coverage proportions. In the second case we can observe that there is a clear over coverage because this kind of confidence set is conservative ( P(EBM V ⊆ U) ≥ 1 − a) because of its definition. 4. Application to the apparel industry In 2006 the Spanish Ministry of Health conducted a national 3D anthropometric survey of the female population. The aim of this survey was to help apparel designers by providing real and consistent measurements to help standardize the sizing system. The ultimate purpose was to increase the protection of consumers and to help in the treatment of eating disorders. A sample of 10,415 Spanish females ranging from 18 to 70 years old were randomly selected from the official Postcode Address File. All of them were measured using The Vitus Smart 3D body scanner from Human Solutions, a non-intrusive laser system which performs a sweep of the body. Several cameras capture images and associated software provided by the scanner manufacturers detects the brightest points and uses them to make a triangulation that provides information about the 3D spatial location of about 200,000 points on the body surface. These points are grouped into triangles forming a mesh which is stored as a Stereo-Lithography format (.stl) file. These are the raw data from which we start. From the sample of 10,415 Spanish females, 4786 were selected. Some cases were excluded or various reasons, namely 71 because they were pregnant, 97 who said they were breast feeding at the time, 368 who had undergone some type of cosmetic surgery (breast augmentation, liposuction, breast reduction, etc.), 3,601 younger than 20 y.o. and 445 older than 65. Also, 1,047 scans had to be rejected due to problems with the software during the scanning process or with the automatic procedure (to be described in Section 4.1) to isolate the torso. 4.1. From 3D points to a 3D image
i = 1, . . . , 500
500
Table 1 Simulation results showing observed coverage proportions for a nominal coverage of 95%.
(12)
for the second. Two distances are used in the construction of the confidence regions, the L2 -induced metric and the Hausdorff metric of Eqs. (3) and (5), respectively. Table 1 summarizes the numerical outputs of
From the scanner, the coordinates of a collection of points located on the surface of each scanned woman are obtained. To turn each cloud of points into a 3D binary image, we propose to run through the vertical axis of the body (z-axis), dividing it into thin slices. The points that belong to each slice are enclosed by their convex hull, which is then filled. Notice that this method is neither a global 3D convex hull, which would yield little more than a cylinder or cone trunk, nor a perfect fit since human body sections are not always convex. Nevertheless, any method to reconstruct the shape more accurately would have needed many ad-hoc adjustments and would have been too prone to
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
93
Table 2 Groups made to calculate mean shapes according to subject height and chest perimeter. The content of the table is the number of available cases in our database for each of the 18 groups. First column (G) is an arbitrary name assigned to each bust group. Second column (P) is the perimeter interval in cm. Height intervals are also in cm. Height (cm) G
P
Height1
Height2
Height3
Bust code
Bust (cm)
<162
[162, 174]
>174
Total
Bust1 Bust2 Bust3 Bust4 Bust5 Bust6 Total
[74,82] [82,90] [90,98] [98,106] [106,118] [118,143]
212 870 902 634 363 79 3060
85 585 542 257 134 37 1640
[1] 28 38 12 [5] [2] 86
298 1483 1482 903 502 118 4786
turn will be used to find the dimensions of the matrix that will hold the result of the mean shape. 4.2. Results Fig. 1. Boxes to isolate the volume of interest.
error with such a wide spectrum of cases. We think that the use of height-section convex hulls is enough for the purpose of this work. In order to obtain accurate prototypes, sizing systems traditionally classify a specific population into homogeneous subgroups based on body dimensions [12], bust and height being the most commonly used body dimensions. In the work presented in this paper, instead of working with the whole body, we will work with a region of interest that comprises the torso of the women. For this purpose we will isolate this region from each scan, and prototypes will be built to fit it. As we are interested in the torso of the women, to isolate it from the whole body, parameters such as the height of the armpits from the floor, the height of the crotch, the lateral limits of the left and right hip and the height of these points, the height of the bust and the perimeter of the chest girth are calculated using projections. Finally, a region of interest for our study is isolated that comprises the torso of the women. This will be the union of two parallelepipeds, one for the lower torso, from the crotch up to the lower waist, and one from there to the neck. Both include the total depth (the X coordinate) but the lower one is wider (larger extension in the Y coordinate) from left to the right at the most prominent points of the hip, whereas the upper box will be limited in width by the left and right armpit Y coordinates. Thus, the arms are left outside but all the torso is included. See Fig. 1. Once the 3D matrix of voxels is available we will need to align all the shapes to be compared or averaged in a meaningful way. Absolute coordinates cannot be taken as a reliable guide, since the placement and posture of the person can change between subjects. The solution we have used is to rotate each shape to place the origin of coordinates at the center of mass of each shape and to make its principal inertia axis coincide with the canonical axis of coordinates. It is assumed that volumes are homogeneous (we do not need to calculate the real center of mass or inertia, just that of the binary shape) and the inertia matrix is calculated and diagonalized. The diagonalization yields a change of basis that is taken as the 3D rotation to be applied to each voxel of the shape. Also, the minimal enclosing parallelepiped whose faces are parallel to the coordinate planes (enclosing box) is calculated after the rotation. This will be used to find the minimal box that encloses all the shapes, which in
The main aim of this work is to obtain prototypes for the different sizes, so, as an illustration, we have divided our data set into different groups, obtaining a prototype and confidence sets for each of them. The European Committee for Standardization [17] defines a system with 9 sizes for the bust circumference (according to bust ranges: 74–82, 82–90, 90–98, 98–106, 106–118, 118–131, 131–143, 143– 155 and 155–167 cm), and another 9 sizes for the height, (according to height intervals: 154–158, 158–162, 162–166, 166–170, 170– 174, 174–178, 178–182, 182–186 and 186–190 cm). The sample set should therefore be segmented into the 81 groups resulting from the combination of both measurements. Nevertheless, as a large number of groups were completely empty or contained just a few samples, these were reorganized into a smaller number of groups, namely 18, resulting in greater sample sizes. Even in this case, three of the groups with five or less samples had to be elliminated, too. Table 2 shows the bust and height measurements of the proposed groups together with the number of women in each group (elliminated groups are enclosed in brackets). For a better understanding of the results each bust range has been labeled with bust1, bust2. . . and so on, resulting in a total of 6 ranges as detailed in the first column of Table 2. The heights have also been labeled as height1 for heights less than 162 cm, height2 for heights ranging from 162 to 174 and height3 for heights larger than or equal to 174 cm as illustrated in the first row of Table 2. Following the notation introduced in Section 2, let (Vi , . . . , Vn ) be the sample of 3D binary image of the torso of women included in one of these groups, n being its corresponding size. Each Vi will be considered as a realization of a random compact set V in R3 . These 3D binary images were averaged using the methodology described in Section 2. Once the mean set is obtained, we consider confidence sets for each 3D mean set using the methods described in Section 3. 4.3. Baddeley–Molchanov means: results As stated before, the main goal of this work is to provide prototypes for different size ranges. The groups created to calculate the mean shape according to the height and bust perimeter were summarized in Table 2. Visually, it can be observed that the prototypes represent the average shape of a woman quite well for most of the groups. As an illustration, Fig. 2 shows the mean shapes for six of the groups. The others can be seen in the Appendix (Additional Material) to this paper.
94
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
Fig. 2. BM mean sets of different groups. First row: bust1/height1 (left), bust2/height2 (center) and bust2/height3 (right). Second row: bust4/height1 (left), bust5/height2 (center) and bust6/height2 (right).
4.4. Confidence sets: results Besides having a prototype of the body, it is also of interest for the apparel industry to provide a confidence set (or confidence region) in which the true mean shape (3D prototype) for a specific size group lies. Confidence sets provide a guideline to estimate the maximum deformation of the mean set the designer is expected to deal with; this is done by giving statistically justified bounds for that particular group. Two different approaches have been considered in Section 3 to define confidence sets. Firstly, we present the results obtained for methodology introduced in Section 3.1. Table 3 shows the bootstrap limits of the confidence regions given as the radius of balls centered in the corresponding sample mean. Fig. 3 shows the confidence regions obtained by using the Hausdorff metric (Eq. (7)). This figure shows superimposed projections on each coordinate plane of the mean set and of the lower and upper limit of the confidence region obtained with a confidence of 0.95 for three of the 14 groups in which the number of available cases allowed a significant result. The same kind of representation for all groups is shown in the Appendix (Additional Material). The shown confidence region (upper limit and lower limit) provides variations around the mean for each given group (a given height/bust combination). The real mean for such group will be contained between the upper and lower limits with a confidence of 0.95. Table 3 d-values obtained with the bootstrap method, expressed in mm, for each group when working with the Hausdorff metric and the L2 metric. Dashes indicate insufficient sample to calculate the confidence set. d (mm) L2 metric
d (mm) Hausdorff metric
Bust1 Bust2 Bust3 Bust4 Bust5 Bust6
Height1
Height2
Height3
Height1
Height2
Height3
3.53 2.50 2.50 3.53 5.00 10.10
5.59 2.50 3.53 5.00 6.12 14.41
– 9.35 10.31 – – –
1.09 0.80 0.81 0.92 1.12 1.76
1.39 0.88 0.91 1.14 1.42 2.09
– 1.86 1.75 – – –
Fig. 3. Projections on the coordinate axes of the mean, the lower and the upper 0.95-confidence region built using the Hausdorff metric for the bust1/height2, bust5/height1 and bust6/height2 groups.
Fig. 4 illustrates the confidence regions obtained by using the L2 metric. For this illustration we can use the union of all sets in our data set that are included in the confidence region. This figure shows superimposed projections on each coordinate plane of the mean set and upper limit of the confidence region obtained with a confidence of 0.95 for the same groups. In this case the results are quite similar to those of the Hausdorff metrics if we look at the separation between the surfaces of the mean and the upper confidence region. Nevertheless, using L2 slight variations arise between different areas of the torso arise, which is an interesting point for apparel designers. The imperfections around the neck area are due to problems with the method for automatic torso extraction; they should be corrected by manual review of the cases if this is of particular interest for clothes designers. 5. Conclusions Clothing fit is an important issue in the apparel industry. A suitable fit for a size range ensures that garment design will be more successful. To achieve this, manufacturers need to design prototypes that take into account their national body features and sizing systems, as well as some variation around the prototypes to properly guide the garment design. In this paper we have proposed a feasible approach based on mean sets to generate prototypes that better synthesize the average body shape for a range of sizes, which is what the designer is ultimately looking for. Furthermore, new approaches to calculating confidence sets have been introduced. The introduction of confidence sets for mean set not only makes it possible to guide the design of the
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
95
Fig. 4. Projections on the coordinate axes of the mean and the upper 0.95-confidence region built using the L2 metric for the bust1/height2, bust5/height1 and bust6/height2 groups.
clothes for a specific size range, but also statistics learned from the population can be used to generate a population for virtual studies (computer-aided design and computer graphics). Such a population can be controlled by the user, either to statistically match an existing population, to increase sample size or to generate extreme cases (individuals not included in the confidence sets).
Acknowledgments This paper has been partially supported by the following grants: TIN2009-14392-C02-01, TIN2009-14392-C02-02 and DPI201345742-R from the Spanish Ministry of Economy and Competitiveness with FEDER funds.
References [1] S. Alemany, J.C. González, B. Nácher, C. Soriano, C. Arnáiz, H. Heras, Anthropometric survey of the Spanish female population aimed at the apparel industry, Inproceedings of the 2010 Intl. Conference on 3D Body Scanning Technologies, 2010. pp. 307–315. lugano, Switzerland ISBN 978-3-033-02714-5. [2] Z. Arstein, R.A. Vitale, A strong law of large numbers for random compact sets, Ann. Probab. 3 (1975) 879–882. [3] S.P. Ashdown, L. Dunne, A study of automated custom fit: readiness of the technology for the apparel industry, Cloth. Text. Res. J. 24 (2) (2006) 121– 136. URL. http://ctr.sagepub.com/content/24/2/121.abstract. http://dx.doi.org/ 10.1177/0887302X0602400206. [4] S.P. Ashdown, E.K. O’Connell, Comparison of test protocols for judging the fit of mature women’s apparel, Cloth. Text. Res. J. 24 (2) (2006) 137– 146. URL. http://ctr.sagepub.com/content/24/2/137.abstract. http://dx.doi.org/ 10.1177/0887302X0602400207. [5] R.J. Aumann, Integrals of set-valued functions, J. Math. Anal. Appl. 12 (1965) 1–12.
[6] Z.B. Azouz, M. Rioux, C. Shu, R. Lepage, Characterizing human shape variation using 3d anthropometric data, Vis. Comput. 22 (2006) 302–314. [7] A.J. Baddeley, I.S. Molchanov, Averaging of random sets based on their distance functions, J. Math. Imaging Vis. 8 (1998) 79–92. [8] R. Bhattacharya, V. Patrangenaru, Large sample theory of intrinsic and extrinsic sample means on manifolds, Ann. Stat. 31 (1) (2003) 1–29. [9] P. Blanchonett, Jack human modelling tool: a review, Technical Report AR-014-672, Air Operations Division, Commonwealth of Australia, 2010. [10] S. Carrier, M.E. Faust, A proposal for a new size label to assist consumers in finding well-fitting women’s clothing, especially pants. An analysis of size USA female data and women’s ready-to-wear pants for North American companies, Text. Res. J. 79 (16) (2009) 1446–1458. [11] C.C.L. Wang, Parameterization and parametric design of mannequins, Comput. Aided Des. 37 (2005) 83–98. [12] M.J. Chunga, H.F. Lina, M.J.J. Wang, The development of sizing systems for Taiwanese elementary- and high-school students, Int. J. Ind. Ergon. 37 (2007) 707–716. [13] N.A.C. Cressie, Statistics for Spatial Data. Revised Edition, John Wiley and Sons, New York, 1993. [14] A. De Raeve, M. De Smedt, H. Bossaer, Mass customization, business model for the future of fashion industry, 3rd Global Fashion International Conference, 2012. pp. 1–17. [15] J. Domingo, M.V. Ibánez, A. Simó, E. Dura, G. Ayala, S. Alemany, Modeling of female human body shapes for apparel design based on cross mean sets, Expert Syst. Appl. 41 (14) (2014) 6224–6234. ISSN 0957-4174. ISSN URL. http://www. sciencedirect.com/science/article/pii/S0957417414002115. http://dx.doi.org/ 10.1016/j.eswa.2014.04.014. [16] V.G. Duffy, Handbook of Digital Human Modeling: Research for Applied Ergonomics and Human Factors Engineering, Human Factors and Ergonomics, CRC Press. 2009. ISBN 9780805856460 [17] European Committee for Standardization, European Standard En 13402-2: Size System of Clothing Primary and Secondary Dimensions, 2002. [18] J. Fan, W. Yu, L. Hunter, Clothing Appearance and Fit, Science and Technology, Woodhead Publishing in Textiles. Woodhed Publishing in Textiles, CRC Press. 2004. [19] M. Gillies, D. Ballin, B.C. Cs’aji, Efficient clothing fitting from data, J. WSCG 12 (1-3) (2004) 129–136. [20] G. González-Rodríguez, W. Trutschnig, A. Colubi, Confidence regions for the mean of a fuzzy random variable, in: J.P. Carvalho, D. Dubois, U. Kaymak, J.M. da Costa Sousa (Eds.), IFSA-EUSFLAT, 2009. pp. 1433–1438. Lisbon, Portugal.
96
J. Domingo, et al. / Image and Vision Computing 52 (2016) 88–96
[21] J.V. Griffey, S.P. Ashdown, Development of an automated process for the creation of a basic skirt block pattern from 3d body scan data, Cloth. Text. Res. J. 24 (2) (2006) 112–120. URL. http://ctr.sagepub.com/content/24/2/112.abstract. http://dx.doi.org/10.1177/0887302X0602400205. [22] H. Jankowski, L. Stanberry, Expectations of random sets and their boundaries using oriented distance functions, J. Math. Imaging Vis. 36 (2010) 291–303. ISSN 0924-9907. ISSN. http://dx.doi.org/10.1007/s10851-009-0186-6. [23] H.K. Jankowski, L.I. Stanberry, Confidence regions for means of random sets using oriented distance functions, Scand. J. Stat. 39 (2) (2012) 340–357. [24] T. Lewis, R. Owens, A. Baddeley, Averaging feature maps, Pattern Recogn. 32 (1999) 1615–1630. [25] H.B. Mann, A. Wald, On stochastic limit and order relationships, Ann. Math. Stat. 14 (3) (1943) 217–226. [26] G. Matheron, Random Sets and Integral Geometry, Wiley, London, 1975. [27] I. Molchanov, Expectations of random sets, in: J. Gani, C.C. Heyde, P. Jarges, T.G. Kurtz (Eds.), Theory of Random Sets, Probability and its Applications, Springer. 2005, pp. 145–194. [28] A. Rissiek, R. Trieb, Isize: Implementation of International Anthropometric Survey Results for Worldwide Sizing and Fit. Optimization in the Apparel Industry, Inproceedings of the 2010 Intl. Conference on 3D Body Scanning Technologies, Lugano, Switzerland, 2010. pp. 269–281. ISBN 978-3-033-02714-5. [29] H. Seo, N. Magnenat-Thalmann, An automatic modeling of human bodies from sizing parameters, Inproceedings of 3D 03 the 2003 Intl. Symposium on Interactive 3D Graphics, ACM, New York, NY, US, 2003, pp. 19–26. [30] R. Seri, C. Choirat, Confidence sets for the Aumann mean of a random closed set, Computational Science and Its Applications ICCSA 2004, Lecture Notes in Computer Science vol. 3045, Springer Verlag. 2004, pp. 298–307.
[31] K.P. Simmons, C. Istook, P. Devarajan, Female figure identification technique (ffit) for apparel part i: describing the female shapes, J. Text. Apparel Technol. Manag. 4 (1). (2008). [32] A. Simo, E.D.e. Ves, G. Ayala, Resuming shapes with applications, J. Math. Imaging Vis. 20 (3) (MAY 2004) 209–222. http://dx.doi.org/10.1023/B:JMIV. 0000024039.27561.b9. [33] D. Stoyan, H. Stoyan, Fractals, Random Shapes and Point Fields. Methods of Geometrical Statistics, Wiley. 1994. [34] D. Stoyan, W.S. Kendall, J. Mecke, Stochastic Geometry and its Applications, second edition ed., Wiley, Berlin, 1995. [35] H. Sul, T.J. Kang, Regeneration of 3d body scan data using semi-implicit particle-based method, Int. J. Cloth. Sci. Technol. 22 (4) (2010) 248–271. [36] UK National Sizing Survey, Sizemic: UK National Sizing Survey Information Document, Technical Report, UK government and UK retailers consortium. 2004. URL. http://www.size.org/SizeUKInformationV8.pdf. [37] P. van der Meulen, A. Seidl, Ramsis — the leading Cad tool for ergonomic analysis of vehicles, Inproceedings of HCI (12)’07, 2007. pp. 1008–1017. [38] O.Y.u. Vorob’ev, Srednemernoje Modelirovanie (Mean-Measure Modelling), 1984. [39] P. Xi, H. Guo, C. Shu, Human body shape prediction and analysis using predictive clustering tree, Inproceedings of the 2011 IEEE International Conference on 3D Imaging Modeling, Processing, Visualization and Transmission, May 2011. pp. 196–203.