A geometric approach to robotic unfolding of garments

A geometric approach to robotic unfolding of garments

Robotics and Autonomous Systems 75 (2016) 233–243 Contents lists available at ScienceDirect Robotics and Autonomous Systems journal homepage: www.el...

2MB Sizes 1 Downloads 87 Views

Robotics and Autonomous Systems 75 (2016) 233–243

Contents lists available at ScienceDirect

Robotics and Autonomous Systems journal homepage: www.elsevier.com/locate/robot

A geometric approach to robotic unfolding of garments Dimitra Triantafyllou a,b,∗ , Ioannis Mariolis a , Andreas Kargakos a , Sotiris Malassiotis a , Nikos Aspragathos b a

Centre of Research and Technology Hellas, 6th km Xarilaou Thermi, 57001, Thessaloniki, Greece

b

Department of Mechanical Engineering and Aeronautics, University of Patras, 26504, Rio, Greece

highlights • • • • •

A geometric approach for robotic unfolding of garments is proposed. Folds and hemline edges are detected and used as initial grasp points. Unfolding is completed through template matching with foldable templates. Garment classification is also achieved through template matching. Experiments in a variety of garments prove the method’s robustness.

article

info

Article history: Received 12 March 2015 Received in revised form 6 September 2015 Accepted 25 September 2015 Available online 23 October 2015 Keywords: Garment manipulation Shape analysis Unfolding

abstract This work presents a novel approach to autonomous unfolding of garments by means of a dual arm robotic manipulator. The proposed approach is based on the observation that a garment can be brought to an approximately planar configuration if it is held by two points on its outline. This step facilitates the detection of another set of points that when grasped the garment will naturally unfold. A robust method for successively detecting such boundary points on images of garments hanging from a single point was developed. The manipulated garment is then laid on a flat surface and matched to a set of foldable templates using shape analysis techniques. Using the established correspondences with the template’s landmark points the garment is re-grasped by such two points that it will naturally unfold in a spread out configuration. The adopted framework has been experimentally evaluated using a dual industrial manipulator and a variety of garments. The produced results indicate the feasibility and robustness of the proposed approach. © 2015 Elsevier B.V. All rights reserved.

1. Introduction Despite recent advances in robotics research, autonomous handling of deformable objects such as real garments still remains very challenging. One of the most difficult tasks is bringing the garments into a spread out configuration, i.e. unfolding. Even in the highly automated garment manufacturing industry this is the only step where humans are employed. Only a few recent studies address autonomous garment unfolding. The majority of them is relying on heuristic techniques, ad hoc rules or are restricted to

∗ Corresponding author at: Centre of Research and Technology Hellas, 6th km Xarilaou Thermi, 57001, Thessaloniki, Greece. E-mail addresses: [email protected] (D. Triantafyllou), [email protected] (I. Mariolis), [email protected] (A. Kargakos), [email protected] (S. Malassiotis), [email protected] (N. Aspragathos). http://dx.doi.org/10.1016/j.robot.2015.09.025 0921-8890/© 2015 Elsevier B.V. All rights reserved.

simple garments such as towels. In summary, in the literature, unfolding is achieved by means of one of the following heuristics: (a) Selection of appropriate grasping points that when grasped by two hands the garment is unfolded by gravity without the need for any intermediate rehandling. In the simple case of towel unfolding, neighbouring corner points are adequate [1], whereas for other garments more sophisticated techniques are required to infer the current configuration from images and thus select the appropriate grasping points [2–7]. (b) A two step approach where the garment is initially held by two outline points resulting in a flat configuration on a table and then picking up from a set of points to completely unfold it [8–10]. (c) Instead of grasping two outline points the iterative regrasping of the lowest hanging point also results in a flat configuration thus facilitating unfolding [11,12]. (d) A few techniques try to unfold the garment without lifting it from the table (i.e. spreading it) by pulling it [13] or by an origami like unfolding process [14].

234

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

The first group of methods imitates the humans performing the task [1,4,5,7]. Nevertheless the tracking of the configuration of the garment (i.e. map points of the hanging garment to those of the unfolded one) is necessary to localize grasping points, which constitutes a very challenging task. The third approach is the simplest and requires minimum vision but has the limitation that a very large working space is needed for handling real garments [11,12]. The fourth approach requires that the garment is just wrinkled [13] or is folded but flat [14]. For the above reasons our approach uses the second heuristic [8–10], adopting the assumption that holding a garment by two points on its outline can bring it into an approximately planar configuration. Although the garment remains folded, its flat configuration facilitates the modelling and manipulation in order to locate suitable re-grasping points for unfolding. The proposed approach, similarly to [8], breaks down the unfolding task to three subsequent subtasks: (a) rehandling, where a method for extracting outline grasping points of a hanging garment is applied, (b) modelling, where the folded garment is modelled using unfolded templates that are matched to its contour, and (c) shaping, where based on the extracted model suitable re-grasping points are selected for unfolding. A robust method for extracting outline grasping points is proposed based on depth information. This is in contrast to [8,9] where outline detection is based on detection of shadows on the intensity image thus making the method sensitive to the environment. Moreover, instead of employing ad hoc rules as in [9] to analyse the resulting configuration, we propose a generic template matching technique that is able to work with folded shapes presenting the best match to the half-folded garment. This allows us to easily include a large variety of garments by just introducing the corresponding template that approximates their shape outline. This enables the selection of the re-grasping points for unfolding based on the entire model of the folded garment and not only on distinct features like those employed in [10], increasing the confidence on the results. Compared with methods using the lowest hanging point heuristic [11,12,7], our approach allows efficient handling of real-sized garments by robots with limited workspaces. This paper is organized as follows. Section 2 presents related work while in Section 3 the proposed approach for robotic unfolding is analysed in detail. In Section 4 the experimental results are demonstrated and discussed and the paper concludes with Section 5. 2. Related work The first category of unfolding techniques mentioned in the previous section involves grasping the garment by two points that lead to natural unfolding of the clothing article by gravity without any intermediate rehandling. An example of such work is described in [1] where Maitin-Shepard et al. address robotic unfolding of towels by extracting geometric cues from stereo images in order to detect corner points for grasping. In their work, border points are detected on the hanging cloth using curvature features. Then, candidate corner points are selected for grasping using a RANSACbased algorithm that fits corners to the estimated border points. The selected corners are grasped in order to bring the manipulated towel to a spread out configuration before folding. Moreover, Yamazaki [2,3] selects simultaneously two grasp points that lead to natural unfolding on a garment randomly placed on a table. The garment’s shape is described using hem elements extracted from a range image while the grasp points are selected based on global shape similarity with a list of training data. Other methods, included in the first category, use pose or configuration inference techniques to achieve unfolding. Kita et al. [4] consider a mass–spring model to simulate a T-shirt’s

silhouette when hanging from a point at its outline. The simulated models are fit to the image of a true hanging garment allowing to localize and subsequently grasp a desired point. Li et al. [5] reconstruct a smooth 3D model of the garment and, using a feature extraction and matching scheme, match the real article to the most similar model in a database. Since its configuration is known, the robot can bring the garment in an unfolded state by iteratively grasping it from predefined points according to its type [6]. The type of the garment, as in [4], is known a priori. Instead of fitting a 3D model Doumanoglou et al. [7] use Random Decision Forests for garments recognition and Hough Forests for the estimation and grasping of two predefined points according to the garment’s type. The garment’s lowest hanging point is used as the initial grasping point in order to limit the number of possible configurations. Their approach relies on the acquisition of a huge training set of garments under various configurations and its annotation with desirable grasping point positions (hanging corner points). The second category of unfolding methods, which is the most similar to our approach, includes the work of Hamajiama et al. [8,9] and Kaneko et al. [10]. In their work, the unfolding task is divided into three subtasks: rehandling, classifying, and shaping. During rehandling [8] the garment is grasped by two hemline points in order to facilitate classification and shaping. The detection of hemlines, is based on the appearance of shadows at the vicinity of garment’s outline while hanging. As a contingency plan in case no hemline is detected, the lowest hanging point is selected. In that study, three garment types (towel, shirt and pants) are considered, whereas the classification and shaping subtasks are not addressed. As reported in that work the proposed approach is not robust to changes of the environmental conditions. The classification subtask is addressed in [9], by means of geometric features extracted by the rehandled garment and the application of ad hoc rules. The same three garment types are considered and a 90.6% average success rate is reported for classification. However, the presented classification approach is using manually defined models for the possible configurations of the three garment types after rehandling and different set of features are extracted for each model. Thus, their approach is difficult to generalize to different garment types or sensor setups. Kaneko et al. [10] adopts the rehandling and classifying approaches of [8] and [9] and proposes a method for performing the shaping task by selecting appropriate grasping points for bringing the garment to an unfolded configuration. The method is based on the detection of corners and ‘stick-out’ regions extracted using a lineapproximated image. The line-approximated image is estimated by differentiating three images of the garment acquired under different orientations of the illumination, in order to accentuate the edges of the overlapping parts. A 96% average success rate is reported for the manual unfolding of the rehandled garment using the estimated grasping points. However, the above success rate is achieved after repeating the proposed procedure in case of failure, until the garment is finally unfolded. The third category of unfolding techniques uses a heuristic approach to achieve unfolding which is based on the selection of the lowest hanging point and was introduced by Osawa et al. [11]. Namely, the garment is hanged by an arbitrary point and the lowest hanging point is initially grasped. If the procedure is repeated the garment will result in spread out configuration as held by two arms. The manipulated garments are then classified to the most similar type by evaluating the covariance between their images and a set of templates. The same heuristic is used by Cusumano et al. [12] for bringing children clothes to desired configuration. The method uses a Hidden Markov Model in order to track the garment’s configuration during handling with the goal of recognizing the specific article, by matching its outline with existing templates. The lowest point heuristic becomes problematic with large

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

235

Fig. 1. An overview of the proposed approach on robotic unfolding of real garments.

garments since the robots’ limited workspace might not allow the necessary manipulations. The fourth category attempts unfolding without lifting the garment from the table. In Willimon et al. [13], visual features such as corners, peak region and continuity of the cloth are used in order to interact with a towel and unfold it while lying on a table. The method has been only tested using 3D simulation software. Triantafyllou et al. [14], addresses the problem of unfolding a piece of fabric lying on a table using a single manipulator. The study considers only convex pieces of cloth, whereas fold detection is based on the extraction of edges and corners of the fabric. The studies presented above are restricted to only simple articles of clothing such as towels. Apart from the aforementioned methods that refer to the unfolding task, other relevant approaches are also developed. In [15] Ramisa et al. introduce a new measure of wrinkledness computed from the distribution of normal directions over local neighbourhoods. The areas that maximize this measure constitute good grasping candidates for a garment lying on a table. In [16] Yakamazi et al. detect clothing articles on a domestic environment based on wrinkled features detected through analysing the response of Gabor filters with a SVM. In [17] a 3D descriptor called FINDD is proposed for recognizing cloth wrinkles and detecting reliable grasping points on a garment located on a table. Furthermore, in [18] Ramisa et al. propose a method based on Bag of Visual Words approach, that detects predefined parts of the garments which are suitable for grasping. Despite the fact that the aforementioned approaches refer to robotic handling of garments and can be used as parts for the unfolding task, they do not describe an end to end unfolding procedure. 3. The unfolding procedure The use of a dual manipulator allows holding a garment by two points on its outline bringing it to a half-folded approximately planar configuration. The first task is detection of outline points which is however not a trivial task. The sequel is based on the assumption that the garment may be well approximated by a zero thickness planar manifold that is quasi non-stretchable. This assumption is reasonably accurate for most everyday clothing articles as also argued in [19]. 3.1. Rehandling: extracting outline grasping points of a hanging garment The rehandling subtask aims at detecting points at the outline of the garment (Fig. 1(b)). In particular, using depth images, two outline points are successively detected and grasped by the robot in order to bring the garment in a half-folded, approximately planar configuration (Fig. 1(c)) and thus facilitate the unfolding task. The images are acquired using a range sensor (Asus-XtionPro) placed on the torso of the robot approximately 1 m away from the item. We assume that the item has been autonomously grasped from a table by the robot and brought in front of it (this step is not described in this paper for brevity).

Fig. 2. Parts of the outline of a garment in a random hanging configuration: (a) junctions formed at the lowest part of the folds, (b) edges with approximately horizontal orientation.

Observing a hanging piece of clothing, we noticed two different types of local features that may be robustly detected on its outline: (i) junctions formed at the lowest part of folds (Fig. 2(a)), (ii) edges with approximately horizontal orientation (Fig. 2(b)). This approach was motivated by the fact that vertical discontinuities are most of the time difficult to detect and grasp since they are visually similar to sharp folds. In [1], Maitin et al. use an elaborate approach to detect them which however is computationally expensive, requires rotating the garment very slowly and was applied only on towels. In [8], shadow detection is used for the detection of vertical or horizontal hemlines which is however very sensitive to the illumination of the scene and reflectance of the item causing frequent failures. In the presented method, the first step for the detection of outline points is the extraction of all the edges and junctions from the range image of the hanging garment, which are analysed to select those corresponding to the outline. In the next subsections, we describe : (1) the detection of the edges and junctions of a hanging garment, (2) the extraction of outline points based on the detected features, (3) the graspability verification of these points, (4) the procedure used in order to select the robot’s grasping point at the garment’s outline. 3.1.1. Edge and junction detection on the hanging garment Edge detection is based on the analysis of depth discontinuities on the garment. In particular, the Canny detector [20] is used for the extraction of edge contours on the depth image, whereas

236

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

Fig. 3. Extraction of edges and junctions: (a) a garment in a random hanging configuration, (b) the extracted edges using the detailed representation, (c) the detected junctions, (d) the extracted edges on the hanged garment.

Fig. 4. Fold detection: (a) a fold, (b) the fold’s edges, (c) the semi-planes defined by e1 , (d) the layers formed around the fold.

contour simplification [21] is used for segmenting contours into line segments. Simplification is applied in two steps. In the first step a coarse simplification (one with a loose threshold) is used to segment the contour in approximately linear segments, i.e. to find inflection points on the contour. In the second step a more refined simplification is applied to the contour corresponding to each segment determined in the first step. The latter allows us to detect junctions more accurately since edges are usually curved (Fig. 3(c)). Junctions are defined by the intersection of two or more edges in their detailed representation (Fig. 3(c)). Due to the presence of gaps in edge detection some junctions may be missed. To avoid this, we originally detect such gaps and then search within a certain range (up to 2 cm) around the ends of these open edges to determine whether a part of another edge is detected (Fig. 3(c)). 3.1.2. Extraction of outline points Once the edges and junctions of the garment are detected, outline points can be extracted based on the aforementioned types of features. In case of horizontal edges the detection is straightforward. However, in case of folds, we have to verify whether the junction is the result of a part of the garment folding on itself. Thus, for each junction Pf : (1) Pf should be a junction of three edges e1 , e2 and e3 (see Fig. 4(b)), (2) at least one of the edges (e1 ) is heading upwards, towards the grasping point (in case more than one edges head upwards e1 is the one whose orientation is closer to the vertical axis), while at least one of the other edges is heading downwards (e3 ), (3) all three edges belong to the same semi-plane S1 defined on the depth image by e1 (see Fig. 4(c)), (4) three layers positioned at different depths are detected

between the edges of the junction. The layer a part of which is covering S2 (meaning the semi-plane that does not include the three edges) should be the most distant layer (layer L1 in Fig. 4(d)). Then, the lower layer of S1 should be the middle layer (layer L2 in Fig. 4(d)), whereas the upper layer of S1 should be the closest layer (layer L3 in Fig. 4(d)). The orientation of junction edges is estimated using the coarse line segments detected in the previous stage, acquiring more robust results than considering only small areas around the junctions. Contrariwise, for the estimation of the layers’ ordering the area under consideration is limited around the junction (20 × 20 pixels), since the layers might exhibit fluctuations on their depth. Based on the existing edges, the area is divided into subareas, which rarely exceed the expected number of three because of the noise. For the three larger areas, their average depths and centroids are calculated and mapped to the three layers formed around the junction. With the above simple heuristic we are able to robustly detect folds even in the presence of noise on the depth images. Using depth data provides invariance under illumination conditions and the material of the garment. 3.1.3. Graspability verification A primary criterion for the selection of an outline point as grasping point is its graspability, i.e. a grasp that is collision free. A fast test to verify this is to test for the presence of a layer in the back of the considered grasping point that is very close (less than 3 cm). In case there is not a layer, an exhaustive collision test between a virtual gripper and the acquired point cloud is performed (Fig. 5(b)). For efficiency we test 4 equidistant points near the grasp candidate for several orientations using octrees [22]

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

237

of the fingers, fp is the number of mesh’s points within the fingers. The weights of the two parameters have been experimentally determined and indicate the relative importance of the first one, since decreasing its value could lead to grasp failure.

Fig. 5. Grasping procedure: (a) grasp candidates near a fold, (b) simulated grasp.

Fig. 6. Simplified gripper model.

(Fig. 5(a)). In the case of a fold, the candidates are lying at the lowest edge of the fold near the junction while in the ‘‘horizontal’’ edge case the candidates are spread along it. If more than one point is graspable, the one closer to the fold is preferred while for the case of ‘‘horizontal’’ edges the choice is random. In order to accelerate the procedure, the grasping candidates are examined in parallel. During the collision test various orientations of the gripper are tested. The search is exhaustive and the step used for each axis rotation is 18° (the origin of the coordinate system is considered to be the grasping candidate’s point). Since there may be several collision free orientations, we rank them based on the following criteria: (1) maximizing the minimum distance between the piece of fabric inside the gripper’s fingers and the fingers, (2) maximizing the ratio between the 3D mesh points in the grasp area of the gripper (Fig. 6) and the rest of the area between the gripper’s fingers. The first measure helps aligning, as much as possible, the gripper with the area of the garment near the grasping point. In this way, the number of cases where the gripper could displace the garment before completing the grasp is substantially decreased. The second measure prevents a larger area of the garment, besides the grasping area, from being trapped within the gripper’s fingers. So, the cost function for the collision free orientations is : md ga cost = 1 − 0.75 gap − 0.25 fp 2

(1)

where md is the minimum distance between the points of the garment’s 3D mesh inside the gripper’s fingers and the closest finger, gap is the distance between the gripper’s fingers when they are open, ga is the number of the mesh’s points inside the grasp area

3.1.4. Grasping point selection procedure Although a single picture of the hanging garment could be adequate for outline point detection, aggregating results over more view points of the garment has demonstrated improved robustness since erroneous folds can be easily avoided. In addition, more points that facilitate robotic grasping are examined and ranked based on the criteria introduced in the previous section. The proposed work flow is as follows. First, the garment is picked up from an arbitrary point (Fig. 1(a)). The manipulator’s wrist is brought to a pose that allows rotation of the hanging garment around the vertical axis. Then, it is rotated up to 360° while several depth images are acquired (80–90 images) and the associated point clouds are reconstructed and merged into a single model used for collision detection. For each depth image the edges, and subsequently the junctions and folds of the garment are detected (Fig. 1(b)) using the rules defined in the previous subsections. Each fold is then back projected in 3D and if it coincides with previously detected folds, a score corresponding to the number of images where it is detected is accumulated. The rotation procedure terminates either when a full rotation is completed, or when a fold appears in more than 6 images. The second case occurs frequently and accelerates the unfolding procedure. If no grasping point is selected until the completion of the garment’s rotation then the fold with the highest score is selected as long as it appears in more than 3 images. The grasp candidate’s graspability is then verified. If a good fold candidate is not found the hemline with the lowest deviation from the horizontal axis is selected. Again, if a graspable point is not found the garment is rotated until a graspable hemline is found. The selected point is then grasped by the robot and the rotation procedure is iterated once again to detect a second point at the outline of the garment. In order to avoid grasping the garment near the first grasp point, this time the algorithm searches for candidate points at the lowest 2/3 part of the hanging garment. Using the detection results, the garment is grasped by two outline points yielding a half folded configuration, as shown is Fig. 1(c). The distance between the robot’s two hands is equal to the Euclidean distance of the grasping points on the hanging garment (the first detected grasping point from which the garment is hanging in the iteration of the detection procedure and the second grasping point detected during the iteration procedure). Although the garment is not completely stretched between these two points because of the curvature of the garment’s edges while hanging, the distance is considered acceptable for the application of the next part of the unfolding method, which is modelling. In case some deformation is still present after laying the garment on the table, it can be dealt by the modelling algorithm, which is not sensitive to local deformations, since it is based on partial matching. 3.2. Modelling: polygon model estimation Once the garment is brought to a half folded configuration, it is laid on a flat surface (Fig. 1(d)), by means of dual arm constraint motion planning, for further analysis (subtask b) and manipulation (subtask c). Firstly, an RGB image of the folded garment is acquired and its contour is approximated by a polygon. The large variety of colour and texture of real garments makes detection of junctions, and edges on a hanging garment, extremely challenging and error prone. Using a depth sensor instead, the above problems are overcome. However, when the garment is lying approximately flat on a table, the accuracy of the depth sensor is not sufficient

238

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

Fig. 7. Block diagram of the adopted approach performing shape matching between a folded garment and unfolded templates.

for discriminating between the surface of the table and the garment, unless only extremely thick fabrics are considered. Therefore, colour edge detection is employed for extracting the garment’s contour from the table. Depth information is only used for separating the table from the rest of the background before extracting the contour. Since the garment is folded on the working table, we propose to identify a folding axis that when applied to a template (e.g. unfolded T-shirt) it will result in a folded polygon resembling the shape of the real garment’s boundary contour. Then, the garment’s boundary contour can be matched to the template’s folded polygon and a model can be fitted to the garment. However, since the garment type is unknown, apart from the folding axis, the appropriate template has to be identified among a set of available templates (T-shirt, trousers, towel, etc.). Details of achieving the above may be found in [23]. What follows is a brief description of the adopted approach. At first, partial matching [24] between the folded garment and the unfolded templates is applied (Fig. 7(a)), in order to generate hypotheses on the location of the folding axis on each template (Fig. 7(b)). The generated hypotheses are validated according to their estimated position with respect to the templates and a smaller set of valid hypotheses proceeds to testing. During testing the hypotheses for each template are used to virtually fold it around the estimated axes (Fig. 7(c)). Then, the virtually folded contours are matched to the garment’s contour using inner-distance shape context [25] (Fig. 7(d)) and the hypotheses presenting the best match is selected (Fig. 7(e)). Thus, both garment type and axis location are estimated during matching. Matching results of the best hypothesis are also used for establishing correspondences between the garment’s contour and the template. In this work, we propose to employ these correspondences in order to fit a polygon model to the folded garment. The fitted polygon model is parametrized using a set S of landmark points lp that correspond to the c corners of the original template’s polygon, as described by (2). S = [ lp1 (x)

lp1 (y)

...

lpm (x)

lpm (y) ].

(2)

The original polygon model, denoted henceforth as the Unfolded Polygon Model (UPM), is based solely on points presenting no interior structure. In that aspect it is similar to the Polygon Model introduced in [26] (used for folding in that work). Let the original UPM be denoted by S0 . Then, using the estimated folding

axis location, S0 can be augmented to model the virtually folded template, as described by: Sfolded = [ Φ | S0 ], where Φ denotes the parameters specifying a line about which the polygon defined by S0 is to be folded. The point correspondences estimated by inner-distance shape context during matching are used to fit the augmented polygon model to the contour of the folded garment. The adopted method is based on the assumption that the folding axis is part of the folded garment’s simplified contour and examines successively each side and generates hypotheses about the axis location on the template. In our case, however, the garment has been grasped by two outline points. Thus, it is safe to assume for most garment types that the folding axis is defined by these two grasping points. Therefore, there is no need to examine all sides of the folded contour, which translates to fewer tests of erroneous hypotheses. Hence, by a-priori estimating the location of the folding axis on the folded garment the computational cost of the method is reduced, whereas the efficiency of the folding axis localization on the template is increased. Furthermore, since only contour information is employed, model estimation is robust to illumination changes, whereas it can deal with the large colour and texture variability of real garments. 3.3. Shaping: selection of re-grasping points for unfolding In order to bring the folded garment to a spread out configuration suitable for re-grasping, points on its outline should be selected. Thus, on each unfolded template a set Sr of suitable pairs of grasping points are defined beforehand (i.e. for a T-shirt they may be the two points on the neck line). The pairs are sorted according to preference order, i.e. preferring those that result to the most natural unfolded configuration of the garment. In Fig. 8 the unfolded templates of four garment types are illustrated. The landmark points are denoted by red dots on the templates. Next to each template a list with the corresponding re-grasping point pairs is provided. The assignment of the left and right robot arms to each point, is independent from the ordering of the grasping-point pairs. It solely depends on the position and orientation of the garment in relation to the robot, whereas the most important concern is to avoid collision of the arms. Thus, the table is separated into two semiplanes and a simple planning is performed for deciding which point

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

239

Fig. 8. Synthetic templates of unfolded garments. Red circles denote landmark points automatically extracted by simplification. GPPs are grasping point pairs manually selected on each template. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

is closest to the semi-plane of the left robot arm, and which point is closest to the right arm semi-plane. For each pair in Sr its location on the folded garment is estimated using the correspondences between the estimated UMP and the contour of the folded garment. The first pair that is accessible for grasping is selected and the garment is re-grasped (Fig. 1(e)).1 The assessment of each pair’s accessibility is based on the model’s estimation of garment areas that are overlapping due to folding and also kinematically reachable. Hence, this approach does not require examination of the area inside the garment’s contour, which is a very challenging task due to the large colour and texture variation exhibited by real garments. 4. Results All robotic manipulations have been conducted using a dual arm robot composed by two M1400 Yaskawa arms mounted on a rotating base (Fig. 9). The depth images of the hanging garments have been acquired by an Asus XtionPro depth sensor placed between the arms at a fixed height, whereas the RGB images of the laid down garments were acquired by a second XtionPro sensor mounted on one of the arms. Grasping has been based on custom made grippers [27]. We have exhaustively evaluated both individual steps of the algorithm as well as the full chain of unfolding from a heap to a finally unfolded item. 4.1. Rehandling evaluation For the evaluation of the outline point extraction subtask, we ran 50 tests that included the detection of 2 grasping points each, in other words, 100 different configurations of various types of clothing were tested (5 long sleeved shirts, 5 T-shirts, 4 shorts, 4 trousers, 3 towels and 2 skirts, with 4 or 5 configurations of each garment). In all the tests, the first grasping candidate was a fold and only in cases where the folds were not graspable the algorithm resorted to selecting horizontal edges as outline candidates. The experimental results are summarized in Tables 1–3. The correct detection of the garments’ folds and edges and their graspability are the two factors analysed in the tables. Table 1 depicts the results referring to the detected folds. Particularly,

1 A specially designed gripper is used, allowing soft collision with the table [27].

Fig. 9. The robotic system. The RGB-D sensors used for the unfolding are highlighted with yellow boxes.

in 75 out of the 100 detected folds, the detection was correct and the folds were graspable, which constitute the circumstances needed for a successful grasping. Nevertheless, in 22 cases the folds were detected correctly but were not graspable and in 2 cases the incorrectly detected folds were not graspable, avoiding in this way grasping the garment by a wrong point. In 1 case, which is categorized as a false graspable fold, the point where the fold was formed was correctly recognized but since its lower edge was unified with another edge during edge detection, the final grasping

240

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243 Table 1 Folds’ detection and graspability results. Graspability

Detection True positive

Graspable Not graspable

Table 4 Folding axis localization results for 7 garments, using 33 half folded configurations.

75 22

False positive

Type

CC

CRPD

NMM

1 2

Shorts T-shirt Skirt Towel Overall

90.0% 100.0% 80.0% 100.0% 93.9%

75.0% 80.0% 80.0% 90.0% 81.8%

1 2 0 0 3

Table 2 Horizontal edges’ detection and graspability results. Graspability

Graspable Not graspable

Detection True positive

False positive

22 9

2 0

Table 3 Overall results of the final grasping points.

Folds Edges Overall

Correct

False

75 22 97

1 2 3

point was incorrect. To sum up, in the 100 tested configurations, 76 folds were detected and grasped while only 1 of them ended to an incorrect grasp. In Table 2 the results addressing the edges’ detection and graspability are reported. The total number of the detected edges is 33 and not 24 as it would be expected (100 configurations − 76 folds = 24 edges) since some of the edges were not graspable leading to further search for graspable edges. In 31 cases the edges were detected correctly but 9 of them were not graspable. The 2 fault cases, that were classified as edges of the garment, corresponded to edges formed by big pockets of shorts and pants. The overall results depicted in Table 3 show that 97 out of 100 final grasp points were correct (97% success rate). The grasped folds were 75 while the edges were 22. Finally, only 1 of the falsely detected points corresponded to a fold while 2 edges were mistaken to be edges of the garments’ outline. As it is observed, a considerable number of detected points was rejected due to graspability issues. However, these issues occurred only in thick garments, such as trousers, thick long sleeved shirts and certain shorts with big pockets. Contrariwise, very thin and soft garments might present problems in detecting folds since the formed depth differences are close to the range sensor’s tolerance. Nevertheless, horizontal edges located at the outline of the garment when it is hanging provide a solution to these cases. In 77 tests, the folds were detected without concluding a whole rotation of the garment around its axis. The early detection led to reducing the average time of fold detection from 25.4 to 13.2 s. Moreover, the average time for graspability check was 5.6 s. The average time for a complete manipulation resulting in a half folded configuration is 126.2 s. The minimum time needed during the experiments was 48 s while the maximum was 185 s. In the first case graspable folds were detected in the very first images of the rotated garment while, in the second case, the robot completed the garment’s rotation for both hands and the initially detected folds were not graspable. Nevertheless, most of the running time (about 70%) is consumed by the robot movement itself which is due to safety reasons and to avoid excessive swinging of the item. 4.2. Modelling and shaping evaluation For the evaluation of the polygon model estimation and regrasping subtask 33 half folded configurations of four types of clothing were tested (2 T-shirts, 2 shorts, 2 towels and 1 skirt,

with 4 or 5 configurations for each garment), using one reference template for each type. The templates have been manually generated, as simple polygons defining generic shapes resembling the contours of the garments of each type. The exact ratios between the sides of the template polygons are arbitrary, whereas the only concern has been not to have templates that are exact matches to any of the real garments used in the experiments. The garments were manually placed on the table but they were not flattened to make the test realistic. The synthetic templates used as reference along with the manually defined key-points used for re-grasping (GPPs) are presented in Fig. 8, whereas examples of half folded garments used for testing are illustrated in Fig. 10. Trousers and long sleeve shirts have not being considered, since grasping them by two outline points usually results in multiple folds. We plan to address this in future work by means of a flattening step. The evaluation concerns only the template matching results and the detection of the correct re-grasping points and not the robotic grasping. During the experiments the type of the manipulated garment was considered unknown. Thus, for each half folded garment all templates were examined and the one producing the best matching results was selected for classifying the garment to the corresponding type and for constructing the polygon model. A summary of the experimental results is presented in Table 4. CC denotes correct classification of the garment type. Out of the 33 tested cases only 2 misclassifications occurred yielding a correct classification rate of about 94%. In the third column of Table 4, CRPD denotes correct re-grasping point detection, which is completely dependent on correct polygon model estimation. In our experiments, the polygon model has been successfully estimated in every case that the folding axis has been accurately located on the template. In the third column of Table 4, NMM denotes the number of mismatches that resulted to verifying erroneous hypotheses on the folding axis location, failing to detect admissible re-grasping points. All other failures correspond to cases where even though the correct hypothesis was verified, low accuracy on axis localization has been obtained, mainly due to severe deformations of the garment. Notice that the overall CRPD is about 82%, which is satisfactory considering the excessive garment deformation that is presented due to hanging and remains to some extend even after laying the garment on the table (Fig. 10). As demonstrated by the experimental evaluation, the use of a single generic template for each garment type can provide adequate accuracy for successfully unfolding a large variety of folded garments, with about 94% correct classification rate, and about 82% correct re-grasping point detection. However, it could be the case that increasing the number of employed templates may decrease correct classification rate a little, but still increase correct re-grasping point detection, which is our main goal, by increasing matching accuracy for the correctly classified cases. As the number of templates increases, classification of the garment type becomes more challenging. Therefore, assigning the category of the closest match may not be the optimal strategy. Perhaps, in that case, a majority voting approach should be preferred for estimating the garment type and then use the closest match for the selected garment type to select the matching template. In case of T-shirts, no misclassification is reported in the experimental evaluation.

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

241

Fig. 10. Example images of folded garments used in the experimental evaluation. The synthetic templates used for matching are presented in the first image of each row for comparison. Red circles on the example images denote the outline points that were used for grasping the garments before they were placed on the table. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

However, they presented the higher number of mismatches. This implies that they could benefit by the use of additional templates. This is left for further investigation in future work. 4.3. Unfolding process evaluation The full unfolding chain was tested by means of the robotic platform. During the experiments the robot grasped the garments, that were lying in a random configuration on a table, from an arbitrary point, proceeded to the rehandling, modelling and shaping procedures, as they are described in Sections 3.1–3.3 accordingly, and finally grasped the points suggested from shaping and picked the spread-out garment up in the air (a video demonstrating the whole unfolding process can be found in geometric unfolding video). In order to evaluate the full unfolding chain the experiment was conducted 50 times. Particularly, 4 different types of garment were used including 3 towels, 3 pairs of shorts, 3 T-shirts and 1 skirt. In 40 cases the robot managed to unfold the garment successfully. Nevertheless, in 2 cases the rehandling method failed to suggest correct outline points, in 2 other cases the suggested points were correct but the gripper grasped two layers of the hanging garment, thus ruining the input configuration for the modelling task, and in 6 cases the suggested re-grasping points were incorrect. Concerning the last 6 cases, 3 of the fault cases were caused by misclassification of the garment type during modelling while the other 3 occurred due to deviation between the real location of the folding axis on the garment and the calculated axis during modelling procedure. Successful and unsuccessful examples of the unfolding procedure are depicted in Fig. 11. The mean time needed for the full unfolding chain is 5.1 min. Nevertheless, most of the running time is consumed by the robot movement itself which is due to safety reasons. In all conducted experiments the method’s limitations were retained, i.e. trousers or long sleeved shirts were excluded. In

addition, garments that resulted with two big folds, thus breaking our hypothesis, were not taken into account (Fig. 12(a)). The percentage of the garments that resulted in such configuration is 9% (5 out the 55 tests) and it mostly refers to towels due to their oblong shape. Furthermore, we have excluded cases where the garment is flipped over itself (Fig. 12(b)). 5. Conclusions A new approach for autonomous robotic unfolding of real garments is presented. The reported results indicate that this new approach can be successfully applied for unfolding a variety of garments handled by a dual manipulator. The proposed approach offers several advantages over existing techniques. Unlike some other approaches it is not significantly restricted by the size of real garments. This advantage is not achieved only due to the robot’s setting but mainly because the method is not dependent on particular grasping points, such as the lowest hanging point. Therefore, the selected grasping points are not restricted to a specified work area and can be anywhere on the garment. Furthermore, the use of RGB-D sensors made the developed vision algorithms very robust both to the appearance of the handled garment and the illumination of the scene. The choice of a geometric methodology for the unfolding task gains leverage in certain aspects over a machine learning approach. The avoidance of big datasets, which demand time-consuming training, facilitates the generalization of the approach. Thus, a new type of garment can be easily added and handled by our method, meeting the demands encountered by garment manufacturing industry which presents a large variety of garments’ shapes. In the latter case the exact geometry of the manipulated items may be considered known, thus facilitating recognition. Existing techniques using 3D simulation or machine learning are more cumbersome to extend in this way.

242

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

Fig. 11. Examples of garments at the end of shaping procedure. The suggested grasp points are marked with a red dot while the blue labels denote the part of the garment to be grasped. The first two rows include successful examples whereas the third row comprises three unsuccessful examples: (1) the type of garment is misclassified, (2) there is a big deviation between the real axis and the detected one, (3) the suggested points lead to two layer grasping thus they are rejected. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 12. Challenging garment configurations: (a) a towel with two big folds, (b) the garment flipped over itself, (c) a pair of trousers with two folds, (d) a shirt with two folds.

D. Triantafyllou et al. / Robotics and Autonomous Systems 75 (2016) 233–243

Nevertheless, our approach presents also certain limitations that we wish to address in the future. As a geometric approach and contrary to machine learning methods, there are assumptions that if they are not valid the method fails to unfold the garment. The assumption that a single fold will result after the initial handling is not always true as we discovered in our experiments since items such as trousers or shirts exhibit significant buckling. Although an extension of our approach to detect more folds is not trivial, it is possible. Finally, pieces of clothing, such as unbuttoned shirts, that cannot be approximated as planar objects cannot be addressed by our method since in such cases its basic assumption is violated. Since the adopted model estimation method can in principle deal with multiple folds, as a future work further testing will be performed including trousers and shirts. However, in that case the assumption that the grasping points form a folding axis may not be valid. Thus, a strategy for analysing the hanging configuration and determining whether a single axis is formed should be developed. Moreover, some compensative actions, such as using the edge of the working surface when placing the garment on it, should be considered for dealing with excessive deformations.

[23] I. Mariolis, S. Malassiotis, Matching folded garments to unfolded templates using robust shape analysis techniques, in: Lecture Notes in Computer Science, vol. 8048 LNCS (PART 2), 2013, pp. 193–200. [24] H. Riemenschneider, M. Donoser, H. Bischof, Using partial edge contour matches for efficient object category localization, in: Proceedings of the 11th European Conference on Computer Vision: Part V, ECCV’10, Springer-Verlag, Berlin, Heidelberg, 2010, pp. 29–42. [25] H. Ling, D.W. Jacobs, Shape classification using the inner-distance, IEEE Trans. Pattern Anal. Mach. Intell. 29 (2007) 286–299. [26] S. Miller, M. Fritz, T. Darrell, P. Abbeel, Parametrized shape models for clothing, ICRA (2011) 4861–4868. [27] T.-H.-L. Le, M. Jilich, A. Landini, M. Zoppi, D. Zlatanov, R. Molfino, On the development of a specialized flexible gripper for garment handling, J. Autom. Control Eng. 1 (3) (2013).

Dimitra Triantafyllou received her diploma degree in Electrical and Computer Engineering from the Aristotle University of Thessaloniki in 2007 and her M.Sc. in Production Systems from the Technical University of Crete in 2009. She is currently a Ph.D. candidate in Mechanical and Aeronautics Department of University of Patras and a research assistant in the Information Technologies Institute at the Centre of Research and Technology Hellas.

Acknowledgement

Ioannis Mariolis received the Diploma degree and the Ph.D. degree in Electrical and Computer Engineering, in 2002 and 2009, respectively, both from the University of Patras (UoP) Greece. During 2010 and 2011, he worked as a post-doctoral research assistant in the Medical Physics Laboratory, at School of Medicine (UoP). Since February 2012, he has been working in the Information Technologies Institute, at the Centre of Research and Technology Hellas, as a post-doctoral research fellow. His main research interests include statistical signal processing, machine vision, medical image analysis and pattern

This work was supported by the EU FP7 project CloPeMa No 288553. References [1] J. Maitin-Shepard, M. Cusumano-Towner, J. Lei, P. Abbeel, Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding, in: ICRA, 2010, pp. 2308–2315. [2] K. Yamazaki, Grasping point selection on an item of crumbled clothing based on relational shape description, in: IROS, 3123–3128, IROS, 2014, pp. 3123–3128. [3] K. Yamazaki, A method of grasp point selection from an item of clothing using hem element relations, Adv. Robot. 29 (1) (2015). [4] Y. Kita, T. Ueshiba, E.S. Neo, N. Kita, Clothes state recognition using 3D observed data, in: ICRA, 2009, pp. 480—485. [5] Y. Li, Y. Wang, M. Case, S.-F. Chang, K.P. Allen, Real-time estimation of deformable objects using a volumetric approach, in: IROS, 2014, pp. 987—993. [6] Y. Li, D. Xu, Y. Yue, Y. Wang, S.-F. Chang, E. Grinspun, P.K. Allen, Regrasping and unfolding of garments using predictive thin shell modelling, in: ICRA, 2015. [7] A. Doumanoglou, A. Kargakos, T.-K. Kim, S. Malassiotis, Autonomous active recognition and unfolding of clothes using random decision forests and probabilistic planning, in: ICRA, 2014, pp. 987—993. [8] K. Hamajima, M. Kakikura, Planning strategy for task of unfolding clothes, Robot. Auton. Syst. 32 (2000) 145–152. [9] K. Hamajima, M. Kakikura, Planning strategy for task of unfolding clothes – classification of clothes, J. Robot. Mechatron. 12 (5) (2000) 577–584. [10] M. Kaneko, M. Kakikura, Study on handling clothes-task planning of deformation for unfolding laundry, J. Robot. Mechatron. 15 (4) (2003). [11] F. Osawa, H. Seki, Y. Kamiya, Unfolding of massive laundry and classification types, J. Adv. Comp. Intell. Intell. Inf. 11 (2007) 457–463. [12] M. Cusumano-Towner, A. Singh, S. Miller, J.F. O’Brien, P. Abbeel, Bringing clothing into desired configurations with limited perception, in: ICRA, 2011, pp. 3893–3900. [13] B. Willimon, S. Birchfield, I.D. Walker, Model for unfolding laundry using interactive perception, in: IROS, IEEE Press, 2011, pp. 4871–4876. [14] D. Triantafyllou, N.A. Aspragathos, A vision system for the unfolding of highly non-rigid objects on a table by one manipulator, in: ICIRA, 2011, pp. 509–519. [15] A. Ramisa, G. Alenya, F. Moreno-Noguer, C. Torras, Determining where to grasp cloth using depth information, Art. Intell. Appl. 232 (2011) 199–207. [16] K. Yakamazi, M. Inaba, A cloth detection method based on image wrinkle feature for daily assistive robots, Mach. Vis. Appl. (2009) 366–369. [17] A. Ramisa, G. Alenya, F. Moreno-Noguer, C. Torras, FINDD: A fast 3D descriptor to characterize textiles for robot manipulation, in: IROS, 2013, pp. 824–830. [18] A. Ramisa, G. Alenya, F. Moreno-Noguer, C. Torras, Learning RGB-D descriptors of garments parts for informed robot grasping, Eng. Appl. Artif. Intell. 35 (2014) 246–258. [19] S. Miller, J. Berg, M. Fritz, T. Darell, K. Goldberg, P. Abbeel, A geometric approach to robotic laundry folding, Int. J. Robot. Res. 31 (2) (2011) 249–267. [20] J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Anal. Mach. Intell. 8 (6) (1986) 679–698. [21] J. Hershberger, J. Snoeyink, An o (nlogn) implementation of the Douglas–Peucker algorithm for line simplification, in: Proceedings of the 10th Annual Symposium on Computational Geometry, SCG’94, ACM, New York, 1994, pp. 383–384. [22] http://docs.pointclouds.org/1.7.1/groupoctree.html.

243

recognition.

Andreas Kargakos graduated from Electrical and Computer Engineering Dpt. of Aristotle University of Thessaloniki (A.U.TH.). In the year 2010–2011, he worked for the research project P.A.N.D.O.R.A. (Program for the Advancement of Non Directed Operated Robotic Agents) of Electrical and Computer Engineering Dpt. (A.U.TH.). He is a research assistant in the Information Technologies Institute of the Centre for Research and Technology Hellas since 2013, currently working on European project CloPeMa (Clothes Perception and Manipulation). His research interests include fields such as robotics and artificial intelligence.

Dr. Sotiris Malassiotis, has received the diploma and Ph.D degrees in Electrical Engineering from the Aristotle University of Thessaloniki, in 1993 and 1998, respectively. From 1994 to 1997 he was conducting research in the Information Processing Laboratory of Aristotle University of Thessaloniki. He is currently a senior researcher (associate professor equivalent) in the Information Technologies Institute, Thessaloniki and leader of the cognitive systems and robotics group. He has participated in more than 15 European and National research projects. He is the author of 30 journal publications and more than 60 papers in international conferences and book chapters. His research interests include 3D image analysis, machine learning, complex and adaptive systems, and computer graphics.

Professor Nikos A. Aspragathos leads the Robotics Group in Mechanical and Aeronautics Engineering Department, University of Patras, Greece. His main research interests are robotics, intelligent motion planning and control for static and mobile robots and for dextrous manipulation of rigid and non-rigid objects, knowledge-based design, industrial automation, and computer graphics. He is reviewer for about 40 Journals and more than 30 conferences, member of the editorial board of the Mechatronics Journal, ROBOTICA and ISRN Robotics. He published more than 60 papers in Journals and more than a 130 in conference proceedings. He was and is currently involved in research projects funded by Greek and European Union sources.