Microscopic vision modeling method by direct mapping analysis for micro-gripping system with stereo light microscope

Microscopic vision modeling method by direct mapping analysis for micro-gripping system with stereo light microscope

Micron 83 (2016) 93–109 Contents lists available at ScienceDirect Micron journal homepage: www.elsevier.com/locate/micron Microscopic vision modeli...

5MB Sizes 0 Downloads 15 Views

Micron 83 (2016) 93–109

Contents lists available at ScienceDirect

Micron journal homepage: www.elsevier.com/locate/micron

Microscopic vision modeling method by direct mapping analysis for micro-gripping system with stereo light microscope Yuezong Wang ∗ , Zhizhong Zhao, Junshuai Wang College of Mechanical Engineering and Applied Electronics Technology, Beijing University of Technology, Beijing 100124, China

a r t i c l e

i n f o

Article history: Received 14 December 2015 Received in revised form 22 January 2016 Accepted 22 January 2016 Available online 2 February 2016 Keywords: Micro-gripping system Stereo light microscope Microscopic vision Micromanipulation

a b s t r a c t We present a novel and high-precision microscopic vision modeling method, which can be used for 3D data reconstruction in micro-gripping system with stereo light microscope. This method consists of four parts: image distortion correction, disparity distortion correction, initial vision model and residual compensation model. First, the method of image distortion correction is proposed. Image data required by image distortion correction comes from stereo images of calibration sample. The geometric features of image distortions can be predicted though the shape deformation of lines constructed by grid points in stereo images. Linear and polynomial fitting methods are applied to correct image distortions. Second, shape deformation features of disparity distribution are discussed. The method of disparity distortion correction is proposed. Polynomial fitting method is applied to correct disparity distortion. Third, a microscopic vision model is derived, which consists of two models, i.e., initial vision model and residual compensation model. We derive initial vision model by the analysis of direct mapping relationship between object and image points. Residual compensation model is derived based on the residual analysis of initial vision model. The results show that with maximum reconstruction distance of 4.1 mm in X direction, 2.9 mm in Y direction and 2.25 mm in Z direction, our model achieves a precision of 0.01 mm in X and Y directions and 0.015 mm in Z direction. Comparison of our model with traditional pinhole camera model shows that two kinds of models have a similar reconstruction precision of X coordinates. However, traditional pinhole camera model has a lower precision of Y and Z coordinates than our model. The method proposed in this paper is very helpful for the micro-gripping system based on SLM microscopic vision. © 2016 Elsevier Ltd. All rights reserved.

1. Introduction As a precise optical instrument, stereo light microscopes are classified into two types: CMO and Greenough. The former consists of two optical sub-systems, which share one common main objective lens. Optical axes of sub-systems are approximately parallel. Hence, CMO-style SLM is more suitable for precision measurement. With two CCD cameras mounted on image planes of SLM, microscopic binocular visual system can be set up, and is called SLM microscopic vision system in this paper. SLM microscopic vision system has been applied widely in microscopic fields, such as micro-surgery, micro-manipulation, micro-assembly and micro-measurement (Kim and Bovik, 1990; Windecker et al., 1997; Eckert and Grigat, 2001; Yuezong et al., 2003; Hasler et al., 2012; Maodong et al., 2014). Micro-gripping system by means of stereo

∗ Corresponding author. Fax: +86 10 67396993. E-mail address: [email protected] (Y. Wang). http://dx.doi.org/10.1016/j.micron.2016.01.005 0968-4328/© 2016 Elsevier Ltd. All rights reserved.

light microscope usually consists of three parts: SLM microscopic vision system, micro gripper and driving component. It can automatically identify small objects by microscopic vision system and handle them by micro gripper. Micro gripper can approach small objects and grip them in large work space of SLM. SLM has high many performances suitable for micro-gripping system (Danuser, 1999), such as large work space, non-contact optical measurement and capturing images of 3D views in real-time. However, SLM also has disadvantages in applications. Microscopic vision system with SLM is typical binocular stereo vision system. Depth information extraction has to be accomplished indirectly by solving stereo correspondence problem. The disparity between two homologous image features corresponding to one object feature has to be measured. Image coordinates of homologous image features are input into a microscopic vision model to reconstruct the world coordinates of corresponding object features. We can find all the matching points by matching algorithm. Generally, there are many mismatching and error matching points in the results of stereo matching, which will lead to big reconstruction errors. Therefore,

94

Y. Wang et al. / Micron 83 (2016) 93–109

stereo correspondence is still a challenge in the research of SLM microscopic vision. Two kinds of positions (i.e., the tip of micro gripper and the gripping position) should be reconstructed, which are usually represented by several image feature points. The extraction of image coordinates of image feature points generally follows the following two steps. First, micro gripper and small objects are accurately identified by shape matching algorithms. Second, affine transformation is used to calculate image coordinates of feature points. Data reconstruction in micro-gripping system involves few objects, and can overcome the disadvantages of traditional matching algorithm to ensure reconstruction precision. Microscopic vision modeling is one of most basic studies on SLM micro-gripping system. Traditional pinhole camera model (TPCM) has mature techniques, such as the calibration and distortion correction methods. TPCM is based on a perspective projection, and widely applied in computer vision. It assumes that the beams reflected by an arbitrary object point must pass through the optical center of objective lens. An intersection between the beam and the image plane is regarded as an image point. It is an assumption of linear imaging, which is different from the real imaging process of objective lens. Hence, lens distortion correction is an important tool to restore real imaging process of lens. Our system is based on CMOstyle SLM. When TPCM is directly applied to reconstruct the world coordinates in our system, it will lead to significant reconstruction errors, which mainly come from the following two sources. First, the imaging process of macro computer vision is a mapping process from big 3D space objects to small 2D image objects. Instead, the imaging process of microscopic vision based on CMO-style SLM is a mapping process from small 3D space objects to big 2D image objects. Second, TPCM assumes that beams must pass through the optical center of objective lens. However, CMO-style SLM has three optical axes: the optical axis of common main objective, the optical axis of left optical sub-system and the optical axis of right optical sub-system. Hence, the imaging process of SLM microscopic vision does not meet the requirement of TPCM. TPCM is more suitable for describing the imaging process of macro computer vision, and can not directly used as a microscopic vision model for SLM. Many studies on SLM microscopic vision have been proposed over the last decades. A binocular visual system based on SLM proposed by Kim and Bovik (1990) was used to reconstruct 3D shape of small objects (e.g., piece of potato and vascular). Kim and Bovik (1990) derived a simplified vision model without rigorous calibration and distortion correction. The model involves three parameters: the base line distance, the parallactic angle and one global magnification for stereo images. For the other parameters, manufacturer-specified values were introduced. However, his research was significant for the development of SLM microscopic vision system. Danuser (1999) derived a weak perspective model. His researches was representative about CMO-style SLM calibration. He proposed a weak item for the vision model to describe the perspective distortion. Different lens distortions (e.g., scale distortion, shear distortion, paraxial distortion and CMO distortion) were considered into his model. Lee et al. (2001) developed a micro-assembly system based on SLM to assembly micro parts. And his system was used to identify the objects in stereo images and reconstruct the world coordinates of object points. Sano et al. (1998) and Yamamoto and Sano (2002) developed a visual feedback microinjection system based on SLM. Their systems were used to reconstruct the world coordinates of needle and other targets. Reconstruction data was transferred to a driving module which controlled the motion of needle. Larsson et al. (2004) used SLM microscopic vision system to measure 3D displacement fields of arbitrarily shaped objects. His system can be applied for sub-mm sized objects of arbitrary shape, and measure small deformation fields. The expected in-plane errors are shown to be less than 0.1 ␮m and the corresponding out-of-plane errors are approxi-

mately four times as much. Recently, more novel studies on SLM microscopic vision have been presented. Wei et al. (2011) and Yongying et al. (2011) studied microscopic stereo imaging principles of SLM. And they proposed a method of micro stereo occlusion correction for microfluidic chip detection. The depth information of occluded part can be recovered effectively by their method. Rogerio et al. (2012) reported a system based on SLM for detecting unintentional collisions between surgical tools and the retina using the visual feedback provided by SLM. Using stereo images, proximity between surgical tools and the retinal surface were detected when their relative stereo disparity was small. For this purpose, they developed a system composed of two modules. Guangjun et al. (2012, 2013) and Weixian et al. (2014) proposed a microscopic stereo vision model based on the adaptive positioning of the camera coordinate frame, and derived an affine calibration algorithm. Their researches were used to measure the ball screw backlash and the gears back-kick backlash. This calibration algorithm with a free planar reference consists of three steps: First, derive the extrinsic parameters based on their invariable definition in the pinhole and affine models. Second, calculate the intrinsic parameters through homograph matrix. Third, refine all the model parameters by global optimization with the previous closed-form solutions as the initial values. In summary, two kinds of researches on SLM vision system have been studied: First, simplified vision models were derived, such as the models proposed by Kim and Bovik (1990) and Sano et al. (1998). These models did not require any complex parameter calibration, but have slightly lower reconstruction precision. Second, complicated vision models were derived, such as Danuser (1999) and Guangjun et al. (2012). The model proposed by Danuser (1999) achieved a lateral and axial neighborhood accuracy of 0.1% and 1–2% respectively with magnification of 1.6× and a standard measuring distance of 0.05 mm. The model proposed by Guangjun et al. (2012) achieved a neighborhood accuracy of 0.12% with magnification of 3.024 × and a standard measuring distance of 0.3125 mm. These models were applied in small distance reconstruction, and could generally achieve high precision on a small scale. Moreover, the complicated models generally used too many parameters to fit lens distortion. Its calibration was too complicated for convenient use. The workspace has a spatial scope with 4 mm × 3 mm × 2.6 mm (length × width × height) with 0.7× for our SLM. The reconstruction precision of microscopic vision model is evaluated in the workspace, as higher precision in all the work space will ensure a better performance of reconstruction. In this paper, a micro-gripping system is developed for gripping small or microscopic objects, which is based on SLM microscopic vision. Copper wires with a diameter of 100 ␮m are regarded as research objects. 3D coordinate reconstruction of micro gripper tip and gripping position is a key issue in the study of micro-gripping system. This paper only focuses on the vision modeling process. We presented a novel and high-precision vision modeling method. Our method consists of four parts: image distortion correction, disparity distortion correction, initial vision model and residual compensation model. Reconstruction errors in SLM micro-gripping system come from different error sources, such as disadvantages of vision model, lens distortion, positioning accuracy of driving component, etc. The residuals of reconstruction data describe the effects of all the error sources. A residual compensation model is used to fit the reconstruction errors. Our model has better adaptability because the residual compensation model can be applied to any kinds of SLM and ensures a high reconstruction precision. This paper is organized as follows. In Section 2, micro-gripping system is designed. The system structure and research emphases are introduced. In Sections 3 and 4, the methods of image alignment, image distortion correction and disparity distortion correction are derived. In Section 5, the initial vision model and the

Y. Wang et al. / Micron 83 (2016) 93–109

95

Fig. 1. Side and top view of micro-gripping scene with wires and micro gripper. (a) Side view. (b) top view.

Fig. 2. Setup of micro-gripping system with SLM microscopic vision. An image distortion correction method is performed for the initial left image Li and the right image Ri. (A) SLM vision system with left camera A1 and right camera A2. (B) Driver component.(C) Mechanical component which can rotate within 360 degrees, is used to adjust position of small objects. (D) Photo of the system. (E) Micro gripper.

residual compensation model are derived. Model parameters are also estimated. In Section 6, the results are discussed. 2. Micro-gripping system design based on SLM Our system is designed for gripping copper wires with a diameter of 100 ␮m in this paper. Fig. 1 shows two images of gripping scene. Fig. 1(a) and (b) shows images of the side view and top view respectively. A small object contains several copper wires, which are regularly fixed on the top of small object. The system drives a micro gripper to approach one of copper wires and grips a special position (i.e., gripping position) in copper wire. Then the micro gripper moves toward the top of a pad and releases the copper wire. The same gripping process will be carried out for other copper wires. Before the micro gripper approaches the gripping position, the relative distance between the tip of micro gripper and the gripping position must be calculated by a microscopic vision model. First, we choose two corresponding feature points in stereo images located on the tip of micro gripper. A shape matching algorithm is used to locate the micro gripper in stereo images. Then we can extract image coordinates of the image feature points by an affine transformation. Second, a shape matching algorithm is used to search stereo images for small objects. A ROI area containing all the copper wires is calculated based on the position of small objects in images. And image coordinates of gripping position are given. Third, a micro-

scopic vision model outputs the world coordinates of the tip of micro gripper and gripping position based on image coordinates of feature points. Fourth, micro-gripping system drives the micro gripper to approach the gripping position, grip the copper wire, drag it and release it on the top of pad. Fig. 2 is the schematic diagram of micro-gripping system. Microscopic vision system consists of one SLM (A) and two cameras (A1 and A2). A 3-DOF driving component (B) is used to adjust the position of micro gripper. A mechanism (C) can be rotated by 360◦ , and is used to adjust position of copper wires. Copper wires are rotated by a fixed angle, such as 120◦ , and reach the position near micro gripper in order. Then all copper wires can be gripped. The symbols Li and Ri represent stereo images of object space scenes. In order to calculate image coordinates of tip of micro gripper and gripping position, a shape matching algorithm is applied to images Li and Ri. Image coordinates are corrected by the methods of image distortion and disparity distortion correction, and are input to a microscopic vision model which outputs the world coordinates (X0 ,Y0 ,Z0 ) corresponding to tip of micro gripper and (X1 ,Y1 ,Z1 ) corresponding to gripping position. The world coordinates of tip and gripping position are used to calculate relative distance between tip and gripping position. And they are transferred to driving component which controls the movement of micro gripper. Studies on the system, as shown in Fig. 2, involve data reconstruction, stereo matching algorithm, gripping strategy, etc. In this paper, we focus

96

Y. Wang et al. / Micron 83 (2016) 93–109

Fig. 3. A calibration sample and its stereo image pair. (a) Index definition of grid points in calibration plate. Grid points are divided into M rows and N columns. The symbol A11 has index number [1,1] which means the first row and the first column. Grid point AM1 has index number [M,1] which means a grid point located in row M and column 1. And A1N has index number [1,N] which means a grid point located in row 1 and column N. (b) The left image of calibration plate. (c) The right image of calibration plate.

on the microscopic vision modeling. The other will be studied in the future.

formation for image coordinates of grid points. The transformation of RCU is given by



w view 3. Image distortion correction In order to analyze image distortion, we design a calibration sample, as shown in Fig. 3(a). The calibration sample consists of a matrix of M × N round patterns, and is fabricated by micromachining technique. The machining precision is ±0.25 ␮m. Center point of each round pattern is defined as grid point. The spacing between two adjacent grid points is 0.2 mm. The diameter of each round pattern is 0.1 mm.The results show that a 27 × 29 matrix of grid points can cover all the field of view. We obtain image coordinates of grid points by Halcon software. A precision of image coordinate extraction is sub-pixel. Fig. 3(b) and (c) is stereo image of calibration sample. Adjacent grid points are connected by line segments. Each grid point corresponds to a serial number [i,j] which denotes its location in the plane of calibration sample, as shown in Fig. 3(a). A M × N matrix consists of M × N grid points, where N (or M) is the number of each row (column) of grid points. The symbol Aij is used to represent a grid point with row number of i and column number of j. If adjacent grid points are connected by line segments, a row of grid points and their connection lines are called row connection unit (RCU). In the same way, a column of grid points and their connection lines are called column connection unit (CCU). Calibration sample with M RCUs and N CCUs is placed in object space, which is required that its RCUs (or CCUs) should be parallel to w1 (or w2 ) axis in image coordinate frame as possible. The coordinate vector of grid point [i,j] is denoted by wview,[i,j] = (w1 view,[i,j] , w2 view,[i,j] )T , where the superscript view = l or r represents left or right images, and [i,j] is serial number of grid point. The geometric deformation of RCUs and CCUs can not be observed clearly because images have a large resolution. Hence, we carry out an affine trans-

w1,[i,j]

= w view,[i,j] −

0

0



 

· w view,[i,1] +

0 1

0 i

(1)

where wview w 1,[i,j] = (w1 ,w2a )T denotes a transformed image coordinate vector with respect to wview,[i,j] . Image coordinates of CCUs are transformed as follows.



w

view w2,[i,j]

=w

view,[i,j]



1 0 0

0



 

·w

view,[1,j]

+

j

0

(2)

where wview x2,[i,j] = (w1a ,w2 )T denotes the transformed image coordinate vector with respect to wview,[i,j] . Fig. 4(a) and (b) shows the relationship between w2a and w1 in left and right images. We can see that each RCU can approximately be regarded as a straight line, and interval distance between two adjacent RCUs increases gradually with increase of w1 . Fig. 4(c) and (d) shows the relationship between w2 and w1a in left and right images. We can see that each CCU is not straight line. However, interval distance between two adjacent CCUs remains unchanged with increase of w2 . The results show that image distortions do exist in stereo images. However, geometric deformation of RCUs and CCUs are different. The goal of image distortion correction is to correct the geometric shapes of RCUs (or CCUs) to be parallel as much as possible. Moreover, equal interval distances of adjacent RCUs (or CCUs) are required. Image distortion correction in w1 direction focuses on adjusting the parallelism and interval distance of RCUs. Parameter set view , bview )|i ∈ [1, M]} actually represents a set of fitted row {(ki,row i,row lines. If row straight lines 1 and i are parallel, their slope will be view = kview . However, they are not parallel because equal, i.e., ki,row 1,row of SLM image distortion. Therefore, we must correct them to be parallel straight lines and make them align in an equal spacing. view )|i ∈ [1, M]} will be obtained based on the conA new set {(ki,row

Y. Wang et al. / Micron 83 (2016) 93–109

97

Fig. 4. Distribution of transformed grid points based on Eqs. (1) and (2). (a) Rows of grid points in left image. (b) Rows of grid points in right image. (c) Columns of grid points in left image. (d) Columns of grid points in right image.

Fig. 5. An example of spacing between virtual lines based on bview i,row = 0. l1 –l5 represents five virtual lines passing through the origin in the image. If we select line 1 as a baseline and set w1 = v2 , L2→1 − L5→1 represents the spacing between the line 1 and the other four lines.

The above results show that the space intervals L2→1 –L5→1 will be slowly increased or decreased in order. Furthermore, as illustrated in Fig. 5, it can be seen that the space intervals L2→1 –L5→1 are related to image coordinates w1 and w2 . The space interval will be increased with increasing w1 . And it will be positive or negative related to w2 . To make the row lines basically parallel to each other, we define a function F(w2 view,[i,j] ) to correct these row lines. The function F(w2 view,[i,j] ) can compensate image coordinate w2 view,[i,j] and make the other row lines l2 –l5 maximally approach line l1 . The new image coordinates with respect to (w1 view,[i,j] , w2 view,[i,j] ) are denoted by (w1 view 1,[i,j] , w2 view 1,[i,j] ), which can be calculated by the following empirical formula.



view 1,[i,j]

w1

view 1,[i,j]

bview i,row

straint = 0. It is a set of row straight lines passing through the origin of image coordinate system. We use a new symbol w2a to describe the image coordinate w2 of grid points which are located in the straight lines passing through the origin. We give an example about five straight lines, as shown in Fig. 5. Straight lines l1 –l5 view )|i ∈ [1, 5]}. If row line l is correspond to parameter set {(ki,row,0 1 regarded as a baseline, we calculate the spacing between line l1 and other lines l2 –l5 . Note that we assign a special value v2 to image coordinate w1 , i.e., w1 = v2 . We denote the space interval between line l1 and line l2 by the symbol L2→1 . And a similar expression (i.e., L3→1 –L5→1 ) is also used for other lines, where L3→1 –L5→1 represent the space intervals between lines l1 and line l3 , l4 , l5 .

w2





=



view,[i,j]

w1 view,[i,j]

w2

view,[i,j]

[1 + F(w1

(3) )]

where F(w1 view,[i,j] ) is a general expression with respect to w1 view,[i,j] . Fig. 4(a) and (b) shows an obvious proportional relationview,[i,j] view,[i,j] ship between F and w1 view,[i,j] , i.e., F(w1 ) = view . row w1 Similar to image distortion correction in w1 direction, image distortion correction in w2 direction focuses on adjusting the parallelism and interval distance of CCUs. However, CCUs are curve lines, and should be modified to be straight lines by a linear fitting method. The linear equation of one CCU is described by w1 = kw2 + b. The parameters k and b are given by least square method. Left or right images consist of N CCUs. And the parameview and bview , which construct ters of CCUs are denoted by kj,column j,column

98

Y. Wang et al. / Micron 83 (2016) 93–109

Fig. 6. Results of image distortion correction. (a) Distribution of each row of corrected and transformed grid points in left image. (b) Distribution of each row of corrected and transformed grid points in right image. (c) Distribution of each column of corrected and transformed grid points in left image. (d) Distribution of each column of corrected and transformed grid points in right image.

view a set {(kj,column , bview )|j ∈ [1, N]}. Then linear fitting error of grid j,column point [i,j] is calculated as follows. [i,j]

view Ecolumn = (kj,column w2view

1,[i,j]

+ bview ) − w1view j,column

1,[i,j]

(4)

Fig. 4(c) and (d) shows an obvious nonlinear relationship between E[i,j] column and w2 view 1,[i,j] . We apply the following second order polynomial to fit the nonlinear relationship. [i,j]

Ecolumn = aview (w2view column

1,[i,j] 2

) + bview wview column 2

1,[i,j]

view + ccolumn

(5)

view where aview , bview and ccolumn are parameters. These paramcolumn column eters can be obtained by solving a system of M × N equations. The new image coordinates based on Eq. (5), are denoted by (w1 view 2,[i,j] , w2 view 2,[i,j] ).



view2 ,[i,j]

w1

view2 ,[i,j]

w2

Table 1 Results of parameter estimation used in image distortion correction. Parameters of left image Value

Parameters of right image Value

l al bl cl l

r ar br cr r

1.61 × 10−6 1.73 × 10−6 −3.12 × 10−3 0.94 1.30 × 10−7

−9.96 × 10−6 −1.54 × 10−6 2.78 × 10−3 −0.85 4.63 × 10−7

and (c) are corrected based on . Fig. 6 shows the results of image distortion correction. Geometric deformation of RCUs and CCUs are corrected efficiently.

 ⎞

2  ⎛ view ,[i,j] view1 ,[i,j] view w view1 ,[i,j] + c view + aview w + b w1 1 2 column column 2 column ⎠ =⎝ (6) view1 ,[i,j]

w2

As mentioned above, eight parameters are applied to fit image distortions. A symbol  is used to denote them, i.e.,  =

4. Image alignment and disparity distortion correction

view view view view . The parameter sequence  are row , column , bcolumn , ccolumn estimated by nonlinear optimization methods. The results are listed in Table 1. 783 grid points of 27 RCUs and 29 CCUs are used in the parameter estimation experiments. Image distortions in Fig. 3(b)

An important function of SLM microscopic vision is the estimation of depth (i.e., Z coordinate in world coordinate frame) based on disparity. Disparity is denoted by the symbol D in this



T

Y. Wang et al. / Micron 83 (2016) 93–109

paper. The image coordinate vector of image points in left and right image coordinate frames are denoted by (wl 1 , wl 2 )T and (wr 1 , wr 2 )T , respectively. An object point P corresponds to two matching image points: pl (wl 1 , wl 2 ) in left image and pr (wr 1 , wr 2 ) in right image. The disparity between points pl and pr is defined as D = wr 1 − wl 1 . We create a disparity coordinate system which is defined by the vertical axis D and the lateral axes w1 and w2 . The axes w1 and w2 are the axes of left image coordinate frame. Research on image alignment and disparity distortion correction needs a great amount of image data. The image capturing solution is proposed, as shown in Section 5.2. A small plane calibration sample with high-precision grids is designed to output image data, which consists of 49 round patterns arranged as 7 × 7 matrix. The serial number of grid point is denoted by 2D serial number [i,j] (i.e., row number i and column number j). Plane calibration sample will be fixed on the driving component and moves along X, Y and Z direction, respectively. It will stop in several specified locations along X, Y and Z directions which are called image capturing locations. Stereo images of calibration sample are captured in these image capturing locations, which constitute X, Y and Z-axis stereo image sequences, respectively. Each grid point in the course of moving will lead to two movement paths which are located in object and image spaces, respectively. Experiments are carried out to observe the relationship between disparity and w1 (w2 ) coordinate. X and Y-axis stereo image sequences are captured based on an interval distance of 0.5 mm. Five grid points in X and Y-axis stereo image sequences are observed, respectively. Fig. 7(a) and (b) shows the results with respect to X and Y-axis stereo image sequences. Fig. 7(a) shows an obvious quadratic parabola relationship between disparity and w1 coordinate. Fig. 7(b) shows an approximate linear relationship between disparity and w2 coordinate. In order to derive a microscopic vision model with better linearity, we assume that the disparity surface corresponding to an object plane is still a plane. However, Fig. 7 shows that the practical disparity surface of object plane is a curved surface. Disparity distortion is proved to occur, which has the influence on the linearity of microscopic vision model. Hence, it should be corrected. The results show that disparity distortion is largely determined by w1 coordinate. Our method of disparity distortion correction only focuses on the quadratic parabola relationship between disparity and w1 coordinate. The estimation of disparity is based on the constraint of epipolar line (i.e., wl 2 ≈ wr 2 ). However, the relative rotation and translation between left and right images have been proved to be existent. Left image is generally regarded as a reference. And the rotation angle ˛ and translation offset distance ıw2 in w2 direction should be calculated for right image. Our method of disparity distortion correction consists of three steps: First, the rotation angle (˛) is calculated. And right image is rotated by −˛. Second, translation offset distance (ıw2 ) is calculated. And right image is translated by ıw2 . Third, disparity distortion is corrected based on the results of image alignment. The movement paths of grid points in X-axis stereo image sequence are approximately regarded as straight lines. The advancing direction of grid point is along w1 direction. The image coordinate vector of an image point is denoted by (w1 view 2,[i,j] , w2 view 2,[i,j] )T , where the superscript view = l or r, indicates left or right images. Hence, for the image point corresponding to grid point [i,j], its movement path is described by a serials of (w1 view 2,[i,j] , w2 view 2,[i,j] )T , where the superscript n indicates image capturing location n, and n ∈ [1,N]. The rotation angle ˛ can be estimated by the movement paths of image points corresponding to grid points. Least square method is applied to fit movement paths in images, and obtain a serials of slopes {kX view,[i,j] |i ∈ [1,7], j ∈ [1,7]}, where the superscript X represents X-axis stereo image sequence. kX view,[i,j] is given by

99

view,[i,j]

kX

N 

view 2,n,[i,j]

[w1 =

view 2,[i,j]

− avg(w1

view 2,n,[i,j]

)][w2

view 2,[i,j]

− avg(w2

)]

n=1 N 

(7) view 2,n,[i,j] [w1



view 2,[i,j] 2 avg(w1 )]

k=1

where avg(w1 view 2,[i,j] ) and avg(w2 view 2,[i,j] ) represent the mean values of w1 view 2,n,[i,j] and w2 view 2,n,[i,j] , respectively. The angle ˛X view ,[i,j] is calculated based on kX view,[i,j] , where ˛X view,[i,j] = arctan(kX view,[i,j] ). The rotation angle ˛ is estimated using the following mean value. [

Nc Nr  

l,[i,j]

(˛X

r,[i,j]

− ˛X

)]

j=1 i=1

˛=

(8)

(Nr Nc )

where Nr (or Nc ) is the number of one row (column) of grid points. Right image is rotated by −˛. New image coordinate vector is denoted by (w1 view 3 , w2 view 3 )T , and is given by

⎧ ⎪ ⎪ ⎪ ⎨

(

r

w13 ⎪ ⎪ ⎪ ⎩ ( r3 ) =

w1l 3

l 3  w2

)=(

w1l 2 w2l 2

)

w1r 2 cos ˛ − w2r 2 sin(˛)



(9)

w1r 2 sin ˛ + w2r 2 cos(˛)

w2

The offset distance ıw 2 is calculated based on (w1 view 3 , w2 and image data from X, Y and Z-axis stereo image sequences. If the image point corresponding to the image capturing location n of grid point [i,j], its offset distance is defined as n[i,j] ıw2 = w2l3 ,n,[i,j] − w2r3 ,n,[i,j] . X, Y and Z-axis stereo image sequences consist of 49N, 49M and 49L grid points, respectively. The mean valn,[i,j] ues of ıw2 from different image sequences are denoted by ıw 2 X , ıw 2 Y and ıw 2 Z , respectively, and calculated as follows. view 3 )T

⎧ Nc Nr N    l 3,n,[i,j] ⎪ ⎪ r 3,n,[i,j] ⎪ ı = [ (w2 − w2 )]/(Nc Nr N) ⎪ w2 X ⎪ ⎪ ⎪ n=1 j=1 i=1 ⎪ ⎪ ⎪ Nc Nr M  ⎨   l 3,n,[i,j] r 3,n,[i,j] ıw2

Y

=[

−w

(w

)]/(Nc Nr M)

2 2 ⎪ ⎪ n=1 j=1 i=1 ⎪ ⎪ ⎪ Nc Nr L ⎪    l 3,n,[i,j] ⎪ r 3,n,[i,j] ⎪ ⎪ (w2 − w2 )]/(Nc Nr L) ıw2 Z = [ ⎪ ⎩

(10)

n=1 j=1 i=1

The final offset distance (ıw 2 ) is given by the mean value of ıw 2 X , ıw 2 Y and ıw 2 Z . ıw2 =

ıw2

X

+ ıw2 3

Y

+ ıw2

Z

(11)

All the image points in right images move ıw 2 along w2 direction. The new image coordinate vector is denoted by (w1 view 4 , w2 view 4 )T , and is calculated as follows.

⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩





l 4,[i,j]

w1



=

l 4,[i,j]

w2

r 4,[i,j] w1 r 4,[i,j]

w2







l 3,[i,j]

w2

 =

l 3,[i,j]

w1

r 3,[i,j]

w1

r 3,[i,j]

w1



(12)

+ ıw2

Disparity distortion is corrected based on the vector (w1 view 4 , w2 view 4 )T and X-axis stereo image sequence. The initial disparity of grid point [i,j] is written as D0 [i,j] = w1 r 4,[i,j] − w1 l 4,[i,j] . The calibration sample consists of 49 grid points, which correspond to 49

100

Y. Wang et al. / Micron 83 (2016) 93–109

Fig. 7. Relationship between distorted disparity and image coordinates. (a) Relationship between distorted disparity and w1 coordinate. (b) Relationship between distorted disparity and w2 coordinate. Table 2 Results of parameter estimation used in disparity distortion correction. Parameters

Value

˛ (◦ ) ıw2 (pixel) aD bD cD (pixel)

−0.1443 ± 0.0086 2.0731 ± 0.4551 6.5365 × 10−6 ± 2.0717 × 10−7 −1.23 × 10−2 ± 6.1869 × 10−4 5.7803 ± 0.0408

is corrected effectively. Fig. 8(b) shows that the linear relationship between disparity and w2 coordinate remains unchanged. 5. Microscopic vision modeling method 5.1. Initial vision model

movement paths in object and image spaces respectively. A linear fitting is applied to the relationship between disparity D and w1 coordinate firstly. Then, the residual of linear fitting is calculated. At last, a polynomial fitting is applied to fit the residuals. Based on this, the formula to compensate residual is derived. A linear fitting is applied to the relationship between disparity D and w1 coordinate for each grid point. The straight line equation of grid point [i,j] is written as D0 [i,j] = kD w1 l 2,[i,j] + bD,[i,j] , where kD,[i,j] and bD,[i,j] are slope and intercept, respectively. kD,[i,j] and bD,[i,j] are calculated by least square method. E [i,j],n,D = (kD,[i,j] w1l 4,n,[i,j] + bD,[i,j] ) − D[i,j],1

(13)

where avg(w1 l 4,[i,j] ) and avg(D0 [i,j] ) are the mean values of w1 l 4,n,[i,j] and D0 [i,j] , respectively. The set {(kD,[i,j] , bD,[i,j] )|i ∈ [1,7], j ∈ [1,7]} can be calculated based on Eq. (13). The linear fitting residuals are calculated based on Eq. (7). The residuals corresponding to image capturing location n of grid point [i,j] is denoted by E[i,j],n,D . E[i,j],n,D is given by E [i,j],n,D = (kD,[i,j] w1l 4,n,[i,j] + bD,[i,j] ) − D[i,j],1

(14)

A quadratic polynomial fitting is applied to fit the relationship between the residual ED and w1 coordinate. The quadratic polynomial is defined as follows. E D = aD (w1l 4 )2 + bD w1l 4 + c D

(15)

where aD , bD and cD are the parameters of polynomial. According to the number of E[i,j],n,D , 49 × N equations will be obtained. We apply least square method to 49N equations. And the corrected disparity D can be calculated as follows. D = D0 − [aD (w1l 4 )2 + bD w1l 4 + c D ]

(16) r4

l 4 .The

results of where D0 is initial disparity, and D0 = w1 − w1 parameter estimation are listed in Table 2. Disparity distortion is corrected based on these parameters. Similar to the experiments in Fig. 7, five grid points in X and Y-axis stereo image sequences are observed. Fig. 8(a) displays the relationship between disparity and w1 coordinate, and Fig. 8(b) displays the relationship between disparity and w2 coordinate. Fig. 8(a) shows that the initial quadratic parabola relationship in Fig. 7(a)

A mapping relationship between world and image coordinate frame is shown in Fig. 9. Seven kinds of coordinate frames, i.e., world coordinate frame (WCF) XYZ, left camera coordinate frame (LCCF) wl 1 wl 2 z, right camera coordinate frame (RCCF) wr 1 wr 2 z, left image coordinate frame (LICF) wl 1 wl 2 , right image coordinate frame (RICF) wl 1 wl 2 , left image world coordinate frame (LIWCF) xl yl and right image world coordinate frame (RIWCF) xr yr are defined. WCF is located in driving component. Its axes coincide with the corresponding axes of driving component. LICF and RICF are located in left and right images, respectively. The projections of coordinate plane XOY to LICF and RICF constructs LIWCF and RIWCF. The world coordinate vector of object point Q is denoted by (XQ ,YQ ,ZQ )T . Point Q corresponds to two image points: Ql in left image and Qr in right image. Image coordinate vectors of point Ql and Qr are denoted by wlQ = (wl 1,Q , wl2,Q )T and wr Q = (wr 1,Q , wr 2,Q )T , respectively. Note that image coordinates wl 1,Q , wl 2,Q , wr 1,Q and wr 2,Q are not initial image coordinates, but new image coordinates output by Eq. (12). In order to keep consistent with the symbols in Fig. 9, we omit the superscript “ 4” used in Eq. (12). The image coordinates in the following sections are defined based on wl Q and wr Q . The mapping relationship between the world and image coordinates can be descried by the following general equation.



XQ





FX (X , w lQ , w rQ )



⎜ ⎟ ⎜ F ( , wl , wr ) ⎟ ⎝ YQ ⎠ = ⎝ Y Y Q Q ⎠ ZQ

(17)

FZ (Z , w lQ , w rQ )

where X , Y and Z are parameter sequences. Our goal is to derive the mathematical expressions of FX , FY and FZ , and calculate all the parameters of X , Y and Z . The vision modeling method presented in this paper consists of two steps. First, an initial vision model (IVM) is derived by analyzing the mapping relationship between WCS and ICS including LICS and RICS, as shown in Fig. 9. Initial world coordinates of point Q are reconstructed by initial vision model. However, they generally have bigger errors. Second, we derive a residual compensation model (RCM) by analyzing the residuals of initial world coordinates. RCM output reconstruction data with high precision. Fig. 9(a) shows an ideal mapping relationship between WCS and ICS. The planes XOY, wl 1 wl 2 and wr 1 wr 2 are parallel. The axes X, wl 1 and wr 1 are parallel, too. The axes xl and xr are parallel to the axes

Y. Wang et al. / Micron 83 (2016) 93–109

101

Fig. 8. Relationship between corrected disparity and image coordinates. (a) Relationship between corrected disparity and w1 coordinate. (b) Relationship between corrected disparity and w2 coordinate.

Fig. 9. Mapping between the world frame and the image frames. (a) The status of theoretical alignment between the world frame and the image frames. (b) The position of image points shift synchronously from Gl to El to Fl in left image frame and from Gr to Er to Fr in right image frame when an object point moves along the Z axis form the position G to E to F in the world frame. (c) The status of actual alignment between the world frame and the image frames.

wl 1 and wr 1 , respectively. If point A and B are located in the axes X and Y, respectively, their corresponding projection points are point Al and Bl in LWICS. We can see that there is an obvious linear mapping relationship between |OA| and |ol Al | (or |or Ar |). And similar linear mapping relationship happens to |OB| and |ol Bl | (or |or Br |), too. If point Q is located on the plane XOY, the linear mapping relationship between its world and image coordinates can be described as follows.



XQ YQ





=

l EXL w1,Q l EYL w2,Q





or

XQ YQ





=

r EXR w1,Q r EYR w2,Q



(18)

where EXL , EXR , EYL and EYR are parameters. These parameters indicate that pixels in left and right images correspond to a standard length in WCS. EXL (EYL ) is the standard length of single pixel of left image along X(Y) direction. EXR (or EYR ) is the standard length of single pixel of right image along X(Y) direction. However, the world coordinate ZQ are not calculated by Eq. (18). Fig. 9(b) shows the change of image coordinates of image points with the change of ZQ . We assume that point Q moves along the path G → E → F in Z direction. Position of the image points Ql and Qr will be changed synchronously. Point Ql generally moves along the path Gl → El → Fl . And point Qr generally moves along the path Gr → Er → Fr . In fact, the position variation of image points Ql and Qr is caused by their disparity, which is denoted by DQ = wr 1,Qwl 1,Q. Disparity of image points indicates the variation of depth in SLM vision system. Points Ql and Qr generally move in opposite direction. If point Ql moves from right side to left side, point Qr will move from left side to right side. Based on above assumption, the world coordinates XQ and YQ remain unchanged when point Q moves along the path G → E → F. However, the measured

values of XQ and YQ output by Eq. (18) are actually changed. The contrary results are obtained. Therefore, we should modify Eq. (18) furthermore. As we know, the motion direction of image points Ql and Qr are opposite. Hence, we use the mathematical quantity (wl 1 + wr 1 )/2 to replace wl 1 or wr 1 in Eq. (18). By this method, we reduce the influence of movement of image points on reconstruction of the world coordinates X and Y. Left and right images have been aligned along w2 direction, as shown in Section 4. They have satisfied the condition wl 2 ≈ wr 2 . The variation of wl 2 and wr 2 are synchronous when point Q moves along the path G → E → F. Their values increase or decrease synchronously. Hence, we also use the mathematical quantity (wl 2 + wr 2 )/2 to replace wl 2 or wr 2 in Eq. (18). The world coordinate ZQ is synchronously changed when point Q moves along the path G → E → F. And the disparity of image points of point Q is synchronously changed, too. We conclude that the world coordinate ZQ is dependant on the disparity. Therefore, we directly use the disparity to predict the world coordinate ZQ . We carry out an experiment to observe the relationship between disparity and world coordinate. The image capturing solution in Section 5.2 is used to output all basic data. Z-axis stereo image sequence with 7 stereo image pairs is obtained based on an interval distance of 0.3 mm, which can output 49 disparity data. If the disparity of image points in the first stereo image pair corresponding to grid point [i,j] is regarded as a reference, the disparity increment is defined by the difference between the disparity of image points in other stereo image pairs and the reference. The number of disparity increment data is 343. Fig. 10 shows the relationship between disparity increment and Za, where Za is the world coordinate increment of ZQ . We can see that Za has an obvious linear relation-

102

Y. Wang et al. / Micron 83 (2016) 93–109

Fig. 10. Relationship between disparity and Za.

ship with Za. Based on above analysis, Eq. (18) can be rewritten as follows.



XQ





l r )/2 EX (w1,Q + w1,Q



⎜ ⎟ ⎜ l r )/2 ⎟ + w2,Q ⎝ YQ ⎠ = ⎝ EY (w2,Q ⎠ ZQ

(19)

Ez DQ

where EX , EY and EZ are the standard length of single pixel in X, Y and Z directions respectively. DQ is the disparity which is output by Eq. (16), and equals DQ 0 + [aD (wl 1,Q )2 + bD wl 1,Q + cD ], where DQ 0 = wr 1,Q − wl 1,Q . Fig. 9(a) only displays an ideal mapping relationship between WCS and ICS. However, there is rotation between the plane XOY and image planes, as shown in Fig. 9(c). The rotation angles can lead to significant reconstruction errors. Therefore, Eq. (19) should be modified furthermore. Our micro-gripping system only handles microscopic objects in small displacements with maximum order of several millimeters. The micro manipulation is limited in a space of several hundreds of microns. Therefore, we can conclude that the values of rotation angles are smaller. This is very helpful for simplifying the formula derivation based on Eq. (19). Eq. (19) can be rewritten as follows.



XQ





l r )/2 + ı ( , X , Y , C ) EX (w1,Q + w1,Q X X Q Q Q



⎜ ⎟ ⎜ l r )/2 + ı ( , X , Y , C ) ⎟ + w2,Q ⎝ YQ ⎠ = ⎝ EY (w2,Q Y Y Q Q Q ⎠ ZQ

(20)

Ez DQ + ıZ (Z , XQ , YQ , CQ )

where ıX , ıY and ıZ are compensation quantities for X, Y and Z coordinates respectively, X , Y and Z represent parameter sequences. ıX , ıY , ıZ , X ,Y and Z will be calculated by the following residual compensation model. 5.2. Residual analysis of reconstruction data As an initial vision model, Eq. (19) output initial world coordinates. However, initial world coordinates generally contain significant reconstruction errors. Based on Eq. (20), we will derive a residual compensating model to compensate for the errors to improve reconstruction precision. The residual compensating model is based on analysis of residuals of reconstruction data. It requires three types of coordinates: the actual world coordinates (as true values), the measured world coordinates (as measured values) and their image coordinates. The measured world coordinates reconstructed by Eq. (19) are based on their image coordinates. Residual distribution of world coordinates is calculated based on

the true and the measured values. In order to obtain above three types of coordinates, we design an image capturing solution, as shown in Fig. 11. And calibration sample consisting of a matrix of 7 × 7 round patterns is used. The spacing between two adjacent grid points is 0.3 mm. The diameter of round pattern is 0.15 mm. Fig. 11(a) shows the structure of calibration sample. The calibration sample is mounted on driving component. It moves in discrete translations. SLM has a focus plane, as shown in Fig. 11(b). SLM has a valid depth of field of about ±1.13 mm with a magnification of 0.7× in this paper. Clear images are captured when objects are in the valid depth of field. The calibration sample moves in X, Y and Z directions, respectively. Its motion is an equal interval movement, as shown in Fig. 11(c). For example, if PX 1 represents a starting location in X direction, and other locations are denoted by PX 2 , PX 3 ,. . ., PXN . The calibration sample moves from the location PX 1 to PX 2 , then from the location PX 2 to PX 3 , and at last reaches the final location PXN . It will stay for a moment in the locations PX 1 , PX 2 ,. . ., PXN , where stereo images of calibration sample are captured. The locations PX 1 , PX 2 ,. . ., PXN are image capturing locations. The distance from the location PX 1 to PXN is called the maximum length of image capturing in X direction. The image capturing solutions in Y and Z direction is similar to that in X direction. At last, X, Y and Z-axis stereo image sequences can be obtained. The symbols sdX , sdY and sdZ represent Interval distances in X, Y and Z directions. X-axis image sequence {Iview Xn , n ∈ [1,N]} can be obtained in the locations PX 1 , PX2 ,. . ., PXN . These locations construct a spatial straight line. The projection of this straight line to image coordinate frames is also a straight line denoted by Xview , as shown in Fig. 11(d). In the same way, we can obtain Y-axis image sequence {Iview Ym , m ∈ [1,M]} and Z-axis image sequence {Iview Z , k ∈ [1,L]}. {Iview Ym } is captured in the locations PY1 ,PY2 ,. . .,PYM . {Iview Z } is captured in the locations PZ1 ,PZ2 ,. . .,PZL . Two kinds of locations construct two different spatial straight lines. Their projections to image coordinate frames are denoted by Yview and Zview , respectively. Three sets {PXn , n ∈ [1,N]},{PYm , m ∈ [1,M]} and {PZk , k ∈ [1,L]} represent the actual image capturing locations in WCF. Each location has its own world coordinates. The actual movement path of each grid point constructs a straight line, which may be parallel to the axes X, Y or Z. The calibration sample consists of 49 grid points. When the calibration sample moves from the location 1–2 in WCF, 49 grid points have the same relative displacements. We reconstruct their world coordinates in the locations 1–2 by Eq. (19), and calculate their relative displacements. Then we can obtain the residuals between the actual and measured relative displacements. Reconstruction precision from the location 1–2 can be evaluated based on the residuals. The true-value sets of world coordinates corresponding to X, n,[i,j] Y and Z-axis stereo image sequences, are denoted by {PX,A |n ∈ [1, N], i ∈ [1, 49], j ∈ [1, 49]}, j ∈ [1,49]},{Pm,[i,j] Y,A |m ∈ [1,M], i ∈ [1,49], j ∈ [1,49]} and {Pk,[i,j] Z,A |k ∈ [1,L], i ∈ [1,49], j ∈ [1,49]}, respectively. A more convenient description is that Pn,[i,j] X,A, Pm,[i,j] Y,A and Pk,[i,j] Z,A are denoted by one symbol Pt,[i,j] AXIS,A , where AXIS = X,Y,Z, t = k,m,n, and the value of t is determined by AXIS. Pt,[i,j] AXIS,A represents the true-value vector of world coordinates corresponding to grid point [i,j] which is contained in AXIS–axis stereo image sequence, where t is the index number of image capturing location. If the starting image capturing location (i.e., t = 1) is regarded as a reference of grid point [i,j], the relative displacements in other locations located in the movement path of grid point [i,j], is defined as Pt,[i,j] AXIS,A = Pt,[i,j] AXIS,A − P1,[i,j] AXIS,A . Based on above analysis, 49 grid points will have the same Pt,[i,j] AXIS,A values. We calculate the world coordinates of all grid points in X, Y and Z-axis stereo image sequences by Eq. (19). The measured values of world coordinates corresponding to X, Y and Z-axis stereo image sequences are obtained and

Y. Wang et al. / Micron 83 (2016) 93–109

103

Fig. 11. Image capturing solution. (a) Calibration sample. (b) Focus plane of SLM. (c) Image capturing in world coordinate frame. (d) Projection of movement path to image coordinate frame.

Fig. 12. Distribution of SX , SY and SZ value based on different residual intervals, with respect to initial vision model based on Eq. (19).

Pm,[i,j] Y,P } and {Pk,[i,j]

Z,P }. In the same way, m,[i,j] X,P , P Y,P and Pk,[i,j] Z,P, where AXIS,P Pt,[i,j] AXIS,P = (Xt ,[i,j] AXIS,P , Yt ,[i,j] AXIS,P , Zt ,[i,j] AXIS,P )T , and the subscript

denoted by

Pt,[i,j]

{Pn,[i,j] X,P}, {

represent three sets: Pn,[i,j]

P represents the measured-value vector. The measured relative displacement in the location t is defined as Pt,[i,j] AXIS,P = Pt,[i,j] AXIS,P − P1,[i,j] AXIS,P . Different grid points may have different Pt,[i,j] AXIS,P

values. Based on Pt,[i,j] AXIS,P and Pt,[i,j] AXIS,A , the residual vector is defined as Et,[i,j] AXIS = Pt,[i,j] AXIS,P − Pt,[i,j] AXIS,A , where Et,[i,j] AXIS = (Et ,[i,j] AXIS,X , Et ,[i,j] AXIS,Y , Et ,[i,j] AXIS,Z )T . It describes three kinds of vectors: Et,[i,j] X , Et,[i,j] Y and E t,[i,j] Z, where Et,[i,j] X = (Et ,[i,j] X,X , Et ,[i,j] X,Y , Et ,[i,j] X,Z )T , E t,[i,j] Y = (E t ,[i,j] Y,X, Et ,[i,j] Y,Y , Et ,[i,j] Y,Z )T and Et,[i,j] Z = (Et ,[i,j] Z,X , Et ,[i,j] Z,Y , Et ,[i,j] Z,Z )T . It actually involves 9 residual elements.

We define three kinds of technical specifications to evaluate the reconstruction precision of microscopic vision model. The first kind is reconstruction residual. The elements of Et,[i,j] AXIS represent the residuals in different directions. Et ,[i,j] X,X , Et ,[i,j] Y,Y and Et ,[i,j] Z,Z describe the axial reconstruction errors in the axes X, Y and Z, respectively. Et ,[i,j] X,Y and Et ,[i,j] X,Z describe the radial reconstruction errors in the axes Y and Z, which correspond to X-axis stereo image sequence. In order to explain their meaning clearly, we give an example for Et ,[i,j] X,Y and Et ,[i,j] X,Z . If an object point moves along the axis X, its measured movement path reconstructed by the microscopic vision model should coincide with the actual axis X in theory. However, its measured movement path generally deviates from the actual axis X because of reconstruction errors. And the deviations consist of the offsets in the axes Y and Z (i.e, Et ,[i,j] X,Y and Et ,[i,j] X,Z ). Offsets in the axes Y and Z should be small as possible for a vision

104

Y. Wang et al. / Micron 83 (2016) 93–109

model with a better performance. Therefore, we use Et ,[i,j] X,Y and Et ,[i,j] X,Z to describe the offsets in the axes Y and Z. The radial reconstruction precisions actually describe the coincidence degree of the actual coordinate axis and the movement path reconstructed by a vision model. In the same way, we use Et ,[i,j] Y,X and Et ,[i,j] Y,Z to evaluate the radial reconstruction precisions in the axes X and Z, which correspond to Y-axis stereo image sequence. And we use Et ,[i,j] Z,X and Et ,[i,j] Z,Y to evaluate the radial reconstruction precisions in the axes X and Y, which correspond to Z-axis stereo image sequence. The second and third kinds are residual interval and validate scale coefficient, respectively. A residual sample is given and the number of its sample elements is G. If a residual interval is defined as ±A, where A is a real number, we search all the sample elements, and select those located in the residual interval ±A, and calculate their number K. We define a validate scale coefficient S based on K and G, where S = K/G. S = 100% indicates that all elements in the residual sample are located in the residual interval ±A. The validate scale coefficient S increases with less value of A, which leads to higher reconstruction precision and less divergence of residual data.

on RA3, respectively, and much lower than level 95%. They do not meet the precision requirement, and should be compensated. The value of SZZ is 96.7% based on RA5, and meets the precision requirement. The results show that EY,X , EY,Z , EZ,X and EZ,Y should be compensated. We use a linear compensation method to correct EY,X , EY,Z , EZ,X and EZ,Y . First, we compensate EY,X and EY,Z . The world coordinate vector of point Q in WCS can be reconstructed by Eq. (19), which is denoted by (XQ ,YQ ,ZQ )T . The linear compensation equation is given by

5.3. Residual compensation model



We use the image capturing solution in Section 5.2. X, Y and Z-axis stereo image sequences are obtained based on an interval distance of 0.1 mm. And the maximum length of image capturing corresponding to X, Y and Z-axes stereo image sequences is 4.0 mm, 3.0 mm and 2.25 mm, respectively. In order to classify the residuals of reconstruction data, we define five kinds of residual intervals: RA1[-0.005 mm,0.005 mm], RA2[-0.0075 mm,0.0075 mm], RA3[-0.01 mm,0.01 mm], RA4[-0.0125 mm,0.0125 mm] and RA5[0.015 mm,0.015 mm]. Residual distribution based on different residual intervals is observed. In general, X, Y coordinate reconstruction has higher precision than Z coordinate reconstruction in SLM microscopic vision system. For our micro manipulation tasks, RA3 is generally regarded as normal precision interval for X, Y coordinate reconstruction in this paper. RA1 and RA2 are highprecision residual intervals. And RA4 and RA5 are low-precision residual intervals. However, RA5 is regarded as a normal precision interval for Z coordinate reconstruction. The validate scale coefficients corresponding to X, Y coordinate reconstruction, which are calculated based on RA3, should be higher than 95%. And the validate scale coefficients corresponding to Z coordinate reconstruction, which is calculated based on RA5, should be higher than 95%, too. We reconstruct the world coordinates of grid points by Eq. (19), and calculate the validate scale coefficients based on different residual intervals. Distributions of the validate scale coefficients are shown in Fig. 12. We can see from Fig. 12(a) that the value of SXX is 100% based on RA1, and the value of SXY is 99.5% based on RA3. They meet the precision requirement for X, Y coordinate reconstruction. The values of SXZ are 94.6% based on RA2 and 99.5% based on RA3, and much higher than the precision requirement for Z coordinate reconstruction. Therefore, it is not necessary to compensate EX,X , EX,Y and EX,Z . We can see from Fig. 12(b) that the value of SYX is 77.3% based on RA3, and far lower than the level 95%. SYX does not meet the precision requirement, N  view,[i,j]

kX

=



XQC1





⎜ ⎟ ⎜ ⎝ YQC1 ⎠ = ⎝ ZQC1

XQ + (KYX YQ + BYX ) YQ

⎞ ⎟ ⎠

(21)

ZQ + (KYZ YQ + BYZ )

where KYX , BYX , KYZ and BYZ are the compensation parameters, and (XQC 1 , YQC 1 , ZQC 1 )T represents a new world coordinate vector of point Q. Second, we compensate EZ,X and EZ,Y by linear method as follows. XQC2





XQC1 + (KZX ZQC1 + BZX )



⎜ ⎟ ⎜ ⎟ ⎝ YQC2 ⎠ = ⎝ YQC1 + (KZY ZQC1 + BZY ) ⎠ ZQC2

(22)

ZQC1

where KZX , BZX , KZY and BZY are the compensation parameters, (XQC 2 , YQC 2 , ZQC 2 )T represents the final world coordinate vector of point Q.

5.4. Parameter estimation Our microscopic vision model consists of twelve parameters which are  X , EX , EY , EZ , KYX , BYX , KYZ , BYZ , KZX , BZX , KZY and BZY . Three types of coordinate samples are used for parameter estimation, which are image coordinates, actual world coordinates and initial world coordinates. The parameter estimation is based on the image capturing solution in Section 5.2. Image coordinates of grid pints are extracted by Halcon software. The extraction accuracy is sub-pixel. Actual world coordinates are supplied by an image capturing procedure. And initial world coordinates are calculated by Eq. (19). 1) Estimation of the parameter  X . The parameter  X is calculated based on X-axis stereo image sequence. The image coordinate sets of grid points are denoted by {(wl,n,[i,j] 1,X , wl,n,[i,j] 2,X )| n ∈ [1,N], i ∈ [1,49], j ∈ [1,49]} and {(wr,n,[i,j] 1,X , wr,n,[i,j] 2,X )| n ∈ [1,N], i ∈ [1,49], j ∈ [1,49]}. A linear fitting method is applied to fit the movement paths of grid points in WCS. The slope of the fitting straight line of grid point [i,j] is denoted by kview,[i,j] X . kview,[i,j] X is calculated by least square method as follows.

[wview,n,[i,j] 1,x − avg(wview,n,[i,j] 1,x )][wview,n,[i,j] 2,x − avg(wview,n,[i,j] 2,x )]

n=1

and should be compensated. The value of SYY value is 100% based on RA2, and meets the precision requirement. However, the value of SYZ is only 23.3% based on RA5, and does not meet the precision requirement. It should be compensated, too. We can see from Fig. 12(c) that the values of SZX and SZY are 63.2% and 50.2% based

N 

(23) [wview,n,[i,j] 1,x − avg(wview,n,[i,j] 1,x )]

2

n=1

where avg(wview,n,[i,j] 1,X ) and avg(wview,n,[i,j] 2,X ) represent the mean values of image coordinates. We can obtain the angles among the fitting straight lines and the axis wview 1 . There are 98 grid points in

Y. Wang et al. / Micron 83 (2016) 93–109

left and right images. The following mean value is regarded as the estimation value of  X .

 1 l,[i,j] r,[i,j] [arctan(kX ) + arctan(kX )] 2Nr · Nc Nc

X = (−1) ·

m ∈ [1,M], i ∈ [1,49], j ∈ [1,49]}. The parameter EY is estimated as follows.

Nr

j

(24)

i=1

105

 1 ( vm,[i,j] Y / Y m,[i,j] Y,A Nc Nr (N − 1) M

EY =

Nc

Nr

(27)

m=2 j=1 i=1

where Nr = 7, Nc = 7. Nr (Nc ) is the number of a row (column) of grid points. Stereo images are rotated by − X . And the new image coordinates are calculated as follows.



w1view

u

w2view

u





=

w1view cos X − w2view sin X



(25)

w1view sin X + w2view cos X

The parameter EZ is estimated based on Z-axis stereo image sequence. When grid point [i,j] is located in tracking position k, its disparity is denoted by Dk,[i,j] Z , where Dk,[i,j] Z = wl u,k,[i,j] 1,Z − wr u,k,[i,j] 1,Z .

where the superscript u represents new image coordinate. 2) Estimation of the parameters EX , EY and EZ . The parameter EX is estimated based on X-axis stereo image sequence and true-value set{Xn,[i,j] X,A |Xn,[i,j] X,A = Xn,[i,j] X,A -X1,[i,j] X,A , n ∈ [1,N], i ∈ [1,49], j ∈ [1,49]}, as shown in Section 5.2. The mean value of the sum of wl u,n,[i,j] 1,X and wr u,n,[i,j] 1,X is denoted by vn,[i,j] X = (wl u,n,[i,j] 1,X + wr u,n,[i,j] 1,X )/2, where the superscript n represents the index of movement path. A set composed of the relative increments of vn,[i,j] X is defined as {vn,[i,j] X |vn,[i,j] X = vn,[i,j] X − v1,[i,j] X , n ∈ [1,N], i ∈ [1,49], j ∈

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

Nc Nr M  

KYX =

 1 ( Dk,[i,j] Z / Z k,[i,j] Z,A ) Nc Nr (L − 1) L

EZ =

Nc

Nr

(28)

k=2 j=1 i=1

3) Estimation of the parameters KYX , BYX , KYZ and BYZ . These parameters are estimated based on Y-axis stereo image sequence, true-value set {Pm,[i,j] Y,A |m ∈ [1,M], i ∈ [1,49], j ∈ [1,49]} and measured value set {Pm,[i,j] Y,P |m ∈ [1,M], i ∈ [1,49], j ∈ [1,49]}, as shown in Section 5.2. The measured set of world coordinates by Eq. (19). Least square method is used for these parameters based on Eq. (21).

{( Y m,[i,j] Y,P − avgY,Y )[(X m,[i,j] Y,A − X m,[i,j] Y,P ) − avgXA−XP,Y ]}

j=1 i=1 m=2 M 

(29) ( Y m,[i,j] Y,P − avgY,Y )

2

m=2

BYX = avgXA−XP,Y − KYX · avg Y,Y where avgY,Y is the mean value of Ym,[i,j] Y,P , and avgXA−XP,Y is the mean value of Xm,[i,j]Y,A − Xm,[i,j] Y,P . In the same way, KYZ and BYZ are calculated as follows.

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

Nc Nr M  

KYZ =

{( Y m,[i,j] Y,P −avgY,Y )[(Z m,[i,j] Y,A −Z m,[i,j] Y,P )−avgZA− ZP,Y ]}

j=1 i=1 m=1 N 

(30) (Y m,[i,j] Y,P −avgY,Y )

k=1

BYZ =avgZA−ZP,Y −KYZ avgY,Y

[1,49]}. The parameter EX is calculated by the following mean value.

 1 ( vn,[i,j] X / X n,[i,j] X,A ) Nc Nr (N − 1) N

EX =

2

Nc

Nr

(26)

n=2 j=1 i=1

parameter EY is estimated based on YThe axis stereo image sequence and true-value set {Ym,[i,j] Y,A|Ym,[i,j] Y,A = Ym,[i,j] Y,A − Y1,[i,j] Y,A, m ∈ [1,M], i ∈ [1,49], j ∈ [1,49]}, as shown in Section 5.2. In the same way, the mean value of the sum of wl u,m,[i,j] 2,Y and wr u,m,[i,j] 2,Y is denoted by vm,[i,j] Y = (wl u,m,[i,j] 2,Y + wr u,m,[i,j] 2,Y )/2, where the superscript m represents the index of movement path. The increment set of vm,[i,j] Y is denoted by {vm,[i,j] Y |vm,[i,j] Y = vm,[i,j] Y − v1,[i,j] Y ,

where avgZA−ZP,Y is the mean value of Zm,[i,j] Y,A − Z m,[i,j] Y,P. New world coordinates can be calculated based on Eqs. (29) and (30). New world coordinate vector is denoted by Pt,[i,j] AXIS,PC1 = (Xt ,[i,j] AXIS,PC1 , Yt ,[i,j] AXIS,PC1 , Zt ,[i,j] AXIS,PC1 )T . And the increment vector of world coordinate vector is denoted by Pt,[i,j] AXIS,PC1 = Pt,[i,j] AXIS,PC1 − P1,[i,j] AXIS,PC1 , where the superscript represents the first error compensation.

4) Estimation of the parameters KZX , BZX , KZY and BZY . These parameters are estimated based on Z-axis stereo image sequence and {Pk,[i,j] Z,A | k ∈ [1,L], i ∈ [1,49], j ∈ [1,49]} and {Pk,[i,j] Z,P | k ∈ [1,L], i ∈ [1,49], j ∈ [1,49]}, as shown in Section 5.2. Least square method is used for these parameters based on Eq. (22).

106

Y. Wang et al. / Micron 83 (2016) 93–109

Table 3 Parameters of image capturing used in five groups of calibration experiments. Parameters

Test 1

Test 2

Test 3

Test 4

Test 5

Interval distance (mm) NUMX NUMY NUMZ

0.1 42 30 32

0.2 21 15 16

0.3 14 10 11

0.4 9 8 8

0.5 9 7 7

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

Nc Nr K  

KZX =

tively. The model parameters of Test1 are used for world coordinate reconstruction. Et,[i,j] AXIS,U defined in Section 5.2, is used to describe the axial and the radial reconstruction precision. Fig. 13 is the scatter diagram of residual distribution with respect to Model 0. Dashed lines represent the boundaries of RA3

{( Z m,[i,j] Z,P − avg Z,Z )[( X k,[i,j] Z,A − X k,[i,j] Z,P ) − avg XA− XP,Z ]}

j=1 i=1 k=2 K 

(31) (Z k,[i,j] Z,P − avg Z,Z )

2

k=2

BZX = avgXA−XP,Z − KZX avgZ,Z

where avgZ,Z and avgXA−XP,Z are the mean values of Zk,[i,j] Z,PC1 ,PC1 and Xk,[i,j] Z,A −Xk,[i,j] Z,PC1 , respectively. In the same way, KZY and BZY can be estimated as follows.

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

Nc Nr K  

KZY =

{(Z m,[i,j] Z,P − avgZ,Z )[(Y k,[i,j] Z,A − Y k,[i,j] Z,P ) − avgYA−YP,Z ]}

j=1 i=1 k=2 K 

(32) (Z k,[i,j] Z,P − avgZ,Z )

2

k=2

BZY = avgYZ−YP,Z − KZY avgZ,Z

where avgZA−ZP,Z is the mean value of Zk,[i,j] Z,A −Zk,[i,j] Y,PC1 . 6. Results and analysis 6.1. Parameter estimation The experiment is based on the conditions listed in Table 3. The interval distance varies from 0.1 mm to 0.5 mm. In Table 3, the symbols NUMX , NUMY and NUMZ represent the number of stereo image pairs involved in X, Y, Z-axis stereo image sequences, respectively. The results of parameter estimation are listed in Table 4.The symbols Mean, Std and Re represent the mean value, standard deviation and relative errors of parameters. We can see that the parameters  X , EX , EY and EZ have less relative errors. However, the parameters KYX , BYX , BYZ , KZX , BZX , KZY and BZY have relative errors of 2–5%. 6.2. Residual analysis The model described by Eq. (19) is named Model 0, which does not consists of any model parameter. We divide the parameters of residual compensation model into two groups. The first group consists of KYX , BYX , KYZ and BYZ . The second group consists of KYX , BYX , KYZ , BYZ , KZX , BZX , KZY and BZY . Two groups of parameters actually represent two kinds of residual compensation models. The models based on the first and second group of parameters, are named Model 1 and Model 2, respectively. Model 0, Model 1, Model 2 are used to reconstruct the world coordinates of grid points corresponding to X, Y and Z-axis stereo image sequences in Section 6.1. The axial and radial reconstruction precision of three models is compared, and their compensation results are observed. We capture another stereo image sequence based on the conditions of Test1 in Table 3. The displacement in X, Y and Z direction, of grid points, is in the range of [−4.1 mm, 0 mm], [0 mm, 2.9 mm] and [−2.25 mm, 0 mm], respectively. The number of grid points in X, Y and Z-axes stereo image sequences is 294, 210 and 161, respec-

zone. We can see that most of scattered points of EX ,X , EX,Y and EX,Z are in RA3 zone. Some scattered points of EY,X and EY,Z are on the outside of RA3 zone when |YY,A | > 2.2 mm (suitable for EY,X ) and |YY,A | > 0.5 mm (suitable for EY,Z ). Most of scattered points of EY,Y are in RA3 zone. Some scattered points of EZ,X and EZ,Y are on the outside of RA3 zone. And most of scattered points of EZ,Z are in RA5 zone. We can see that EY,X , EY,Z , EZ,X and EZ,Y should be compensated. Fig. 14 is the scatter diagram of residual data distribution with respect to Model 1. We can see that most of scattered points of EY,X and EY,Z are in RA3 zone, and compensation for EY,X and EY,Z are proper and efficient. However, some scattered points of EZ,X and EZ,Y are far from the boundaries of RA3 zone. This shows that the first group of parameters can not fit all the residuals efficiently. Fig. 15 is the scatter diagram of residual data distribution with respect to Model 2. We can see that most of scattered points of error items except for EZ,Z are in RA2 or RA3 zones, and the compensation results are proper and efficient. And most of scattered points of EZ,Z is in RA5 zone. The second group of parameters can compensate for all the residual efficiently. We calculate S values of Model 2 based on different residual intervals, as shown in Fig. 16. We can see that the values of the error items except for SZ,Z are already close to 95% or over 95% in RA2 zone. The results show that residual data are mainly distributed in RA2 or RA3 zones, and that reconstruction precision corresponding to these error items can meet the precision requirements. The values of SZ,Z are 89.8% in RA4 and 96.3% in RA5, respectively. The result shows that residual data of EZ,Z is mainly distributed in RA5 zone, and that the reconstruction precision of Z coordinates meet the precision requirement, too. 6.3. Comparison of our models with traditional pinhole camera model In order to compare our model with TPCM, and guarantee the reliability of data reconstruction, we directly use Halcon software to reconstruct the world coordinates of grid points in X, Y and

Y. Wang et al. / Micron 83 (2016) 93–109

107

Fig. 13. Distribution of the residuals of world coordinates calculated by Model 0. (a), (b) and (c) are distribution of the residuals with respect to X direction. (d), (e) and (f) are distribution of the residuals with respect to Y direction. (g), (h) and (i) are distribution of the residuals with respect to Z direction.

Fig. 14. Distribution of the residuals of world coordinates calculated by Model 1. (a), (b) and (c) are distribution of the residuals with respect to X direction. (d), (e) and (f) are distribution of the residuals with respect to Y direction. (g), (h) and (i) are distribution of the residuals with respect to Z direction.

Z-axis stereo image sequence in Section 6.2. Halcon software provides calibration and 3D reconstruction modules for TPCM, which can output the world coordinates based on TPCM. We calculate the residuals of reconstruction data, and compare our model with TPCM. Fig. 17 is the scatter diagram of residual distribution with respect to TPCM. We can see that scattered points of EXX and EXY are distributed in RA3 zone. And scattered points of EXZ are distributed widely. The maximum value of |EXZ | is close to 0.15 mm.The maximum of |EYX |, |EYY |, |EZX |, |EZY | and |EZZ | are 0.04 mm, 0.08 mm, 0.1 mm, 0.2 mm and 0.2 mm, respectively. And most of scattered

points of EYZ are within the interval of [−0.02 mm, 0.02].Compared to Model 0, Model 1 and Model 2, we can see from Fig. 17 that the reconstruction precision of TPHM is lower than our model. CMOstyle SLM consists of two optical sub-systems, which share one common main objective lens. The optical axes of sub-systems and the optical axis of main objective lens are parallel and separate. An incident ray will enter the optical sub-systems at the position located on the surface of main objective lens. And this position is actually far from the optical center of main objective lens. Above imaging principle is inconsistent with the imaging requirement for TPHM because TPHM requires that rays must pass though the opti-

108

Y. Wang et al. / Micron 83 (2016) 93–109

Fig. 15. Distribution of the residuals of world coordinates calculated by Model 2. (a), (b) and (c) are distribution of the residuals with respect to X direction. (d), (e) and (f) are distribution of the residuals with respect to Y direction. (g), (h) and (i) are distribution of the residuals with respect to Z direction.

Fig. 16. Distribution of SX , SY and SZ value based on different residual intervals, with respect to initial vision model based on Model 2. Table 4 Results of parameter calibration based on five groups of experiments. Parameters

Test 1

Test 2

Test 3

Test 4

Test 5

Mean

Std

Relative error (%)

 X (deg) EX (mm/pix) EY (mm/pix) EZ (mm/pix) KYX BYX (mm) KYZ BYZ (mm) KZX BZX (mm) KZY BZY (mm)

3.0400 −0.0030 −0.0030 0.0125 −0.0050 0.0016 −0.0214 0.0015 −0.0056 −0.0011 −0.0141 −0.0037

3.0347 −0.0030 −0.0030 0.0125 −0.0050 0.0015 −0.0213 0.0013 −0.0057 −0.0011 −0.0143 −0.0041

3.0297 −0.0030 −0.0030 0.0125 −0.0052 0.0017 −0.0213 0.0009 −0.0057 −0.0013 −0.0144 −0.0041

3.0282 −0.0030 −0.0030 0.0125 −0.0052 0.0015 −0.0215 0.0009 −0.0059 −0.0013 −0.0147 −0.0044

3.0407 −0.0030 −0.0030 0.0125 −0.0052 0.0019 −0.0214 0.0011 −0.0054 −0.0011 −0.0142 −0.0046

3.0347 −0.0030 −0.0030 0.0125 −0.0051 0.0016 −0.0214 0.0011 −0.0057 −0.0012 −0.0143 −0.0042

0.0057 0 0 0 0.0001 0.0002 0.0001 0.0003 0.0002 0.0001 0.0002 0.0003

0.2 0 0 0 2.1 4.2 0.4 2.9 3.2 3.3 2.1 4.2

Y. Wang et al. / Micron 83 (2016) 93–109

109

Fig. 17. Distribution of the residuals of world coordinates calculated by traditional pinhole camera model. (a), (b) and (c) are distribution of the residuals with respect to X direction. (d), (e) and (f) are distribution of the residuals with respect to Y direction. (g), (h) and (i) are distribution of the residuals with respect to Z direction.

cal center of lens. Therefore, remarkable reconstruction errors will occur when TPHM is directly applied to SLM microscopic vision system. And this has been proved by our experimental results.

Acknowledgment This work was supported by National Nature Science Foundation of China through Grant No. 51175009.

7. Conclusion References In order to grip copper wires with a diameter of 100 ␮m, we develop a micro-gripping system based on SLM microscopic vision system with the focus of microscopic vision modeling in this paper. We propose a novel method for microscopic vision modeling based on TPCM. The method consists of four parts: image distortion correction, disparity distortion correction, initial vision model and residual compensation model. First, a method of disparity distortion correction is proposed. We obtain image data from stereo images of calibration sample, and correct disparity distortion by linear fitting or polynomial fitting. Second, a method of disparity distortion correction is proposed. The relationship between disparity distortion and image coordinates is observed through analyzing stereo images of calibration sample. We correct disparity distortion by linear fitting or polynomial fitting. Third, we derive the initial vision model and residual compensation model by analyzing the relationship of different coordinate frames, and calculating the residual of reconstruction data. The results show that the precision of X, Y and Z coordinate reconstruction are about 0.01 mm, 0.01 mm and 0.015 mm respectively, when their values are limited within the range of 4.1 mm, 2.9 mm and 2.25 mm. The precision can meet the requirements. Other useful results, i.e., the principles of image acquisition and the precision evaluation of microscopic vision model in a large field of view, are obtained, too. Comparison of our vision model with TPHM shows that there will be remarkable reconstruction errors when TPHM is directly applied to SLM microscopic vision system. And TPHM has much lower reconstruction precision than the model proposed in this paper. The method of microscopic vision modeling proposed in this paper, has the advantages of reducing difficulty of parameter calibration so as to improve the reconstruction precision by the residual compensation model. It has demonstrated high flexibility and efficiency, and can be applied to any SLM vision system.

Kim, N.H., Bovik, A.C., 1990. Shape description of biological objects via stereo light microscopy. IEEE Trans. Syst. Man Cybern. 20, 475–489. Windecker, R., Fleischer, M., Tiziani, H.J., 1997. Three-dimensional topometry with stereo microscopes. Soc. Photo Opt. Instrum. Eng. 36, 3372–3376. Eckert, L., Grigat, R.R., 2001. Biologically motivated, precise and simple calibration and reconstruction using a stereo light microscope. IEEE Int. Conf. Comput. Vis. 2, 94–101. Yuezong, W., Chong, L., Liding, W., Xiaodong, W., Zhan, S., 2003. Parameter calibrationof stereo light microscope in micromanipulation imaging system. Chin. J. Mech. Eng. 39, 81–86. Hasler, M., Haist, T., Osten, W., 2012. Stereo vision in spatial-light-modulator-based microscopy. Opt. Lett. 37, 2238–2240. Maodong, R., Jin, L., Zhengzong, T., Xiang, G., Leigang, L., Miao, Y., 2014. Microscopic three-dimensional deformation measurement system based on digital image correlation. Acta Opt. Sin. 34, 1–10. Danuser, G., 1999. Photogrammetric calibration of a stereo light microscope. J. Microsc. 193, 62–83. Lee, S.J., Kim, K., Deok-Ho, K., 2001. Recognizing and tracking of 3D-shaped micro parts using multiple visions for micromanipulation. 2001 Int. Symp. Micromechatron. Hum. Sci., 203–210. Sano, T., Yamamoto, H., Endo, H., 1998. A visual feedback system for micromanipulation with stereoscopic microscope. IEEE Instrum. Meas. Technol. Conf., 1127–1132. Yamamoto, H., Sano, T., 2002. Study of micromanipulation using stereoscopic microscope. IEEE Trans. Instrum. Meas. 51, 182–187. Larsson, L., Sjodahl, M., Thuvander, F., 2004. Microscopic 3-D displacement field measurements using digital speckle photography. Opt. Lasers Eng. 41, 767–777. Wei, P., Yongying, Z., Zheng, X., Chenxiang, W., 2011. Accuracy analysis of SLM based micro stereo vision system. Int. Conf. Syst. Sci. Eng., 363–368. Yongying, Z., Wei, P., Zheng, X., Chenxiang, W., 2011. Micro stereo occlusion correction algorithm. 4th Int. Cong. Image Signal Process., 1433–1437. Rogerio, R., Marcin, B., Raphael, S., Eric, M., 2012. Russell Taylor: vision-based proximity detection in retinal surgery. IEEE Trans. Biomed. Eng. 59, 2291–2301. Guangjun, Z., Weixian, L., Zhenzhong, W., Zhipeng, G., Yali, W., 2012. Microsopic vision based on the adaptive positioning of the camera coordinate frame. Microsc. Res. Tech. 75, 1281–1291. Guangjun, Z., Zhengzhong, W., Weixian, L., Zhipeng, C, Yali, W., 2013. Microscopic vision measurement method based on adaptive positioning of camera coordinate frame. Patent, United states, No. US 2013/0058581 A1. Weixian, L., Zhenzhong, W., Guangjun, Z., 2014. Affine calibration based on invariable extrinsic parameters for stereo light microscope. Opt. Eng. 53, 1–7.