ARSAC: Efficient model estimation via adaptively ranked sample consensus

ARSAC: Efficient model estimation via adaptively ranked sample consensus

Neurocomputing 328 (2019) 88–96 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom ARSAC: Effi...

2MB Sizes 0 Downloads 12 Views

Neurocomputing 328 (2019) 88–96

Contents lists available at ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

ARSAC: Efficient model estimation via adaptively ranked sample consensus Rui Li a, Jinqiu Sun b,∗, Dong Gong a, Yu Zhu a, Haisen Li a, Yanning Zhang a a b

School of Computer Science, Northwestern Polytechnical University, Xi’an, China School of Astronautics, Northwestern Polytechnical University, Xi’an, China

a r t i c l e

i n f o

Article history: Received 15 November 2017 Revised 20 February 2018 Accepted 28 February 2018 Available online 20 August 2018 Keywords: RANSAC Robust model estimation Efficiency Adaptively ranked measurements Non-uniform sampling Geometric constraint

a b s t r a c t RANSAC is a popular robust model estimation algorithm in various computer vision applications. However, the speed of RANSAC declines dramatically as the inlier rate of the measurements decreases. In this paper, a novel Adaptively Ranked Sample Consensus(ARSAC) algorithm is presented to boost the speed and robustness of RANSAC. The algorithm adopts non-uniform sampling based on the ranked measurements to speed up the sampling process. Instead of a fixed measurement ranking, we design an adaptive scheme which updates the ranking of the measurements, to incorporate high quality measurements into sample at high priority. At the same time, a geometric constraint is proposed during sampling process to select measurements with scattered distribution in images, which could alleviate degenerate cases in epipolar geometry estimation. Experiments on both synthetic and real-world data demonstrate the superiority in efficiency and robustness of the proposed algorithm compared to the state-of-the-art methods.

1. Introduction Model estimation is a crucial step in various computer vision problems, including structure from motion (SFM), image retrieval and simultaneous localization and mapping (SLAM), etc. One of the main challenges during model estimation is that the measurements (often points or correspondences) are unavoidably contaminated with outliers due to the imperfection of current measurements acquisition algorithms. These contaminated measurements will probably lead to an arbitrary bad model which could bring a catastrophic impact on the final result in real world applications. In order to identify and exclude the outliers, a variety of robust model fitting techniques [1–8] have been studied for years and can be roughly categorized into two groups. One group of methods [3,9–11] analyze each measurement by its residuals under different model hypotheses, then distinguish inliers from the outliers. These methods are usually used in multi-model fitting problems and limited by the requirement of abundant model hypotheses. The other group of algorithms generate model hypotheses by sampling in the measurement set, then find the best model via optimizing a welldesigned cost function. The cost function can be solved by random sampling [2,6–8,12] or branch-and-bound (BnB) algorithm which ∗

Corresponding author. E-mail addresses: [email protected] (R. Li), [email protected] (J. Sun), [email protected] (D. Gong), [email protected] (Y. Zhu), [email protected] (H. Li), [email protected] (Y. Zhang). https://doi.org/10.1016/j.neucom.2018.02.103 0925-2312/© 2018 Elsevier B.V. All rights reserved.

© 2018 Elsevier B.V. All rights reserved.

guarantees an optimal solution with, however, much more time consuming [4,5,13]. In real-world applications, these algorithms are carefully selected in terms of different usages. Among these methods, random sample consensus (RANSAC) algorithm [1] is one of the most popular algorithms for robust model estimation, which is broadly recognized by its simplicity and effectiveness. The algorithm follows a vote-and-verify paradigm. It repeatedly draw minimal measurement sets by sampling from the measurements randomly, and calculate model parameter from each sample to form a model hypothesis in each trial. For every model hypothesis, a verification step is also needed. All measurements are incorporated to calculate their residuals with current model parameter. Then, measurements whose residuals below a certain threshold will be grouped as the consensus set of that model hypothesis. After sufficient trials of model generation and verification, RANSAC selects model hypothesis with the largest consensus set as the optimal hypothesis. And the final model parameter is calculated from the consensus set of the optimal hypothesis. Though proved to be effective, RANSAC still needs improvement for better efficiency. In order to get sufficient trials in RANSAC, we must draw k trials to get at least one outlier-free sample with confidence η0 ,

k≥

log(1 − η0 ) , log(1 −  m )

(1)

where  is the fraction of inliers in the dataset, m denotes the size of the minimal sampling set. It’s obvious that as  declines, k

R. Li et al. / Neurocomputing 328 (2019) 88–96

89

grows dramatically which brings heavy computational burden on the algorithm. This phenomenon results from the uniform sampling strategy in RANSAC, where the algorithm samples every measurement at equivalent chance. Therefore some valuable information indicating true inliers will be overlooked, which could have been used to cut down the running time of the algorithm. In this paper, we mainly address on this problem and present ARSAC, which aims to boost the efficiency of the algorithm. We adopt a non-uniform sampling strategy where the samples in each trial are carefully chosen to improve the convergence speed of the algorithm. An adaptively ranked sampling algorithm is then proposed for finding new samples. The ranking of measurements is first calculated by their own quality and then updated by analyzing the current best model hypothesis. We can get a glimpse of the probability for a measurement to be an inlier through the ranking and leverage it to incorporate measurements into new samples. In addition, in the non-uniform sampling process, measurements with higher quality tend to cluster in images, which cause potential risks for degeneracy when estimating epipolar geometry. To avoid this, a geometric constraint is proposed to enforce the algorithm to sample measurements which have scattered distribution in images. This constraint is proved to be effective experimentally in the following section. Our main contributions can be summarized as follows:

valid models. But it still requires pairwise matching which takes a lot of time in large scale datasets. SPRT test [20,21] uses Wald’s Theory of sequential testing [22] on model verification with adaptive likelihood ratio being a criterion. However, this method may lead to misjudgments toward true inliers, that the conclusion is usually drawn before all measurements are verified. For algorithms designed to improve accuracy and robustness, some methods [23,24] extended the inlier set by conducting extra RANSAC trials based on the current best consensus set, which explores potential inliers that is not strictly consistent with current best model. In order to handle with degeneracy, QDEGSAC [25] uses several runs of RANSAC to provide the most constraining model, which leads to the least probability of degeneracy. However, these methods requires further runs of RANSAC which helps little for efficiency. Our method adopts non-uniform sampling for efficiency by maintaining an ordered measurements set. Different from the methods above, we propose an adaptive scheme to update the ranking of measurements to get better samples, which is crucial to the fast convergence of the algorithm. At the mean time, in order to alleviate degeneracy during the sampling process, we select measurements with scattered distribution, which requires no extra trials compared with other related algorithms.

• We propose a method which adopts non-uniform sampling based on adaptively ranked measurements. It updates the ranking of measurements in each trial, and incorporate most prominent measurement into sample, which is experimentally proved to achieve high efficiency. • We design a simple geometric constraint, which samples measurements with scattered distribution in images. It effectively alleviates degenerate cases in epipolar geometry estimation and improves the overall robustness of the algorithm. • We develop ARSAC, an efficient algorithm which is able to achieve better speed and robustness compared to other efficient RANSAC-like algorithms.

In order to improve the efficiency of RANSAC, ARSAC adopts non-uniform sampling that measurements with high quality are sampled in advance. Different from methods with fixed ranking of measurements [15], in ARSAC, the ranking of measurements is iteratively updated, which offers a better description of the measurements’ quality. At the same time, a geometric constraint is proposed to constrain the sampling process. Note that X = {xi }N is the dataset i=1 with N measurements, where i is the index of every measurement. M j is the minimal sample set of size m, j denotes the index of the trials. θ j is the model hypothesis generated by M j and its consensus set is noted as Ij . U is the ranked measurements, and Un denotes top n measurements of it. The whole procedures of ARSAC are described in Algorithm 1.

2. Related work Due to the inefficiency of RANSAC, a number of schemes have been proposed to improve its performance. Basic ideas focus mainly on the sampling and model verification stage. For the sampling stage, instead of randomly picking measurements into samples, non-uniform sampling is conducted by leveraging prior knowledge of the measurements. NAPSAC [14] selects measurements that are sampled within a hypersphere in high dimensional space of radius r. It could handle the problem of poor sampling in higher dimensional space, but may result in degeneration for the measurements may be too close to each other. PROSAC algorithm [15] ranks the measurements by their quality, then draw non-uniform sampling with measurements in descending quality order. However, the fixed quality order of measurements could not be sufficient to precisely reflect the probability of true inliers, especially for the scenes whose inlier feature points shows no apparent quality advancement towards outliers. PROSAC also needs to be improved in case of degeneracy [16] during non-uniform sampling. GroupSAC [17] divides the measurements into different groups, and samples from the most prominent group. It provides better performance on well-grouped images. However, it relies on the effectiveness of the group function which is supposed to divide the scene into distinct groups. For the model verification stage, a subset of all measurements is often chosen for verification. Preemptive methods [18,19] follows the paradigm that a very small number of measurements are randomly selected for a rough verification to filter out extremely in-

3. Adaptively ranked sample consensus algorithm

Algorithm 1 Adaptively ranked sample consensus algorithm. Input: Ranked measurement set U, minimal sample length m, length of the measurement set N; Output: The best model parameter θ j ; 1: Initialize j ← 0, n ← m, Un ← U (1 : n ), αn ← 0; 2: while Stopping conditions are not achieved do 3: j ← j + 1; Update the ranking of U and select current sample M j (see 4: Section 3.1.2); Compute current model parameter θ j and its consensus set 5: Ij; Set θ j to the model parameter with the largest consensus 6: set; Set the stopping criteria of current iteration (see 7: Section 3.2); 8:

return θ j .

3.1. Adaptively ranked progressive sampling In ARSAC, we adopt non-uniform sampling based on an adaptively ranked measurement updating scheme. We first show the sampling process when a ranked measurements set is given, then introduce both the updating strategy for the measurement ranking and the geometric constraint.

90

R. Li et al. / Neurocomputing 328 (2019) 88–96

3.1.1. Non-uniformsampling with ranked measurements Given a set of ranked measurements U, it’s intuitively difficult to ensure which measurements should be selected or how many times should a measurement be selected. We thus use a progressive scheme which was introduced by Chum and Matas [15], that a subset Un containing n top-ranked measurements is chosen to draw a minimal sized sample. As the sampling process continues, Un will gradually increase. This guarantees that the sampling strategy in ARSAC will perform at least the same trials as RANSAC on the worst condition, where the ranking of measurements is totally arbitrary. Note that SN is the total samples drawn in standard RANSAC, whose default value is 85, 0 0 0 in this paper. Let Sn denotes the average number of samples that only contains measurements from Un

n m

m −1 

m

i=0

Sn = SN  N  = SN

n−i . N−i

(2)

Considering that there is overlap between samples in Sn and Sn−1 ,  so the number of samples Sn that need to be drawn for current Un will be 

Sn = Sn − Sn−1 ,

(3)

where • denotes the operation of getting upper bound. It’s worth noticing that new sample drawn from Un follows the principle that the nth measurement in Un must be selected and the rest m − 1  measurements are randomly chosen from Un−1 . When Sn samples are drawn under current Un , n will increase by 1, and Un will in turn expand by including the best inlier from the remaining measurements U \Un . 3.1.2. Measurements updating In measurement updating process, we focus mainly on choosing high quality measurements into model estimation to get accurate results as fast as possible. An adaptively ranked measurements updating strategy is proposed to solve this problem. At the mean time, in order to reduce degenerate cases in epipolar geometry estimation, a geometric constraint is proposed to guide the measurement updating scheme. Relevant experiments are conducted in Section 4 to demonstrate the effectiveness of the two methods. Adaptively ranked measurement updating: Measurements updating in ARSAC is related to the sampling process. The ranking of measurements is initialized by the evaluation of isolated measurements. In this paper, we use SIFT [26] to detect and match feature points of images, and evaluate the quality of individual correspondence by Ri , ratio of the second-shortest distance to the shortest distance of measurement i. The initial ranking measurements are then generated in accordance to the descending order of Ri , it reflects the quality of measurements in terms of their qualities of feature correspondences. However, this ranking should not be considered fully reliable. As shown in Fig. 1(a), there exists dominant proportion of measurements whose Ri are close to 1, which indicates the shortest distance of a measurement shows no distinct advantage toward the second-shortest one. So even some measurements locate in the front of the initial ranking, their reliability to be true inliers remains to be a query, there also exists high possibilities for them to be outliers, see Fig. 1(b). To handle this problem, in this paper, we use the quality ranking of measurements’ feature correspondences as an initialization, and iteratively update it to get better samples. The first sample consists of top m of initial ranking, then the rest of the ranking is iteratively updated whenever Un is about to expand (see Section 3.1.1 for details of sampling process) using information provided by current best model. We assume that the current best model θ j in the jth trial is more reliable to evaluate the quality of remaining measurements U \Un than the initial ranking.

Fig. 1. Initially ranked measurements by feature quality. (a) shows all measurements of an image pair ranked by their feature quality (distance ratio) in descending order, where most measurements possess distance ratios near 1. (b) shows some measurements(correspondences) whose position are pointed in (a), green lines denote the right correspondences while red lines denote the wrong ones. The figure demonstrates that it’s unreliable to identify inliers merely by their positions in the initial ranking for measurements whose distance ratio are close to 1. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

In ARSAC, the residuals of measurements in U \Un are calculated under the current best model θ j in each trial. Measurement with the lowest residual is considered as the current best inlier xj , which shows the highest consistency of the current measurements in Un . When the best inlier is selected, the location of xj will be leveraged to judge the reliability of current ranking. Considering that sometimes xj locates not far from the best measurement in initial ranking(i.e., the first element in U \Un ), a safe region is designed to reduce the unnecessary insertion operation and maintain the potential property of initial ranking when initial ranking is considered as reliable in current trial. For measurements in U \Un , a safe region is defined as top ω measurements of the set



ω=

0.05 × NU \Un 0

if NU \Un > 100 , else

(4)

where NU \Un denotes the number of measurements in U \Un . If xj lies within the safe region, the current ranking will be considered reliable and measurements will not be updated to avoid unnecessary operation. Once xj does not locate in the safe region of U \Un , the current ranking will be thought unreliable and xj will be inserted at the beginning of U \Un , which also results in an updated U that offers a better description of the latent inliers. The updating process of our algorithm is illustrated in Fig. 2. By this step, we offer another perspective to evaluate the quality of measurements. Measurements are rearranged into a new order by their residuals. It shows high potential to get the largest consensus set in very few trials, which contributes both accuracy and efficiency in model estimation problems. Geometric constraint: We further enhance the sampling strategy by a geometric constraint to alleviate degeneracy. Inspired by

R. Li et al. / Neurocomputing 328 (2019) 88–96

91

Fig. 2. Updating of measurement ranking. The best inlier xj under current best model is inserted at the beginning of U \Un if it dose not locate in the safe region.

situations illustrated in [27], We observe that measurements leading to degeneracy are liable to gather in limited areas in images. At the mean time, measurements with scattered distribution are more robust against degeneracy. Fewer degenerate cases can be achieved directly by sampling measurements with a broader distribution. In ARSAC, a constraining circle CUn is presented which contains all measurements in Un with the smallest radius. The center of the circle cUn lies on the center of all measurements in Un , whose coordinate is the mean value of measurements in Un . The radius rUn is set to be

rUn = max xi − cUn 2 + λ,

xi ∈ Un

(5)

where λ is used to decide the extra expansion of the circle. New measurement to be included in Un should meet with the constraint that the position of it should be out of the range of CUn in the image, in order to avoid degeneracy which result from the clustering of measurements. In this step, to cope with the challenges in epipolar geometry estimation, a geometric constraint is proposed to reduce the chances leading to degeneracy, by sampling measurements with larger range of distribution. The method is simple yet effective as illustrated in Section 4.2. To concatenate the two ranking updating strategies above, we slightly altered both algorithms to make a trade-off between high efficiency and low degenerate cases. The whole updating process in ARSAC is summarized in Algorithm 2. It is worth noticing that parameter tuning in this algorithm is also crucial to the final performance. The ideal performance in model estimation means we can get highly accurate result in very limited time. Here we mainly discuss two parameters, nb and λ, by which we strike a balance between accuracy and efficiency. nb is the maximum number of comparisons that ARSAC conducts to choose appropriate measurement with comparatively lower residual and broader distribution. The value of nb influence the quality of measurements and efficiency of the algorithm. In practice, for image pairs which are liable to get wrong results (those contains repetitive patterns or dominant planes), we set nb to large values in range [10,25] to get more reliable results. For image pairs which has distinct textures and less noises, smaller values in range [1,5] are recommended to achieve higher efficiency. In our algorithm, λ is designed for selecting measurements robust to degeneracy. When λ gets larger, the measurements selected will be more robust against degeneracy. However, more time will also be consumed. So it’s necessary to change the value of λ in terms of different data types. For normal 1K × 1K pictures, if there exists dominant plane in image pairs, a larger λ (20 ∼ 50pixels) could enforce the algorithm to sample more scattered measurements, which is helpful to get a non-degenerate model, though extra time consuming is needed. If there are no dominant plane in images, a small λ (2 ∼ 10pixels) is enough to get reliable results.

Algorithm 2 Ranking updating algorithm. Input: Current ranked measurement set U, subset Un , λ, current best model θ j , number of the selected measurements nb ; Output: Newly updated measurements U, expanded subset Un+1 ; nb as the selected measure1: Calculate top nb best inliers {x }z=1 ments under θ j ; 2: Calculate the parameters of the constraining circle CUn ; 3: z ← 1; 4: while z ≤ nb do if xz locates out of CUn in images then 5: continue; 6: 7: else if xz lies out of the safe region of U \Un then 8: Remove xz to the beginning of U \Un to update U; 9: 10: Append xz to Un to form Un+1 ; break; 11: else 12: Append the first measurement of U \Un to Un to form 13: Un+1 ; break; 14: z←z+1 15: 16: if No selected measurements are appended yet then Append x1 to Un to form Un+1 and update U; 17: 18:

Return ranked measurements U and updated subset Un+1 .

3.2. Stopping criteria In order to get an optimal model after limited number of trials, The basic stopping criteria of ARSAC are composed of three elements: maximality constraint criterion, non-randomness constraint criterion and geometry constraint criterion. The maximality constraint criterion guarantees that after kη trails, the probability of existing another model which has a larger consensus set falls below a certain threshold η

( 1 −  m )kη ≤ η ,

(6)

where  denotes the inlier rate of the measurements. The trials of our algorithm k must be larger than kη before being terminated. In ARSAC, η is set to 0.05 and  is updated in each trial. The non-randomness constraint criterion is designed to make the probability of the situation below a certain threshold , that a bad model is calculated and supported by a random consensus set. The distribution for a set g to be supported by random “inliers” follows the binomial distribution B(n, β ), that



P (g ) = β n

g−m

(1 − β )

n−g+m



n−m , g−m

(7)

where β is the probability of a measurement in Un being misclassified as an inlier by a wrong model. Thus for each n, the minimal

92

R. Li et al. / Neurocomputing 328 (2019) 88–96

Fig. 3. The inlier rate  out of top n measurements for different algorithms from synthetic dataset. (a) denotes RANSAC which draws uniform sampling, (b) denotes PROSAC that draws non-uniform sampling with fixed ranking and (c) denotes ARSAC which adopts adaptively ranked non-uniform sampling. The blue dashed denotes the inlier rate of the whole dataset, which is 0.45. The green circle in (b) and (c) indicates the subset size when the two algorithms terminate respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

size of consensus set Lnmin is

Lnmin = min{ j :

n 

P n (g ) < } .

(8)

g= j

To cut down the running time, we further assume the size of the subset Un is big enough that the distribution can be approximated by Gaussian distribution according to central limit theorem

B(n, β ) ∼ N (μ, σ ),



(9)

where μ = nβ , and σ = nβ (1 − β ). Then Eq. (8) could be reformulated by Chi-square distribution. So the minimal size of consensus should be

Lnmin = m + μ + σ ∗



Chi2 ,

(10)

where is determined by the threshold , whose default value is set to 0.05 in our experiment. The trails of ARSAC will not stop until the size of the best consensus set is bigger than Lnmin . Geometric constraint criterion is related to the geometric constraint as demonstrated in Section 3.1.2. It is used to judge whether measurements in Un possess a scattered distribution after conducting non-uniform sampling with geometric constraint in epipolar geometry estimation. Note that DUn is the bounding box of measurements in Un , and Dtotal is the bounding box of all measurements in the image. the algorithm terminates when the condition is met in both images Chi2

DU n > rrange , Dtotal

(11)

where rrange is the acceptable area ratio of two bounding boxes. In our experiment, a larger rrange is preferable when image pairs contains dominant planes. The area ratio of dominant plane and the whole scene in pictures could be a rough guess for the value of rrange . 4. Experiments In this section, we conduct experiments on real-world data to verify the efficiency and robustness of ARSAC. We evaluate different algorithms on two well-known model estimation problems: homography (H) estimation and fundamental matrix (F) estimation. Both model estimation problems require effective outlier removal process to guarantee the accuracy of the final solution. Homography describes the transformation between two plane objects, while fundamental matrix constrains the 3D spatial relationship between two views (epipolar geometry). So in real world applications, degeneracy is one of the key challenges in fundamental matrix estimation problems. Both H and F can be estimated by least-square methods. The minimal sample size for homography is 4 while that for fundamental matrix is 7. As discussed in Section 2,

there are some RANSAC-like algorithms designed for better efficiency during the sampling stage. So in experiments, except for RANSAC we further compare the proposed ARSAC with the following state-of-the-art algorithms: NAPSAC, PROSAC and GroupSAC. We implement these algorithms in Matlab. All the evaluations are performed on an Intel i7 CPU with 32 GB RAM. 4.1. Synthetic data We test the quality of samples drawn by ARSAC. A synthetic dataset is generated to estimate a fundamental matrix. The dataset is composed of 10 0 0 measurements contaminated by Gaussian noises whose variance is fixed to 2. Inlier rate is set to 0.45. We conduct the model estimation process with RANSAC, PROSAC and ARSAC respectively. As shown in Fig. 3, the inlier rate of RANSAC in top n measurements fluctuate around the inlier rate, for RANSAC does not rank any measurement and conduct random sampling. Both PROSAC and ARSAC keep high inlier rate in the top fraction of ranked measurements. However, the inlier rate of PROSAC falls down quickly as n grows and there are ups and downs in its inlier rate curve, which means the ranking strategy that PROSAC used could not effectively choose true inliers. ARSAC manages to find the largest number of measurements which keep very high inlier rate in the top fraction. That means in limited sampling trials, ARSAC reaches the highest probability to get all-inlier samples, which strongly boost the convergence speed of the algorithm. Further experiments are conducted to compare the overall performance of different algorithms. We generate several datasets, each dataset contains 10 0 0 feature correspondences in range [0,10 0 0] for both axis. Gaussian noise is added to the correspondences, with σ varying from 0.5 ∼ 2.0. Inlier ratio  is set in range [0.3,0.6]. The initial rankings of the measurements are manually arranged. The refined result is given in Table 1, where dataset A and B simulate measurements under a fundamental matrix while C and D simulate measurements under a specific homography. For each algorithm, the table lists the found inliers(I), the number of trails(k) and the total runtime(time) measured in millisecond. The error is computed by Sampson error [28] and return the mean value after 500 executions for each algorithm. As is shown in Table 1, trials of RANSAC grows significantly as the inlier rate declines, which results in the lowest efficiency. NAPSAC improves RANSAC in many cases. But its performance is not stable enough due to the configuration of hypersphere radius. While PROSAC develops the speed of RANSAC significantly, it sometime fails when the initial ranking of the measurements is not capable to truly reflect the property of inliers as shown in case D. GroupSAC offers a good scheme for efficiency, but its performance is still limited. Among all the algorithms, ARSAC delivers solutions with the highest speed and lowest error in most cases. It provides

R. Li et al. / Neurocomputing 328 (2019) 88–96

93

Table 1 Baseline comparison of ARSAC with other algorithms on synthetic data.

A:  = 0.59

B:  = 0.35

C:  = 0.55

D:  = 0.31

I k Time Error I k Time Error I k Time Error I k Time Error

RANSAC

NAPSAC

PROSAC

GroupSAC

ARSAC

588.32 271.58 1453.73 0.25 313.55 21439.80 58367.53 0.76 530.76 31.24 91.22 1.14 292.42 336.22 710.09 1.12

562.53 156.57 880.25 2.24 309.34 7379.82 24836.98 0.81 536.49 31.35 115.48 1.14 292.79 276.68 625.63 1.12

579.95 21.92 259.27 3.24 302.20 427.25 2589.36 1.83 536.21 6.07 33.39 1.14 284.03 225.73 583.72 2.37

566.95 18.45 190.18 3.21 290.05 65.70 383.48 2.65 517.70 4.35 36.28 2.72 292.95 10.10 37.65 1.98

554.54 12.85 186.37 2.25 295.90 57.75 391.63 1.14 539.00 4.16 31.47 2.14 288.35 8.70 64.89 1.13

Table 2 Baseline comparison on real-world data for homography estimation.

A:  = 0.52, N = 2540

B:  = 0.15, N = 514

C:  = 0.34, N = 1967

D:  = 0.12, N = 979

I

RANSAC

NAPSAC

PROSAC

GroupSAC

ARSAC

1281.10

1196.13

1189.24

1162.70

1134.20

k

93.55

63.90

18.27

8.45

8.00

Time Error

545.29 0.73

1500.35 2.37

169.73 3.23

289.89 0.69

91.5 0.74

I

73.30

68.14

68.70

66.73

67.90

k Time Error

12266.65 16474.05 0.40

6847.9 7951.13 9.39

15.38 44.98 8.41

7.46 53.67 1.63

13.15 40.5 0.40

I

632.10

603.53

426.03

409.62

448.5

k

482.05

313.10

12.18

8.9

11.25

Time Error

2218.73 0.80

2401.05 11.54

72.97 3.34

127.32 1.65

74.27 1.34

I

107.90

106.40

100.95

97.40

93.38

k

32585.20

10114

861.06

177.15

159.30

Time Error

73441.53 0.47

14825.28 6.53

2220.70 1.27

282.08 1.47

211.83 0.46

an adaptively ranked paradigm which ensures to include current best measurement into the sample to boost the convergence speed of the algorithm. 4.2. Real-world data Considering that the distribution and quality of measurements in real-world data are much more complicated than in the synthetic ones. To reflect the real performance of ARSAC, we tested the algorithm on real-world images for model estimation problems. The datasets is provided by [29], each dataset presents various challenges in terms of low inlier rate and degeneracy. In experiments, dataset A ∼ D are selected to perform homography estimation, while dataset E ∼ H are used to estimate the fundamental matrix. Since the ground truth of inlier is unknown on real-world images, we approximate it by performing 106 trials of random sampling. The baseline comparison is shown in Table 2 for homography estimation and Table 3 for fundamental matrix estimation. For each algorithm, the tables list the found inliers (I), the number of trails (k) and the total runtime (time) measured by millisecond. The error is computed by Sampson error [28] and

the mean value is returned after 500 executions for each algorithm. As shown in Tables 2 and 3, in most cases, ARSAC performs the best in terms of average trials and time than other algorithms. What’s more, the average error of ARSAC keeps at the lowest level compared with other algorithms. It is worth noticing that RANSAC could find more inliers and estimate model with lower error, but this performance builds on significant number of trials which is the case we are trying to avoid. In fundamental matrix estimation, we further perform partial implementations of ARSAC with “adaptively ranked measurement updating only” (ARMU only) and “geometric constraint only” (GC only) configurations respectively to verify the effectiveness of main components of ARSAC. Experimental results reveals that two implementations both contribute the accuracy of the model. The difference lies in that ARSAC (ARMU only) aims to choose measurements with reliable correspondences and less noises to estimate the model, the accuracy of the model builds on pairwise correspondences, while ARSAC (GC only) tends to avoid degenerate model estimation results via geometric constraint, the accuracy of the model relies on the rejection of individual degenerate model estimations. As shown in Table 3, for image pairs E, G, H which have dominant planes, result of

94

R. Li et al. / Neurocomputing 328 (2019) 88–96 Table 3 Baseline comparison on real-world data for fundamental matrix estimation.

E:  = 0.48, N = 3154

F:  = 0.22, N = 1516

G:  = 0.39, N = 422

H:  = 0.92, N = 786

I

RANSAC

NAPSAC

PROSAC

GroupSAC

ARSAC (ARMU only)

ARSAC (GC only)

ARSAC

1429.15

1476.43

610.43

1433.95

604.41

587.23

595.35

k

1234.50

868.10

4.91

55.26

3.94

5.61

4.35

Time Error

12230.63 0.54

10351.39 1.25

563.47 1.46

4213.75 0.96

410.21 1.01

649.27 0.65

447.13 0.73

I

275.20

254.70

291.45

255.85

252.32

264.73

283.55

k

50 0 0 0

5023

36.85

32.45

3.97

73.14

4.1

Time Error

140720.86 2.85

14300 20.35

248.20 14.59

746.98 6.14

53.46 3.23

541.34 6.25

76.61 3.59

I

147.05

142.37

140.70

138.60

135.10

143.34

155.3

k

6501.05

3886.47

283.70

20.95

8.96

301.22

11.30

Time Error

8093.21 4.19

7203.83 9.89

352.24 10.51

262.92 2.15

125.13 7.4

402.24 0.21

139.28 0.54

I

687.55

688.2

604.00

605.10

600.91

611.24

604.00

k

12.35

11.83

2.75

4.3

1.79

3.04

1.86

Time Error

153.21 0.71

251.22 4.35

57.32 5.58

294.07 12.56

46.23 4.92

60.87 4.38

50.27 3.19

Fig. 4. Average trials by each algorithm to reach predefined error threshold δ . The label on X-axis corresponds to datasets in Tables 2 and 3, Y-axis denotes the minimal trials of each algorithm to make the error of model falls below δ , where δ is set to 3.0. The plots represent average value of 500 runs.

ARSAC(GC only) has lower error than that of ARSAC (ARMU only). But for image pair F which possesses plenty of erroneous correspondences and high noise with no dominant plane, ARSAC (ARMU only) shows a better performance. Results also show that ARSAC (ARMU only) can boost the speed of the algorithm, due to the high quality measurements which lead to fast convergence, while ARSAC (GC only) needs extra trials to find measurements that are not likely to cause degeneracy. In order to compare the efficiency of different methods directly, we display the least trials for different methods to make their average residual fall below a certain error threshold. Considering that ARSAC, PROSAC and GroupSAC shows far better performance in efficiency compared with NAPSAC, we compare those three algorithms as shown in Fig. 4. As is illustrated in the diagrams, for some cases, all methods provide satisfactory results for efficiency, but for some specific cases, e.g. low inlier rate or degeneracy, ARSAC shows its superiority in fastest convergence compared with other methods. We further compare the robustness of ARSAC with other algorithms. Table 4 shows the number of degenerate cases (kDeg )

Table 4 Degenerate cases for fundamental matrix estimation. NAPSAC

E F G H

PROSAC

GroupSAC

ARSAC

kDeg

k

kDeg

k

kDeg

k

kDeg

k

261.75 23.41 869.51 4.27

868.10 5023 3886.47 11.83

1.31 0.67 231.80 0.93

4.91 36.85 334.60 2.75

3.22 1.26 3.54 3.34

75.20 32.45 20.95 4.3

0.06 0 0.91 0.52

4.30 4.10 11.30 1.86

among total trials (k) in fundamental matrix estimation on realworld dataset A ∼ D. Each algorithm is executed 500 times and the mean value is returned. The result shows that ARSAC performs the best with very few degenerate cases for every dataset, thanks to its geometric constraint which samples measurements with scattered distribution. The true inlier rate for different algorithms’ optimal consensus sets also demonstrate the robustness of ARSAC, as the optimal consensus set determines the final parameter of the model. We compare them as shown in Fig. 5, it can be seen that ARSAC prevails over all other non-uniform sampling algorithms. At

R. Li et al. / Neurocomputing 328 (2019) 88–96

95

Fig. 5. Fraction of true inliers returned by each algorithm for fundamental matrix estimation and homography estimation problems. The label on X-axis correspond to datasets in Tables 2 and 3, the Y-axis denotes the inlier rate of each dataset estimated by different algorithms. The plots represent average value of 500 runs.

the mean time, the true inlier rate of ARSAC could reach the same level as that of RANSAC, with a significant improvement in efficiency. It could be observed that in the context of efficient model estimation problems, algorithms like PROSAC and NAPSAC improve the efficiency a lot, but the error of estimated model seems a little bit larger than other methods. GroupSAC performs well in many cases, but it could suffer from degeneracy in fundamental matrix estimation for scenes with dominant planes. Among all the algorithms, ARSAC presents the best performance in terms of both efficiency and robustness, which provides a better solution for efficient model estimation problems. 5. Conclusion We present ARSAC, a novel variant of RANSAC that draws nonuniform sampling with adaptively ranked measurements. In each trial of ARSAC, the algorithm selects measurement with highest quality into the new sample set. We also propose a geometric constraint in ARSAC to alleviate degeneracy in epipolar geometry. Our algorithm is capable of handling low inlier rate measurements in model estimation and is shown to be more efficient and robust compared with state-of-the-art algorithms. Though proved to be effective, our algorithm may be bothered by a number of parameter configurations. In future work, we plan to explore a parameter-free strategy to simplify the algorithm in both theory and pr actice. Acknowledgments This work is supported by the National Natural Science Foundation of China (No. 61231016), the National High-tech R&D Program of China (No. 2015AA016402), Seed Foundation of Innovation and Creation for Graduate Students in Northwestern Polytechnical University (Z2017184). References [1] M.A. Fischler, R.C. Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Commun. ACM 24 (6) (1981) 381–395. [2] P.J. Rousseeuw, Least median of squares regression, J. Am. Stat. Assoc. 79 (388) (1984) 871–880. [3] R. Toldo, A. Fusiello, Robust multiple structures estimation with j-linkage, in: Proceedings of the European Conference on Computer Vision, 2008, pp. 537–547. [4] H. Li, Consensus set maximization with guaranteed global optimality for robust geometry estimation, in: Proceedings of the IEEE International Conference on Computer Vision, 2009, pp. 1074–1080. [5] T.J. Chin, H.K. Yang, A. Eriksson, F. Neumann, Guaranteed outlier removal with mixed integer linear programs, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5858–5866.

[6] J.V. Miller, C.V. Stewart, Muse: robust surface fitting using unbiased scale estimates, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, 1996, p. 300. [7] K.M. Lee, P. Meer, R.H. Park, Robust Adaptive Segmentation of Range Images, IEEE Computer Society, 1998. [8] C.V. Stewart, Minpran: a new robust estimator for computer vision, IEEE Trans. Pattern Anal. Mach. Intell. 17 (10) (1995) 925–938. [9] L. Magri, A. Fusiello, T-linkage: a continuous relaxation of j-linkage for multi– model fitting, in: Proceedings of the Computer Vision and Pattern Recognition, 2014, pp. 3954–3961. [10] W. Zhang, J. Kosecka, A new inlier identification procedure for robust estimation problems, in: Robotics: Science and Systems, 2006. [11] T.J. Chin, J. Yu, D. Suter, Accelerated hypothesis generation for multi-structure robust fitting, in: Proceedings of the European Conference on Computer Vision, 2010, pp. 533–546. [12] H. Chen, P. Meer, Robust regression with projection based m-estimators, in: Proceedings of the IEEE International Conference on Computer Vision, 2003, pp. 878–885. vol. 2. [13] Y. Zheng, S. Sugimoto, M. Okutomi, Deterministically maximizing feasible subsystem for robust model fitting with unit norm constraint, in: Proceedings of the Computer Vision and Pattern Recognition, 2011, pp. 1825–1832. [14] D. Nasuto, J.M.B.R. Craddock, Napsac: High noise, high dimensional robust estimation-it’s in the bag[C], Proc. Brit. Mach. Vision Conf. (2002) 458–467. [15] O. Chum, J. Matas, Matching with PROSAC-progressive sample consensus, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, IEEE, 2005, pp. 220–226. [16] R. Raguram, J.-M. Frahm, M. Pollefeys, A comparative analysis of RANSAC techniques leading to adaptive real-time random sample consensus, in: Proceedings of the Computer Vision – ECCV 2008, 2008, pp. 500–513. [17] K. Ni, H. Jin, F. Dellaert, GroupSAC: efficient consensus in the presence of groupings, in: Proceedings of the Twelfth IEEE International Conference on Computer Vision, IEEE, 2009, pp. 2193–2200. [18] O. Chum, J. Matas, Randomized RANSAC with TD, D test, in: Proceedings of the British Machine Vision Conference, 2, 2002, pp. 448–457. [19] D.P. Capel, An effective bail-out test for RANSAC consensus scoring., in: Proceedings of the BMVC, 2005. [20] J. Matas, O. Chum, Randomized RANSAC with sequential probability ratio test, in: Proceedings of the Tenth IEEE International Conference on Computer Vision, 2, IEEE, 2005, pp. 1727–1732. [21] O. Chum, J. Matas, Optimal randomized RANSAC, IEEE Trans. Pattern Anal. Mach. Intell. 30 (8) (2008) 1472–1482. [22] T.L. Lai, Sequential Analysis, Wiley Online Library, 2001. [23] O. Chum, J. Matas, J. Kittler, Locally optimized RANSAC, in: Proceedings of the Joint Pattern Recognition Symposium, 2003, pp. 236–243. [24] R. Raguram, J.M. Frahm, M. Pollefeys, Exploiting uncertainty in random sample consensus, in: Proceedings of the IEEE International Conference on Computer Vision, 2010, pp. 2074–2081. [25] J.-M. Frahm, M. Pollefeys, RANSAC for (quasi-) degenerate data (QDEGSAC), in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, IEEE, 2006, pp. 453–460. [26] D.G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, Kluwer Academic Publishers, 2004. [27] O. Chum, T. Werner, J. Matas, Two-view geometry estimation unaffected by a dominant plane, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, IEEE, 2005, pp. 772–779. [28] R. Hartley, A. Zisserman, in: Multiple View Geometry in Computer Vision, second ed., Cambridge University Press, 20 0 0. [29] R. Raguram, O. Chum, M. Pollefeys, J. Matas, J.-M. Frahm, USAC: a universal framework for random sample consensus, IEEE Trans. Pattern Anal. Mach. Intell. 35 (8) (2013) 2022–2038.

96

R. Li et al. / Neurocomputing 328 (2019) 88–96 Rui Li received the B.S. degree in school of automation and is currently working in the school of computer science toward the M.S. degree at Northwestern Polytechnical University, China. His research interests include robust estimation and 3D reconstruction.

Yu Zhu received the B.S. and M.S. degrees from Northwestern Polytechnical University, Xi’an, China, in 2008 and 2011, respectively,where he is currently pursuing the Ph.D. degree with the School of Computer Science. His current research interests include image processing, image super resolution.

JinQiu Sun received her B.S. degree from Northwestern Polytechnical University in 1999, M.S. and Ph.D. Degree from Northwestern Polytechnical University in 2004 and 2005, respectively.She is presently a Professor of School of astronomy, Northwestern Polytechnical University.Her research work focuses on signal and image processing, computer vision and pattern recognition.

Haisen Li received the B.Eng. and M.Eng. degrees from Northwestern Polytechnical University, Xi’an, China, in 2007 and 2011, respectively, where he is currently pursuing the Ph.D. degree with the School of Computer Science. His current research interests include image processing, sparse representation, and related problems.

Dong Gong received the B.S. degree in computer science from the Northwestern Polytechnical University, Xi’an, China. He is currently pursuing the Ph.D. degree with the School of Computer Science, Northwestern Polytechnical University. His current research interests include machine learning and optimization techniques and their applications in image processing (e.g., image deblurring and image denoising) and computer vision.

Yanning Zhang received her B.S. degree from Dalian University of Science and Engineering in 1988, M.S. and Ph.D. Degree from Northwestern Polytechnical University in 1993 and 1996 respectively. She is presently a Professor of School of Computer Science and Technology, Northwestern Polytechnical University. She is also the organization chair of ACCV2009 and the publicity chair of ICME2012. Her research work focuses on signal and image processing, computer vision and pattern recognition. She has published over 200 papers in these fields, including the ICCV2011 best student paper. She is a member of IEEE.