Wireless RSSI fingerprinting localization

Wireless RSSI fingerprinting localization

Signal Processing 131 (2017) 235–244 Contents lists available at ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/sigpro R...

624KB Sizes 119 Downloads 364 Views

Signal Processing 131 (2017) 235–244

Contents lists available at ScienceDirect

Signal Processing journal homepage: www.elsevier.com/locate/sigpro

Review

Wireless RSSI fingerprinting localization Simon Yiu n, Marzieh Dashti, Holger Claussen, Fernando Perez-Cruz Bell Laboratories, Nokia, 600 Mountain Avenue, Murray Hill, NJ 07974, USA

art ic l e i nf o

a b s t r a c t

Article history: Received 22 March 2016 Received in revised form 17 May 2016 Accepted 5 July 2016 Available online 14 July 2016

Localization has attracted a lot of research effort in the last decade due to the explosion of location based service (LBS). In particular, wireless fingerprinting localization has received much attention due to its simplicity and compatibility with existing hardware. In this work, we take a closer look at the underlying aspects of wireless fingerprinting localization. First, we review the various methods to create a radiomap. In particular, we look at the traditional fingerprinting method which is based purely on measurements, the parametric pathloss regression model and the non-parametric Gaussian Process (GP) regression model. Then, based on these three methods and measurements from a real world deployment, the various aspects such as the density of access points (APs) and impact of an outdated signature map which affect the performance of fingerprinting localization are examined. At the end of the paper, the audiences should have a better understanding of what to expect from fingerprinting localization in a real world deployment. & Published by Elsevier B.V.

Keywords: Fingerprinting localization Location-based service (LBS) Received signal strength indicator (RSSI) Pathloss model Gaussian Process Non-parametric model Machine learning

Contents 1. 2. 3.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Offline phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Online phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Traditional fingerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Parametric model – pathloss model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Non-parametric model – Gaussian process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Benchmark experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. k-NN for traditional fingerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Combining RSSI measurements from multiple MAC addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. Density of training database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5. Density of access points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6. Density of RSSI signature map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7. Impact from outdated signature map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

235 236 237 237 237 238 238 238 239 240 241 241 241 242 242 242 243 243 244

1. Introduction n

Corresponding author. E-mail addresses: [email protected] (S. Yiu), [email protected] (M. Dashti), [email protected] (H. Claussen), [email protected] (F. Perez-Cruz). http://dx.doi.org/10.1016/j.sigpro.2016.07.005 0165-1684/& Published by Elsevier B.V.

Recent applications in location based services (LBS) have stimulated extensive research on wireless localization [1–4]. Among all the localization technologies, wireless fingerprinting has been proven as an effective technique due to its simplicity and

236

S. Yiu et al. / Signal Processing 131 (2017) 235–244

deployment practicability [5–12]. Wireless fingerprinting localization avoids hardware deployment cost and effort by relying on existing network infrastructure such as WiFi (e.g. IEEE 802.11 [13]) or cellular (e.g. long term evolution (LTE) [14,15]). Fingerprinting localization works in two phases: an offline training phase and an online localization phase. During the training phase, radio frequency (RF) measurements (also known as signatures or fingerprints) at known locations are collected in a database. The fingerprint database is also referred to as the radiomap [16] and we use these terms interchangeably in this work. During the online phase, users determine their location by comparing the real-time RF measurement with the entries in the database. The majority of previous research on fingerprinting localization utilizes received signal strength as the RF measurement due to its availability at both the transmitter and receiver sides. For example, there are billions of WiFi access points (APs) and their coverage is almost universal. At any location in dense urban areas, we can measure the received signal strength indicator (RSSI) from several tens (or hundreds) of them, which can be acquired easily by any Android device together with a position reference signal (PRS) from cellular communication standards. Churchill (House of Commons, November 11, 1947), “Many forms of [localization] have been tried, and will be tried in this world of sin and woe. No one pretends that [RSSI] is perfect or all wise. Indeed, it has been said that [RSSI] is the worst form of [localization] except for all those other forms that have been tried from time to time”. RSSI-based localization has many limitations, such as the received power's heavy dependency on the environment, the chipset, the antenna, and the orientation of the device [17]. But RSSI does not have a showstopper, as time of arrival or angle of arrival seem to have, due to the need for stringent synchronization or multiple antennas and the strong effect of multipath biases, which significantly impedes the use of those technologies [18]. RSSI can be assisted with accelerometers, gyroscopes, magnetometers, barometers or Bluetooth beacons to become more accurate [19]. It can also make use of blueprints or maps and nonlinear tracking to remove outliers [16]. However, in order to assist RSSI localization, we first need a baseline probabilistic RSSI-only localization algorithm. Although fingerprinting localization is one of the most exploited techniques in localization, there remains a lot of unsolved research problems. For example, previous research in the literature has reported localization error ranging from 3 m to 10 m for fingerprinting localization using WiFi RSSI [5–12] and errors in iOS or Android wide area localization are even larger. However, it is unclear what contributes to the performance discrepancy. In this paper, we attempt to provide a comprehensive review on fingerprinting localization and answer these open questions. We consider three methods of generating the radiomap in this work. We begin by reviewing the simplest form of fingerprinting localization which relies only on measurements to create the radiomap. This method is referred to as traditional fingerprinting. In this method, users are localized by comparing the real-time measurement with the entries in the radiomap using a k-nearest neighbor (k-NN) algorithm. Next, we consider the scenario where a parametric pathloss regression method is used to help generating the radiomap. Finally, we consider the non-parametric GP [20] regression method to help generating the radiomap. The parametric and non-parametric regression methods are useful in large area where it is not practical to do an exhaustive measurement campaign due to time and labor constraints. We would like to find out how the performance of the three aforementioned methods compares in terms of localization performance in a real-world deployment. In our experiments which is based on real-world deployment, we look at several elements which may potentially affect the

localization performance of fingerprinting localization. First, we consider the scenario where we have measurements in a subset of locations. In particular, we would like to find out how sparse measurements affect localization performance. Second, the impact of the density of APs on the localization performance is investigated. Third, we present several methods to combine multiple temporal measurements should they be available. Forth, we attempt to improve the performance by combining measurements from multiple media access control (MAC) addresses. Fifth, we shed light on how sparse the radiomap can be when a regression method is used to generate the radiomap. This is important as a smaller database reduces the search effort and computational complexity. Finally, the impact of an outdated signature map is studied. We summarize the major findings and contributions of this paper as follows:

 Traditional fingerprinting performs best when the same data-





 





base is used for training and online testing, an unrealistic assumption where the independent and identically distributed (i.i. d.) assumption holds. The performance degrades with the density of the available database. The performance also degrades significantly if a different database (e.g. taken at a different time or day or with a different device or by a different person) is used for testing, because in this case the i.i.d. assumption does not hold. The GP and pathloss model are more robust than tradition fingerprinting localization when only partial measurements in a subset of location are available. The performance of GP and pathloss model are relatively stable when the number of training points decreases. Until a certain point in which there is not enough information and the performance degrades rapidly. When multiple temporal measurements are available at a location, we could not come to a conclusion of whether using the mean or maximum value of the measurements provide the best performance. There are arguments for using either, but we found no empirical evidence to support either. The localization performance decreases with decreasing number of APs. When an AP transmits using multiple MAC address, i.e., there are different RSSI entries corresponding to those MAC addresses, it is beneficial to consolidate these RSSI values when the training dataset is small. For a full training dataset, it is better to treat the entries as they are from different APs. For the GP and pathloss regression model, the radiomap can be generated offline a priori. We do not see a degradation in performance until the density of the radiomap generated is less than 3 m2, i.e., a fingerprint signature is generated every 3 m2 or larger. Among the three methods of generating the radiomap, GP is the most robust in terms of database and test track mismatch, i.e., online testing is performed in a different time or date or device. On the other hand, traditional fingerprinting performs the worst.

The rest of the paper is organized as follows. In Section 2, related work in the literature is presented. The network model under consideration is presented in Section 3. Background material on nearest neighbor, the parametric pathloss regression model, and the non-parametric Gaussian process model are introduced in Section 4. Experimental results with different parameters are studied and discussed in Section 5 followed by some conclusions in Section 6. 2. Related work Localization techniques can be generally classified into two

S. Yiu et al. / Signal Processing 131 (2017) 235–244

main categories: Infrastructure-free and Infrastructure-based approaches. Infrastructure-free approaches focus on leveraging existing infrastructure such as WiFi [21–26], FM, TV [27–29], Global System for Mobile communications (GSM) [30,31], geo-magnetic [32], and sound signals [33] to enable localization. On the other hand, infrastructure-based approaches rely on deploying dedicated RF infrastructure such as RFID [34], infrared [35], ultrasound [36], Bluetooth and/or visible lights [37] for localization purpose. In this work, we consider Location fingerprinting (LF) which is an infrastructure-free approach without the requirement of deploying expensive hardware. LF relies on existing RF infrastructure and determines a UE's location by comparing the UE's real-time RSSI readings against the pre-recorded entries in the radiomap database. The radiomap is constructed in an offline training phase and it contains the RSSI readings from detectable access points (APs) at multiple known locations (reference points (RP) or calibration points). LF requires an updated radiomap of the area of interest to provide the accuracy that meets the requirements of commercial LBS. A common practice to construct a radiomap is to manually collect fingerprints at multiple known locations in the entire building. Obviously manual calibration is a labor-intensive, tedious and time consuming task especially when measuring large areas. To reduce the human effort, self-guided robots equipped with inertial measurement unit (IMU) sensors can roam around and explore the space of interest to collect training data [38]. Using robots is currently not a global economic approach. Radio maps can also be constructed automatically using crowd-sourcing and machine learning methods [39–42]. Crowd-sourcing relies on volunteers willing to participate in data collection [43]. This approach makes use of random traces of measurements (RF measurement and inertial sensor measurements) collected by volunteers carrying smartphones as they walk around the localization area during their daily routines. Obtaining large enough number of traces that cover the whole building requires spending a long period of time (e.g., a week). The crowdsourcing approach is computationally complex, time consuming, and obtaining high accuracy is challenging. Recently, a new method has been proposed, where an AP plays a role as a fixed site-survey collector [44]. Every AP scans the RSSI values from other APs. Then the GP method is used to learn the power distribution of the AP's signal over the whole area of interest. This method requires prior knowledge of the positions of all APs, which is often not available. The number of distributed APs should also be sufficient, which might be a limiting factor. The time and effort required to build the RF signature map during the offline phase have prompted research in simultaneous localization and mapping (SLAM) [45,46]. However, although the effort to build the RF signature map is eliminated, the performance is generally not good enough for most practical indoor applications. In this paper, the effort of building the RF signature map is reduced by modeling the received signal strength as a GP. Previously, GP was used for WiFi position estimation in an industrial environment [47]. It was integrated with the laser-localizer on the hot metal carrier (HMC). The mean function and covariance kernel used are different from the one assumed in this work. In [48], GP was used to predict WiFi and GSM signal strength for location estimation purpose. A different mean function was assumed and the hyperparameters were estimated by using conjugate gradient descent method. Bekkali et al. [49] applied GP to an indoor environment. It is unclear what algorithm was used to train the hyperparameters. Also, the mean function was not utilized. In [50], the WiFi-SLAM problem is addressed by using GP latent variable models. An adaptive particle filter was used as the localizer in [51] and the gradient descent algorithm was used for hyperparameters estimation. Similar to [50], a mean function was not employed. In

237

this paper, we propose a GP algorithm that exploits only few sparse manually collected training data and builds a reliable radiomap efficiently. Our GP algorithm does not require any prior knowledge of the AP's locations and hence works in different environments and is suitable for industrial implementation. Other existing works propose using data from augmented sensors in the smartphones, e.g. accelerometers and compasses, together with RSSI information in the filter to enhance the location estimation accuracy [19]. However, these sensors are still not widely used in many cell phones. Moreover scanning inertial measurements continuously drains the phone battery rapidly. In order to have a solution that works for every basic and smart phone, and with minimum energy overhead, we only use the measurements that are part of the standard reporting of all cell phones. Signal strength measurements are part of the standard per call measurement data (PCMD) reporting.

3. Problem statement We consider a two dimensional area  where a localization service is of interest.1 It is assumed that wireless service is provided to the area with a homogeneous wireless technology. In this work, it is assumed that the entire area has WiFi coverage. In particular, it is assumed that the area is served by a sufficient number of APs. There is enough redundancy (several tens of APs) to be able to compensate the errors and deviations mentioned in the introduction. It should be noted that not all the APs serve the whole area in  , this condition deals with dead-zones in which APs are not heard, as well as scalability issues, when covering a large area. We consider a downlink localization technology based on downlink signal transmitted from the APs. An AP advertises its service availability by broadcasting its MAC address. It should be noted that some modern APs have the capability to broadcast more than one MAC address in the same or different wireless channels. At the receivers, e.g., mobile phone or tablet, the power of the received RF signal from all APs is measured as RSSI. APs that are far enough away to result in a RF signal below the received antenna's sensitivity level will not be detected by the receiver. 3.1. Offline phase It is assumed that the area  ∈ 2 is discretized into a set of L known locations  = {xl |l = 1…L}, where xl represents the 2-dimensional (2-D) Cartesian coordinate of location l. A RM consisting of RSSI values is collected a priori at these locations offline. Commonly, RSSI is scanned for a certain period of time to record multiple temporal samples from every AP to tolerate some degree of noise. It is assumed that T temporal samples from all N unique MAC addresses are collected for all L locations. The RSSI values are collected in a three dimensional matrix D with dimension L × N × T . The RSSI sample collected at location xl from MAC address n at time index t is denoted as ψnt (xl ), t = 1…T , l = 1…L , and n = 1…N . For locations where certain APs cannot be heard, ψnt (xl ) can be replaced by a constant such as  110, i.e., the device sensitivity level. 3.2. Online phase During the online phase, a receiver at an unknown location x listens to all the APs in the area and collects the RSSI measurements rnp (x ) in a two dimensional database R with dimension 1 We consider a multistory building as disconnected 2-D spaces and we do not attempt 3D localization on a given floor.

S. Yiu et al. / Signal Processing 131 (2017) 235–244

N × P where P is the number of temporal measurements made at each location during the online phase. It should be noted that P may not be same as T. Furthermore, it is assumed that the same N MAC addresses that were heard in the offline phase are also heard during the online phase. This is clearly not the case, because we have N overall available APs in the offline database that might not be heard from each location, moreover there might be new APs that can be heard during the online phase which are not in the database. For the latter, the data is discarded for localization purposes (it can be stored for improving the fingerprinting database), because we do not have a reference in the database for comparison. For the APs that are in the database but not heard by the device there are two alternatives: set a value that is equal to the sensitivity of the device (about  110 dBm); or, ignore it all together. The first one is the best approach if the listening time is long enough. In this case, we can say we have gathered all the available MAC addresses in our current location and conclude that the device is far away from the missing APs. The second one is the best approach, if we have only listened for a short period of time and there might be MAC addresses that are available but we did not get to record their RSSI values. In our measurements, we listen for about 5 s and assume that we get most of (if not all) the available APs.

4. Background The radiomap database can be obtained/generated by different methods. The three methods that are used to generate the fingerprint database are introduced in this section.

Histrogram 150

100

Count

238

50

0

−85

−75

−65

−55

−45

−35

−25

−15

−5

RSSI (dBm) Fig. 1. Histogram of RSSI values for a APs 2 m apart. RSSI value is 8 bit number in which 255 represents 0 dBm and 160  95 dBm.

from the k-NN ( k > 1) algorithm generally does not coincide with the locations in  due to the averaging. For T ≠ 1 and P ≠ 1, there is more than one temporal measurement for both the database and online measurement for a given location. In this case, either the mean or max RSSI values can be used to replace ψn (xl ) and rn (x ) in (1):

r¯n (x ) = mean rnp (x ), p

ψ¯n (x ) = mean ψnt (xl ) t

4.1. Traditional fingerprinting

r¯n (x ) = max rnp (x ),

For traditional fingerprinting, a measurement campaign has to be carried out for all L locations offline. For ease of exposition, we first consider the case of T = P = 1. For T = P = 1, the radiomap is essentially a L × N matrix and the online measurement at unknown location x ∈  is a 1 × N vector. Recall that  ∈ 2 and therefore, x may not coincide with xl . The one nearest neighbor (1NN) algorithm can be used to determine the best location estimate x*:

The rational behind using the mean value is assuming that the deviation in RSSI is governed by thermal noise and hence averaging should provide a more accurate estimate. However, the measured RSSI is in many cases affected by fading and interference and the change in RSSI might be considerable. For example, if we are physically close to the WiFi APs, we observe drop in the RSSI measurements in the range of 20 dB to 50 dB when the beacon packages collide. In Fig. 1, we show the histogram of the RSSI values for an AP that is 2 m away from the reading device. The mean value would typically be around 200 ( 55 dBm), which suggests that the reading device is significantly further away than we actually are.

⎧ N ⎫ ⎪ ⎪ x* = argmin ⎨ ∑ (ψn (xl ) − rn (x ))2⎬⎪. ⎪ xl ⎩ n= 1 ⎭

p

(1)

Basically, the 1-NN algorithm outputs the location that is closest to the real time received signal in the RSSI space. It can be seen that the 1-NN only localizes user to any of the L locations with prior measurements. This means that when an error occurs, the minimum error is equal or greater than the minimum distance between all known locations xl with measurements given by

min ∥ x i − x j ∥2 .

x i, x j ∈  i≠j

(2)

As we see in the experimental section, this is the fundamental reason why 1-NN performs poorly when only a sparse radiomap is available. Instead of finding the xl location which minimizes (1), one can N rank xl according to the cost function ∑n = 1 (ψn (xl ) − rn (x ))2 in ascending order. The mean of the k location estimates with the highest rank can be used as the location estimate. This is referred to as the k-NN algorithm and, as we see in Section 5.2, generally performs better than the 1-NN algorithm. It should be noted that while the user is only localized to one of the L locations in  for the 1-NN algorithm, the resulting location estimate x* obtained

ψ¯n (x ) = max ψnt (xl ).

(3)

t

(4)

4.2. Parametric model – pathloss model As mentioned before, obtaining the full radiomap of an area is time consuming and labor intensive. To reduce the cost and effort of collecting a full radiomap database, a parametric model can be used to model the RSSI measurements. As we are modeling the radio signal, a natural choice for the model is the pathloss model. In particular, the following two models are considered and the estimated received RSSI from AP n at location z ≜ (x, 0) is given by

ψn* (z ) = C + α log10 (∥ z − z nAP ∥)

(5)

ψn* (z ) = C + α log10 (∥ z − z nAP ∥) + β (∥ z − z nAP ∥).

(6)

Eqs. (5) and (6) are referred to as the hyperbolic model and the mixture model in this work, respectively. In the above equation, znAP represents the relative location of AP n to z in three dimensional Cartesian coordinate. α is the pathloss exponent (for simplicity, the factor 10 is factored into α), whereas C is the pathloss at

S. Yiu et al. / Signal Processing 131 (2017) 235–244

a reference point 1 unit away from z nAP . The only difference between (5) and (6) is that a linear multiplicative term β (∥ z − z nAP ∥) is introduced in (6). The term is useful to model linear pathloss due to walls (and other obstacles) commonly found in office space. In order to apply (5) and (6) to make RSSI estimation, one needs to first learn the parameters C, α, β and z nAP . For convenience, all non-zero temporal RSSI measurements at the L training locations are collected in a column vector yn , yn ≜ {ψnt (zl )|ψnt (zl ) ≠ 0, ∀ l, t} and the corresponding training locations zl are collected in vn . Applying vn to (5) and (6), we obtain the estimated RSSI vector yn* = ψn* (vn ). It should be noted that the same location zl can appear in vn more than once if there are multiple non-zero temporal measurements for that location. Algorithm 1. Optimization algorithm to learn parameters of the pathloss model. 1: 2:

AP Initialize t¼ 1, z nAP , z^n (t ) = znAP ; Obtain solution for C, α and, β using yn and vn assuming

AP z^n (t ) is the location of AP n;

3: 4: 5: 6: 7: 8:

AP Compute yn* using z^n (t ) and the optimized parameters C, α and, β; Compute the standard deviation of the error: σ (t ) = std (yn* − yn );

Set t¼2; while t < MaxGeneration do AP z^n (t )

AP z^n (t

= − 1) + rand (3, 1); Obtain new solution for C, α and, β using yn and vn asAP suming z^n (t ) is the location of AP n;

9:

AP Compute yn* using the z^n (t ) and the optimized parameters C, α and, β; 10: Compute the standard deviation of the error: σ (t ) = std (yn* − yn );

11: 12: 13: 14: 15: 16: 17:

if ( exp ( − (σ (t ) − σ (t − 1)) × length (yn )) < rand) then

σ (t ) = σ (t − 1); AP AP z^n (t ) = z^n (t − 1); end if t = t + 1; end while Copt = C , αopt = α , βopt = β ,

^ AP z nAP , opt = ∑t z n (t ) /MaxGeneration

Based on the set of training data  n = {yn , vn }, we use the optimization described in Algorithm 1 to learn the parameters for each AP. The algorithm rational is based on the Metropolis–Hastings sampling algorithm [52]. For a given location of the APs, the values of C, α and β can be computed by least squares and we update the position of the APs using a random walk. If the new position of the AP provides a lower error we accept it, if the new position of the AP provides a larger error, we accept with a probability that is proportional to the ratio of the errors. The Metropolis–Hastings algorithms allows exploring the potential locations of the APs and avoid getting trapped in local minima. In the first step, the time index is initialized to t¼1. A random AP AP AP location z AP is generated and set z^ (t ) = z AP . With z^ (t ), the n

n

n

n

solution to C, α and β which minimizes the sum of squared error is obtained by using the least squares method. This is done in step 2. AP The estimated RSSI vector y* is obtained by using z^ (t ), and the n

n

optimized C, α and β in step 3. The standard deviation of the error

239

vector σ (t ) = yn* − yn is logged in step 4. The time step is incremented to 2 in step 5 and the algorithm enters a while loop in step 6. Within the while loop, a new AP location is generated acAP AP cording to z^ (t ) = z^ (t − 1) + rand (3, 1) in step 7. Then in step 8, n

n

parameters C, α and β are obtained again from the least squares method using the new AP location. In step 9, the estimated RSSI vector yn* is computed and the corresponding standard deviation of the error vector σ (t ) is obtained in step 10. If the standard deviation in the previous time step is smaller than the standard deviation in the current time step, discard the newly generated AP AP AP z^ (t ) and set σ (t ) = σ (t − 1) and z^ (t ) = z^ (t − 1). This is ren

n

n

peated for a pre-defined number of iterations MaxGeneration. Finally, the optimized parameters are given by Copt = C , αopt = α , AP β = β , z AP = ∑ z^ (t ) /MaxGeneration . n, opt

opt

t

n

Once the optimized parameters are obtained, (1) and (2) can be used to estimate the received RSSI from AP n at any arbitrary location z . The algorithm can be used to generate radiomap of any resolution in  when only measurements from a few training locations are available. Suppose (1) and (2) are used to generate a radiomap consisting of RSSI predictions at locations zq , q = 1… , Q . The following localization algorithm can be used to estimate the location of a user with real-time RSSI measurement rn (z ), N ⎧ ⎫ ⎪ ⎪ z^ = argmin ⎨ ∑ (ψn* (zq ) − rn (z ))2⎬⎪. ⎪ zq ⎩ n= 1 ⎭

(7)

If more than one time measurement is available during the online phase, either the mean of the received RSSI r¯n (z ) = meanp rnp (z ) or the maximum value of the received RSSI r¯n (z ) = maxp rnp (z ) can be used to replace rn (z ) in (7). 4.3. Non-parametric model – Gaussian process The parametric pathloss model presented in the last section generates the radiomap deterministically. Another way to generate the radiomap is to use a non-parametric model. In this case, the radiomap is generated probabilistically. In particular, we consider the GP. GPs are ideally suited for representing the complex likelihood models of RSSI measurements [53]. They overcome various limitations of other existing techniques: they do not rely on a discrete representation of space, they are non-parametric and can thus represent arbitrary likelihood models, they correctly represent uncertainty due to sparse training data, and they enable the consistent estimation of hyperparameters. We model the RSSI measurement ψnt (x ) as a noisy measurement given by ψnt (x ) = fn (x ) + nnt where fn (x ) is the noiseless RSSI value at location x from AP n and nnt is the measurement noise which is modeled as i.i.d. zero-mean Gaussian random variable with variance σ 2t . It should be noted that the temporal variation behavior nn

of RSSI is captured in the noise term. The noiseless RSSI value fn (x ) is modeled as a GP with zero mean and covariance functions cn (x m, xk ), x m, xk ∈  . As with the parametric model, all non-zero temporal RSSI measurements at the L training locations are collected in a column vector yn , yn ≜ {ψnt (xl )|ψnt (xl ) ≠ 0, ∀ l, t} and the corresponding training locations xl are collected in rows of wn . Again, it should be noted that the same location xl can appear in wn more than once if there are multiple non-zero measurements for that location. An important property of GP is that any function fn (x ) drawn from a GP, the marginal distribution over any set of input points x ∈  has a joint multivariate Gaussian distribution. Since nnt is i.i.d. zero-mean Gaussian random variable, ψnt (x ) is also a GP. Furthermore, the marginal distribution of yn over input locations wn must also have a joint multivariate Gaussian distribution. In particular,

240

S. Yiu et al. / Signal Processing 131 (2017) 235–244

⎛⎡ ⎤ ⎜ ⎢ 0⎥ yn ∼  ⎜ ⎢ ⋮ ⎥, ⎜ ⎢ 0⎥ ⎝⎣ ⎦

⎞ ⎡ c (w1, w1 ) … c (w1, w M ) ⎤ n n n ⎥ ⎟ ⎢ n n n 2 ⋮ ⋱ ⋮ ⎥ + σ n t IM × M ⎟. ⎢ n ⎟ ⎢ c (w M , w1 ) … c (w M , w M )⎥ ⎣ n n n ⎠ n n n ⎦

(8)

Eq. (8) can be written compactly in vector form as yn ∼  (0, Cn + σ n2t IM × M ). In the above equation, wnm indicates the n

m-th row of wn . Note that a mean function can be used to replace the zero vector in the above equation. However, for this work, we simply shift all the RSSI values by 93 as the smallest observed RSSI value in the database is  92 dB (or by 110, which is the device sensitivity). A different covariance function can be used for cn (wnm, wnk ). For example, a square-exponent covariance function

cn (wnm, wnk ) = σn2 exp ( − (wkm − wnk )/ln )

(9)

was used in this work. θn = [σn2, ln, σn2n ] is referred to as the hyperparameters of the GP. They are called hyperparameters because they loosely define the structure of the non-parametric model. Training of the hyperparameters is explained later in this section. Our goal is to use the non-parametric model to predict RSSI measurements at any arbitrary location x*. It should be noted that the marginal distribution of ψnt (x ) over the training location wn and test location x* has a joint multivariate Gaussian distribution. Therefore, by using rules of conditional probability of Gaussian random variable and assuming knowledge of the trained hyperparameters θn, it follows that the estimated RSSI ψnt (x*) at x* given prior measurements yn taken at locations wn is normally distributed as follows:

ψnt (x*)|yn , wn, x*, θn ∼  (μ n (x*), σn2 (x*))

should be taken to ensure that the solution is not from bad local maxima [54]. It should be noted that the training data used for training the hyperparameters do not have to be the same as the training data n used for making the RSSI prediction in (10)–(12). As mentioned before, the hyperparameters only loosely define the structure of the RSSI measurement through the prior model. The length scale ln determines the length of the wiggles in the function. In general, measurements that are taken ln units away can be considered as uncorrelated. The output variance sn2 determines the average distance of the random function away from its mean. Since measurements from all APs are available at all locations, it is possible to train the hyperparameters by using all available measurements, i.e., [y1; y2 ; … ; yN ] and [w1; w2; … ; wN ]. In this case, the trained hyperparameters can be used for all APs. During the prediction phase, the training data n from each AP can be used in (10)–(12) to predict the mean and the variance of the RSSI measurement from that AP. Suppose Eqs. (10)–(12) are used to generate the mean and variance of the predicted RSSI at locations x q , q = 1… , Q , i.e., a radiomap consisting of mean and variance at Q locations is obtained. The best location estimate given the received RSSI rn (x ) at unknown location x is given by the maximum likelihood decision rule N

x^ = argmax ∏ pn (rn (x )|μ n (x q ), σn2 (x q )), xq

pn (rn (x )|μ n (x q ), σn2 (x q )) ∝

(10)

where

μ n (x*) = − 93 + bnT (Cn + σ 2t IM × M )−1(yn + 93) nn

(11)

(13)

n= 1

⎛ ⎞ −(rn (x ) − μ n (x q ))2 ⎟ exp ⎜ . ⎜ ⎟ 2σn2 (x q ) 2πσn2 (x q ) ⎝ ⎠ (14) 1

If more than one time measurement is available during the online phase, either the mean of the received RSSI r¯n (x ) = meanp rnp (x ) or the maximum value of the received RSSI r¯n (x ) = maxp rnp (x ) can be used to replace rn (x ) in (13) and (14).

and

(12) wnm ),

In the above equations, [bn ]m = cn (x*, m = 1, … , M and pn = cn (x*, x*). Essentially, given the training data n = {yn , wn }, the posterior predictive distribution of ψnt (x ) at any arbitrary location x ∈  without prior measurements can be obtained. Note that, the above method can be also used to predict measurement at a location with known measurements. In this case, the GP is able to indicate how noisy the measurements are. Similar to the radiomap generated by the parametric model presented in the last section, a user can be localized to any location in  , including locations without prior measurements. This is contrary to traditional fingerprinting localization with 1-NN where a user can only be localized to the locations with prior measurements.2 The hyperparameters for each AP θn can be trained by maximizing the log marginal likelihood of observing yn given wn and the hyperparameters. Note that this is simply a multivariate Gaussian random variable with zero mean and covariance matrix Cn + σ 2t IM × M . To maximize this probability, denote the log marnn

ginal likelihood as log p (yn |wn, θn ). By taking the partial derivatives of the log marginal likelihood with respect to each hyperparameter, a numerical optimization method such as conjugate gradients can be used to optimize θn . This optimization problem is generally a non-convex optimization problem and therefore, care 2 As we have seen in Section 4.1, for traditional fingerprinting, post-processing can be performed to arrive with a location estimate without prior measurement. For example, if localization algorithm such as the k-NN is used, the final location estimate is not in the fingerprint database.

5. Experiments In this section, we present some experiments based on the three different methods to generate the radiomap introduced in Section 4. The test area is a 2500 m2 enterprise building, with 27 office cubicles, 16 meeting rooms, and corridors. A radiomap consisting of measurements from L¼ 235 locations is collected. The measurement locations are shown in Fig. 2. To obtain the measurements, the data-collector person walks through the whole 110 Radiomap Track 1 Track 2 Track 3 5% of Radiomap

100 90 80 y [m]

σn2 (x*) = pn − bnT Cn−1b n.

70 60 50 40 30 80

90

100

110

120

130

140

150

x [m]

Fig. 2. Measurement locations of the radiomap and three test tracks. (For interpretation of the references to color in this figure caption, the reader is referred to the web version of this paper.)

S. Yiu et al. / Signal Processing 131 (2017) 235–244

e = ∥ x − x^ ∥2 .

(15)

We also consider the root-mean-square (RMS) error. The RMS error for an error vector of length M is defined as

erms =

1 M

Empirical CDF 1 GP Hyperbolic Mixture: α=−40 Traditional: 1−NN

0.9 0.8 0.7 0.6 CDF

building for less than an hour, and collects the data using a Google Nexus tablet and a data collection application (app) developed for this purpose. Each measurement location is geotagged manually by the data-collector tapping his/her location on the building map displayed on the tablets screen. At each measurement location, T ¼5 consecutive scans were made. It should be reminded that a MAC address may appear in one scan but not in another scan. The database referenced N ¼174 different MAC addresses and is referred to as Track 0. As mentioned earlier, some of these MAC addresses belong to the same WiFi router. We will discuss on how to possibly aggregate these MAC addresses in Section 5.3. To simulate the online phase, 3 different track databases are considered. Tracks 1, 2, and 3 consist of 171, 106, and 29 locations, respectively. P ¼5 measurements were taken at each location, and all track measurements were taken on different dates. The measurement locations of the radio map and all three tracks are shown in Fig. 2. Performance metrics: The performance metric used in this work is localization error defined as

241

0.5 0.4 0.3 0.2 0.1 0

0

5

10

15

20

25

Localization Error [m]

Fig. 3. Localization error of fingerprinting based on various radiomaps. Full measurement database. N ¼174 MAC addresses. Track 1. Table 1 RMS localization error in meters for all tracks and radiomaps generated by different methods. Full measurement database. N ¼174 MAC addresses. GP

Hyperbolic

Mixture

Traditional: 1-NN

6.0957 5.4712 5.1695

8.2512 6.5876 7.0417

7.9879 6.5004 6.4434

7.7251 6.1220 7.3700

M



2 em .

m=1

(16)

Track 1 Track 2 Track 3

Empirical CDF

5.1. Benchmark experiment

0.9 1−NN 4−NN

0.8 0.7 0.6 CDF

We performed a long list of experiments in this work. In this subsection, we define a benchmark experiment so that future experiments can be compared with this one. The benchmark experiment compares the performance of traditional fingerprinting and the radiomap generated by the three methods presented in Section 4. For traditional fingerprinting, each location in the test track can be localized to one of the 235 locations in the fingerprint database (Track 0). Specifically, the radiomap is purely based on measurement. The other three radiomaps are generated based on the GP, mixture, and hyperbolic model. For the parametric model (hyperbolic and mixture), the training data  n is used to optimize the model parameters for each AP. The optimized model parameters are then used to estimate the RSSI value for the entire area  at a resolution of 1 m2. It is found that the length of yn for some APs are very short meaning that those APs are far away from the area. These far away and weak APs are excluded as they do not provide high quality RSSI information and only those APs with card (yn ) > 50 are trained and used for localization. For the non-parametric model (GP), measurements from all APs are used to train the hyperparameters. Then, n is used to estimate the RSSI value for the entire area  at a resolution of 1 m2. Similar to the non-parametric case, only those APs with card (yn ) > 50 will be used for localization. Finally, the mean of the temporal scan was used in the localization rule in (7) and (13). Fig. 3 shows the cumulative distribution function (CDF) of the localization error for Track 1. It can be seen that the radiomap generated by GP outperforms the other radiomap obtained by actual measurements and the two non-parametric models. Table 1 shows the RMS localization error erms of all tracks and radiomaps. The RMS localization error is lowest when the radiomap is generated by GP for all three tracks. The radiomap generated by GP is consistently better than the other radiomaps for all three tracks.

1

0.5 0.4 0.3 0.2 0.1 0

0

5

10

15

20

25

Localization Error [m]

Fig. 4. Localization error with 1-NN and 4-NN. Full measurement database. N ¼ 174 MAC addresses. Track 1. 174 APs.

5.2. k-NN for traditional fingerprinting 1-NN was considered in the experiment shown in Fig. 3 for traditional fingerprinting. It is found that k-NN when k > 1 generally performs better than 1-NN. Fig. 4 compares the localization error of 1-NN and 4-NN for Track 1. At 90 percentile, 4-NN is almost 3 m better than 1-NN. The RMS localization error for 1-NN and 4-NN is 7.7251 m and 6.7283 m, respectively. 5.3. Combining RSSI measurements from multiple MAC addresses As mentioned before, some APs are capable of broadcasting multiple MAC addresses. The RSSI measurements from the same AP but with different MAC addresses can be aggregated. There are J¼ 50 aggregated APs after combining the RSSI measurements

242

S. Yiu et al. / Signal Processing 131 (2017) 235–244

from the same physical AP (and same 2.4 GHz channel) with different MAC addresses. Therefore, each aggregated AP can potentially have more than 5 RSSI measurements due to the combined RSSI values from multiple MAC addresses transmitted from the same AP. This is essentially a method to obtain more temporal measurements without having to spend time for performing additional scans. After aggregating the MAC addresses, the length of the training vector yj , j = 1…J will be typically longer than yn , n = 1…N . In general, the localization error is similar whether the MAC addresses are combined or not if full measurements are available for all radiomaps. Table 2 shows the RMS localization error of Tracks 1 and 3 with and without MAC address aggregation for various radiomaps assuming full measurements is available. As seen in the table, a conclusion cannot be drawn on whether there is a benefit to aggregate the MAC addresses from the same AP when full measurements are available. However, as we will see in the next subsection, when only partial measurements are available to construct the radiomap, consolidating the MAC addresses improves the localization performance significantly.

locations is maximized. Needless to say, once those locations are determined, the labor required to obtain measurements in those 12 locations is reduced significantly. The same experiment as in Fig. 3 is considered assuming 5% (12 locations) of the full database is available. Also, the MAC addresses are aggregated to J ¼50 APs as in the last subsection. For the radiomap generated by the non-parametric and parametric model, training is done similar to the full database case except that only the measurements from the 12 locations are assumed to be available. In other words, for each AP n, instead of training the hyperparameters based on n , only a subset of n corresponding to the 12 locations are used. The trained hyperparameters are then used to generate the fingerprint database using the same approach as in the full database case. Fig. 5 shows the localization error of Track 1. It can be seen that the radiomap generated by GP performs better than the other radiomaps. In particular, the hyperbolic model performs the worst. We found that with few data points available for training (measurement from only 12 locations), the solution obtained for αn for some APs is significantly off the typical range between  20 and  40 which impacts the predicted radiomap negatively. Table 3 shows the RMS localization error as a function of δ for Track 1 where δ is the percent of database assumed to be available for generating the radiomap. It can be seen that as delta decreases, the RMS localization error of GP stays relatively flat whereas the error grows as δ decreases for all other radiomaps. As mentioned in the previous subsection, when only a small subset of the radiomap was available, aggregating the MAC addresses improves the localization error. Table 4 shows the RMS localization error of all three tracks with and with MAC address aggregation when only 5% of the full database was assumed to be available. It can be seen that aggregating the MAC addresses reduces the localization error across the board.

5.4. Density of training database

5.5. Density of access points

The major drawback of traditional fingerprinting is the requirement of acquiring (and maintaining) the radiomap database. This is a labor intensive and time consuming effort. On the other hand, when only a subset of the area is fingerprinted to reduce this effort, it is expected that the localization error increases. This is investigated in Fig. 5 where only 5% (12 locations) of the full database (Track 0) is assumed to be available. The 12 locations, represented by the filled red circle in Fig. 2, were selected heuristically such that the mutual minimum distance among all

As seen in the last subsection, consolidating the MAC addresses can potentially improve the localization performance. In this section, we investigate whether further reducing the number of APs has an impact on the localization performance. To further reduce the number of APs, we keep the APs which broadcast the most MAC addresses. We observed that for the J¼ 50 physical APs in the area, some APs broadcast up to 5 MAC addresses. Fig. 6 shows the histogram of the number of broadcast MAC addresses of each AP and it is found that there are 17 APs which broadcast 5 MAC addresses. We investigate the impact on the localization error if only those 17 APs are used. The rational behind using the AP with the most broadcasted MAC addresses is that they provide more measurements and allow the non-parametric and parametric model to capture the diversity of the measurements. Table 5 compares the RMS localization error of 50 vs 17 APs. It is found that further reducing the number of APs actually degrades the performance.

Table 2 RMS localization error in meters of Track 1 and Track 3 with and without MAC addresses aggregation. Full measurement database. GP

Track 1 Without aggregation With aggregation

Track 3 Without aggregation With aggregation

Hyperbolic Mixture Traditional: 1-NN

6.0957 8.2512

7.9879

7.7251

6.5748

7.8301

7.4186

8.0721

5.1695

7.0417

6.4434

6.3570

5.5863 6.2945

6.0344

7.2831

Empirical CDF 1 0.9 0.8 0.7

CDF

0.6

5.6. Density of RSSI signature map

0.5 0.4 GP Hyperbolic Mixture: α=−40 Traditional: 4−NN

0.3 0.2 0.1 0

0

5

10

15

20

25

30

Localization Error [m]

Fig. 5. Localization error of fingerprinting based on various radiomaps. 5% of full database (12 locations). J ¼50 aggregated APs. Track 1.

The radiomap generated by the non-parametric and parametric models so far has a resolution of 1 m2. It should be noted that the radiomap density has a direct impact on the localization complexity. As the radiomap density decreases, the complexity of the localization decision rule decreases as less comparison needs to be made with the radiomap. Table 6 shows the RMS localization error of Track 3 assuming full database and different radiomap resolutions. The density of the radiomap and complexity decreases as the radiomap resolution increases. It is found that the RMS localization error stays relatively the same from resolution 1 m2 to 4 m2. After that, the RMS localization error increases gradually. It

S. Yiu et al. / Signal Processing 131 (2017) 235–244

243

Table 3 RMS localization error in meters of Track 1 and radiomaps generated by different methods assuming δ% of full database was available. J ¼50 aggregated APs. δ

100%

75%

56%

42%

32%

24%

18%

13%

10%

8%

5%

GP Hyperbolic Mixture Traditional: 4-NN

6.5748 7.8301 7.4186 7.1746

6.2907 8.4694 7.2697 7.1133

6.2958 8.5210 7.3984 7.7858

6.4423 8.4591 7.5715 7.6563

6.3315 9.3177 7.2636 8.1945

6.4227 8.1986 7.4166 8.3877

6.3167 9.2503 7.7663 8.9986

7.1246 11.9746 8.0909 11.0387

7.0740 13.2857 11.1986 11.3684

6.7417 14.8389 8.5420 12.2231

8.8566 19.0852 14.1696 12.6178

Table 4 RMS localization error in meters for all tracks and radiomaps generated by GP. 5% of database was assumed to be available. With and without MAC addresses aggregation.

With aggregation Without aggregation

Track 1

Track 2

Track 3

8.8566 10.6867

7.9087 11.595

9.6042 11.4876

5

Number of broadcast MAC addresses

4.5 4 3.5 3 2.5 2 1.5 1 0.5 0

5

10

15

20

25

30

35

40

45

50

AP

Fig. 6. Number of broadcast MAC addresses of each AP.

Table 5 RMS localization error in meters of all tracks with 17 APs and 50 APs. Full measurement database. GP

Hyperbolic

Mixture

Traditional: 4-NN

Track 1

17 APs 50 APs

7.6964 6.5748

8.7235 7.8301

8.1284 7.4186

7.4932 7.1746

Track 2

17 APs 50 APs

7.2827 5.4305

6.9676 6.4508

6.8025 6.0703

8.3214 5.4049

Track 3

17 APs 50 APs

5.9161 5.5863

9.1388 6.2945

8.4710 6.0344

7.3815 6.0169

Table 6 RMS localization error in meters of Track 3 and radiomaps generated by different methods at different resolutions. Full measurement database. N ¼ 174 MAC addresses. Resolution

GP

Hyperbolic

Mixture

Traditional: 4-NN

1 2 4 8 16 32

5.1695 5.1327 5.0923 6.7262 7.9307 20.504

7.0417 7.1535 7.6834 7.6474 7.0466 22.1593

6.4434 6.5285 6.2146 7.3532 7.0466 21.1399

6.357 6.357 6.357 6.357 6.357 6.357

should be noted that the column for traditional fingerprinting with 4-NN stays the same regardless of the resolution because the radiomap of traditional fingerprinting depends only on the actual measurements and full measurements are assumed for this experiment. 5.7. Impact from outdated signature map In this subsection, we would like to find out the impact of an outdated radiomap on localization performance. From our experience, we found that if the test track is the same as one used for or to generate the radiomap, the performance tends to be better. However, this is unrealistic in practice. In other words, the localization performance will be overestimated if the same measurement is used for the radiomap and the test track. This is true even when one of the five temporal measurements is reserved solely for the test track and the other four measurements are used as the fingerprint database. In light of this, we would like to investigate such effect. In particular, we use the same measurements from the fingerprint database (Track 0) as the test track. The RMS localization error decreases for all radiomaps. Note that the localization error is 0 in the first row of Table 7 for traditional fingerprinting because when the test track is identical to the radiomap, the perfect location estimate minimizes (7). In the last row for Table 7, Track 0 is split into two parts: 56% and 44% of the database. 56% of the database was used as the radiomap whereas the remaining 44% was used for the test track. As can be seen from the table, even the radiomap and the track were taken on the same date around the same time, the localization error is greater than when the track is identical to the radiomap.

6. Conclusion This paper provides a comprehensive review on RSSI localization. RSSI localization is an important topic as it might become the de facto form of localization. This is especially true when most of the literature is concentrated on how to amend RSSI localization with other sensors or tracking, and the RSSI mapping decisions are not actually explained. On the other hand, if RSSI will be useful, it has to work on its own as well, because there are going to be many devices that can only work with the simplest measurements and without further assistance. In this paper, we have shown how RSSI localization works at its simplest. Also we have concentrated on those little decisions that are typically neglected. What can be done with repeated Table 7 RMS localization error in meters of Track 0 and Track 1 and radiomaps generated by different methods based on Track 0. J ¼50 APs. Radiomap

Track

GP

Hyperbolic Mixture Traditional: 1-NN

Track 0 Track 0 4.0106 6.3514 Track 0 Track 1 6.5748 7.8301 56% of Track 0 44% of Track 5.7115 7.6805 0

5.5994 7.4186 6.7104

0 8.0721 8.3187

244

S. Yiu et al. / Signal Processing 131 (2017) 235–244

measurements, how sparse the radiomap can be, what interpolation techniques make more sense or even how to treat the sensibility of the device. Finally, how the different interpolation techniques compare to each other and why we believe GP present the best trade-off between accuracy and ease of fingerprinting. To make RSSI localization deployable in practice, it will have to rely on sparse number of measurements.

References [1] C. Papamanthou, R.P. Preparata, R. Tamassia, Algorithms for location estimation based on rssi sampling, in: Proceedings of Algorithmic Aspects of Wireless Sensor Networks, Fourth International Workshop, ALGOSENSORS 2008, Reykjavik, Iceland, July 2008, pp. 72–86. [2] Z. Farid, R. Nordin, M. Ismail, Recent advances in wireless indoor localization techniques and system, J. Comput. Netw. Commun. 2013 (2013) 1–12. [3] H. Liu, H. Darabi, P. Banerjee, J. Liu, Survey of wireless indoor positioning techniques and systems, IEEE Trans. Syst., Man, Cybern. Part C: Appl. Rev. 2007 (2007) 1067–1080. [4] A.R. Kulaib, R.M. Shubair, M.A. Al-Qutayri, J.W.P. Ng, An overview of localization techniques for wireless sensor networks, in: Proceedings of 2011 International Conference on Innovations in Information Technology (IIT), Abu Dhabi, April 2011, pp. 167–172. [5] P. Bahl, V.N. Padmanabhan, Radar: an in-building rf-based user location and tracking system, in: Proceedings of Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM, vol. 2, 2000, pp. 775–784. [6] Martin Klepal, Stéphane Beauregard, et al., A novel backtracking particle filter for pattern matching indoor localization, in: Proceedings of the First ACM International Workshop on Mobile Entity Localization and Tracking in GPSless Environments, ACM, San Francisco, USA, 2008, pp. 79–84. [7] A.M. Ladd, K.E. Bekris, A. Rudys, L.E. Kavraki, D.S. Wallach, Robotics-based location sensing using wireless ethernet, Wirel. Netw. 11 (1–2) (2005) 189–204. [8] M.A. Youssef, A. Agrawala, A. Udaya Shankar, Wlan location determination via clustering and probability distributions, in: Proceedings of the First IEEE International Conference on Pervasive Computing and Communications (PerCom), 2003, pp. 143–150. [9] Veljo Otsason, Alex Varshavsky, Anthony LaMarca, Eyal De Lara, Accurate gsm indoor localization, in: Ubiquitous Computing (UbiComp), Springer, Tokyo, Japan, 2005, pp. 141–158. [10] Alex Varshavsky, Eyal de Lara, Jeffrey Hightower, Anthony LaMarca, Veljo Otsason, Gsm indoor localization, Pervasive Mob. Comput. 3 (6) (2007) 698–720. [11] Andrew M. Ladd, Kostas E. Bekris, Algis P. Rudys, Dan S. Wallach, Lydia E. Kavraki, On the feasibility of using wireless ethernet for indoor localization, IEEE Trans. Robot. Autom. 20 (3) (2004) 555–559. [12] Chen Feng, Wain Sy Anthea Au, Shahrokh Valaee, Zhenhui Tan, Receivedsignal-strength-based indoor positioning using compressive sensing, IEEE Trans. Mob. Comput. 11 (12) (2012) 1983–1993. [13] A. Petrick, B. O'Hara, IEEE 802.11 Handbook: A Designer's Companion, Standards Information Network IEEE Press, New York, NY, USA, 2005. [14] S. Sesia, I. Toufik, M. Baker, LTE—The UMTS Long Term Evolution: From Theory to Practice, 2nd edition, Wiley, West Sussex, UK, 2011. [15] E. Dahlman, S. Parkvall, J. Skold, 4G: LTE/LTE-Advanced for Mobile Broadband, 2nd edition, Academic Press, Waltham, MA, USA, 2014. [16] M. Dashti, S. Yiu, S. Yousefi, F. Perez-Cruz, H. Claussen, RSSI localization with Gaussian processes and tracking, in: Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM), San Diego, CA, December 2015. [17] E. Martin, O. Vinyals, G. Friedland, R. Bajcsy, Precise indoor localization using smart phones, in: Proceedings of the 18th ACM International Conference on Multimedia (MM'10), New York, NY, October 2010, pp. 787–790. [18] J.-R. Jiang, C.-M. Lin, F.-Y. Lin, S.-T. Huang, ALRD: AoA localization with RSSI differences of directional antennas for wireless sensor networks, Int. J. Distrib. Sens. Netw. 2013 (2013) 1–11. [19] Henri Nurminen, Anssi Ristimaki, Simo Ali-Loytty, Robert Piché, Particle filter and smoother for indoor localization, in: Proceedings of International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2013, pp. 1–10. [20] F. Perez-Cruz, S.V. Vaerenbergh, J.J. Murillo-Fuentes, M. Lazaro-Gredilla, I. Santamaria, Gaussian processes for nonlinear signal processing, IEEE Signal Process. Mag. 30 (June (4)) (2013) 40–50. [21] A. Haeberlen, E. Flannery, A.M. Ladd, A. Rudys, D.S. Wallach, L.E. Kavraki, Practical robust localization over large-scale 802.11 wireless networks, in: Proceedings of MobiCom, 2004. [22] S. Sen, B. Radunovic, R. R. Choudhury, T. Minka, Precise indoor localization using phy layer information, in: Proceedings of HotNets, 2011. [23] S. Sen, B. Radunovic, R.R. Choudhury, T. Minka, You are facing the Mona Lisa: spot localization using phy layer information, in: Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services, MobiSys '12 2012. [24] H. Wang, S. Sen, A. Elgohary, M. Farid, M. Youssef, R.R. Choudhury, No need to war-drive: unsupervised indoor localization, in: Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services,

MobiSys '12, New York, NY, 2012, pp. 197–210. [25] Z. Xiao, H. Wen, A. Markham, N. Trigoni, Lightweight map matching for indoor localization using conditional random fields, in: Proceedings of the International Conference on Information Processing in Sensor Networks (IPSN'14), Berlin, Germany, 2014. [26] C. Zhang, J. Luo, J. Wu, A dual-sensor enabled indoor localization system with crowdsensing spot survey, in: Proceedings of the 10th IEEE DCOSS, 2014, pp. 75–82. [27] A.G. Dempster V. Moghtadaiee, B. Li, Accuracy indicator for fingerprinting localization systems, in: Proceedings of PLANS, IEEE/ION 2012. [28] V. Moghtadaiee, A. G. Dempster, S. Lim, Indoor localization using fm radio signals: a fingerprinting approach, in: Proceedings the International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2011. [29] A. Youssef, J. Krumm, E. Miller, G. Cermak, E. Horvitz, Computing location from ambient fm radio signals, in: Proceedings of IEEE Wireless Communication and Networking Conference (WCNC), 2005. [30] V. Otsason, A. Varshavsky, A.L. Marca, E. de Lara, Accurate gsm indoor localization, in: Proceedings of Ubiquitous Computing (UbiComp), 2005. [31] A. Varshavsky, E. de Lara, J. Hightower, A. LaMarca, V. Otsason, Gsm indoor localization, in: Proceedings of Pervasive Mobile Computing, 2007. [32] J. Chung, M. Donahoe, C. Schmandt, I.-J. Kim, P. Razavai, M. Wiseman, Indoor location sensing using geo-magnetism, in: Proceedings of MobiSys, 2011. [33] S.P. Tarzia, P.A. Dinda, R.P. Dick, G. Memik, Indoor localization without infrastructure using the acoustic background spectrum, in: Proceedings of MobiSys, 2011. [34] L.M. Ni, Y. Liu, Y.C. Lau, A.P. Patil, Landmarc: indoor location sensing using active rfid, in: Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003 (PerCom 2003), Forth Worth, TX, 2003, pp. 407–415. [35] R. Want, A. Hopper, V. Falcao, J. Gibbons, The active badge location system, ACM Trans. Inf. Syst. 10 (1992) 91–102. [36] P. Lazik, A. Rowe, Indoor pseudo-ranging of mobile devices using ultrasonic chirps, in: Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems, SenSys'12, New York, NY, 2012, pp. 99–112. [37] G. Pirkl, P. Lukowicz, Robust, low cost indoorpositioning using magnetic resonant coupling, in: Proceedings of the 2012 ACM Conference on Ubiquitous Computing(Ubicomp-2012), 2012, pp. 431–440. [38] Lun-Wu Yeh, Ming-Hsiu Hsu, Hong-Ying Huang, Yu-Chee Tseng, Design and implementation of a self-guided indoor robot based on a two-tier localization architecture, Pervasive Mob. Comput. 8 (2) (2012) 271–281. [39] C. Wu, Z. Yang, Y. Liu, W. Xi, Will: wireless indoor localization without site survey, in: Proceedings of IEEE INFOCOM, IEEE, 2012, pp. 64–72. [40] A. Rai, K.K. Chintalapudi, V.N. Padmanabhan, R. Sen, Zee: zero-effort crowdsourcing for indoor localization, in: Proceedings of the 18th Annual International Conference on Mobile Computing and Networking, ACM, Istanbul, Turkey, 2012, pp. 293–304. [41] D. Tao, X. Li, X. Wu, S.J. Maybank, General tensor discriminant analysis and gabor features for gait recognition, IEEE Trans. Pattern Anal. Mach. Intell. 29 (October) (2007) 1700–1715. [42] J. Yu, Y. Rui, Y.Y. Tang, D. Tao, High-order distance-based multiview stochastic learning in image classification, IEEE Trans. Cybern. 44 (March) (2014) 2431–2442. [43] Moritz Kessel, Michael Werner, Automated wlan calibration with a backtracking particle filter, in: Proceedings of International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2012, pp. 1–10. [44] Mohamed M. Atia, Aboelmagd Noureldin, Michael J. Korenberg, Dynamic online-calibrated radio maps for indoor positioning in wireless local area networks, IEEE Trans. Mob. Comput. 12 (9) (2013) 1774–1787. [45] P. Mirowski, T.K. Ho, S. Yi, Simultaneous localization and mapping with mixed WiFi Bluetooth, LTE and magnetic signals, in: International Conference on Indoor Positioning and Indoor Navigation, Montbeliard, France, October 2013, pp. 1–10. [46] M.W.M.G. Dissanayake, P.M. Newman, S. Clark, H.F. Durrant-Whyte, M. Csorba, A solution to the simultaneous localization and map building SLAM problem, IEEE Trans. Robot. Autom. 17 (June (3)) (2001) 229–241. [47] F. Duvallet, A.D. Tews, WiFi position estimation in industrial environments using Gaussian processes, in: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008, pp. 2216–2221. [48] B. Ferris, D. Haehnel, D. Fox, Gaussian processes for signal strength-based location estimation, in: Proceedings of Robotics Science and Systems, 2006. [49] A. Bekkali, T. Masuo, T. Tominaga, N. Nakamoto, Gaussian processes for learning-based indoor localization, in: Proceedings of 2011 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Xi'an, China, September 2011, pp. 1–6. [50] B. Ferris, F. Dieter, N. Lawrence, WiFi-SLAM using Gaussian process latent variable models, in: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007, pp. 2480–2485. [51] A. Brooks, A. Makarenko, B. Upcroft, Gaussian process models for indoor and outdoor sensor-centric robot localization, IEEE Trans. Robot. 24 (December) (2008) 1341–1351. [52] D.J.C. MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press, 2003. [53] Brian Ferris, Dirk Haehnel, Dieter Fox, Gaussian processes for signal strengthbased location estimation, in: Proceedings of Robotics Science and Systems, Citeseer, 2006. [54] S. Yiu, K. Yang, Gaussian process assisted fingerprinting localization, IEEE J. Internet Things PP (2015) 1.