ARTICLE IN PRESS
Journal of the Franklin Institute 345 (2008) 592–617 www.elsevier.com/locate/jfranklin
Theoretical and experimental study of through-wall microwave tomography inverse problems Eugene M. Lavely, Yan Zhang, Edward H. Hill III, Yi-san Lai, Peter Weichman, Angela Chapman BAE Systems Advanced Information Technologies, 6 New England Executive Park, Burlington, MA 01803, USA Received 15 November 2007; accepted 24 January 2008
Abstract We consider the problem of building structure estimation using microwave ray tomography. A Bayesian formulation is developed, and a Markov chain Monte Carlo (MCMC) procedure is used to sample the posterior distribution, which is based on a data likelihood defined in terms of a residual misfit between observed and predicted waveforms. To accelerate model optimization, a simulated annealing approach is combined with the MCMC, using specific model moves to explore each component of the model space. Our approach is applicable to data acquired in the frequency or time domain and for monostatic or bistatic acquisition modes. Experimental data for a multi-wall laboratory test structure were acquired using a horn antenna connected to a vector network analyzer and used to validate both the forward model and the inversion approach. Although, in true remote sensing problems for building structure, the model order is usually unknown, in this initial study, the actual inversion experiment is performed in a reduced-dimension model space for which a subset of the variables are taken as known or fixed. Generalization to the variable-dimension problem can be achieved by using reversible jump MCMC sampling procedures. r 2008 The Franklin Institute. Published by Elsevier Ltd. All rights reserved. PACS: 01.30.y Keywords: Microwaves; Bayesian inference; Tomography; Markov chain Monte Carlo; Simulated annealing; Through-wall imaging
Corresponding author. Tel.: +1 781 273 3388; fax: +1 781 273 9345.
E-mail address:
[email protected] (E.M. Lavely). 0016-0032/$32.00 r 2008 The Franklin Institute. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.jfranklin.2008.01.006
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
593
1. Introduction The ability to detect and classify objects or people behind walls and inside buildings is of major interest to law enforcement officials, search-and-rescue workers, and the U.S. military [1]. Microwave sensing is an advantageous modality for these applications, since microwave signals can penetrate through non-metallic walls. This is confirmed by common experience, as cell phones, hand-held phones, and wireless routers typically operate successfully in the frequency band from 900 MHz to 2.5 GHz. An additional advantage is that typical microwave wavelengths, on the order of 10–30 cm, provide the physical basis for high spatial resolution and localization when combined with appropriate exploitation algorithms. Model-based inference methods for the microwave band can readily be implemented, since geometric optics is largely applicable to typical wall structures and building configurations of interest. The geometric optics, ray-based picture does not capture all of the relevant propagation effects, but the geometric formulation can be extended to model significant non-ray effects such as edge diffraction or Bragg diffraction due to periodic structures (walls with rebar or studs, cinderblock walls, and others). This is of important practical consequence, since ray-based forward models are much more computationally efficient than full-wave solvers such as those using finite difference time domain (FDTD) methods. Model-based inference for waves just below the microwave band, from 100 MHz to 500 MHz (corresponding to VHF and UHF), almost certainly require a strongly numerical forward model, since the wavelengths are on the order of typical structural scales within buildings. This would likely prove impractical for solving inversion problems in realistic time frames. In addition, the imaging resolution of such waves is degraded relative to that of microwaves, and directional focusing requires rather large transmitting antennas relative to that in the microwave band. On the other hand, frequencies in the upper range of the microwave band and higher, above 3 or 4 GHz, are strongly attenuated by walls due to the presence of small but finite conductivity in typical dielectric building materials. This limits signal measurability for multi-wall scenarios. Thus, we are led to conclude that the low to middle microwave band is the ideal choice for radio frequency (RF)-based remote sensing of building interiors. Various RF measurements can provide evidence of motion behind a single wall, and depending on the experimental aperture, excellent spatial localization can be achieved. However, localization of objects or tracking of occupants becomes much more difficult when the targets are behind multiple walls or inside a complex structure; this follows due to propagation effects such as wave multipathing and dispersion, edge diffraction, and Bragg diffraction [2,3]. These effects must be deconvolved from signal observations in order to optimize detection, classification, and tracking algorithms for through-the-wall inference. Model-based estimation of wall and building structures, when combined with a fast, highfidelity propagation model that accounts for these effects, can provide the information required to improve performance for these applications. Our forward model (described in Section 2.1) is fundamentally a geometric optics approach, extended to describe the most important diffraction and multipathing effects. The principal objective of this paper is to develop a building model estimation formalism in order to enable or enhance the various end-user applications listed above. The building tomography problem presents a number of challenges, including issues pertaining to model representation, model order, data sensitivity, and feasible/practical
ARTICLE IN PRESS 594
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
experimental aperture. Model representation is non-trivial due to (i) the high dimensionality of the model space, (ii) the need for a representation suitable for high-fidelity forward computation and use in numerical codes, and (iii) the trade-off between model fidelity and model compactness. The model order issue is also difficult, since the model configuration (or model basis) is unknown; that is, the number and geometric arrangement of walls, doors, clutter objects, and so forth is not necessarily known a priori. In addition, the wall types, and hence the associated number and identity of the corresponding model parameters, may also vary, and must be determined even for a fixed geometrical configuration. A simple example of a single model configuration is a rectangular building with four parallel interior walls that span the ‘short side,’ each of which contains a door. There are many realizations of this single configuration, since the walls may be translated along the ‘long side’ and the doors may be placed anywhere within each wall. If the configuration is known, the model order is also known (i.e., the model selection problem is solved), and the problem reduces to one of parameter estimation. In general, of course, the model configuration is unknown, and the model and its parameters must be simultaneously estimated. The issues of data sensitivity and experimental aperture are also vexing ones for building tomography. Data sensitivity pertains to the intrinsic sensitivity of the measurable microwave data to the building structures, i.e., the Fre´chet derivative of the data with respect to changes in the model parameters, which is fundamentally determined by the physics of wave propagation. Another obvious sensitivity issue is signal observability. Wave attenuation due to finite conductivity of dielectric structures ultimately limits how deep waves can propagate before dipping below the dynamic range of the sensor. In addition, signals of interest, such as primary reflections from walls deep within a building, may ultimately be unobservable if they are buried within a sea of simultaneous arrivals caused by multipathing and reverberation. Finally, for realistic data acquisition, the available experimental aperture, i.e., source and receiver positions and frequency bandwidth, may be restricted. Thus, it is generally not possible to fully access a structure in ways similar to medical imaging where data sampling can be more easily controlled. Instead, the acquired data may be spatially irregular and geometrically sparse. In addition, there may be uncertainty in the experimental parameters such as sensor location. One example concept of operations for whole-building data acquisition is shown in Fig. 1, which displays the notion of multiple transmitters and a receiver array attached to a moving vehicle which acquires data in front of a building. In this example, vector network analyzers may be used in stepped-frequency mode to drive the transmitter antennas over a bandwidth sufficient to obtain the required spatial resolution. The frequency scan time and signal integration time are key limiting factors for resolution, and define the upper limit of the vehicle speed with the on-board array (e.g., a few m/s for useful resolutions). Partially mitigating some of these inversion issues is the fact that real buildings have spatial regularity. This property can be used to inform the design of prior probability densities on the model and to design model representations that aid in regularization of the inversion. We introduce these constraints through the use of graphs for representation of selected geometric features of the model such as the wall layout. Any available a priori data, such as prior distributions on wall types or wall materials, can provide additional model regularization. Priors on model complexity such as number of model components, entropy of wall placement, etc., may also be introduced to improve inversion performance. We represent dielectric structures with cuboids (rectangular parallelopipeds), each of
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
595
Fig. 1. Example concept of operations, in which a single N M receiver array is sequentially translated along the front of a subject building. Active data are acquired using transmitters (indicated by the small spheres) above and below the array, for 17 distinct array positions. The line segments extending from the sphere indicated the peak direction of the main lobe antenna pattern.
which may be homogeneous in its dielectric properties (permittivity, electrical conductivity and magnetic permeability) or stratified along its shortest dimension. The cuboid representation supports high-fidelity modeling of reflection and transmission events. This representation must be augmented with additional model features when embedded periodic structures are present. In this paper we illustrate these ideas by solving a low-dimensional inverse problem in which we are able to effectively convert the variable-dimension problem to a fixeddimension problem. We develop a global optimization approach based on simulated annealing (SA), and we perform model sampling using a strategy based on Markov chain Monte Carlo (MCMC) [4–6], for which specialized model moves are designed. The estimation scheme searches over a model space consisting of binary indicator variables for wall existence on a structured graph and for graph variables that define the spatial geometry of the graph edges where walls may be placed. Our formulation identifies additional degrees of freedom of the general problem, but in the limited study presented here, many of these are taken as known or held fixed, and we only invert for the ‘wall existence’ and ‘edge location’ variables described above. The inversion framework, as formulated, preserves the dimensionality of the model space. Therefore, we are able to use fixed-dimension MCMC sampling strategies such as Metropolis-Hastings (MH) [6]. The more general case of unknown model order constitutes a system identification problem. In principle, this can be addressed with trans-dimensional inversion techniques [7] such as reversible jump MCMC [4,6–10]. The latter presents a natural framework for the required extension, and will be addressed in future work. The remainder of this paper is organized as follows. In Section 2 we describe the algorithmic basis of our problem formulation, including the forward model, time-domain signal synthesis, model parameter specification, and our Bayesian inference paradigm. The latter includes specification of the prior and posterior densities, MCMC model sampling strategies, and SA optimization. Our numerical results using an experimental data set are presented in Section 3, and concluding comments are presented in Section 4. 2. Algorithms This section provides a theoretical specification of the forward and inverse problems. The inverse approach can be implemented with any forward model that captures the
ARTICLE IN PRESS 596
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
essential physics of the structure under study. Therefore, in the following, we only briefly summarize the forward model, concentrating instead on the inversion formulation. The algorithmic basis of the geometric optics, ray-based forward model is discussed in Section 2.1. The forward data are computed in the frequency domain, and the experimental data are acquired in this domain as well. Our inversions, however, are performed in the time domain, and the required time-domain signal synthesis is presented in Section 2.2. Our formulation of the inverse problem utilizes structured graphs to facilitate model representation and design of ‘model moves;’ the graph formalism and a detailed discussion of the model parameters is presented in Section 2.3. We use a MCMC method for model sampling and couple this approach with a SA framework for model optimization, as described in Section 2.4. Specialized moves for exploring the model space and their corresponding acceptance probabilities are presented in Section 2.5. 2.1. Forward model The forward model used here is an extended geometric optics ray tracer called ATrace [11,12], which we developed for rapid in-the-loop simulation of multipath electromagnetic propagation in realistic building structures. It incorporates novel algorithms to improve computational efficiency and to capture several propagation phenomena beyond the range of validity of geometric optics, including certain types of diffraction. For high multipath scenarios, such as propagation inside a building, the most expensive operation for the majority of ray and beam tracers is path discovery (see Fig. 2). A key computational strategy in ATrace is to segregate, as much as possible, electromagnetic calculations such as reflection and transmission coefficients from the predominantly geometric path discovery sub-problem. As a result, the forward modeler runs in two stages. The first stage of the ATrace implementation is wavefront and path discovery. Rays are initiated from a surface surrounding the transmitter with a specified angular distribution and density, and are propagated using a variety of mechanisms, including specular reflection and transmission through cuboid objects (representing walls, floors, ceilings, clutter, etc.) and Bragg diffraction induced by objects with internal periodic structures (e.g., walls with rebar, or cinderblock walls). Paths that ultimately connect a transmitter to a receiver are saved in an n-ary tree data structure (see Fig. 3), and all others are discarded. To compensate for the systematic over- and under-sampling that occurs in many ray and beam tracers, ATrace uses a novel logical path encoding algorithm. If all surfaces within a scene are planar (so that there are, for example, no caustics) then one can show that there is a direct correspondence between logical paths and individual wavefronts. Here, logical paths are strings such as Tx-7 Face-32-R Face-19-T Face-107-R Rx-22 which encode the ray history, including transmitter identifier (number 7, in this case), objects (or parts thereof) encountered (the numerical identifier for the cuboid face on which the ray is incident, in this case), and type of ray interaction with each (a reflection, transmission, reflection sequence, in this case), and receiver identifier (number 22, in this case). A group of rays with the same logical path compose a single (nearly planar) wavefront in the vicinity of the receiver. Given the wavefront encodings, ATrace can rapidly sort paths to identify the ones which best represent each wavefront at each receiver. The best ray
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
597
Fig. 2. Example of ray multipathing computed with ATrace for a one room building structure (with an internal partition). ATrace discovers all paths linking a transmitter–receiver (Tx–Rx) pair up to a specified bounce depth, or in accordance with a specified ray pruning strategy. In this example there is a horizontal receiver line array on the right and a single transmitter centered about the receivers. Tx Bw Bw Hw
Hw Hw Rx
Rx
Bw Rx
Rx
Bw Rx
Rx
Rx
Bw Rx
Rx
Rx
Rx
Rx
Bw
De
Rx
Rx
Rx
Rx
Fig. 3. Example n-ary tree structure used in ATrace for ray composition. The representation corresponds to a directed acyclic graph in which each node is a ray interaction event and each leg is a ray segment. The transmitter (Tx) and receiver (Rx) nodes indicate the birth and death of a ray, respectively, while all other nodes indicate a ray interaction with a particular scatterer (implicitly, a scatterer index is attached to each node). The notation Hw indicates a wall with homogeneous or stratified dielectric properties (in its short dimension), for which specular reflections and transmissions are the canonical events. The notation Bw indicates Bragg walls in which diffracted, non-specular rays are generated by wall-periodic structures, such as cinderblocks, rebar or studs. The notation De indicates a diffracting edge from which, in principle, a continuum of rays is spawned.
paths are those which pass closest to the receiver phase center, and we refer to these paths as ‘eigenrays’ (see Fig. 4). Since the entire ray path histories, including the launch directions, are available within the n-ary tree data structure, in future work it will be easy to implement further adaptive path improvement algorithms. As better ray paths are discovered, inferior paths may be discarded. The n-ary tree is also useful for controlling computational complexity, e.g., by limiting ray paths to a maximum total bounce depth, or the total number of diffractive events allowed in any given ray path. In the second stage, ATrace uses the eigenray paths to calculate the contribution of each wavefront to the signal measured by each receiver. Analytic expressions [3] for
ARTICLE IN PRESS 598
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
Fig. 4. In the geometrical optics picture, each wavefront is represented by a bundle of nearly parallel rays (three of which appear in this sketch). In computing the voltage receiver response, each bundle should be counted exactly once. In ATrace, rays are assigned to bundles based on their bounce history: members of a bundle have encountered the identical obstacles in the identical order. The n-ary structure (Fig. 3) provide an extremely efficient means of comparing ray histories. The optimal ray for each bundle is the one that passes closest to the center of the receiver. These rays are the eigenrays (highlighted in the figure), and are the ones used to compute the receiver voltage.
transmission and reflection coefficients in homogenous or multi-layered media as a function of layer thickness, permittivity, permeability, and conductivity, may be used for this purpose. Along with the geometric path information and the transmitter spatial intensity function, the total contribution along each wavefront can be computed for each frequency of interest. Ray theory predicts the complex amplitude V^ for a signal of frequency o transmitted from xT and received at xR in the (slightly simplified) form ^ R ; xT ; oÞ p^ , ^ p Gðx V^ ðxR ; xT ; oÞ ¼ SðoÞ^ R T
(1)
^ is the EM tensor Green function for ^ in which SðoÞ is the complex source spectrum, and G the building, describing propagation from xT to xR in the form ^ R ; xT ; oÞ ¼ Gðx
Nr X i¼1
Ai eiol i =c Ti ;
Ti ¼
ni Y
TðiÞ k
(2)
k¼1
and represented here as a sum over N r rays connecting the two points. The vectors p^ R;T describe receiver and transmitter aperture, polarization, and orientation effects. The factor Ai for each ray contains all geometric spreading factors (inversely proportional to the optical path length l i ), while the overall phase factor also incorporates the optical path length. For the ith ray, the ray tensor Ti is the product of the sequence of ni transmission, reflection, or diffraction complex tensor factors TðiÞ k applied to that ray at each vertex
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
599
where it undergoes an interaction. The factors are complicated functions of the wall dielectric parameters, wave frequency, and incident ray geometry [2,3,11]. The frequency domain data vector d^ ¼ ½V^ 1 ; V^ 2 ; . . . ; V^ N d consists of a set of frequency series taken for different transmitter and receiver parameters (location, orientation, polarization, etc.). In the inversion, the vector s^ ðxÞ ¼ ½V^ 1 ðxÞ; V^ 2 ðxÞ; . . . ; V^ N d ðxÞ of corresponding predictions for a hypothesized model x is compared against d^ (or some function thereof) in the search for models with acceptable data misfit and prior plausibility. The corresponding time-domain quantities, obtained via discrete Fourier transform, will be denoted ðd; sÞ. The correctness of the ATrace forward model has been validated by comparison of ATrace simulations to laboratory measurements and to simulations from a full-wave FDTD solver. As an example, Fig. 5 shows time-domain data observations for an I-wall structure acquired with the experimental equipment described in Section 3. 2.2. Time-domain synthesis Time-domain synthesis is a method for converting electromagnetic waves in the frequency domain to pulses in the time domain. In the time domain, time delays represent the optical distance the pulses travel from transmitter to receiver, including scattering by objects in between. For a single frequency, the time dependence is given simply by sðtÞ ¼ Re½V^ ðoÞeiot , (3) where V^ ðoÞ is the complex signal in the frequency domain for a given receiver. If there are multiple frequencies, a narrow pulse may be synthesized in the form Z 1 do ^ sðtÞ ¼ PðoÞ V^ ðoÞeiot , (4) 1 p ^ ^ ^ , where PðoÞ is the spectrum of the transmitted pulse. It is assumed that PðoÞ ¼ PðoÞ ^ ^ V ðoÞ ¼ V ðoÞ so that sðtÞ is real. We use Gaussian pulses 2 s pffiffiffi iðoo0 Þt0 s2 ðoo0 Þ2 =2 2 ^ p½e PðoÞ ¼ e þ eiðoþo0 Þt0 es ðoþo0 Þ =2 , (5) 2 where s is the pulse width in the time domain, t0 is the pulse starting time, and o0 ¼ 2pf 0 is the center frequency. If one has a discrete set of N f frequencies on ¼ omin þ nDo, n ¼ 0; 1; . . . ; N f 1, where ½omin ; omax is the sampling interval, and ðomax omin Þ=ðN f 1Þ is the spacing, then we define "N 1 # f X ion t ^ ^ Pðon ÞV ðon Þe sðtÞ ¼ Re , (6) n¼0
which corresponds to a pulse that repeats with period T ¼ 2p=Do. The sum is a discrete Fourier series, which can be efficiently implemented with a fast Fourier transform (FFT). 2.3. Model space specification and graph concepts For the inverse problem, we use structured graphs to define building structures (Fig. 6). This is convenient choice due to the regularity and rectilinearity of common architectural designs. The graph structure itself can be used to represent the geometric properties of buildings (or structures with wall-like components). For example, graph edges can be used
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
600
to indicate the presence or absence of walls, and the vertices of the graph may be associated with spatial coordinates that define the location and extent of the edges (or walls). In our formulation, edges are only allowed to ‘live’ in one of two directions (e.g., the x and y
Scanner
2.25" [5.7cm]
1.5" [3.8cm]
6’ [182.9cm]
10’ [304.8cm] 3 layers of 3/4" plywood
5’10" [177.8cm]
6’ [182.9cm]
2 layers of 3/4" plywood
Horizontal scan limit for test
1.5" [3.8cm]
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
601
7
14
21
28
35
42
49
56
63
70
6
13
20
27
34
41
48
55
62
69
5
12
19
26
33
40
47
54
61
68
4
11
18
25
32
39
46
53
60
67
3
10
17
24
31
38
45
52
59
66
2
9
16
23
30
37
44
51
58
65
1
8
15
22
29
36
43
50
57
64
Fig. 6. Example of structured graph showing active vertices, edges (wall segments) and structured but spatially varying grid line coordinates. In the graph shown here, only the vertices 17, 24, 31 and 38 are not part of the active set since they are not connected to any wall segment.
directions). The separation distances between each grid line of constant coordinate value need not be the same. In addition, the edges can be associated with various attributes, or with categorical variables that indicate wall type (e.g., homogeneous walls, cinderblock walls, etc.). Formally, the graph representation of a model x takes the form x ¼ ðGn ; Cm ; xb ; xs Þ
(7)
in which, Gn is a set of vectors gi , i ¼ 1; 2; . . . ; n which contain the coordinates and labels of the n ‘active’ vertices (out of a total of N vertices on the graph), i.e., those connected to a wall segment, and Cm is a set of vectors cj , j ¼ 1; 2; . . . ; m, containing properties (bounding vertex labels, thickness, EM parameters, rebar spacing, cinder block dimensions, etc.) of
Fig. 5. Time-domain syntheses of data measurements (below left) and ATrace simulation (below right) for an ‘I’shaped structure (sketched above) made of three sheets of plywood, demonstrating reasonable correspondence. The sheets were all 8 ft high and 6 ft wide, but the front wall was 2.25 in thick, while the back and center walls were 1.5 in thick. Data were acquired from the front of the structure with a vector network analyzer and an ETSLindgren 3164 horn over a 0.7–3.1 GHz frequency band at 81 positions along a linear track with an inter-element spacing of 0.875 in centered at 48 in above the ground, with a 10 ft standoff from the front plywood sheet. The known parameters of the experiment were used in the simulation, though a floor slab was not included. The power was 5 dBm. The vertical stripe in the center of the ATrace image is actually an artifact and should be ignored: since we do not yet have the ability to model the very complicated diffraction of rays striking and entering the subwavelength thick edge of the center wall, they are simply dropped, leading to the shadowed region.
ARTICLE IN PRESS 602
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
the wall segments occupying the m ‘active’ edges (out of the total of M edges on the graph). The parameter vector xb contains any further structural features not conveniently described by the graph representation (e.g., doors, windows, pipes, stairs, discrete clutter items, such as desks and cabinets, within rooms), and xs describes the parameters of the imaging experiment (e.g., source and receiver locations and orientations and polarizations, measurement frequencies, transmitter radiation pattern and receiver sensitivity pattern for each frequency). For the purposes of the MCMC implementation, a set of ‘moves’ is defined on the set x that allow one to add and delete walls and clutter items, continuously vary their EM properties, adjust the spacing of the grid lines, etc. Such moves can be local, focused on a single wall, or more global, effecting a simultaneous change to many walls, e.g., those designated to be in the same class, hence constrained to have identical properties throughout the building. Correspondingly, there will be moves that allow a set of walls of the same class to break up into subsets with different class labels, or, conversely, to merge with another set to make a bigger set of walls of the same class. The acceptance or rejection of all such moves must be governed by their ability to improve agreement between model predictions and data. The graph representation is a convenient way of defining and labeling the universe of allowed models, and a notion of distance between them, in the sense of the type and number of moves required to evolve one model to another. The parameter vector x is connected to the representation of the physical model residing in through a ‘mapper’ which transports the graph model to a building model with physicsbased descriptors and representations suitable for use in forward electromagnetic propagation calculations, in our case carried out by ATrace. The representation x can be made very general, but in the numerical studies described in Section 3 we demonstrate only a very specialized implementation. As mentioned previously, dielectric structures in ATrace are represented with cuboids, and, by superposition, they can represent quite complex dielectric scenes. In Ref. [12] cuboids were used to compose two-story buildings with multiple rooms, and they are naturally suited for the multi-wall test structure discussed in Section 3.1 (see also Fig. 8). Of course, cuboids are not suitable for representing objects with curved surfaces or noncuboid compact scatterers such as cylinders or spheres. The model parameter vector for the ith cuboid can be written xi ¼ ðgi ; ai ; ci Þ;
1pipN c ,
(8)
where gi is a nine-component vector defining its geometry (three center of mass position coordinates, three orientation coordinates about the center of mass, and three edge lengths), ai is the wall type or class index, and ci is a property vector containing all electromagnetic parameters, and internal geometric information, required to fully model it. Some examples are shown in Table 1. 2.4. Cost function, search method, and inversion fidelity The search for the model that best fits the available data is controlled by a cost function C½d; xX0 which vanishes when (but not necessarily only when) the predicted data, controlled by the model x, perfectly matches the collected data d. In practice, there are many elements of uncertainty (measurement noise, both external and internal/instrument;
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
603
Table 1 Some possible examples of wall class types and associated parameter vectors Wall type
Class label
Model parameters
No wall (null case) Homogeneous Concrete with vertical rebar (1D) Floor or ceiling with crossed rebar (2D) Cinderblock
n h v c b
– ch ¼ ðth ; hr ; mh ; sh Þ cv ¼ ðtv ; vr ; sv ; rvd ; rvs Þ cc ¼ ðtc ; cr ; sc ; rcd ; rcs Þ cb ¼ ðtb ; br ; sb ; nbv Þ
Here, ðta ; ar ; mar sa Þ define, respectively, the thickness, relative permittivity, relative permeability, and conductivity for wall class a 2 fh; v; c; bg. Rebar diameter and rebar spacing for wall types a 2 fv; cg are defined by ðrad ; rad Þ, and nbv specifies the number of voids (typically two or three) for cinderblock walls. For the latter, it is assumed here that the dimension and geometry of the void elements are fixed from prior knowledge once the number of voids is known. Otherwise, these properties would need to be included in cb as well. One could also extend this table with walls containing more complicated layered structure (e.g., sheetrock over plywood, vinyl over concrete block, plaster over brick).
building model uncertainty, such as real walls imperfectly described by the chosen types; forward model uncertainty, such as electromagnetic effects lying outside of the model fidelity) that preclude such a perfect match. One therefore seeks either the ‘best’ model, defined as the one that minimizes C over the available space, or perhaps a broader characterization, in terms of the set of models that come ‘close’ to minimizing C. In the latter approach, for example, by seeing which model features are most commonly shared across the elements of the set, one may estimate the (posterior) probability with which those features have been correctly inverted. Mathematically rigorous formulations of such probabilistic interpretations require that both the cost function and the search method be designed to yield the correct posterior probability distribution. For example, the classic MH Monte Carlo algorithm was designed to simulate thermodynamic systems, and the posterior distribution could rigorously be shown to converge to the correct Gibbs distribution determined by the Hamiltonian of the system. However, in most real world applications the posterior distribution (in our case, reflecting building practises in a particular region of the world, the types of noise that may corrupt the measurement, and even the psychology of the group of people of interest, in terms of the types of buildings they might choose to inhabit, and the ways that they might alter them to conform to their needs) is only known, at best, at a very coarse level (e.g., general constraints on building materials available, quality of construction, size and geometry of rooms, type of room contents, presence of RF interference). Lacking such knowledge, at the operational level, one instead invents a convenient cost function [13] that (i) properly reflects the data fidelity (e.g., does not attach very high costs to data features that are known to be noise sensitive), and a search method that (ii) encodes any prejudices (prior constraints) on the model space that one wishes to impose, and (iii) is capable of searching this space in a numerically efficient manner. This implementation defines a posterior distribution on the model space, which although having little or no rigorous basis, nevertheless should provide some insight into the uncertainties in the inversion.
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
604
2.4.1. Cost function A very common choice of cost function is the Gaussian form Cðd; xÞ ¼ 12½d dðpredÞ ðxÞT R1 ½d dðpredÞ ðxÞ
(9)
in which dðpÞ ðxÞ is the model prediction, and the covariance matrix R encodes the data 2 uncertainty. For example, if datum j enters via a diagonal term jd j d ðpÞ j j =2Sj , a large value of Sj would reflect a an expected large noise impact on d j , and a correspondingly smaller cost to deviations from the model prediction, while a small value would reflect a more robust determination. As presented, Cðd; xÞ is a global cost function, using all available data d to constrain the totality of model parameters x. Such a function is appropriate for small-scale problems including the experimental data inversion that we consider in the following. However, in general, dominant features in the data may control global features of an inversion solution. In addition, for larger problems it is necessary to control model complexity through inversion strategies that limit the model order. For example, in whole-building tomography, a hierarchical inversion approach can be defined in which data subsets partitioned on the basis of source–receiver pairs in order to provide spatially localized sensitivity. Additional localization sensitivity (in depth) can be achieved by inverting selected time-windows of the spatially partitioned time-domain data. This strategy can be an effective means to enhance model estimation accuracy and control computational requirements of model search schemes [14]. For frequency domain data collected in a noisy environment, Sj ¼ Sðoj Þ may be the variance of the RF noise at frequency oj . In the applications considered here, where the data are collected under laboratory conditions, external noise is very small, and the uncertainties are instead dominated by unmodeled aspects of the environment. It is then better to consider a time-domain formulation in which one can use time-windowing to focus on radar returns from particular distances. A convenient choice is Cðd; xÞ ¼
1X 2 W ðtj Þ½d steer ðtj Þ d ðpredÞ steer ðx; tj Þ , 2 j
(10)
in which W ðtj Þ is a window function that vanishes outside the desired range, and the subscript ‘steer’ includes data processing steps, such as beam steering, that coherently combine measurements from different transmitter–receiver positions to focus on a particular cross-range. 2.4.2. Prior constraints and prior distribution We list here a few of the prior constraints one might consider imposing: (1) There are obvious constraints that can be imposed from an exterior view of the building. For example, exterior wall dimensions, window and door placements can be very accurately specified in advance from a few simple photographic measurements. (2) The graph approach already imposes rectilinear constraints on the wall positions. One may also impose constraints on the spacing between grid lines, for example strongly penalize model realizations in which any spacing l closer than l 0 (e.g., 1 m) apart. One could either use a hard constraint, simply assigning infinite cost to neighboring grid
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
605
lines spaced by less than l 0 , or a softer constraint, e.g., through a cost ðl 0 =lÞr for some power r. (3) As mentioned, local construction practices will strongly bias expected wall types. For example, desert areas will use more stone and less wood. Also, cinderblock and brick materials and geometries might be limited to those available from a few local manufacturers. (4) Walls segments occupying the same grid line may be expected to have similar properties. For example, a lengthy exterior wall is not likely to randomly alternate between class types along its length. One may again add terms to the cost function that penalize strong differences between neighboring walls segments. (5) Similarly, exterior and interior walls should have a number distinguishing features (thickness, surface properties, etc.) that can be used to bias the cost function. These constraints basically determine a prior probability distribution (i.e., one that is uninformed by the measured data) on the model space. As indicated, the latter may be expressed in terms of a prior cost function Cprior ðxÞ that is added to Cðd; xÞ and that can be used to initialize the search (see below). 2.4.3. Model space search procedure In MCMC, independent samples are generated by performing a random walk. The rules of the random walk determine the (posterior) probability distribution with which one explores the model space. As indicated earlier, in many applications, this distribution has little or nothing to do with physical reality, but is determined instead by numerical convenience. In our approach, the random walk is using the MH method [6]. A move is proposed from the current model xp to a new model xq which is either accepted or rejected. The proposed move is made using a single step transition probability Qðxq jxp Þ, described further below. The move is then accepted with probability a given by Pðxq jdÞ Qðxp jxq Þ p q aðx ; x Þ ¼ min 1; (11) Pðxp jdÞ Qðxq jxp Þ in which Pðxp jdÞ is the desired posterior distribution, constrained by the data d. Thus, if the ratio in Eq. (11) is greater than one, the move is automatically accepted, whereas if it is less than one, it is only accepted probabilistically. This definition of a is precisely what is required to ensure that the model space is, after many iterations, explored with distribution P [6]. In many cases, the random walk obeys detailed balance, Qðxp jxq Þ ¼ Qðmq jxp Þ, and a is independent of Q. In our case we define P through a Boltzmann distribution constructed from the cost function: Pðxjd; TÞ ¼
1 eCtot ðd;xÞ=T , Zðd; TÞ
(12)
in which Ctot ðd; xÞ ¼ Cðd; xÞ þ TCprior ðxÞ is the total cost function (by convention, the prior cost usually appears unscaled by T in P), and Zðd; TÞ is a normalization factor that need not actually be computed since it drops out of the ratio in Eq. (11). The temperature T40 is a free parameter, which determines how strongly peaked P is about low cost regions.
ARTICLE IN PRESS 606
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
In the limit T ! 0 its weight is completely concentrated at the model x0 which minimizes the cost. 2.4.4. Simulated annealing Since the models of interest to us are precisely those that have small cost, the low temperature limit is of great relevance. However, according to Eq. (11), low temperature also has the effect of trapping the random walk in local, as opposed to global, minima: escaping a local minimum becomes exponentially unlikely. The SA approach [15] provides a method for increasing the probability that the random walk settles down to the desired global minimum by slowly decreasing the temperature from larger values. In this way, the random walk is given time to explore as much of the model space as possible at the same time that it attempts to locate the minimum. In the approach, a prescribed number of model realizations are sampled from Pðxjd; TÞ at temperature T i , and this procedure is iterated over a sequence of monotonically decreasing temperatures T i ! 0. It can be shown [16] that x settles to the global minimum with probability one for a sufficiently slow cooling schedule. This is usually not feasible in practice, and accelerated schedules are commonly used instead, with corresponding less certainty that one has actually discovered the true minimum. A block diagram for the SA optimization procedure with additional details is shown in Fig. 7.
Fig. 7. Flow chart of the inversion algorithm using the Markov chain Monte Carlo method with simulated annealing. In the inner loop the model space is explored using a given temperature T i . Candidate model moves A; B; C; . . . are selected with, respectively, probability tA ; tB ; tC ; . . . (which add to unity), according to a random number u1 uniformly distributed on ½0; 1. The actual acceptance of the move is controlled by a second random number u2 in the same interval. The outer loop increments the temperature downwards after every M circuits through the inner loop, until it reaches the final temperature T N .
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
607
2.5. Model moves In order to explore the model space, it is necessary to introduce model moves for each of the components in the model vector x. In Section 2.3, we defined a graph representation for various aspects of x. In addition, a cuboid representation was used to define dielectric properties of objects within the model. These representations inform the specific algorithms required for each type of model move, including incorporation of constraints, available degrees of freedom, and the nature of the move (perturbation, translation, birth, death, etc.). In the simplified inverse problem presented here, since a number of parameters, such as the number of grid lines and the wall element electromagnetic parameters, are fixed in the experiment, we consider only a small subset of all possible model moves. Thus, the only degrees of freedom we will consider in the inversion are the class labels C (assigned to the graph edges) and the grid geometry G (defining edge locations). The values of the remaining variables were assigned and held fixed using the prior knowledge available from the test experiment. Two different move types, A and B, are now described which change each of these degrees of freedom. 2.5.1. Move type A: change class type of edge Edges are defined by their locations within the graph, their class type, and a model parameter set associated with each of the class types. Move type A is introduced to sample the model space for the class type attribute, and is attempted with probability tA . An edge ei , with currently assigned class type ci , is selected with uniform probability 1=N e , where N e is the total number of edges in the graph. A proposed new class type c0i aci is selected according with uniform probability 1=ðN c 1Þ, where N c is the number of class types. Since we have chosen uniform probabilities, the reverse move probability in this case is identical to the forward move probability. The transition probabilities tA (13) QA ðxq jxp Þ ¼ QA ðxp jxq Þ ¼ N e ðN c 1Þ are also equal, and therefore cancel out of Eq. (11). The latter therefore reduces to aA ðxq jxp Þ ¼ minf1; e½Ctot ðd;x
p
ÞCtot ðd;xq Þ=T
g,
(14)
a function only of the cost difference between the current and proposed models. Note that, since ‘no wall’ is a class type, move type A can add and delete wall elements. 2.5.2. Move type B: translate grid line Move type B is used to sample hypothesized wall positions within the model space. In our formulation, individual edges may not be translated. Instead, edge (or wall) repositioning is accomplished by translating a selected grid line, which translates all edge elements attached to the line. The translation move, defined in this way, helps enforce or preserve the prior constraint of structural rectilinearity. We select a grid line with uniform probability 1=N g , where N g is the total number of lines (in all directions). For a given grid line, there are many ways of choosing a translation distance (orthogonal to the line direction). One possibility is to simply select a translation distance d from an interval ½a=2; a=2, where the length a, fixed throughout the search procedure, is somewhat arbitrary, but depends on the spatial scale of the scene under
ARTICLE IN PRESS 608
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
study, and on the density of grid lines within that scene. The transition probabilities tB QB ðxq jxp Þ ¼ QB ðxp jxq Þ ¼ (15) aN g are then again symmetric, and the acceptance probability (11) again reduces to Eq. (14). Prior constraints on the line positions are contained Cprior , which is set to infinity if the translated grid line steps outside of the specified model domain, or if it lies within a threshold value l 0 of another (parallel) grid line, chosen to prevent ‘collision’ or interpenetration of walls. The acceptance ratio vanishes in these cases, and the move is automatically rejected. Another possibility is to allow the translation interval to depend on the present grid line conformation. For example, if grid line i is selected, one imposes minimum and maximum distances l i , Li between neighboring lines, and do not allow them to jump across each other, then one could choose the new position g0i with uniform probability on the interval g0min ðgi1 ; giþ1 Þog0i og0max ðgi1 ; giþ1 Þ, g0min maxfgi1 þ l i ; giþ1 Liþ g, g0max minfgiþ1 l iþ ; gi1 þ Li g,
(16)
in which gi1 are the positions of the neighboring grid lines. In the case of the first or last grid line along one of the dimensions, gi1 ! Gmin , giþ1 ! G max , respectively, where Gmin ; G max are the bounds of the model domain. The transition probabilities tB (17) QB ðxq jxp Þ ¼ QB ðxp jxq Þ ¼ N g ðg0max g0min Þ are symmetric and the acceptance probability once again reduces to Eq. (14). 3. Experimental demonstration To demonstrate the inversion methodology from Section 2, we conducted a data collection experiment using a known multi-walled structure. The parameters of the experiment are described in Section 3.1. The model moves used for optimization are specialized from those in Section 2.5 and are described in Section 3.2. The inversion results are presented in Section 3.3. 3.1. Experiment description The measurement data used in this study were collected for the structure shown in Fig. 8 using an Agilent vector network analyzer and an ETS-Lindgren quad-ridged horn antenna (Model 3164-04). The acquisition was performed in the S11 , or monostatic, measurement mode for a series of frequencies in the range of 700 MHz to 3.1 GHz, with a spacing of 3 MHz. The data were collected at a constant x-offset of 2 m from the structure (corresponding to the left wall in Figs. 8(a) and 10(a,b), and the front wall in Fig. 8(b)), and at a constant height of 1.22 m from the floor. The horn was positioned in the y direction in a linear sequence of 81 positions with 0.0222 m separation, providing a total horizontal aperture of approximately 1.8 m. In our forward simulations, the horn radiation pattern was approximated as constant in horizontal cross-range and in frequency.
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
609
Fig. 8. (a) Photograph of the experimental test structure consisting of three 6 ft 8 ft plywood walls (each constructed from two sheets of 34 in plywood) arranged in parallel fashion, and connected with two additional walls. (b) Plan view depiction of the test structure.
Although the ETS-Lindgren horn has a directional radiation pattern, the front lobe is fairly broad, and does not vary strongly over the width of the structure. In addition, the phase center for the horn varies by several centimeters over the band of interest, but this effect was ignored in the forward model. The stepped-frequency data were transformed to the time domain, where the inversion was performed. Pulse synthesis was accomplished by applying the Gaussian window function (5). Sidelobe control and the desired pulse width specification were achieved using 1 s ¼ 16 ns, t0 ¼ 3s, and N f ¼ 3083 frequencies. These choices yielded a time resolution Dt ¼ 1=ðNDf Þ ¼ 0:1081 ns. 3.2. Detailed model moves for the experiment The inversion was formulated to estimate the class type and location of each edge in the structured graph representation. We consider a reduced-dimension wall type set so that the only allowed class types for each edge are either no wall (n) or a homogeneous wall (h)—a simple binary variable. We assume that all instances of a homogeneous wall in the model have the same defining parameter set, which is assumed to be known a priori, with components determined experimentally by other means: we use wall thickness w ¼ 3:81 cm (two layers of 34 in plywood), wall height h ¼ 2:44 m (8 ft), relative permittivity r ¼ 2:2, and conductivity s ¼ 0:074 S=m. The floor model was independently estimated, and by design, there were no clutter objects, doors, or windows in the scene. All sensor parameters were also either known or separately estimated. Hence, the parameter vector xb was not operative in the inversion, and xs was taken as known and fixed. Throughout the inversion, the cardinality of the edge set and the vertex set remains constant. Thus we fix the number of grid lines along x to be 3, and along y to be 4. With
ARTICLE IN PRESS 610
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
the origin of coordinates at the center of the sensor array (a line of length 1.776 m along y), the values of gy;1 ¼ G y;min ¼ 0:9144 m and gy;4 ¼ G y;max ¼ 0:9144 m, defining the extent of the three parallel walls oriented along x, were taken to be fixed. The model bounds along x were taken as G x;min ¼ 1:55 m, G x;min ¼ 6:05 m. We are left with five degrees of freedom fgx;1 ; gx;2 ; gx;3 ; gy;2 ; gy;3 g, the minimal number required to describe the experimental five wall system. In addition, for the edges defined by this 3 4 grid, we introduced the constraint that edges on the minimum and maximum values in the y direction are taken as open. This means that the class label associated with these edges is restricted to be ‘n’. There are then 13 remaining edges with class labels to be determined by the inversion. We have therefore simplified the problem to 5 continuous degrees of freedom, and 13 discrete (binary) degrees of freedom. Clearly, move type A (class labeling) has 13 model selection opportunities at each iteration of the search, and move type B (grid line translation) for which we use Eqs. (16) and (17) with parameters listed in Table 2, has N g ¼ 5 model selection opportunities. 5 Accordingly, we have set tA 13 18 and tB 18. Finally, we did not use any further model prior, as we have already enforced all prior beliefs in the specialized problem formulation. 3.3. Inversion results In our inversion example, the SA temperature cooling schedule is taken as T i ¼ T 0 ai with i ¼ 0; 1; . . . ; N 1, where T 0 is the initial temperature and a is the cooling rate. Specifically, we set T 0 ¼ Cðd; xð0Þ Þ=50 (where xð0Þ is the initial model), a ¼ 0:5, and N ¼ 5. The number of model moves at each temperature step was taken to be M ¼ 200. The choices for these parameters were partly driven by the limitation of computational resources. With the above choices, 1000 model moves were generated under the SA algorithm, and of these, 378 were accepted. The iteration converged after 750 steps, as shown in Fig. 9, where the sparser density of circles for iteration index greater than 750 indicates a higher rejection rate. Occasional increases in cost function values demonstrate a characteristic feature of SA algorithms, which is the ability to climb out of local minima in the search for the global minimum. Fig. 10 shows the wall locations before (Fig. 10(a)) and after (Fig. 10(b)) the inversion in dark gray, in comparison with the ground truth in light gray. The sensor lies to the left. Notice that in Fig. 10(b), the estimated model shows a spurious side wall between the second and third walls that face the sensors. This demonstrates that the side walls, especially those further to the right which are separated from the sensors by several intervening walls, are not easy to invert correctly. Similarly, one wall panel on the third Table 2 Parameters (in units of meters) constraining grid line motion used in Eqs. (16) and (17)
True gi l i Li l iþ Liþ
gx1
gx2
gx3
gy2
gy3
2.3881 0.45 1.05 0.8 1.4
3.4 0.8 1.4 0.5 1.1
4.2 0.5 1.1 0.95 2.75
0.3048 0.5 0.7 0.5 0.7
0.3048 0.5 0.7 0.4288 0.8288
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
611
Cost Function 3.5 Cost Function
3 2.5 2 1.5 1 0.5 0
0
100
200
300
400
500
600
700
800
900
1000
Number of Iterations
ors Sens
Sens
ors
Fig. 9. Evolution of the cost function C as a function of global iteration number. The initial temperature was T 0 ’ 2:3=50 ¼ 0:046.
Fig. 10. Wall models for the demonstration of inversion: Light gray—true walls. Dark gray—estimated walls. (a) Initial estimation versus the true wall model. (b) Wall model after inversion versus the true model. The cost function difference between the inverted and true models was DC ’ 0:015, comparable to the temperature between the second and third cooling steps.
wall, furthest from the sensor, is missing due to the weak response after signal attenuations by the first and second walls. The cost of the true model (C ’ 0:445) is lower than that of the inverted model (C ’ 0:459), so the true model in principle should emerge from a sufficiently long simulation. However, the cost difference DC ’ 0:015 is very small on the scale of the cost variations seen in Fig. 9, even at later times. Therefore it is not too surprising, given the noise levels in the data, that full convergence to the true model would be not achieved without much longer runs. Fig. 11 compares the experimental time domain back-projection (TDBP) (Fig. 11(a)) to the predicted images for the initial model of the inversion (Fig. 11(b)), the model at the final iteration of the inversion (Fig. 11(c)), and the measurement (Fig. 11(d)). As previously noted in the caption to Fig. 5, there are vertical strips appearing in the TDBP images for the simulation, which are model artifacts due to deletion of rays striking the wall edges. Numerical experiments show that these strips do not significantly affect the inversion of walls that are aligned parallel to the linear experimental aperture, but do decrease the
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
612
0
0 30
30
-5
-5
-10
-10 25 -15
20
-20
Delay Time [ns]
Delay Time [ns]
25
-15
20
-20
-25
-25 15
15
-30
-30
10
0
20 40 60 Sensor Position
80
10
-35
0
20 40 60 Sensor Position
80
0
0 30
30
-5
-5
-10
-10 25 -15
20
-20
Delay Time [ns]
Delay Time [ns]
25
-15
20
-20
-25
-25 15
15
-30
-30
10
-35
0
20 40 60 Sensor Position
80
-35
10
0
20 40 60 Sensor Position
80
-35
Fig. 11. Time domain back-projection images. (a) Obtained from true model experimental data. (b) Obtained from initial model, Fig. 10(a). (c) Obtained from inverted model, Fig. 10(b). (d) Obtained from true model predicted data. The lower two images are very similar, with the true model yielding only a slightly lower cost.
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
613
accuracy of side-wall inversions. Regardless of the ray-dropping issue, side-wall inversion can significantly benefit from extension of the data collection aperture. Fig. 12 shows plots of 1D slices of the TDBP images at sensor location 40 comparing the experimental data to the predicted data before inversion, after inversion, and for the true model. In order to test robustness of the inversion scheme, the pulses created by the initial model (Fig. 12(a)) were intentionally set up to be very different in both shape and location, from those in the experimental data. As shown in Fig. 12(b), the pulse positions for the inverted model are well aligned with those in the data; however, the amplitude of the second and third pulses do not match. Almost identical amplitude mismatch is seen in Fig. 12(c) for the prediction based on the true model. Possible sources of these mismatches include: (1) Inhomogeneities and layering in the plywood are not captured by our homogeneous wall forward model; (2) there are other scattering sources in the data, such as the ground and other objects, that are not included in the model, and may, for example, explain some of the unmodeled secondary peaks; and (3) the full frequency dependence of
0.03
0.03
Data Initial model
Data Inverted model
0.025
0.02
Amplitude [a.u.]
Amplitude [a.u.]
0.025
0.015 0.01 0.005 0
0.02 0.015 0.01 0.005 0
0
10
20 30 40 Delay Time [ns]
50
0
10
20 30 Delay Time [ns]
40
50
0.03 Data True model
Amplitude [a.u.]
0.025 0.02 0.015 0.01 0.005 0 0
10
20 30 Delay Time [ns]
40
50
Fig. 12. One-dimensional data slices from Fig. 11 at sensor position 40 comparing measurement data (solid line) to (a) predicted data for initial model, (b) predicted data for inverted model, and (c) predicted data for true model.
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
614
the transmitter radiation pattern and receiver sensitivity pattern are unknown in the experiment, and the simplifying assumptions made for them in the forward model are distorting the synthesized time-domain signal. Model samples generated from the SA procedure eventually converge to the target Boltzmann distribution, and would converge to the posterior distribution in the absence of a cooling schedule (i.e., if all temperatures T i are set to unity). We are interested in the ensemble of models with an acceptable data fit, not just the single optimal model [17]. We used the finite sample set as described below to compute spatially dependent occupation probabilities for walls in the solution space. This provides an effective means of displaying the model ensemble, and since we are using SA, the samples will be concentrated on the mode(s) of the posterior distribution. The inversion result for wall locations is presented statistically in Fig. 13. Note that the views here have been rotated by 90 relative to those in Fig. 10, so that the sensors now lie below the structure. These figures can be interpreted as spatial occupation probability maps for walls in the indicated 2D space. The probabilities are computed using the model samples generated during the course of the inversion and are displayed for successive
1
0.9 450
450
0.8
0.8
0.7
400
0.9
400
0.7
350
0.5 0.4
300
y (cm)
y (cm)
0.6
0.6
350
0.5 300
0.4
0.3 250
0.3 250
0.2 0.1
200
0 -150
-100
-50
0 x (cm)
50
0.2 0.1
200 -150
100
-100
-50
0 x (cm)
50
100
1
1
0.9
450
450
0.9
0.8
0.8 400
0.7 0.6
350
0.5 0.4
300
y (cm)
y (cm)
400
0.7 0.6
350
0.5 300
0.4
0.3 250
0.2 0.1
200
0.3 250
0.2 0.1
200
0 -150
-100
-50
0 x (cm)
50
100
0
-150
-100
-50
0 x (cm)
50
100
0
Fig. 13. Occupation probability map of wall existence/location for (a) iterations 1–200, (b) iterations 201–400, (c) iterations 401–600, and (d) iterations 601–1000. The corresponding ground truth plot is shown in Fig. 14.
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
615
1 0.9
450
0.8
y (cm)
400
0.7 0.6
350
0.5 0.4
300
0.3 250
0.2 0.1
200
0 -150
-100
-50
0 x (cm)
50
100
Fig. 14. Occupation probability map corresponding to the ground truth.
iteration intervals to indicate the evolution of the probability space. The probabilities for iterations 1–200, 201–400, 401–600, and 601–1000 are shown in Figs. 13(a–d), respectively. These results should be compared to Fig. 14, which shows the corresponding ‘ground truth’ plot. Several conclusions follow from these plots. The first (bottom) wall that faces the sensors is the easiest to invert. After 200 iterations, its position converges with a small fraction of a centimeter of the ground truth, with essentially 100% probability. The second wall facing the sensors converges to a position within about 1 cm of the ground truth and with 470% probability after 400 iterations. As expected, convergence for the third wall is slow. It takes more than 600 iterations to converge to a location within about 2 cm of the ground truth with a probability of 50%. The side walls are less sensitive to the data collected for the given aperture; therefore, they converge slowly and less accurately. For example, the first side wall between the first and second front walls converges to within about 3 cm of the ground truth, with a probability higher than 40% only after 600 iterations. The side walls between the second and third front walls are the least accurate— either misplaced with a probability of about 40% (right side wall) or showing a large, 5 cm offset from the ground truth with a probability of 40%. 4. Conclusions We have developed a microwave tomography method for building structure estimation using a Bayesian formulation. The Metropolis-Hastings sampling procedure was used to generate model realizations from the posterior distribution, and specialized model moves (along with associated proposal densities) were designed for this purpose. The samples generated by the search provide the essential input for estimators of various properties of the model space. For more efficient model optimization, we developed a simulated annealing method for preferential sampling in the vicinity of the modes of the Boltzmann distribution. We demonstrated our optimization method through inversion of an
ARTICLE IN PRESS 616
E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
experimental data set, collected for a proxy building consisting of a multi-wall structure with five partitions. Measurements were acquired in the frequency domain with a vector network analyzer and transformed to the time domain, where the inversion was performed. The demonstration experiment showed that the accuracy and model confidence for wall position estimates depends on their relative location and orientation with respect to the sensors. In general, walls that were oriented parallel to the experimental aperture (in this case an effective horizontal line array) and that were not separated by intervening walls could be estimated with the highest accuracy. Since it is more difficult to obtain position estimates for walls that are oriented orthogonal to the experimental aperture, the confidence of their existence is much lower as well. These results suggest the utility of a spatially diverse experimental aperture. In this study, the wall class types, thicknesses, and dielectric constants were all assumed to be known. In the future, these should be inverted for as well. Although the number of grid lines along which wall segments were permitted to lie was fixed, the number of wall elements along each line was permitted to vary. Thus, we have demonstrated a solution to a (relatively simple) problem with unknown model order. In future applications, the number of grid lines will be allowed to vary as well. In summary, we have demonstrated here the utility of the graph-based method, which allows for both continuous moves, such as wall location adjustment, and discontinuous moves, such as adding or deleting an entire wall, for a very simple set-up. Inversion of larger and more complex structures will be the subject of future work. Acknowledgments This work was supported by DARPA Strategic Technology Office (STO), under Contract No. HR0011-06-C-0110. We would like to thank Dr. Fauzia Ahmad of Villanova University for constructing the multi-wall structure used in this study and for collecting the microwave sounding data used for the inversion. We would like to thank Professor Moeness Amin for organizing this special issue. References [1] F. Ahmad, M.G. Amin, G. Mandapati, Autofocusing of through-the-wall radar imagery under unknown wall characteristics, IEEE Trans. Image Process. 16 (7) (2007) 1785–1795. [2] H.L. Bertoni, Radio Propagation for Modern Wireless Systems, Prentice-Hall, New York, NY, 1999. [3] M. Born, E. Wolf, Principles of Optics, Pergamon, Oxford, 1970. [4] K.E. Andersen, S.P. Brooks, M.B. Hansen, Bayesian Inversion of Geoelectrical Resistivity Data, J. R. Stat. Soc. Ser. B 65 (2003) 619–642. [5] C. Andrieu, N. de Freitas, A. Doucet, M.I. Jordan, An introduction to MCMC for machine learning, Mach. Learn. 50 (2003) 5–43. [6] A. Gelman, J.B. Carlin, H.S. Stern, D.B. Rubin, Bayesian Data Analysis, second ed., Chapman & Hall, CRC, London, 2004. [7] M. Sambridge, K. Gallagher, A. Jackson, P. Rickwood, Trans-dimensional inverse problems, model comparison and the evidence, Geophys. J. Int. 167 (2006) 528–542. [8] S.P. Brooks, N. Friel, R. King, Classical model selection via simulated annealing, J. R. Stat. Soc. Ser. B 65 (2003) 503–520. [9] P.J. Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika (1995) 711–732. [10] P.J. Green, Trans-dimensional Markov chain Monte Carlo, in: P.J. Green, N.L. Hjort, S. Richardson (Eds.), Highly Structured Stochastic Systems, Oxford Statistical Science Series, 2003, pp. 179–198 (Chapter 6).
ARTICLE IN PRESS E.M. Lavely et al. / Journal of the Franklin Institute 345 (2008) 592–617
617
[11] P.B. Weichman, Ultra-fast forward modeling of microwave propagation through multipathing media for efficient solution of tomographic inverse problems (invited presentation), URSI 2007 North American Radio Science Meeting, July 22nd–26th, 2007. [12] Y. Zhang, E.M. Lavely, P.B Weichman, E. Hill III, Y. Lai, The VisiBuilding inverse problem (closed session presentation), ASAP 2007, June 5th–6th, 2007. [13] C.F. Mecklenbra¨uker, P. Gerstoft, Objective functions for ocean acoustic inversion derived by likelihood methods, J. Comput. Acoust. 8 (2) (2000) 259–270. [14] E.M. Lavely, Y. Zhang, E. Hill III, Y. Lai, P. Weichman, The building tomography inverse problem (invited presentation), URSI 2008 National Radio Science Meeting, January 3rd–6th, 2008. [15] S. Kirkpatrick, C.D. Gelatt Jr., M.P. Vecchi, Optimization by simulated annealing, Science 220 (4598) (1983) 671–680. [16] S. Geman, D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Mach. Intell. 6 (1984) 721–741. [17] M. Sambridge, Geophysical inversion with a neighbourhood algorithm—II. Appraising the ensemble, Geophys. J. Int. 138 (1999) 727–746.