Computers & Geosciences 105 (2017) 81–90
Contents lists available at ScienceDirect
Computers & Geosciences journal homepage: www.elsevier.com/locate/cageo
Research paper
Large Crater Clustering tool
MARK
⁎
Jason Laura , James A. Skinner Jr., Marc A. Hunter U.S. Geological Survey, Astrogeology Science Center, 2255 N. Gemini Dr., Flagstaff, AZ 86001, United States
A R T I C L E I N F O
A BS T RAC T
Keywords: Planetary science GIS Clustering DBScan Impact craters ArcMap Spatial analysis Astrogeology Computer science
In this paper we present the Large Crater Clustering (LCC) tool set, an ArcGIS plugin that supports the quantitative approximation of a primary impact location from user-identified locations of possible secondary impact craters or the long-axes of clustered secondary craters. The identification of primary impact craters directly supports planetary geologic mapping and topical science studies where the chronostratigraphic age of some geologic units may be known, but more distant features have questionable geologic ages. Previous works (e.g., McEwen et al., 2005; Dundas and McEwen, 2007) have shown that the source of secondary impact craters can be estimated from secondary impact craters. This work adapts those methods into a statistically robust tool set. We describe the four individual tools within the LCC tool set to support: (1) processing individually digitized point observations (craters), (2) estimating the directional distribution of a clustered set of craters, back projecting the potential flight paths (crater clusters or linearly approximated catenae or lineaments), (3) intersecting projected paths, and (4) intersecting back-projected trajectories to approximate the local of potential source primary craters. We present two case studies using secondary impact features mapped in two regions of Mars. We demonstrate that the tool is able to quantitatively identify primary impacts and supports the improved qualitative interpretation of potential secondary crater flight trajectories.
1. Introduction Planetary mapping and topical science studies rely upon the temporal partitioning of discrete geologic units in order to understand the processes that drove the evolution of the surface. Impact crater size-frequency distributions and cross-cutting relationships jointly provide the data necessary to perform relative surface dating (e.g., Tanaka, 1986; Barlow, 1988; Platz et al., 2013; Tanaka et al., 2014). However, the interpretation of both relative and model absolute craterbased ages is not trivial and often yields conflicting temporal assignments for equivalent geologic units. The accurate quantitative association of primary impact craters (those created by collision of an exogenic impactor with a planetary surface) and secondary impact craters (those created by collision of material ejected from a primary impact on the same body) establishes a compelling ability to not only assess distal stratigraphic associations but also provides a means to explore impact dynamics as related to target material, gravitational fields, and atmospheric effects. There is expansive literature that explores and summarizes the physics of crater formation, including the excavation and ejection of material from a primary impact and the dynamical circumstances wherein a secondary crater will form. Secondary impact craters have been shown to have maximum diameters that are <5% of the parent
⁎
primary diameter (Shoemaker, 1965; Bierhaus et al., 2001), regardless of target body. The secondary crater diameter has been shown to decrease with distance from the primary crater (e.g. McEwen et al., 2005). Closer to the primary crater (<10 crater radii), secondary impact craters tend to be circular to elliptical in shape, have depth/diameter ratios lower than of primary impacts, and commonly display a distinctive chevron, linear, loop, or cluster pattern (Wilhelms et al., 1978, Wilhelms et al., 1987). The computational work described herein relies on the user to identify (i.e., map) potential secondary craters (as points) or crater clusters (as lines). The nuances of such mapping effort are not the focus on this paper but rather the input resulting from such a mapping effort. See Melosh (1989) or McEwen and Bierhaus (2006) for more complete information regarding the cratering process and the characteristics and distributions of secondary craters. Secondary-to-primary impact relationships have been analyzed for the Moon, Mars, and some Jovian satellites. Wilhelms (1976) leveraged the visual association of both secondary craters and related landforms to larger impact basins on the Moon to help establish a lunar timestratigraphic scheme. Similarly, Dundas and McEwen (2007) demonstrate that secondary impact craters can be visually mapped back to the source impact using the Lunar crater Tycho. Likewise, Preblich et al. (2007) and McEwen et al. (2005) both report similar findings for the Zunil crater on Mars with McEwen et al. (2005) suggesting that the
Corresponding author. E-mail address:
[email protected] (J. Laura).
http://dx.doi.org/10.1016/j.cageo.2017.04.011 Received 14 September 2016; Received in revised form 24 April 2017; Accepted 26 April 2017 Available online 03 May 2017 0098-3004/ Published by Elsevier Ltd.
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
majority of small-diameter craters (10–200 m) and crater clusters on Mars could be the product of secondary impacts. This assertion is supported by Bierhaus et al. (2005) who suggest that, for Europa, the majority of small craters (95% of all craters with diameters ≤ a few kilometers (km)) are the product of secondary impacts. The differentiation of primary and secondary impact craters significantly affects the use of crater size-frequency distributions as a surface dating mechanism (Shoemaker et al., 1963), particularly at smaller crater diameters. Finally, Bierhaus et al. (2001, 2005) suggest that almost all secondary impacts on the surface of Europa are attributable to a limited number of large primaries. Focusing more on secondary impact cratering models, Popova et al. (2003, 2007) identify two classes of Martian crater clusters: (1) small clusters with many craters with diameter less than approximately 10 m and spatial extents measurable in the hundreds of meters and (2) large clusters with many craters with diameters of approximately 100 m and spatial extents measurable in kilometers (km). The process for secondary impact creation for the former is attributed to the breakdown of the meteor in the atmosphere and the latter is the product of large, secondary ejections post impact. Both small and large crater clusters can be visually back-projected to potential primary impacts to support the determination of stratigraphic age (Wilhelms et al., 1978). However, visual association between alignments of secondary impact crater chains (and associated landforms) and potential source (primary) impacts alone is not always sufficient for confident spatiotemporal linkage, particularly for those primary impacts that are smaller than basin scales. As such, a much more quantitative method is needed to explore the source-secondary impact relationships. We leverage the secondary-to-primary relationship and inherent clustering in many secondary crater fields on Mars to demonstrate that the visual back-projection of secondary impact craters can be improved through computational modeling and statistical validation. The resultant secondary-to-primary relationship can then serve as another constraint to extend the stratigraphic record, i.e surface age, from well determined geologic features such as lunar basins and large Martian impact craters. Computational modeling of secondary-to-primary impacts provides a critical data product for use in a holistic approach to surface dating with the goal of improving our understanding of planetary geologic processes. This work offers the capability to do four things. First, statistically driven secondary clustering supports pattern identification that might otherwise be lost in the complexity (noise) of temporally layered primary and secondary impacts. Second, the computational identification of the secondary-to-primary relationship supports repeatability and testing of past interpretations and inferences about temporal sequencing of geologic processes upon which these impactrelated landforms reside. Next, the ability to automate the search for secondary-to-primary relationships supports future geologic mapping that can depend heavily on chronostratigraphic relationships. Finally, the ability to map secondary-to-primary craters can support the informal identification of a more granular temporal stratigraphic scheme. Collectively, we provide a computational tool with wide science applicability to the planetary geologic community. The remainder of this work is organized as follows. First, in Section 2 we describe the computational model, including input data sets and the assumptions the model makes. In Section 3 we describe the computational methods used for secondary-to-primary back projections and in Section 4 we apply our methods to the Mare Acidalium and Lunae Palus quadrangles on Mars. Section 6 describes sources of uncertainty in the model. Finally, we conclude with Section 7 and offer areas of potential future work.
Fig. 1. Overview of the surface processes the LCC toolset seeks to model where secondary impact craters are back-projected along inferred ejecta trajectories to identify potential primary impact craters.
tively relate mapped secondary craters (as points) or crater chains (as lines) to distant primary (source) craters. The presented plugin builds upon previous work by Nava et al. (2009), Nava and Skinner (2010), Skinner and Nava (2011) who prototyped the underlying concepts for a single crater (Zunil) on the Martian surface in ArcGIS 9.x. We call this model the Large Crater Clustering (LCC) tool set. Fig. 1 broadly illustrates the process of primary and secondary impact emplacement. The LCC tool set identifies clusters of secondary impact craters and back projects (along the inferred ejecta trajectory) secondaries seeking to locate primary impact craters while accounting for ejection and orbital properties, of the given planetary body. 2.1. Input data sets The identification of primary impact craters using the developed back-projection method requires the digitization, from remotely sensed data, of secondary impact craters. For the purposes of this tool, we classify secondary impacts into three groups: traditional clusters, crater chains (catenae) and lineated terrains (lineaments). Fig. 2 illustrates the supported data types. Each input data type (point or line) defines an entry point into the secondary-to-primary modeling process. Broadly, our model functions by extending great circle arcs, that measure orthodromic (spherical) distance, from either the semi-major axis of a bounding ellipse defined by spatial clustering of digitized points or defined linear features. This makes the assumption that clusters have been well defined, either computationally, or manually by the user. The length of these arcs is defined by a user and is a function of post impact ejection speed and flight time. Therefore, digitized points are first statistically clustered using traditional point pattern clustering techniques so that a bounding ellipse can be computed. For line and polygon geometries, great circle arcs can be extended immediately from the ending or semi-major axis points. 2.2. Model assumptions The two primary assumptions made follow Bierhaus et al. (2001). First, secondary craters do not follow the Complete Spatial Randomness (CSR) assumption and instead occur in some identifiable cluster. Second, we assume that large crater clusters are the product of post-impact spallation and ejecta. By extension, the impacting object is also large, resulting in a primary of significant size. In making these assumptions, we also leverage the behavior models developed by Popova et al. (2003, 2007) that are supported by the numerical modeling of Vickery (1986, 1987). That is, large crater clusters are defined by secondaries created by post-impact spallation and ejection of fragments on the order of 5–50 m. Flight times are assumed to generally last between 1000 and 2000 s with the potential for longer 5000 s flights for the largest impacts (Popova et al., 2007). In the latter case, total travel distance, assuming velocities of 3 km/s, and flight times up to 5000 s could be as far as 15,000 km (Popova et al., 2007) with a maximum width dispersion on the order of 25 km (given the radius, gravity, and rotation speed of Mars).
2. Computational model The primary contribution of this work is an ArcMap 10.x plugin, written in Visual Basic (VB) dotNET (.NET) that attempts to quantita82
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
Fig. 2. Crater clusters (a), catenae (b), and lineaments (c) on the Martian surface. The crater clusters (a) can be digitized as individual points or a visually approximated best fit line, the catenae (b) can be digitized using individual points, and the lineaments (c) can be digitized using a line. The LCC tools support representation of secondary impacts using all three geometries.
3. Methods
B = k1 (1 −
The number of processing steps required to perform secondary-toprimary back projection is determined by the input data types. Fig. 2 illustrates the different geometric representations secondary crater identification can take. The broad process requires that a series of individual points, a best fit line drawn through a visually identified cluster of craters or catenae, or polyline representation of a lineament have been identified. Once identified, the geometry can be projected across the planetary surface to a user defined distance. The projected lines are potential, back-projected, flight lines and the intersection of multiple flight lines can be indicative of a primary impact. The secondary-to-primary estimation process is implemented as four separate tools within the LCC tool set: clustering to identify discrete secondary crater clusters (visually or computationally), great-circle back projection, great-circle intersection, and clustering to identify discrete primary impacts. Below, we describe the process in detail, beginning with a digitized point representation of secondary craters. We note that, had the user manually identified clusters or provided polyline inputs, computational point clustering would not be required.
cos(2σm ) = cos σ −
s = βA (σ − Δσ ),
k1 =
a2 − b 2 b2
(3.1)
(3.2)
1 + u2 − 1 1 + u2 + 1
(3.3)
1
A=
1 B [cos σ (−1 + 2 cos2 (2σm )) 4 (3.7) (3.8)
1 The level of error is, as is well known, a function of the total distance between observations. This level of error is reported for the distance between New York City and New Orleans, comparing Euclidean distance with Geodetic distance measures.
1 + 4 (k1)2 1 − k1
(3.6)
where a is the length of the semi-major axis of the ellipsoid (equatorial radius), b is the length of the semi-minor axis (radius at the pole), α is the azimuth at the equator, k1 is Helmert's expansion parameter, σ is the arc length between points on the sphere, cos(2σm ) is derived when computing λ with U1 and U2 being equal to arctan [(1 − flattening)tan θ ], the reduced latitude on the auxiliary sphere, and s is the ellipsoidal distance between points. Clearly, the computational cost for ellipsoidal distance is significantly more expensive than planar, euclidean distance. The use of planar distance measures has the potential to induce significant error (Banerjee, 2005) in the secondary-to-primary impact crater estimation for the following two reasons. First, the overall error in distance is anisotropic. Therefore, secondary-to-primary relationships estimated over long distances have the potential to incur significant error. Second, we utilize spatial autocorrelation as a key indicator that observations can be clustered and this assumption requires that a critical distance, beyond which the likelihood that a cluster exists is extremely low, be identified. Error in distance computation can invalidate this assumption. For example, Popova et al. (2007) suggests that maximum, suborbital, travel distance of ejected blocks can be on the order of 1500 km with a maximum width dispersion of 25 km. Banerjee (2005) reports error upwards of 14% when comparing Geodetic to Euclidean distance measures.1 Therefore, small errors in distance computation can manifest in significantly different point distributions within a cluster. Within the LCC tool set, computation of the distance matrix uses a brute-force (O (n2 )) implementation where the distance between each digitized secondary crater and all other observations is computed. Once computed, the top k distances are stored and written to disk as a table within an ESRI Geodatabase. A brute-force approach was used for implementation simplicity and we store the distance matrix to disk to reduce the need for recomputation. The LCC tool set implements two point clustering algorithms, a classic, unsupervised, hierarchical agglomerative clustering approach
where x and y are Cartesian coordinates, and d is the distance between said coordinates. Geodetic distance is computed utilizing Vincenty's inverse method (Vincenty, 1975) and assumes that the planet is an oblate spheroid. We provide a truncated formulation, computed once λ, the longitudinal distance between two points, has converged. For a full formulation of the iterative method to compute λ see Vincenty (1975).
u2 = cos2 α
2 sin U1])sin U2 cos2 σ
1 − B cos(2σm )(−3 + 4sin2 σ )(−3 + 4 cos2 (2σm ))]} 6
This section assumes that the user has digitized individual craters and wishes to utilize quantitative clustering. Prior to the quantitative identification of crater clusters, pairwise distances between each crater (a digitized point) must be computed. Broadly, methods for computing spatial data can be defined as geodetic or planar. Both classes of methods provide different levels of accuracy at different computational costs. The LCC tool supports both Euclidean and geodetic distance computation. Two-dimensional Euclidean is computed utilizing the Pythagorean theorem as:
(x2 − x1)2 + ( y2 − y1)2 ,
(3.5)
Δσ = B sin σ {cos(2σm ) +
3.1. Observation clustering tool
d=
3 (k1)2 ) 8
(3.4) 83
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
a small spatial distance. Finally, DBScan runs in O (n log n ) time, a significant speedup over hierarchical clustering. We formulate DBScan as:
(Han et al., 2001) and a density based DBScan (Ester et al., 1996). Agglomerative clustering operates by selecting unvisited points, applying some distance measure to identify all neighbors and aggregating or combining all candidate points into a cluster. This is a single-linkage, irreversible Greedy approach (Murtagh and Contreras, 2012). It is known that this approach can result in clusters with long semi-major and short semi-minor axes as the adjacent pairwise inter-point distances are small but the global end-to-end point distances can be quite large. The implementation of this method is intentional as the propensity to cluster in long chains closely models the secondaryimpact emplacement process. We do not allow the clustering process to continue until all clusters have been aggregated. Instead, we terminate the clustering process once all point observations are either within a cluster, or marked as noise because their nearest neighbor distance is greater than a user-defined threshold. Finally, this type of clustering has O (n3) run time, dependent upon the implementation method. The LCC tool set implements the hierarchical clustering as:
In Fig. 3 we illustrate a potential realization of this hierarchical clustering approach. At initialization, the number of clusters equals the number of observations. Point 1 is selected (a) and a new cluster formed containing all points within the user defined distance (ϵ) (b). The next point to be visited is Point 2 and all points within ϵ are selected (c). Finally, Point 3 is visited and the red and green clusters are merged. Ester et al. (1996) developed a density based clustering method, DBScan, in order to (1) reduce the local domain knowledge necessary to parameterize existing clustering methods, (2) locate clusters of varying shape, and (3) support robust computation of large data sets. DBScan requires that the user provide the minimum density at which clusters are formed as well as the maximum distance between points within a cluster. It is not necessary to define the number of expected clusters a priori as one must with other unsupervised clustering methods such as K-Means (MacQueen, 1967). Though testing shows that elongated clusters (low inverse flattening) are most indicative of clustered secondary impacts, the shape of these features can vary. Therefore, DBScan is more robust than hierarchical clustering methods in the identification of a range of cluster shapes and orientations. In practice the shape distinction provides a means to identify intertwined clusters and differentiate between two or more clusters that fall within
In Fig. 4 we illustrate one potential realization of the clustering process, where ϵ is the threshold distance beyond which observations are not clustered. In (a) a randomly selected point is checked to see if sufficient points fall within ϵ. Next, (b), a point n less than the threshold size is marked as noise. Finally, (c) a point previously marked as noise can be assigned to a cluster if a member of another point's neighborhood exceeds n. Due to the stochastic start and handling of border points, DBScan is not deterministic. 3.2. Directional distribution tool Once observations have been clustered the directional distribution of the geometry is computed. The directionality is determined by the ellipticity (flattening) of the cluster, defined by
f=
(a − b ) , a
(3.9)
where a is the semi-major axis of the ellipsoid bounding the cluster and 1 b is the semi-minor axis. LCC utilizes the inverse flattening ( f ). As described below in Section 3.4, inverse flattening is usable as a 1 confidence weighting metric. As f approaches infinity ( f = 0) flight
Fig. 3. Hierarchical clustering.
84
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
Fig. 4. One potential realization of the DBScan clustering process illustrating (a) a randomly selected point with an insufficiently dense neighborhood within ϵ, (b) an existing cluster centered on 3, and (c) the inclusion of the previously skipped point 1 into the cluster previously centered on 3.
the reduced latitude (on the auxiliary sphere2) computed as:
trajectory confidence decreases. Computation of the reference ellipsoid is a five phase process. First, each point is read into memory. The coordinates of each point are reprojected to geographic coordinates and then to a conformal Transverse Mercator projection. This transformation ensures that issues with the −180, 180 boundary are overcome. Next, the ellipsoid center point, semi-major axis length, semi-minor axis length, semimajor azimuth and rotation are computed and stored as geometry attributes. Third, the center point of the ellipsoid is set as the center of an azimuthal, equidistant projection to ensure that the extension of the axes occurs to the correct length. Fourth, the semi-major and semiminor axes are extended to the defined geodesic distance such that each input point is bounded. The length of extension is computed as the product of the given axis and a user defined number of standard deviations. This ensures that the ellipsoid bounds all points in the cluster. Finally, the semi-major and semi-minor axes are utilized to compute the inverse flattening.
U = arctan[(1 − f )tan ϕ]
where f is the inverse flattening, and ϕ is the latitude of the point. With the bearings of the two endpoints (σ1 and σ2) and known flight distances along the geodesic, back projection endpoints are computed. Finally, having computed flight azimuths and end points, a geodesic curve is computed and written to disk. In Fig. 5 we illustrate the potential shift in flight trajectory caused by planetary rotation, the Coriolis effect. The Coriolis effect is used to model the impact of a rotating body under ejected material and account for the apparent deflection that occurs as the reference sphere moves independently of the ejecta. This is implemented by fixing the ejection point in space and computing the offset in landing site based on a Newtonian motion equation with a Coriolis component. This interaction is implemented as:
t= 3.3. Trajectory estimation tool
(3.10)
⎛ ⎞ cosU1 sin λ σ2 = arctan ⎜ ⎟ ⎝ cos U1 sin U2 − sin U1 cos U2 cos λ ⎠
(3.11)
distance velocity
⎞ ⎛ 2π ω=⎜ ⎟ * sin(ϕ) ⎝ rotationperiod ⎠
The next processing step is trajectory estimation where computed bounding ellipses, user supplied polylines, or any combination of the two are utilized as seed points from which potential flight trajectories are extended. The flight trajectory of ejected blocks is estimated from the endpoints (P0 and P1) of a line (semi-major axis in the case of an ellipse) and bidirectionally extended along potential flight trajectories (e.g., a geodesic arc). Bi-directional back projection is utilized because secondary crater emplacements do not typically provide sufficient a priori information (e.g., deformation) to suggest a known impact direction. That is, the boundary shape of the individual secondary craters is generally too circular to have an identifiable impact direction. The LCC tools request user defined minimum cluster length and threshold inverse flattening. Both constraints allow control over the potential accuracy of the final result. As noted above, flattening and total length both serve as a guide for the potential for the group of secondary craters to point back to a primary with some higher level of confidence. The generation of each trajectory is a five phase process. First, we compute the geodesic bearing between P0 and P1 utilizing (as above, we omit the computation of λ Vincenty (1975)):
⎛ ⎞ cosU2 sin λ σ1 = arctan ⎜ ⎟ ⎝ cos U1 sin U2 − sin U1 cos U2 cos λ ⎠
(3.12)
(3.13)
(3.14)
where distance is the geodesic distance traveled, velocity is the user defined ejection velocity for secondary impact material, t is the travel time in seconds, rotationperiod is a user defined length of the planetary sidereal day, in seconds, ϕ is starting latitude and ω is the angular speed. In order to compute the longitudinal shift due to rotation, we compute:
λ1 = λ + (ω*t )
(3.15)
where, ϕ is the original longitude in radians, and λ1 is the shifted longitude in radians. The Coriolis effect is most pronounced approaching the poles and least impactful nearing the equator. 3.4. Primary impact approximation tool We have identified two methods for primary impact approximation utilizing the presented back projection method, a quantitative and a qualitative approach. Both approaches leverage the geodesic curves computed during trajectory estimation. The quantitative approach applies a classic line intersection algorithm, converting intersections to point observations, and then applies point pattern clustering, as above. The statistically significant clusters over some threshold size are then stored, along with associated metadata (mean or weighted
where λ is the refined longitudinal delta between the end points, U is
2
85
The spherical model associated with a given ellipsoid.
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
statistics to populate our basic parameters, frequently discarding the upper or lower quartile of the data (as appropriate to the method). Additionally, the tools report the full set of descriptive statistics to the user, allowing for modification to the parameters after optimization has been performed. In practice, we see that simple removal of what is likely noise in the data supports more accurate primary impact estimation. 4. Case study We test and demonstrate the LCC tool set capabilities using two quadrangles on Mars: Mare Acidalium and Lunae Palus. These regions, though adjacent, contain geologic units of varying age and strength. As a result, these regions contain slightly different secondary crater landforms, which provides an excellent test of the analytical capabilities and broad application of the LCC tool set. In both cases, we identify secondary impact craters and related landforms (described per region below) using the controlled Thermal Emission Imaging System (THEMIS) infrared (IR) basemap, which is 100 m per pixel. This basemap affords identification of individual features larger than approximately 300 m in diameter. Features were mapped into a dedicated GIS geodatabase at a scale of 1:500,000. For both case studies we utilize both the computational and visual primary impact approximation methods. The latter, visual identification, was performed by two external users: User A and User B.
Fig. 5. Shift in flight trajectory when accounting for planetary rotation, the Coriolis effect. Without account for planetary rotation (a) flight trajectories follow the geodesic. Account for rotation (b) the trajectory follows the geodesic, but is also shifted due to the rotation of the surface during flight.
4.1. Acidalia The Mare Acidalium quadrangle extends from 300° to 360° East longitude and 30–65° North latitude in the western hemisphere of Mars, Fig. 7. The region is interpreted to be largely comprised of 3.5 billion year old sedimentary plains (Tanaka et al., 2014) located between −4000 and −5200 m elevation bounded by higher-standing, slightly older volcanic and cratered plains located between −1000 and −4000 m elevation. Secondary impact-related landforms in the Mare Acidalium study area are predominantly large crater clusters. Therein, we identify 5956 secondary impact craters (mapped as points) that occur in variably elongated clusters. We also identify 151 individual crater clusters (mapped as lines). The approach of identifying both points and lines provides an opportunity to compare the LCC tool set functions with respect to input features. Fig. 8 illustrates the results of secondary-to-primary approximation for computationally identified clusters using point and line input data sets as well as user-identified primary impact craters using a value-byalpha mapping technique. The mapping between manually digitized point crater data set and linear approximation data set is not symmetric, as some of the digitized crater clusters are non-elliptical. We see rough agreement (3/4 cases) between the linearly digitized crater clusters and User B. The unmatched primary, at approximately 34° North and 353° East is located in an area of low trajectory overlap when compared to the center of the quadrangle. This illustrates the challenge in computationally identifying primary impact craters, using a single parameter set, when the density of trajectory intersections varies spatially. We see computationally clustered secondary craters identifying one potential primary impact, in agreement with a linear feature and User B located primary. Finally, User A identified one potential primary impact citing excessive noise in the value-by-alpha map making identification challenging.
Fig. 6. Value by alpha mapping over a black background to avoid bias in interpretation. User identified primaries are indicated in red. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
centroid and mean inter-point distance). A centroid for each cluster approximates the primary impact location. Computation of the mean centroid or weighted centroid, where weight is a function of the inverse flattening of the input ellipsoid are supported. Qualitative approximation utilizes the Value-By-Alpha (VBA) mapping (Roth et al., 2010) approach whereby trajectories are buffered to a user-defined width and the alpha value (opacity) of each trajectory set low, (e.g. 1% opacity or 100 stacked trajectories will achieve full color saturation). Fig. 6 illustrates a sample VBA map where high overlap incidence is indicative of a potential source crater. The VBA map is intentionally displayed over a black background to avoid bias induced by a base map. 3.5. Optimization
4.2. Lunae Palus Each stage of the secondary-to-primary approximation is dependent on user parameterization. For example, the definition of ϵ within the context of DBScan or the minimum length of a digitized feature to be included in the trajectory estimation phase. The LCC tool set supports automated optimization functionality that seeks to parameterize the data in a reasonable way. The tool utilizes simple descriptive
The Lunae Palus quadrangle extends from 270° to 315° East longitude and 0–30° North latitude in the western hemisphere of Mars. The region is comprised, almost in equals parts, of 3.7–4.1 billion year old cratered terrains and 3.6–3.7 billion year old lava plains (Tanaka et al., 2014; Platz et al., 2013). These regions are 86
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
Fig. 7. Input data sets in the Mare Acidalium quadrangle showing individually digitized secondary craters.
generally located between 1000 and −500 m elevation. Secondary impact-related landforms in the Lunae Palus study area are both clustered individual craters (n=93, similar to those observed and mapped in Mare Acidalium) but also include chains of overlapping craters (n=359, termed “catenae”) and crater-related lineaments (n=85). This varied range of interpreted crater-related secondary landforms provides an additional opportunity to compare LCC tool set function on multiple features. In addition, these landforms are very similar to those observed on the Moon, exemplifying the range of tool set application Fig. 9. Quantitative and qualitative identification of primary source craters, using the LCC Tool, was a four step process. First, individual point craters were clustered utilizing the DBScan method and the directional distribution of said clusters computed. Next, we perform back-projected trajectory estimation for input data sets out to 25000 km. This yields a derived data set with 537 candidate, bidirectional flight trajectories. Third, primary impact approximation was performed utilizing: (1) the qualitative VBA mapping approach (with two inde-
pendent users and no base map), (2) for each individual input type (crater clusters, catenae, and lineaments individually) and (3) as an aggregation of all input types. Fig. 10 illustrates the results over a Thermal Emission Imaging System (THEMIS) (Christensen et al., 2004) daytime IR base. We symbolize approximate, quantitatively computed, primary impacts with yellow, purple, green, and red stars to represent the catenae, crater cluster, lineament, and aggregated data sets, respectively. We see rough agreement between the quantitative and qualitative approaches, with both identifying the same potential primary impact locations. Two examples of this convergence are at approximately 4° North, 308° East and 22° North, 273° East, Fig. 10. For this use case, we see catenae providing the most accurate approximations of primary impacts with digitized clusters performing most poorly. This is attributable to the difficulty in visually approximating the directional distribution of a non-chained cluster of secondary impacts. We also identify general linear trending from North-East to South-West for potential primary impact craters approximated from catenae and
Fig. 8. Quantitative and qualitative results in the Mare Acidalium quadrangle using digitized crater clusters.
87
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
Fig. 9. Input data sets in the Lunae Palus quadrangle showing clustered crater chains (points), catenae (polygons), and lineaments over a THEMIS daytime infrared base map.
Fig. 10. Quantitative and qualitative results in the Lunae Palus quadrangle using digitized crater clusters, lineaments, and catenae.
6. Uncertainties and traditional spatial analysis issues
attribute this trending to the fact that many catenae are generally trending in this direction and the developed method does not remove potential trajectories once intersection has occurred. That is, a trajectory contributes multiple intersections to all other trajectories and is not removed from the sample once intersected.
The LCC toolset computational model incurs uncertainties from four sources. First, digitization error in visually approximating a best fit line or computational clustering error can result in significant flight path deviation potentially undercutting accurate determination of primary craters. For example, a crater cluster, digitized using a visually identified best fit line can exhibit significant flight deviation due to small changes in direction, (i.e, angle of pointing). Intuitively, this uncertainty is propagated more significantly the longer the post-impact ejecta flight time. Second, the impact of back-projection accuracy to noise in the data is non-trivial. For that reason, we rely on the user to reduce outliers with each processing phase by selecting only those candidate clusters most likely to point to primary impacts (e.g., long, highly elliptical catenae or lineaments or digitized clusters of significant size with high inverse flattening). We note that the tool does not offer an automated method to identify and deconvolve spatially overlapping
5. Comparison of human identified and computationally identified primary impact craters We compare the identification of potential primary impact craters using computational methods and human selection for two reasons. First, it is not possible to definitively correlate whether the mapped secondary craters are sourced from the identified primary crater. Therefore, we do not offer an assessment of the absolute accuracy of our tool. Second, the LCC toolset is designed to provide a first assessment of the potential secondary to primary relationship. This relationship can then be further strengthened using other investigative approaches such as geologic mapping. 88
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
potentially informing target material strength, ejection speed, and atmospheric conditions at the time of emplacement. The LCC tool set could also be used to approximate the orientation and extent of globalscale fractures that occur on icy satellites such as Ganymede and Europa. Deviation from predicted great circles could implicate lateral variations in the fractured crust of these icy bodies. The full utility of the LCC tool set has not yet been fully realized.
secondary craters without human intervention. For example, two overlapping catenae, digitized as point data and forming a rough ‘X′ shape would be consider a single crater cluster by our algorithm. Computationally, this is noisy data and this use case was a primary driver for the functionality to begin the process using line and polygon geometries. Third, the LCC toolset can be significantly impacted by edge effects. McEwen et al. (2005) and Preblich et al. (2007) illustrate a roughly circular emplacement pattern for secondary impact craters around Zunil crater, Mars. Had the digitization process been limited to an arbitrary boundary, the potential to either omit or mis-identify a primary would have been high. We recommend accounting for edge effects by expanding the study area beyond the immediate region of interest and not constraining mapping efforts to arbitrary boundaries. Finally, the LCC results are susceptible to issues of the Modifiable Areal Unit Problem (MAUP)3 and the scale of secondary digitization is important. We recommend mapping secondaries at 1:500,000 scale.
Acknowledgement Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. References Banerjee, S., 2005. On geodetic distance computations in spatial modeling. Biometrics 61 (2), 617–625. Barlow, N., 1988. Crater size-frequency distributions and a revised martian relative chronology. Icarus 75 (2), 285–305. Bierhaus, E.B., Chapman, C.R., Merline, W.J., 2005. Secondary craters on Europa and implications for cratered surfaces. Nature 437, 1125–1127. http://dx.doi.org/ 10.1038/nature04069. Bierhaus, E.B., Chapman, C.R., Merline, W.J., Brooks, E., Asphaug, S.M., 2001. Pwyll secondaries and other small craters on Europa. Icarus 153 (2), 264–276, (URL 〈http://www.sciencedirect.com/science/article/pii/ S0019103501966904〉). Christensen, P.R., Jakosky, B.M., Kieffer, H.H., Malin, M.C., Jr, H.Y.M., Nealson, K., Mehall, G.L., Silverman, S.H., Ferry, S., Caplinger, M., Ravine, M., 2004. 2001 Mars Odyssey. Springer Netherlands, Dordrecht, Ch. The Thermal Emission Imaging System (Themis) for the Mars 2001 Odyssey Mission, pp. 85–130. (URL 〈http://dx. doi.org/10.1007/978-0-306-48600-5〉). Dundas, C.M., McEwen, A.S., 2007. Rays and secondary craters of Tycho. Icarus 186 (1), 31–40, (URL 〈http://www.sciencedirect.com/science/article/pii/ S0019103506002788〉). Ester, M., Kriegel, H.-P., Sander, J., Xu, X., 1996. A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. AAAI Press, Portland, OR, USA, 226–231. Han, J., Kamber, M., Tung, A.K.H., 2001. Geographic Data Mining and Knowledge Discovery. Taylor and Francis, London, UK, Ch. Spatial Clustering Methods in Data Mining: A Survey. MacQueen, J., 1967. Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics. University of California Press, Berkeley, Calif., pp. 281–297. (URL 〈http://projecteuclid.org/euclid.bsmsp/ 1200512992〉). McEwen, A., Preblich, B., Turtle, E., Artemieva, N., Golombek, M., Hurst, M., Kirk, R., Burr, D., Christensen, P., 2005. The rayed crater Zunil and interpretations of small impact craters on Mars. Icarus 176 (2), 351–381. McEwen, A.S., Bierhaus, E.B., 2006. The importance of secondary cratering to age constraints on planetary surfaces. Annu. Rev. Earth Planet. Sci. 34, 535–567. Melosh, H.J., 1989. Impact Cratering: A Geologic Process/H.J. Melosh. Oxford University Press; Clarendon Press New York, Oxford, (URL 〈http://www.loc.gov/ catdir/enhancements/fy0638/88005353-t.html〉). Murtagh, F., Contreras, P., 2012. Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1), 86– 97. (URL 〈http://dx.doi.org/10.1002/widm.53〉). Nava, R.A., Skinner, J.A., 2010. GIS-based Analysis of Secondary Craters as Stratigraphic Markers, Mars, Abstract #2699. In: Proceedings of the Lunar and Planetary Science Conference, vol. 41 of Lunar and Planetary Science Conference. Nava, R.A., Skinner, J.A., Hare, T.M., 2009. Using distributional characteristics of superposed large-scale crater clusters as temporal indicators of geologic processes, abstract #2530. In: Proceedings of the Lunar and Planetary Science Conference, vol. 40 of Lunar and Planetary Science Conference. Platz, T., Michael, G., Tanaka, K.L., Jr., J.A.S., Fortezzo, C.M., 2013. Crater-based dating of geological units on Mars: methods and application for the new global geological map. Icarus, 225(1), 806–827. (URL 〈http://www.sciencedirect.com/science/ article/pii/S0019103513001863〉). Popova, O., Nemtchinov, I., Hartmann, W.K., 2003. Bolides in the present and past Martian atmosphere and effects on cratering processes. Meteorit. Planet. Sci. 38 (6), 905–925, (URL 〈http://dx.doi.org/10.1111/j.1945-5100.2003.tb00287.x〉). Popova, O.P., Hartmann, W.K., Nemtchinov, I.V., Richardson, D.C., Berman, D.C., 2007. Crater clusters on Mars: shedding light on martian ejecta launch conditions. Icarus 190 (1), 50–73, (URL 〈http://www.sciencedirect.com/science/article/pii/ S0019103507001340〉). Preblich, B.S., McEwen, A.S., Studer, D.M., 2007. Mapping rays and secondary craters from the Martian crater Zunil. J. Geophys. Res.: Planets 112 (E5), (n/a-n/a, e05006. URL 〈http://dx.doi.org/10.1029/2006JE002817〉). Roth, R.E., Woodruff, A.W., Johnson, Z.F., 2010. Value-by-alpha maps: an alternative technique to the cartogram. Cartogr. J. 47 (2), 130–140, (URL 〈http://dx.doi.org/10.1179/000870409X12488753453372〉).
7. Conclusion We have presented a large crater clustering tool that seeks to quantitatively support the identification of primary impact craters utilizing digitized secondary impacts. The LCC tool set provides a quantitative, reproducible tool set for the approximation of the secondary-to-primary impact crater relationship. The LCC tool set has been designed to straddle the accuracy-computational complexity divide (with runtimes a function of data set size, offering the ability to work in less accurate planar, or more accurate geodesic space. We anticipate broad use of this tool within the planetary geological mapping community not only where stratigraphic interpretation is essential to augment other cross-cutting relationships in determining the sequencing of geologic units but also where impact flight dynamics need to be investigated. The intended audience has driven the decision to release the tool for the ArcMap platform as an easily installed plugin. Future work will continue to draw from the secondary impact emplacement literature (e.g. Bierhaus et al., 2001, 2005; Popova et al., 2003, 2007) and will focus on four improvements. First, the qualitative primary impact approximation mirrors the physical width emplacement constraints proposed by Popova et al. (2007) better than the 2dimensional line geometries used by the quantitative approach. Modification of the quantitative approach to support cones of dispersion for intersection is warranted. Second, additional constraints on ejection dynamics will be added if appropriate a priori information is available. This includes attribution of crater shape that may be indicative of emplacement direction. Third, support of individual craters, digitized as polygons, with rich attribution (e.g., crater characteristics or used defined per geometry weights) can be added. This additional information allows the addition of size constraints (Popova et al., 2007) on primary and secondary impacts, with the goal of avoiding mis-identifying a secondary as a primary. Finally, computational efficiency will be improved by replacing the brute force distance computation methods with a KD-Tree. The LCC tool set described herein is a first cut approach to quantitatively linking large-scale secondary impact crater landforms to potential source craters. As discussed herein, the utility of this approach is rooted in the potential to assist assignment of ages of geologic terrains across a wide spatial area, particularly for units occurring in non-contiguous regions and/or in areas where geologic contacts are sparse. However, we identify several areas where the LCC tool set can be used in whole or in part for other planetary investigations, including assessment of parameters that contribute to the impact and ejection process. For example, the tool set could be used to identify the rate of break-up and dispersion of blocks along the flight path, 3 A potential bias in analysis causing divergent hypothesis testing results as a function of the scale at which analysis is performed.
89
Computers & Geosciences 105 (2017) 81–90
J. Laura et al.
224–236, (URL 〈http://www.sciencedirect.com/science/article/pii/ 0019103586901053〉). Vickery, A.M., 1987. Variation in ejecta size with ejection velocity. Geophys. Res. Lett. 14 (7), 726–729, (URL 〈http://dx.doi.org/10.1029/GL014i007p00726〉). Vincenty, T., 1975. Geodetic inverse solution between antipodal points. Tech. Rep. 176, Directorate of Overseas Surveys of the Ministry of Overseas Development, Tolworth, Surrey, UK. Wilhelms, D.E., 1976. Secondary impact craters of lunar basins. In: Proceedings of the Lunar Science Conference 7th. pp. 2883–2901. Wilhelms, D.E., McCauley, J.F., Trask, N.J., 1987. The geologic history of the Moon. Tech. rep. (URL 〈http://pubs.er.usgs.gov/publication/pp1348〉). Wilhelms, D.E., Oberbeck, V.R., Aggarwal, H.R., 1978. Size-frequency distributions of primary and secondary lunar impact craters. In: Proceedings of the Lunar and Planetary Science Conference, vol. 9 of Lunar and Planetary Science Conference. pp. 1256–1258.
Shoemaker, E.M., Hackman, Robert, J., Eggleton, R., 1963. Interplanetary correlation of geological time. Adv. Astronaut. Sci. 8, 70–89. Shoemaker, E.M., 1965. Preliminary analysis of the fine structure of the lunar surface in mare cognitum. In: Hess, W.N., Menzel, D.H., O′Keefe, J.A. (Eds.), The Nature of the Lunar Surface. p. 23. Skinner, J.A., Nava, R.A., 2011. Using large crater clusters to identify potential source craters on mars: technical methods and science applications, abstract #2502. In: Proceedings of the Lunar and Planetary Science Conference, vol. 42 of Lunar and Planetary Science Conference. Tanaka, K.L., 1986. The stratigraphy of Mars. J. Geophys. Res.: Solid Earth 91 (B13), E139–E158, (URL 〈http://dx.doi.org/10.1029/JB091iB13p0E139〉). Tanaka, K., Skinner, J.A., J., Dohm, J., Irwin, R.P., I., Kolb, E., Fortezzo, C., Platz, T., Michael, G., Hare, T., 2014. Geologic map of Mars: U.S. Geological Survey Scientific Investigations Map 3292, scale 1:20,000,000, pamphlet 43. USGS Publications. Vickery, A., 1986. Size-velocity distribution of large ejecta fragments. Icarus 67 (2),
90