Statistical methods are commonly employed to estimate spatial
probabilities of landslide release at the catchment or regional
scale. Travel distances and impact areas are often computed by means
of conceptual mass point models. The present work introduces a fully
automated procedure extending and combining both concepts to compute
an integrated spatial landslide probability: (i) the landslide
inventory is subset into release and deposition zones. (ii) We employ
a simple statistical approach to estimate the pixel-based landslide
release probability. (iii) We use the cumulative probability density
function of the angle of reach of the observed landslide pixels to
assign an impact probability to each pixel. (iv) We introduce the
zonal probability i.e. the spatial probability that at least one
landslide pixel occurs within a zone of defined size. We quantify this
relationship by a set of empirical curves. (v) The integrated spatial
landslide probability is defined as the maximum of the release
probability and the product of the impact probability and the zonal
release probability relevant for each pixel. We demonstrate the
approach with a 637 km
Overviews of spatial landslide probability (susceptibility) at catchment or regional scales are useful for hazard indication zoning and for prioritizing target areas for risk mitigation. Computer models making use of geographic Information Systems (GIS) are commonly employed to produce such overviews (Van Westen et al., 2006). Physically-based modelling of landslide susceptibility – also with reasonably complex modelling tools – has become an option also for large areas from a purely technical point of view (Mergili et al., 2014a, b). However, the parameterization of such models remains a challenge, limiting the quality of the results obtained. For this reason, statistical methods – often coupled with stochastic concepts – are commonly employed to relate the spatial patterns of landslide occurrence to those of environmental variables, and to estimate landslide susceptibility by applying these relationships (Guzzetti, 2006). A broad array of statistical methods for landslide susceptibility analysis has been developed, documented by a large bunch of publications (e.g. Carrara et al., 1991; Baeza and Corominas, 2001; Dai et al., 2001; Lee and Min, 2001; Brenning, 2005; Saha et al., 2005; Guzzetti, 2006; Komac, 2006; Lee and Sambath, 2006; Lee and Pradhan, 2007; Yalcin, 2008; Yilmaz, 2009; Nandi and Shakoor, 2010; Yalcin et al., 2011; Petschko et al., 2014). However, such methods only concern the release of landslides whilst they disregard their propagation.
Whilst advanced physically-based models for landslide propagation (e.g. Christen et al., 2010a, b) are usually employed for local-scale studies, conceptual approaches have been developed to analyze and to estimate travel distances and impact areas at broader scales. Some build on the angle of reach or related parameters (e.g. Scheidegger (1973) for rock avalanches; Zimmermann et al. (1997) and Rickenmann (1999) for debris flows; Corominas et al. (2003) for various types of landslides; Noetzli et al. (2006) for rock/ice avalanches), others consist in semi-deterministic models employing the concept of Voellmy (1955) (Perla et al., 1980; Gamma, 2000; Wichmann and Becht, 2003; Horton et al., 2013). Mergili et al. (2015) have recently presented an automated approach to statistically derive cumulative density functions of the angle of reach from a given landslide inventory, and to apply these functions to compute a spatially distributed impact probability. Modelling approaches considering both the release and the propagation of landslides do exist (Mergili et al. (2012) and Horton et al. (2013) for debris flows; Gruber and Mergili (2013) for various high-mountain processes). However, they yield deterministic results distinguishing areas with an impact expected from those with no impact expected, or qualitative scores.
Integrated automated approaches to properly estimate the spatial probability of a given area to be affected by a landslide – considering both release and propagation – are still missing. The present work attempts to fill this gap by combining the two newly developed open source software tools r.landslides.statistics and r.randomwalk. We will next introduce our modelling strategy (Sect. 2) and the study area in Taiwan (Sect. 3). After presenting (Sect. 4) and discussing (Sect. 5) the results we will conclude with a set of key messages (Sect. 6).
Within the present article we use the term “landslide” in a broad sense, including all relevant types of gravitational mass movements.
We propose an integrated statistical procedure (containing stochastic elements) to compute the spatial probability of a given area (technically, a given GIS pixel) to be affected by a landslide either through its release or through its propagation. We first consider release and propagation separately and finally combine the two concepts. The entire work flow is illustrated in Fig. 1, its components are introduced in detail in Sects. 2.2–2.6.
Two newly developed raster modules of the open source software package
GRASS GIS 7 (Neteler and Mitasova, 2007; GRASS Development Team, 2015)
are combined:
r.landslides.statistics allows inventory subsetting, estimation
of the spatial probability of landslide release, and the generation of
a zonal probability function. r.randomwalk, introduced by Mergili et al. (2015), employs
sets of constrained random walks to route hypothetic mass points down
through the digital elevation model (DEM) and assigns an impact
probability to each pixel. The cumulative probability density function
(CDF) used is derived from the analysis of the observed
landslides. Further, r.randomwalk includes an algorithm to combine
release probabilities and impact probabilities, making use of the
zonal probability function derived with r.landslides.statistics.
Both tools build on a combination of the Python and C programming
languages. The R software environment for statistical computing and
graphics (R Core Team, 2015) is used for
built-in validation and visualization
functions. r.landslides.statistics and r.randomwalk can be started in
a fully non-interactive way i.e. all parameters are passed as command
line arguments. This strategy enables a straightforward combination of
multiple runs of the two models at the script level.
An issue of central importance consists in the strict separation of the data used for model development and the data used for model application and evaluation. In this sense, most operations are performed either for the model development area (MDA) or for the model evaluation area (MEA), but not for both. The only exception from this rule applies to the initial step of inventory subsetting.
All probabilities used in the context of the present work are summarized in Table 1.
Landslide inventories often suffer from a missing – or unsatisfactory – subsetting into release, transit and deposition areas. The reason for this problem, which applies also to our case study, is not necessarily related to deficient mapping efforts, but rather to the impossibility to identify each zone in the field or from remotely sensed data. Appropriate subsetting, however, is required before using the inventory for statistical analyses of landslide release or propagation. We therefore suggest a reproducible procedure to deal with this problem.
We analyze the geometric properties of all landslides in a given
inventory in terms of inclination, minimum and maximum elevation,
elevation range, central, maximum and average 2-D and 3-D length and
width, 2-D and 3-D areas. Lengths and widths are defined as Euclidean
distances (the central 2-D and 3-D lengths
In the present work, we consider all observed landslide pixels with
observed release areas (ORA), where all release pixels are
considered observed positives (OP), the rest of the landslide areas
are considered no data, and all non-landslide pixels are considered
observed negatives (ON); observed deposition areas (ODA), where all deposition pixels are
considered OP, the rest of the landslide areas are considered no data,
and all non-landslide pixels are considered ON; observed impact areas (OIA), where all landslide pixels are
considered OP, and all non-landslide pixels are considered ON.
These definitions prevent us from including pixels in the statistical
analysis and the validation procedure we can neither assign to the ORA nor to the ODA. To ensure
excuding all uncertain pixels we have to chose conservative values of
Statistical analyses of landslide spatial release probability
(landslide susceptibility) have been treated exhaustively in previous
studies (see Sect. 1 for references). In the context of the present
work we are bound to a method yielding spatial probabilities in the
range 0–1. In this sense, we employ a simple approach building on the
spatial overlay of classified predictor maps. Considering separately
each of the resulting combinations of predictor classes, we compute
the fraction
The true positive (TP), true negative (TN), false positive (FP) and
false negative (FN) pixel counts are derived for selected levels of
It is useful for many purposes to work with pixel-based spatial
release probabilities (
a subset of the MDA with a randomized size and randomized centre
coordinates is selected. within this subset, a set of sub-subsets with constant zone size
(2) is repeated for a large number of sets of sub-subsets
covering a broad range of
(1)–(3) are repeated for a large number of random subsets of the MDA.
This procedure results in a line cloud of
The tool r.randomwalk (Mergili et al., 2015) is employed for
routing mass points representing hypothetic landslides through the
DEM. The specific impact probability The CDF describing the probability that a moving mass point
starting from an arbitrary release pixel leaves the OIA of the same
landslide at or below a certain threshold of The CDF is then employed to compute The same CDF is used for computing
The integrated spatial landslide probability
For this purpose, we come back to the function introduced in
Eq. (2). Thereby we assume that the shape of the logistic regression
function is insensitive to the zonal average of the computed values of
The expected error of
We note that the described procedure is supposed to yield smoothed
results due to averaging effects: (i) Eq. (5) builds on the
simplification of a uniformly distributed release probability over the
possible release zone. (ii) As highlighted in Sect. 2.5,
In the period from 7 to 9 August 2009, Typhoon Morakot triggered
a high number of landslides in Taiwan. According to Lin et al. (2011),
more than 22 000 landslides were recorded in Southern Taiwan. One of
the hot spots was the Kao Ping Watershed (Wu et al., 2011), where
extremely heavy rainfall (more than 2000
We consider a 637
The model tests are summarized in Table 2. The Kao Ping study area is
divided into four subsets (A–D in Fig. 6) to separate between MDA and
MEA. In each of the tests, three subsets are used as MDA and one
subset is used as MEA. The division lines between the subsets follow
catchment boundaries in order to ensure that all landslides are
clearly assigned to one of the four subsets and no landslide may
impact more than one subset. All tests are run at a pixel size of
20
We use values of
For back-calculating
Preliminary tests have further indicated that the largest, deep-seated
landslides in the test area are poorly predicted by the statistical
model applied. We hypothesize that landsides of this type are governed
by other factors than those which can be derived directly from the DEM
or other surface data. The analyses are therefore repeated excluding
all landslides with a total size of the OIA
We further run the model with a spatially constant value of
The model results are evaluated against the observed landslides at two
spatial levels using ROC Plots:
The pixel level. The level of slope units. The slope units are derived using the
GRASS GIS module r.watershed (parameter half_basin), with a minimum
area of one slope unit of 10
Figure 7 illustrates the result maps for test 1C. For reasons of
clarity, we show only a subset of the test area (see Fig. 6). However,
the general patterns of the results are well represented in this area
and are also valid for the other tests. Figure 7a shows the result of
the inventory subsetting, the spatial variation of
The largest values of
Note that high values of
Figure 10 shows the distribution of
Considering all observed landslides (tests 1A–D), 7.5 % of the
entire test area are classified as OIA (i.e. the observed integrated
spatial landslide probability). The average value of
The ROC Plots for model evaluation are compiled in
Fig. 11.
Removing the largest landslides (OIA
In accordance with the patterns observed with regard to
The ROC Plots shown in the Fig. 11h–l relate the modelled
distribution of
Slope units are not the suitable level to spatially aggregate
We have introduced a novel methodology to compute the spatial probability of an arbitrary raster pixel – or any other type of unit – to be affected by a landslide. Our approch considers both landslide release and propagation. It further introduces the concept of the zonal release probability for correcting (i) the release probability relevant for a certain impact pixel for the size of the possible release area, or (ii) any type of probability for a certain level of spatial aggregation.
The model results were evaluated at the pixel and slope unit
levels. Slope units have been used earlier for discretizing and
evaluating landslide release susceptibility maps (e.g. Rossi et al.,
2010; Jia et al., 2012). Marchesini et al. (2015) have shown that
a physically-based landslide susceptibility model performs better when
evaluated at the level of slope units instead of pixels. In the
present study, this phenomenon is confirmed for
Whilst traditional statistically-based landslide susceptibility
studies (e.g. Carrara et al., 1991; Baeza and Corominas, 2001;
Dai et al., 2001; Lee and Min, 2001; Saha et al., 2005; Guzzetti,
2006; Komac, 2006; Lee and Sambath, 2006; Lee and Pradhan, 2007;
Yalcin, 2008; Yilmaz, 2009; Nandi and Shakoor, 2010; Yalcin et al.,
2011; Petschko et al., 2014) are useful to identify likely release
areas at the pixel level, they appear to play a limited role when
(i) considering integrated landslide probability; or (ii) aggregating
the pixel-based results to larger spatial units. However, the strong
correlation between zone size and the zonal value of
The proposed approach is considered particularly useful for situations
where landslides are highly mobile e.g. where they convert into
debris flows. It has to be used with care where landslides are not
mobile. In these cases, the CDF of the angle of reach would reflect
the length distribution of the ORAs rather than the mobility of the
landslides. In general, we note that the angles of reach used in the
present study rely on another concept than those included in published
relationships (e.g. Scheidegger, 1973; Zimmermann et al., 1997;
Rickenmann, 1999; Corominas et al., 2003; Noetzli et al., 2006):
whilst these and other authors refer to the angle between the highest
and the terminal point of the landslide, we consider the angles
between any release pixel of an observed or hypothetic landslide and
its terminal point. This is necessary to combine
The exclusion of large landslides improves the model performance. Particularly the well-investigated Hsiaolin Landslide (Kuo et al., 2013) is poorly predicted by the suggested approach with the parameters applied. We hypothesize that such events are sometimes characterized by very particular geotechnical and geological settings which cannot necessarily be deduced from a DEM or remotely sensed data only. Instead, understanding, modelling and predicting those events relies on detailed on-site investigations and more advanced physically-based models.
Whilst it was out of scope of the present study to extensively
evaluate the sensitivity of the model results to the various
parameters used, such an evaluation has to be the subject of future
studies, including (i) the predictors; (ii) the type of statistical
method for computing
We have presented an innovative approach for integrated statistical modelling of the spatial probability of landslides at catchment or broader scales. For this purpose we have combined the tools r.landslides.statistics and r.randomwalk. The release probability was computed using a simple overlay of the landslide inventory with a set of predictor layers whilst landslide propagation – i.e. the impact probability – was deduced from the cumulative probability of the angle of reach of the observed landslide pixels. The concept of zonal release probability was introduced, allowing to correct the release probability for the size of the release area possibly affecting a given pixel before combining the impact probability and the release probability.
The result approximates the probability of a pixel to be affected by
a landslide either through its release or through its
propagation. Analyzing the outcomes of the procedure leads us to a set
of key conclusions:
The predictors used explain the observed landslide distribution
only at a moderate performance level. This observation may be related
to the fact that the landslides are attributed to one single
meteorological event (the typhoon Morakot). The prediction quality does not decrease when using a constant
release probability over the entire area. This indicates that the size
of the possible release area is more important for the zonal release
probability than the pixel-based release probability. This conclusion
is supported by the outcome of the evaluation of the results on the
basis of slope units. Even though this effect may be less pronounced for areas where
the distribution of the release areas is well explained by the
environmental layers, we conclude that the outcomes of traditional
statistical landslide susceptibility analyses are less relevant for
the integrated landslide probability and for higher levels of spatial
aggregation. Removing the largest observed landslides from the analysis
improves the prediction quality. We explain this phenomenon with
particular geological settings not deducible from terrain data
conditioning some of these events, and conclude that in-detail studies
and physically-based models are needed in this context.
Confirming, refining and improving the results obtained will rely on
thorough tests of parameter sensitivity.
The support of Massimiliano Alvioli, Matthias Benedikt, Yi-Chin Chen, Julia Krenn and Ivan Marchesini is acknowledged.
Summary of the various probabilities as defined in the context of the present work.
Summary of model tests. All tests build on the combination of the tools r.landslides.statistics and r.randomwalk. Refer to Fig. 6 for the subsets A–D used to define the MDA and the MEA.
Key figures describing the results of the twelve tests introduced in Table 2. The IDs 1–3 refer to the combined results from each set A–D. All values given in per cent are averages over the area indicated.
Values marked with an asterisk represent averages for the entire test area.
Simplified work flow of the integrated statistical analysis of spatial landslide probability.
Landslide geometry and inventory subsetting. ORA and ODA are defined on the basis of
Approximation of the zonal release probability
Work flow for estimating the impact probability
Integrated spatial landslide probability
The test area in the Kao Ping Watershed in southern Taiwan. A–D refer to the subsets of the test area alternatively used as MDA and MEA (see Table 2). The comparison of pre- and post-event imagery for part of the test area illustrates the large number of landslides triggered by the typhoon Morakot.
Set of results of the test 1C. For readability, only a small subset of the test area (see Fig. 6) is shown.
Gaussian probability density functions and cumulative density functions (CDFs) of the observed angle of reach
Zonal release probability (see Fig. 3).
Integrated spatial landslide probability
ROC Plots relating the model results for the MEAs of all tests to the relevant observations.