Homogenous regions based on extremogram for regional frequency analysis of extreme skew storm surges

To resist marine submersion, coastal protection must be designed by taking into account the most accurate estimate of the return levels of extreme events, such as storm surges. However, because of the paucity of data, local statistical analyses often lead to poor frequency estimations. Regional Frequency Analysis (RFA) reduces the uncertainties associated with these estimations, by extending the dataset from local (only available data at the target site) to regional 15 (data at all the neighboring sites including the target site) and by assuming, at the scale of a region, a similar extremal behavior. RFA, based on the index flood method, assumes that, in a homogeneous region, observations at sites, normalized by a local index, follow the same probability distribution. In this work, the spatial extremogram approach is used to form a physically homogeneous region centered on the target site. The approach is applied on a database of extreme skew storm surges and used to carry out a RFA. 20


Introduction
To resist marine submersion, coastal protection must be designed by taking into account the most accurate estimate of the return levels of extreme events, such as extreme sea level or storm surges.
When performing a local analysis in order to estimate high return levels, the local duration of observation is often too low 25 to be able to obtain precise results on the estimates of return levels that are seeking (associated typically with a return period of 100 or 1000 years).For example, storm surge records calculated from tidal gauge measurements from one site are usually shorter than 30 years.These uncertainties can be reduced by a Regional Frequency Analysis (RFA) developed by Dalrymple (1960), which tries to exploit the similarities between sites.This kind of analysis is based on the index flood method and assumes that within 30 a homogeneous region, extreme events normalized by a local index representing local features, are drawn from a common regional distribution.20 However, the delineation of a homogenous region usually leads to the problem of the so-called "border effect".Indeed, if one is interested in a target site which is very close to the region limit, the information at the site located on the other side of the region is excluded, even though both sites offer similar information and have similar asymptotic properties.For example, in the physically homogeneous regions formed obtained by Weiss et al. (2013), it may be noted that, two French sites, Boulogne-Sur-Mer and Calais are located in two physically different homogeneous regions while these cities are 25 very close (they are only separated by about thirty kilometers).We can notice similar issues for all areas composed of two sites located at each side of a border between two regions.A physically homogeneous region defined by Weiss et al. (2013) is a typical storm footprint, and it can be expected, in fact, that very close sites facing similar storms are in the same area.
Moreover, Weiss et al. (2013) gather in the same region very distant sites, which raises the question of whether there are remaining traces of heterogeneity, even in a region considered statistically homogeneous.Acreman and Wiltshire (1987) 30 have suggested that the sites located near the border between 2 regions could be considered partially owned by each of those two regions.However, Burn (1990) notes that there is no need to define boundaries between regions and a particular region can be defined for each site (which consists of sites similar to the site of interest in terms of extremes).
To form a physically homogeneous region centered on a target site, Hamdi et al. (2016) had recently proposed an approach using the spatial extremal dependence between observations (the spatial extremogram) to measure the neighborhood between sites.Herein, we define a pairwise distance between sites and we use the spatial extremogram approach for the RFA applied on extreme skew storm surges.The composition of regions built herein and which can be thought of as neighborhoods is based on the similarity of sites attributes.The higher the value of the spatial extremogram between the target site and another site is, the greater the dependency of extreme storm surges; therefore indicating that storms impacting the target site tend to also impact the other site which can be included in the region of the target site.Indeed, in 5 a specific region, the process generating storms and impacting the target site will also tend to impact the other sites in the region and vice versa.We can then consider that the processes generating storms in a region are physically homogeneous.
So it is assumed that sites, with a sufficiently high value of the spatial extremogram with the target site, may be included in the same physically homogeneous region or, better, the region of influence of the target site.The region may also be regarded as a typical storm footprint in the neighborhood of the target site.

10
Once a physically homogeneous region is formed around a target site, the statistical homogeneity is then checked.The whole procedure to estimate the regional law (and, in particular, the dependence model and the way to calculate the effective duration) is then applied in the same way as Weiss et al. (2014), starting from the physically homogenous regions defined from the spatial extremogram.
The detail of the methodology is described in section 2. In section 3, one will find an application of this method carried 15 out for a database of extreme skew storm surges collected at 67 sites located along the Spanish, French and UK coasts.In order to compare the results obtained in this study with those of Weiss (2014c), the database used in section 3 is the same used by Weiss (2014c).

Methodology
The objective is to form physically homogeneous regions for RFA on extreme skew storm surges.The proposed method 20 is based on the use of the spatial extremogram values.

Formation of physically homogeneous neighborhood of a target site by using the spatial extremogram
Let  be the random variable representing the skew storm surges at a site S1, (the target site for instance), and  be the random variable for the skew storm surges at site S2.Let 1 and 2 be the thresholds above which skew storm surges are considered as extreme.
We consider that the sites, S1 and S2, are inside the same physically homogeneous region if at least a part of extreme skew storm surges from each site are likely to be simultaneously generated by the same storms (for the same time lag), which means that the extreme skew storm surges of the two sites are dependent.In the spatial and pairwise dependence description we should also include the temporal dimension, because storm conditions can last several days.Then, a pairwise 30 Nat.Hazards Earth Syst.Sci. Discuss., doi:10.5194/nhess-2016-378, 2017 Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 27 February 2017 c Author(s) 2017.CC-BY 3.0 License.probability of dependence can be defined by the spatial extremogram coefficient (, , ℎ) for bivariate time series, used also by Davis et al. (2011) and defined within Eq. ( 1): Where  , and  , are the (1 − 1/n)-quantiles of the distribution of  and , and A and B are finite intervals that are bounded away from 0 ( and  will be included inside [1; +∞[).
5 Davis et al. (2011) defined the natural estimator  ̂(, , ℎ) of (, , ℎ), the empirical spatial extremogram, which will be used in this study, and which can be written as follows: Where  is the number of data occurring at the same time at the site S1 and S2, and where   and   are extreme quantiles.
Note that, we have to ensure that  is large enough to have the guarantee that the probability of dependence is rather 10 significant.Of course, we have  ̂(, , ℎ) ∈ [0; 1] and  ̂(, , ℎ) ≠ 0 indicates some dependency between  and  on extreme values.A threshold  0 is defined to indicate if the probability of dependence between  and  is high enough to consider S1 and S2 inside the same physically homogeneous region: if  ̂(, , ℎ) >  0 then S1 and S2 are considered to be part of the same physically homogeneous region.There is very often a residual probability of dependence which can be considered as noise, even between sites far enough; therefore,  0 has to be great enough to make sure that the sites only 15 linked by a residual probability of dependence (considered as noise) do not belong to the same region.The minimal value of  0 can be found when analyzing the probability of dependence between all sites and the target site which allows for the evaluation of the order of magnitude of the maximum residual noise   .The determination of  0 has to allow for the merging of any sites within the same region which have an extreme dependence with the target site, and to put aside sites only linked to the target site by a probability of dependence considered as residual noise.

20
The empirical quantiles   and   are set in order to select in the  and the  series only few storms per year, which allows for the computation of the empirical spatial extremogram from the biggest storm of each year.Moreover, ℎ, the lag time, is large enough to allow a storm that occurred at one site to propagate eventually to the other site.If ℎ is set to several values then, between two sites, several values of  ̂ are estimated and the greatest value of  ̂ is kept.
Finally, the neighborhood of a target site is formed by all sites which satisfy a probability of dependence with the target 25 site greater than  0 .From a statistical point of view, this means that if a storm affects a given target site, this storm will likely impact (and therefore not systematically) only sites enclosed in the target site's region and vice versa.The neighborhood of the target site can be seen also as the region of influence around the target site (the so-called RoI approach, Burn, 1990).

Independent storms extraction
In this section, we describe the procedure to construct a sample of independent storms.The procedure is the same as the one used in Weiss et al. (2013) and applies to each neighborhood of each target site obtained from section 2.1.The procedure is summarized below: We define storm as a physical event that generates extreme skew storm surges in at least one site in the neighborhood of a 5 target site (the study area).In this section, an observation is considered as extreme for a given site, if it exceeds   , where   is the p-quantile of the initial at-site skew surge series, with p close to 1. Thus, a site is considered impacted by a storm if   is exceeded.Moreover, at a given site, one or more extremes can occur during the same storm, according to its duration.When several extremes appear, only the maximum value is retained.This operation allows us to get independent extremes extracted from storms at site scale.

10
The detection of storms which propagate both in space and in time, rely on a spatio-temporal declustering procedure.The main idea is that extreme neighbors in time and in space are considered to be part of the same storm.In other words, two extremes are spatio-temporal neighbors if they: -are among the γ-nearest neighbors of each other.
-occurred within Δ hours, 15 Thus, to detect a storm, three parameters are needed: p, setting its impact on a given site, and (Δ, γ) which are linked and depend on its spatio-temporal propagation.In order to guarantee an accurate detection of the physical events, (p, Δ, γ) should be chosen correctly.Since we work on the same database of skew storm surges as the one used by Weiss et al. (2013), those parameters will be the same as their, who obtained them after various tests:  = 0.995, Δ = 24 h,  = 14.We also, like in the study of Weiss et al. (2013), consider here the hypothesis often accepted in the literature that the 20 declustering procedure leads to a sample of independent storms.In order to detect storms, Nissen et al. (2010) used a similar way as described above but from wind speed observations.Therefore, we assume that this procedure allows for the construction of independent storms for each region defined as the neighborhood of a target site.
The storms extracted in this section represent physical events generating extremes; however, for statistical aspects, and in 25 order to build the regional sample, a sub-selection of these storms is extracted in order to focus on the most intense events.
In particular, like Weiss (2014c), we redefine storms in such a way that, on average and at each site, there is λ storm(s) a year.Weiss (2014c) suggest to use λ=1, and we set λ to the same value, which enables us to carry out the statistical analysis from the most the biggest storms.Bernardara et al. [2014] recommend this ''double-threshold'' procedure to address autocorrelated environmental variables using the peak over the threshold method.

Regional statistical homogeneity
RFA of extreme storm surges requires statistically homogeneous regions.Like in Weiss et al. (2013), the physically homogeneous regions obtained with the method described in section 2.1 should be also statistically homogeneous.Two tests are used to verify that regions are statistically homogeneous: The Hosking and Wallis test (Hosking and Wallis, 1997) can be used to assess the statistical homogeneity of a region.

5
Their heterogeneity indicator H measures whether the dispersion between sites is similar to a value that would be expected in a statistically homogeneous region.Hosking and Wallis suggest that a group of sites may be regarded as "acceptably homogeneous" if H < 1, "possibly heterogeneous" if 2 > H ≥ 1, and "definitely heterogeneous" if H ≥ 2.
The discordancy criterion Dc of (Hosking and Wallis, 1997) can identify discordant sites by indicating if, in a given region, a site is significantly different, in terms of L-moments, from the other sites.If Dc > 3, a site can then be considered as 10 discordant.
For each region defined as the neighborhood of the target site, the homogeneity and the discordance test are performed.If one site is found to be discordant, then a second neighborhood of the target site is defined without the discordant site.For target sites where the quantiles' estimations are particularly important, an RFA can be carried out on the two neighborhoods (the one with the discordant site and the one without it) in order to compare the results and to keep the highest one (for 15 conservative reason).
Finally, in all cases studied, we checked that the target site is not on the edge of its region.This must be the case most of the time.But, if it is not the case, we consider that another method (for example the one developed by Weiss, 2014c), could be better adapted to estimate the quantiles at the target site.

20
RFA assumes that within a homogeneous region, extreme events normalized by a local index are drawn from a common regional distribution.The local index represents local specificities.Here again, the same procedure as the one developed by Weiss (2014c) is used.This procedure is summarized below: For each site i, let's denote   the storm threshold which is exceeded by the skew surges  times per year on average.Let the ni-sample   be the exceedance of   .It is assumed that   is drawn from a GPD law.

25
The annual quantile   of each site is used to normalize local sample   .  denotes the normalized local sample for the site i.
For each storm defined in section 2.2, we keep only the maximum normalized skew storm surge in the regional sample, ensuring sample independence.For statistical aspects, Weiss (2014c) use a threshold which allows to select the most intense storms, and corresponds to a value of  equal to 1.   denotes this sub-selection of the maximum normalized skew 30 storm surge sample for a specific region centered on a target site which corresponds to  = 1.
A Kolmogorov-Smirnov test is carry out in order to check whether the law of   can be considered as the law of   .If it is the case,   can be considered as the regional sample.A Generalized Pareto Distribution (GPD) is fitted to the regional sample taking into account the seasonality (the threshold of the regional GPD law is equal 1, in the case of peak over the threshold method applied to the regional sample).Seasonal effects can be modelled through a sinusoid and the regional distribution is a discrete mixture of GPD where the scale parameter varies periodically and smoothly across the seasons of occurrence of storms.4 seasons are considered here: summer (June, July, August,), autumn (September, October, November), winter (December, January, February) and spring 5 (March, April, May).The AIC criteria is taken into account to select the more adequate distribution.The good adequacy of the fit curve to the regional sample is also checked visually.
Regarding the effective duration of the regional sample, which is denoted   , this value depends on the effective duration   of each local sample (i varying from 1 to N, where N is the number of sites that are included in the region) and is closely related to the spatial dependence (characterized by a  function).  is estimated by Weiss et al (2014b).Details about 10 and   are given in Appendix A.
Finally, the local T-return level    of the target site i, is calculated within the following equation: Wherein   is the regional T-return level of the region focused on the site i.

Skew storm surge data
The database used in this study is the same as the study of Weiss, (2014c), which enables a simple comparison between our results and the results obtained by Weiss, (2014c).The raw data used is a temporal series of hourly sea level observations collected at 67 ports along the Spanish, French and U.K. coasts (see Fig. 1).In Appendix B, we recall some elements about the database.

20
The construction of physically and statistically homogeneous regions within the results of Weiss (2014c) are presented in Fig. 2.
Note that some sites, like Calais in northern France (inside the red circle on Fig. 2), are located very close to the border between two regions.Nevertheless, on both sides of the border the process generating storms is likely the same.In addition, the region 1 is very large (for example Boulogne on the extreme North of France is inside the same region as Saint-Jean-25 de-Luz at the extreme South of France, or in the north of Spain), with perhaps differences between the process generating storms in the north of France and in the north of Spain (which may cause traces of heterogeneities).Furthermore, in the study of Weiss (2014c), the region where La Rochelle and Brest are included has a heterogeneity measure equal to 1.1, and then is only considered as possibly statistically homogenous (according to the criteria of Hosking and Wallis described in section 2.3).

Formation of homogeneous region centered on a target site
Quantiles   and   should not be too small, otherwise the spatial extremogram won't be performed from extreme values.
In addition, quantiles   and   should not be too large, otherwise the spatial extremogram won't be performed on enough values.So there is a trade-off to be found.
Values of   and   are tested in order to select first 4 and then 6 storms a year, which finally give information from the 5 spatial extremogram that led to similar conclusions.Therefore, the empirical quantiles   and   are set in order to select (in the  and the  series) only 4 storms a year.This value allows for the computation of the empirical spatial extremogram from the biggest storm of each year.
Moreover, ℎ, the lag time, has to be large enough to allow a storm which occurs at one site to propagate eventually to the other site.If we note ds, the time between two skew storms surges (it means about ±12ℎ), then tests performed with h>24h 10 show little interest compared with tests where h=0 or h=ds.
ℎ is finally set to two values : 0 and ds.Then, between two sites, two values of  ̂ are estimated and the greatest value is kept.
And at last, we set  0 to 0.3, which allows for the elimination of any sites associated with a value of the spatial extremogram that looks like a residual noise from the target site's region. 15 However, we will see, that for some special cases (rare cases), one can consider including a site even if the spatial extremogram with the target site shows a slightly smaller correlation than  0 .

Application for several target sites
We apply our methodology to three sites for which we estimate the 1000-years return period quantiles.

20
As previously noted, the particularity of the Calais site is that it is located close to a border of one of the regions found by Weiss (2014c) or Weiss et al. (2013).The application of the spatial extremogram (see Fig. 3) leads from Calais to the region illustrated in Fig. 4.
The Fig. 3 shows, on the vertical axis, the probabilities of the extremal dependence.The x-axis represents the sites which are sorted in an ascending order (based on the geographical distance to the target site, the closest sites from the target site 25 are on the left).The sites with extremal dependence probabilities greater than a threshold of 0.3 (represented by a red line) are considered as potential neighbors of the target site (Calais) and are thus part of the region of interest (of Calais) considered to be physically homogeneous (see Fig. 4).In brackets, next to the name of each site i, are indicated the durations for the year in which the tide gauge of the site i and the tide gauge of the target site have operated simultaneously; therefore, it is the number of years on which the extremogram was calculated.For example, between Calais and Dunkerque 30 it is 26 years.If this period is too small, the probability of the extremal dependence may not be relevant, and one can ask whether it is appropriate to add the site to the region of the target site.This question must be answered case by case.The region built around Calais is shown in Fig. 4.
As shown in Fig. 4, the region around Calais is slightly smaller than the one obtained by Weiss (2014c) or Weiss et al.
( 2013), but more centered on Calais (which is no longer located at the border of a region).This region is considered as a physically homogeneous region centered on Calais.We will see that, in general, we find smaller areas than those found by 5 Weiss (2014c) or Weiss et al. (2013).But the advantage of a smaller region which is centered on a target site is that it is, most likely, more physically homogenous.
Once the physically homogeneous regions have been identified, the statistical homogeneity must be verified.The Hosking and Wallis' homogeneity tests and the discrepancy test described in the section 2.3 are used.A heterogeneity measure H of -0.13 was obtained and no discordant sites were found.The region is then considered as statistically homogeneous.10

The physically and statistically homogeneous region for Brest
In the case of Brest, we note that the region is larger than the one built for Calais (see Fig. 6) with many sites for which the dependency probability is at the limit of the dependency threshold.It is especially the case for sites in the Bristol Channel (UK coast).A study was conducted with and without these sites, which were finally selected in the region.In fact, their absence led to the selection of a model that did not fit very well with the data.The extremogram between Brest and 15 all other sites is shown in Fig. 5 and Fig. 6 represents the region around Brest. Figure 5 shows, on the vertical axis, the probabilities of the extremal dependence and along the horizontal axis we show sites sorted in ascending order based on the geographical distance to the target site (the closest sites to the target site are on the left).Like for the previous case of Calais, the sites located above the threshold of 0.3 (line in red) are integrated into the region of Brest (considered to be physically homogeneous).In brackets, next to the name of each site i, durations are indicated for the year in which the 20 tide gauge of the site i and the tide gauge of the target site have operated simultaneously (therefore, it is the number of years on which extremogram was calculated).As shown in Fig. 6, the region around Brest is smaller than that the one which includes Brest in Weiss (2014c) or Weiss et al. (2013) study (see region 1, Fig. 2) but nevertheless is better focused on Brest.
By applying the discrepancy and homogeneity tests, we find no discordant site and the heterogeneity measure H is 0.99; 25 therefore, the region is also considered as statistically homogeneous.

The physically and statistically homogeneous region for La Rochelle
The La Rochelle site has been the subject of many studies after the Xynthia storm (e.g.Hamdi et al., 2015).In this study, the region centered on the La Rochelle site raised the question of whether or not to add Saint-Servan and Saint-Malo sites in the region.Indeed, although Saint-Servan has a dependency extremal probability of 0.4, this value has been calculated 30 only with a common period with the target site of solely 2 years (which may not be very representative).In addition, Saint- . Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2016-378, 2017 Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 27 February 2017 c Author(s) 2017.CC-BY 3.0 License.

Nat
Malo shares a common period of 14 years with La Rochelle but has an extremal dependency probability of 0.29, just below the threshold.
St. Malo and St. Servan being very close (with a distance of less than 2 km), it seems logical to either add them both to the region, or withdraw them both to the region.It was finally decided to integrate them into the region of La Rochelle because Saint Helier (whose abbreviation is "JER" in Fig. 7), very close also to Saint Malo and Saint Servan, was selected to be 5 inside the region of La Rochelle (the dependency extremal probability between Saint Helier and La Rochelle is calculated on 14 years and is equal to 0.3).
However, these examples show that the extremogram is not a tool to be used blindly and that the choice of the threshold, which serves as an effective aid to form the region as consistently as possible, does not necessarily need to be inflexible.
Figure 7 represents the extremogram for La Rochelle and Fig. 8 represents the region built focused on La Rochelle.As 10 shown in Fig. 8 and as in the case of Calais and Brest, the region around the La Rochelle site is also smaller than that obtained by Weiss et al. (2013), but nevertheless better centered on the La Rochelle site.
By applying the discrepancy test, we find that the site of Eyrac is discordant (Dc=3.65).This site is located in the center of the region and this discrepancy could be explained by the specific sea conditions in the Arcachon basin.When Eyrac is removed from the region of La Rochelle, we find no discordant site and the heterogeneity measure H is 0.53.So the region 15 (without Eyrac) is also considered statistically homogeneous.The final region for La Rochelle is shown in Fig. 9.

Step to build the regional sample
To estimate the regional distribution we used, for each constructed region, the regional pooling method; however, extreme events that can impact several sites (due to the presence of intersite dependence) during a single storm must be considered only once.Storms elaborate within the procedure described in section 2.2 are a relevant way to suppress the intersite 20 dependence.The distribution of the storm regional maximum denoted Ms is now, for each region, supposed to be identical as the regional distribution.Of course, we must verify the validity of this assumption.In order to evaluate the null hypothesis so that Zi and Ms have the same distribution for each site i,, a Kolmogorov-Smirnov test, as explain in section 2.4, can be performed.For the three regions built for Calais, Brest, and La Rochelle, we find that no p-values are smaller than the risk level of 5%, consequently, for each region, the regional distribution can be estimated from each regional 25 sample Ms (which is considered for each region as the regional sample).

Regional effective duration and comparison with results from the study of Weiss (2014c)
The delineated homogeneous region around a target site is characterized (by the nature of the method) by a strong dependence between sites.This spatial dependence impacts, in particular, the regional effective duration (that will be even lower than if this dependence were high).The effective duration is calculated for each region centered on a target site, as in table 1. Remember that Calais is a part of region 2 in the study of Weiss (2014c), and that Brest and La Rochelle are a part of region 1 in the study of Weiss (2014c).
In table 1, one will notice that our effective durations associated with the region of Calais have the same size as those from the region constructed by Weiss (2014c) in which Calais is included.
For the region of Brest and La Rochelle, one will notice that our effective durations are lower than the ones from the region 5 constructed by Weiss (2014c) in which the target site is included.However, in all cases, our durations are higher than those found by a local analysis.

Check stationarity
In order to fit a GPD law within a fixed threshold, we have to test the stationarity of our samples.In order to carry out this for each region, we perform a Student test to check the means' equality of two subsamples of our sample.All tests are 10 completed with a risk level of 5%.

Regional fitting for the region focused on Calais, Brest, and La Rochelle
A Generalized Pareto Distribution (GPD) is fitted to the regional sample, taking into account the four seasons.In Appendix C, details are given about the laws which are used.Eight models are possible, and we must now select, from these models, the one that best fits our observations.The most commonly used criterion in the literature is the AIC (Akaike Information 15 Criterion), based on the estimate of maximum likelihood.This criterion is used in this study.The Expsin model is selected for Calais and the Gpdcos sin model is selected for Brest and La Rochelle (see Appendix C for the definition of those models).Those models are the same models chosen respectively for regions 2 (which includes Calais) and 1 (which includes Brest and La Rochelle) in the study of Weiss (2014).Figure 10 shows all the fittings which are performed.As we can see on Fig. 10, the fitting looks good: most of the points are inside the confidence intervals which are not so large.

20
Those elements are also relevant to accept the credibility of the law used for the fitting.

Return levels and comparison with return levels obtained from the results of the Weiss (2014c) study procedure
The last step in our regional analysis is to calculate the local quantiles by renormalizing the regional distribution by multiplying with the local indices.Moreover, by following exactly the same procedure as that followed by Weiss (2014c) 25 to find the quantiles and the 70% of the confidence interval at the 3 sites of interest, we can finally compare results.We then obtain the returns levels shown in table 2. It is worth noting that compared to the procedure followed by Weiss (2014c), the confidence intervals herein are not always larger.In addition, in all cases the 2 studies find confidence intervals with almost the same magnitudes.

Conclusion
The aim of our study is to achieve extreme statistics on skew storm surges and to reduce uncertainties that are found in a local analysis by using RFA.An important step of the RFA is to form a physically homogeneous region.The method which allows one to shape those physical homogeneous regions is based on the use of the spatial extremogram.It helps to build the influence region of a target site, and we made the assumption that a relatively high enough probability of 5 dependence allows one to consider the region as physically homogeneous.
In most cases, the approach includes (in the same physically homogeneous region) any site near the target site and we can expect that, most of the time, a target site can't be located at the border between two regions.In the introductory section, it was stated that the method proposed by Weiss (2014c) to form the region of interest leads to the problem of the so-called "border effect".In our study, the question of "What would be the value of a quantile for a particular site located on the 10 border of a region if this site belonged to the neighboring region?"should not arise often.This is one of the interesting elements of the method we have used.Moreover, the method, unlike the one used by Weiss (2014c), seems to not allow two very distant sites to be in the same region.This is quite comforting because it reduces doubts regarding possible traces of physically heterogeneity that could be generated by relatively distant sites.
In addition, the regions seem more consistent geographically than those obtained by Weiss (2014c) because this method 15 groups sites within a region where the climates are more similar (simply because sites are closer).This leads to a one thousand quantiles and the associated confidence intervals of the same magnitude than those found from the results of Weiss (2014c) study procedure.This observation consolidates the 2 approaches.
Finally, compared to the study of Weiss (2014c), the method used herein can likely increase the level of the physical homogeneity of the formed regions and decreases, not in all cases but in general, the effective duration of observation.

20
Nevertheless, cases like the site of Calais (which in this study is no longer close to a border and where the effective duration calculated is roughly the same as the one found by Weiss, 2014c) seem to be particularly interesting.A study on Dunkerque, not shown here, leads to the same conclusion as the study carried out for Calais, and we should probably find other sites as interesting as these 2 sites.
Furthermore, physical homogeneity likely has an impact on statistical homogeneity and if we use the criterion of Hosking 25and Wallis (1997), we can then observe that all regions that we have built are seen as statistically homogeneous.This is unlike those built by Weiss (2014c) where we can find that the regions 1 and 5 are only considered as possibly statistically homogeneous.
However, we can put forward a limitation of the method in the formation of the regions.Indeed, when the tide gauge sites operated at different times, the common time period between two sites used to calculate the extremogram may be short.

30
Thus, we can sometimes consider an extremogram calculated on a very short series (or even without any data), which may not be relevant.We must therefore consider this detail during the formation of the regions and possibly add or remove a site for which the extremogram does not give us enough relevant information.This disadvantage has been particularly highlighted in a study consisting of forming a region focused on Dieppe (not shown here), where the creation of the region using the extremogram has not allowed to construct a region roughly focused on Dieppe.This is why it could also be interesting to study the confidence intervals associated with extremograms in order to have more reliability of our estimates of the extremal dependence between sites.
Finally, one of the possible future improvements, would be the considerations of physical data complementary to the use of the extremogram (such as the atmospheric pressure or the wind speed and direction) that would surely help in the 5 formation of homogeneous regions.    .Top left, the regional fitting for the region of Calais.Top right, the regional fitting for the region of La Rochelle.On the bottom, the regional fitting for the region of Brest.The red crosses indicate the skew storm surge from the target site, so the black crosses show the contribution to the data of the regional approach.The confidence intervals at 70% and 95% are also Nat. Hazards Earth Syst.Sci.Discuss., doi:10.5194/nhess-2016-378,2017   Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 27 February 2017 c Author(s) 2017.CC-BY 3.0 License.

Fig. 2 .
Fig. 2. Five physically and statistically homogenous regions within the results of Weiss (2014c) are represented by dots in five colors.Inside the red circle is the site of Calais, very close to a border.

Fig. 8 .Fig. 9 .Fig. 10
Fig. 8. Physically homogenous region for La Rochelle represented in red dots found in our study (set number 1).La Rochelle is located inside the black circle.The set number 2 (blue dots) includes all the sites which are not a part of the physically 5 Earth Syst.Sci.Discuss., doi:10.5194/nhess-2016-378,2017   Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 27 February 2017 c Author(s) 2017.CC-BY 3.0 License.