How to explain variations in sea cliff erosion rates? Insights from a literature synthesis

Rocky coast erosion (i.e. cliff retreat) is caused by a complex interaction of various forcings that could be marine, subaerial or due to rock mass property. It turns into variable erosion rates (over 4 orders of magnitude at least, from 1 mm.yr−1 to 10 m.yr−1). While numerous local studies exist and explain erosion processes on specific sites, there is a lack at global scale. In order to quantify and rank the various parameters influencing erosion rates, we compiled existing local studies in a global 5 database called GlobR2C2 (for Global Recession Rates of Coastal Cliffs). This database records erosion rates, cliff setting and measurement specifications; it is filled from peer reviewed articles and national databases. In order to be homogeneous, marine and climatic forcings were recorded from global models and reanalysis. Up to now, GlobR2C2 contains 58 publications which represents 1530 cliffs studied and more than 1680 erosion rate estimates. A statistical analysis was conducted on this database to explore links between erosion rate and forcings at global scale. Rock resistance, inferred through Hoek and Brown (1997) 10 criterion, is the strongest signal explaining variation in erosion rate. Median erosion rates are of 2.9 cm.yr−1 for hard rocks, 10 cm.yr−1 for medium rocks and 23cm.yr−1 for weak rocks. Concerning climate, only the number of frost days (number of day per year below 0°C) for weak rocks shows a significant, positive, trend with erosion rate. Every other relations with both climatic and marine forcings are very spread and non-significant. Copyright statement. TEXT 15

or because authors did not think relevant to mention it. Moreover, parameters describing rock state are either complex or technically expensive to describe and quantify or outside the authors' scientific field of expertise. They were characterise with a Boolean value to be integrated in the database. True refers to the presence of fracturing/weathering mentioned in the paper, false otherwise.

Cliff location 5
Cliff location is entered as a geographic information. Studied cliff site extent was digitized from to publication information and mapped using Google Earth . A primary key links this geographic file to the database.

Measure description
The measure entity contains the erosion rate values and measurement methodology (how erosion was measured, for how long etc.). Erosion is most of the time provided as an erosion rates in meters per year, occasionally as finite retreat (in meters), 10 minimum and maximum erosion rate or eroded volume (in cubic meters).
Cliff retreat measurement errors and time spans were also recorded. Indeed, measuring sea cliff erosion presents a wide range of techniques. Those techniques vary largely in terms of: (i) accuracy, from field observation and "expert" estimates (e.g., May, 1980) of volume loss to lidar (e.g., Dewez et al., 2013) ; (ii) time period surveyed, from twenty minutes (Rosser, 2016) to thousands years (Choi et al., 2012;Regard et al., 2013); and (iii) Spatial extent, from tens of meters (e.g., Letortu 15 et al., 2015) to kilometres (Gibb, 1978;Hapke et al., 2009) Moreover, these measurements can be divided into three classes of methods: 1D, 2D or 3D.
1D cliff retreat measurement techniques correspond to retreats calculated on single transects. Typically, they correspond to measure done with peg transects recording the cliff toe retreat or transects on aerial photographs to quantify cliff-top retreat (Kostrzewski et al., 2015;Lee, 2008;Pye and Blott, 2015). 2D measurements are mostly based on aerial photograph compari-20 son. They either quantify the area lost between two aerial photographs campaigns or average numerous transects (Costa et al., 2004;Letortu, 2013;Marques, 2006). 3D techniques record the evolution of the cliff face and quantify volumes (e.g., Letortu et al., 2015;Lim et al., 2005;Rosser et al., 2007). The oldest method is rockfall inventory (rockfall volume estimation based upon size of debris or scar, (e.g., May, 1971;Orviku et al., 2013;Teixeira, 2006) but now the two most used methods are lidar and SfM. . This rich systematic dataset was obviously included in GlobR2C2 but raise two caveats. On the one hand, the CEREMA study introduces a strong spatial bias for French oceanographic and climatic conditions in the database observation records.

30
This situation may risk to polarize analytical results but was recognized beforehand and specifically treated to prevent such 5 Earth Surf. Dynam. Discuss., https://doi.org/10.5194/esurf-2018-12 Manuscript under review for journal Earth Surf. Dynam. bias. On the other hand, being a systematic study for every stretch of coastal cliff around the country, it makes it more robust to scientific and funding biases. Research funds are often sought for areas combining coastal threats with societal interest.
Coasts with higher recession rates are therefore more often sampled, while quiet stretches of coastlines remain in the shadow.
Including this data therefore provides a more representative set of values existing along coastlines. Among little studied sectors it represents this CEREMA study contains hard rock coastal stretches (e.g. hard Proterozoic granites from French Brittany) 5 and erosion rates lower than the study's detection threshold.
Based on historical aerial photograph archives, CEREMA acknowledges that photographs quality limits the detectable cliff recession to rates higher than 10 cm/yr. Below this value, they deem recession rates as undetermined. We chose to record those undetermined values in the database but not to use them in the statistical analysis . We discuss this choice in the following.

10
The tidal range describes the water column height at cliff foot and produce wetting and drying cycles (Kanyaya and Trenhaile, 2005). Rather than referring to difficult use tidal records from tide gages, tidal modelling was performed with FES 2012 software (Carrère et al., 2012). This model which gives all the constituents of the harmonic tide analysis. 8 harmonics were considered: M2, N2, K2, S2, P1, K1, O1. The model produces time series of sea water height within a regular grid of 0.25 degree between two dates. The tide is computed for each study location for two entire years, of which is extracted the mean 15 amplitude over N cycles (i.e. height difference between successive high and low tides).

Waves
Wave properties were extracted from ERA-interim reanalysis dataset (Dee et al., 2011). This gridded data has a pixel size of 0.75 degree. Temporally, data spacing is 6 hours during the 1979-2016 period. Wave assault was characterised both in terms of mean agitation and extreme events. Three mean parameters characterise wave assailing force: significant wave height of 20 combined swell and wind, wave period and wave direction. For swell characteristics, mean significant wave height and wave period characterise the average sea agitation. The wave direction value records the most frequent wave direction for the duration of the reanalysis period .
Anticipating that mean sea state values may be deceptive metrics, a record of extreme events was also described. Those events were characterised by the 95 % percentile of wave significant height. To complete this quantile value, the number of 25 storms experienced at each cliff site was calculated between 1979 and 2016 according to (Castelle et al., 2015).

Climate
Climatic information was extracted from Climate Research Unit data between 1961and 1990Mitchell and Jones (2005. The grid size is 0.5 degree, at monthly time step. Chosen parameters likely to influence erosion rate are mean annual rainfalls, mean monthly temperatures and number of freezing days (number of days per year bellow 0°C).

Cliff height
Cliff height appeared to be often missing. Filling this value is not straightforward because cliff height can be strongly variable along the surveyed cliff. Nevertheless, in order to provide a robust estimate, a mean cliff height was extracted from the 8" global DEM (GMTED2010, Danielson and Gesch, 2011). Cliff height extraction consisted in computing a buffer around the cliff extension shapefile, in which the mean value of non-zero pixels (corresponding to the sea) is computed. To assess the 5 accuracy of these cliff height estimates, they were compared against those rare values presented in publications. Computation is close to value given in publication with a root mean square error of 19 m at global scale. It is quite good for a first attempt at the global scale, probably not so far from the cliff height accuracy in the publications.

Tidying the covariates: from database fields to predictors
The first purpose of the database is to collate raw data from original sources in the most traceable manner. This data does not 10 necessarily report information in an easily accessible fashion. This may be because: (i) fields translate different realities (e.g. recession rates vs retreat values or recession rates relate to profile-specific recession rate or to kilometre long cliff sections), (ii) value instances of a field is too broad and needs summarizing in fewer categories (e.g. lithology). Thus, post processing was applied to the database in order to make it more homogeneous and more readily usable for statistical analysis. 15 We mentioned earlier that measurement techniques were either 1D, 2D or 3D. These methods do not reflect exactly the same processes and a choice was made to force all measurements to report 2D type measurements homogeneously. This 3D measures in m 3 .yr −1 were divided by cliff face surface in a cliff top equivalent retreat in m.yr −1 . 1D measurements do not average information spatially. Cliff retreat is a stochastic process in time and space and 1D measurements profiles may happen to quantify erosion on a particular high or low erosion transect. Erosion rates of the transect measures were therefore averaged 20 for a unique study, cliff and period of time in order to limit the risk of over/under-representation.

Field unit conversion
Original data may be provided in different ways (for example the time span between 2 measurements may be given by a duration or start and end dates). As often as possible this information is summarized in a single duration field with homogeneous unit. -The mean cliff height is either obtained from a cliff height mean field or as the mean between height min, height max  -The error (m/yr) is a compilation of error value and error type.

Average site climate
Some explanatory variables were strongly correlated with each other (e.g. wave period vs wave significant height). This redundant information may lead to spurious correlation. New synthetic variables combine existing variables.
-Monthly mean temperature were converted into mean annual temperature and amplitude.

5
-Deep water swell energy flux was computed using swell period and significant height Where ρ is water density; Hs is significant wave height; Cg is wave group velocity; and T is wave period.
-Swell direction to the cliff.

Rock resistance inference
Macroscopic rock mass strength classes, though possibly crude, exhibits the ordered behaviour expected by literature: weak rock erode faster than medium strength rock, and medium strength rocks erode faster than hard rocks. Central erosion rate values increase by a factor 2 to 3 from one class to the next.
These values are in agreement with Woodroffe's work (2002), but, even if those distributions are distinct, they are broadly spread and multimodal.

Erosion vs climatic forcings
−1 ) in the database. Only a few studies concern harder rocks under cold 20 climate. However, even if a trend exists, data are really spread and Spearman's rank correlation coefficient is low (0.25 for frost, 0.07 for rainfalls) . Mean annual temperature do not show any clear correlation with erosion rate.

Comparison to previous studies
The GlobR2C2 database provides a quantitative overview of current rocky coast erosion knowledge. This database is the first 25 but narrow down assumed erosion rate ranges both towards lower and higher rates. We also observe that supposed hard rocks 30 10 In order to explore the influence of sea aggression, several variables were implemented in the database describing mean sea agitation and tidal range, and sea agitation during extreme events. All the variables concerning swell are strongly correlated.
Hence, only three independent marine parameters are analysed in the following scatterplots ( Fig.7): tidal range, wave energy Concerning climatic forcings, recession rates are compared to temperature variation, frost frequency and amount of rainfalls.
As for marine forcings, data is very scattered (Fig.8 as granites or basalts can erode as quickly as 1 m.yr −1 . This is because resistance to erosion does not depend on lithological category alone, but also on the degree of weathering, jointing, folding etc. This graph, presented at a conference with sedimentologists triggered deep-hearted reactions for the lack of rock classification robustness in their community. This result confirms the choice for a less debatable rock resistance criterion instead of lithology. This geotechnical criterion is not perfect either.
It was inferred based upon authors' description of the cliff, thus it can include a part of interpretation and some degree of 5 uncertainty.

What knowledge does GlobR2C2 compile?
The GlobR2C2 database is based on bibliographic references plus models and reanalysis used as proxies of forcings. Some bias are inherent to this kind of approach. The next paragraphs focus on different aspects of these limitations due to the use of (i): erosion rate as a proxy of erosion, (ii) the use of model and reanalysis as proxy of forcing, (iii) the use of peer-reviewed 10 journals.

Erosion rates, study duration and stochastic behaviour
Statistical exploratory data analysis (known as EDA) is a way to dissolve local particularity into a global analysis. Nonetheless, including every quantitative study implies mixing rates measured with different methods, accuracy, spatial and temporal extents, which could be a source of bias. Erosion is a stochastic event: the fortuitous occurrence of a rare big event would 15 influence the actual figure of the observed retreat rate. Rohmer and Dewez (2013) for instance, describe statistical indicators for testing the outlier nature of very large rock falls, with methods borrowed to hydrology, seismology and financial statistics.
These indicators were applied to a chalk cliff site in Normandy (northern France) in Dewez et al. (2013). During the 2.5 years terrestrial lidar monitoring period, a massive 70'000 m 3 rock fall caused a local cliff top retreat of more than 19 m (Dewez et al., 2013). That is more than one hundred years' worth of average retreat in one event. Estimated annual cliff recession rate 20 rose from 0.13 m.yr −1 to 0.94 m.yr −1 , a seven-fold increase, just by including this fortuitous, and definitely unrepresentative event (Dewez et al., 2013).

Forcing proxies
While publication-derived cliff recession rates and cliff conditions could be forced into a coherent database framework, envi-30 ronmental forcings were so scarcely and heterogeneously documented that the same rationalization process was not possible on

11
GlobR2C2 therefore addresses the concern of non-representative erosion values by compiling all studies available online, and retaining information from all sites and survey periods. In doing so, the actual dispersion of recession rate values is preserved and allows for recognizing outlying values (Fig.10). the publication basis alone. Instead, publicly available global climatic and sea conditions database were used. These databases present the advantage of being spatially and temporally continuous thanks to reanalysed climate and sea state models. Their principal limitation is their coarse-grained definition compared to site specificities. Nevertheless, they document external forcings (i) in a uniform fashion (regular spatial and temporal sampling steps), (ii) for the entire globe, and (iii) reflect forcing condition for durations spanning several decades. So, even if regional or continental data sets offer more resolved information 5 in space or time, the global extent ensures that all cliff sites worldwide are documented uniformly.

Literature biases as future tracks to improve cliff evolution understanding
GlobR2C2's worldwide compilation shows that research in this domain is very active. A lot of quantitative data already exist.
However, even if data coverage is somewhat global, publications turned out to focus mostly on a few western countries. This finding is reflects the strategy of literature search adopted: only international and national literature published in English,

10
French or Spanish were compiled. Due to the language barrier, we are aware that studies in Russian, German or Japanese languages, among others, were unwillingly obliterated.
Spatially, our search strategy did not flag scientific literature on the evolution of African and South American cliffs. Cliff recession studies are apparently focused on the richest areas where economically valuable coastal assets are exposed to losses.
This geographic distribution induces an over representation of temperate climates and a limited presence of some extreme 15 climates or wave condition like equatorial or polar regions. Those extrema could nevertheless be a key for understanding effects of climate and wave conditions on cliff erosion.
Studies also focus on fast eroding coasts because they represents bigger risks and also because of methodological limitation.
This fact biased the analysis by mostly documenting erosion distribution in higher values. The weight of this bias can be approached thanks to the French CEREMA study. This study contains null erosion values for coastal sectors where the cliff 20 was not seen to recess in a detectable manner on historical photographs. Yet this detection threshold is deemed to be of the order of 10 cm.yr −1 (Perherin et al., 2012), which is rather high, and null recession could reflect erosion situations anywhere in the spectrum between 0 and 10 cm.yr −1 .
These null values represent 67% of the study rocky coasts, which means that low erosion rocky coasts are common and ignoring this information can probably affect conclusions. In order to check the importance of the bias induced by those 25 values, we explored two extreme cases. The erosion value was set to either a small value of 1 mm.yr −1 or to the detection threshold of 10 cm.yr −1 positive correlation still exists between frost day frequency and a maximum tidal efficiency for tidal range between 1 and 3 m still is observed.

12
. Together with cliff settings compiled from publications, GlobR2C2 also records continental climate and marine conditions 15 at study sites from reanalysed models for their global, spatial and temporal sampling regularity. Both forcings exhibit weak relations with cliff recession rates. In relative terms, however, climate (i.e. frost days frequency) exhibits a stronger influence than marine forcing. Influence of the sea is only slightly visible in this dataset through a maximum efficiency of erosion for tidal ranges between 1 and 3 meters.
Our data divides into three classes of resistance, following the Hoek and Brown parameter. The most resisting (respectively 20 least resisting) rocks are found to lead to retreat rates less than 0.1 m.yr −1 (Q83) (respectively up to 85 cm). Medium-resistance rocks are not studied enough to give a precise range of retreat rates. Climate seems to be more efficient and frost seems to have the strongest influence.
Code availability. TEXT

Data availability. TEXT
Code and data availability. TEXT

13
We conclude at this stage that rocky coast erosion is primarily driven by cliff settings with second-order but non-negligible 25 modulations by marine and continental forcings (Fig.2). These findings are of primary interest for coastal erosion models ). Using solely a lithology denomination in the way of Woodroffe (2002) historical graph (Fig.9), lithologic types exhibit a similarly ordered behaviour (Fig.6), even if geologists contest the robustness of these denominations as proxies for rock strength.