Full 4D Change Analysis of Topographic Point Cloud Time Series using Kalman Filtering

. 4D topographic point cloud data contain information on surface change processes and their spatial and temporal characteristics, such as the duration, location, and extent of mass movements, e.g., rockfalls or debris ﬂows. To automatically extract and analyse change and activity patterns from this data, methods considering the spatial and temporal properties are required. The commonly used M3C2 point cloud distance reduces uncertainty through spatial averaging for bitemporal analysis. To extend 5 this concept into the full 4D domain, we use a Kalman ﬁlter for point cloud change analysis. The ﬁlter incorporates M3C2 distances together with uncertainties obtained through error propagation as Bayesian priors in a dynamic model. The Kalman ﬁlter yields a smoothed estimate of the change time series for each spatial location, again associated with an uncertainty. Through the temporal smoothing, the Kalman ﬁlter uncertainty is, in general, lower than the individual bitemporal uncertainties, which therefore allows detection of more change as signiﬁcant. In our example time series of bi-hourly terrestrial laser scanning 10 point clouds of around 6 days (71 epochs) showcasing a rockfall-affected high-mountain slope in Tyrol, Austria, we are able to almost double the number of points where change is deemed signiﬁcant (from 14.9% to 28.6% of the area of interest). Since the Kalman ﬁlter allows interpolation and, under certain constraints, also extrapolation of the time series, the estimated change values can be temporally resampled. This can be critical for subsequent analyses that are unable to deal with missing data, as may be caused by, e.g., foggy or rainy weather conditions. We demonstrate two different clustering approaches, transforming 15 the 4D data into 2D map visualisations that can be easily interpreted by analysts. By comparison to two state-of-the-art 4D point cloud change methods, we highlight the main advantage of our method to be the extraction of a smoothed best estimate time series for change at each location. A main disadvantage of not being able to detect spatially overlapping change objects in a single pass remains. In conclusion, the consideration of combined temporal and spatial data enables a notable reduction in the associated uncertainty of the quantiﬁed change value for each point in space and time, in turn allowing the extraction of 20 more information from the 4D point cloud dataset. be identiﬁed even if no 4D-OBC of a minimum size could be extracted, complementing the state-of-the-art.


Introduction
Near-continuous time series of 3D topographic point clouds have recently become readily available through applications in research (Eitel et al., 2016), industry (Industry 4.0, e.g., Pasinetti et al., 2018), and in the public sector (e.g., distaster management, Biasion et al., 2005). Commonly, terrestrial laser scanners are installed on surveying pillars to regularly (e.g. hourly) 25 acquire three-dimensional representations of the surrounding topography. To interpret the data for geographic monitoring, especially in terms of topographic change processes acting on the surface, information needs to be extracted in the form of movement patterns (Travelletti et al., 2014), objects (Anders et al., 2020) or clustering (Kuschnerus et al., 2021). This information can then be used by experts to analyse change patterns and magnitudes concerning their underlying causes, predict future events and assess immediate dangers. 30 However, with any measurement taken in the real world, uncertainties need to be considered. In the case of topographic laser scanning, uncertainty may result in estimated change values that seemingly correspond to change in the topography of the involved surfaces, though no real change has occurred. For example, a rockfall with a low velocity is only detectable after a certain period -when the change magnitude is larger than the random effects introduced by the measurement.
Two approaches can be combined to alleviate uncertainty: Statistical tests, such as a t-Test, allow making statements about 35 uncertain values, by transforming them to thresholds or interval values using a confidence probability. For example, a change value of 0.01 m may have a 95% probability to be significantly different from zero. In the remaining 5% of cases, the value of 0.01 m would be caused by random errors and result in a false positive detection. The measurand (the quantity being measured) is seen not only as a singular value but rather as a probability density function. An analysis of the cumulative distribution function (CDF) then gives the relation between the Type-I error probability α, (or the specificity of a test 1 − α) 40 and the corresponding confidence interval. This moves the problem of change analysis or quantification to one of change detection.
The other approach takes advantage of the fact that no two measurements are completely uncorrelated. Generally, the closer they are to each other, the more they are alike. In space, this has been described in Tobler's first law of Geography (Tobler, 1970) and logically extends into time. In the analysis of dense time series of 3D point clouds, this fact is used to reduce 45 uncertainty in change detection. Consequently, lower thresholds for detectable change can be derived while keeping the same significance probability. Change can therefore be detected as statistically significant at lower change values; or generally with lower change rates. To achieve this reduction uncertainty, some sort of averaging or aggregation of multiple measurements of the same quantity into one value is required. This allows to reduce the influence of random errors but also reduces the information contained in the data through smoothing. 50 Spatial smoothing, i.e. aggregating points spatially before subjecting them to change analysis, reduces the spatial resolution at which change can be detected. In the widely employed multiscale model-to-model cloud comparison (M3C2) algorithm, a method to compare surfaces represented by two point clouds, a search cylinder is used to select and aggregate points of the two epochs before measuring the distance between them (Lague et al., 2013). This is beneficial over a simple cloud-to-cloud (nearest-neighbour) distance, because point clouds acquired with a laser scanner never sample the surface with the exact same pattern, and therefore no one-to-one correspondences can be established. Additionally, averaging the point positions reduces uncertainty in the position of the surface. A more simple approach, the creation of digital elevation models of differences (DoD), also includes spatial averaging when aggregating all points within a raster cell to a single value, but is restricted to a single direction of analysis and cannot account for complex 3D topography.
In the time domain, measurement series are often interpreted as signals. Signal smoothing is widely used and a multitude 60 of methods have been established. In many approaches, a moving window is employed to aggregate multiple consecutive measurements or samples to remove or reduce outliers. Depending on the aggregation function, different filters are established, and may be mathematically described as 1D convolutions (e.g., kernel-based smoothing, Kim and Cox, 1996). Alternatively, global methods such as Fourier transform may be applied to eliminate high-frequency elements of the signal, resulting in a low-pass filter (Kaiser and Reed, 1977). For point cloud analyses, a moving average filter has been successfully used to reduce daily patterns and random effects in time series (Kromer et al., 2015;Eltner et al., 2017;Anders et al., 2019).
4D point cloud analyses have employed both spatial and temporal smoothing separately to increase the Signal-to-Noise ratio of the change signal. Kromer et al. (2015) go further and combine spatial and temporal neighbours in a median filter to remove noisy data. In this work, we present a complementary approach, where we employ a Kalman filter to combine spatial and temporal smoothing. Kalman filters are mathematical descriptions of dynamic systems and are commonly used, e.g., in 70 navigation (Cooper and Durrant-Whyte, 1994) or traffic congestion modelling (Sun et al., 2004). They allow the consideration of uncertainties in observations optimally over time. In our case, the observations are bitemporal point cloud distances. In a Bayesian sense, each observation provides prior information on the system. The Kalman filter combines this information in a joint probability distribution to obtain estimates for the target variables that are, in general, more accurate (less uncertain) than the original observations. When estimates of position, velocity and acceleration have been made, they can even be propagated 75 into the future, beyond the newest measurement (Kalman, 1960).
We use the Kalman filter on change values between each epoch and a reference epoch, to obtain a smoother, less uncertain time series of change for each spatial location. To obtain accurate uncertainty estimates for the change values, we apply M3C2-EP, a method that allows the propagation of measurement and alignment uncertainties in bitemporal point cloud analysis to the obtained change values , but different methods of uncertainty quantification can also be imagined.

80
M3C2-EP contains an aggregation step derived from M3C2, where (spatial) neighbours are collected to create a local planar model of the surface. In combination, this leads to spatial and temporal smoothing, where the spatial smoothing step provides a Bayesian prior to the temporal smoothing step.
We further derive several features from the smoothed time series, where noise has been reduced, and use them to form clusters of similar change. We also show how the smoothed time series can be used to improve the results obtained with 85 established clustering methods, namely k-Means clustering, which has been applied to 4D point cloud data by Kuschnerus et al. (2021) to identify change patterns on a sandy beach.
To show the applicability of our method, we analyse a dense (bi-hourly) time series of Terrestrial Laser Scanning (TLS) scans acquired in Vals, Tyrol (Schröder and Nowacki, 2021). After a rockfall in 2017, permanent TLS surveys were carried out to ensure the safety of workers repairing the road and moving debris. We showcase how our method allows extracting 90 interpretable information from a large amount of data present in the time series of 71 epochs with about 20-23 million points each.
The contribution of our research is twofold: Firstly, the combination of the existing methods of M3C2-EP point cloud change quantification including the quantification of associated uncertainty with a Kalman filter to take advantage of the temporal domain, resulting in lower detection thresholds and less noise in the change extracted from the time series. Secondly, we show 95 how the smoothed time series improves existing clustering techniques and present a complementary technique, and finally compare our results to two different state-of-the-art approaches.

Methods
In this section, we will first present the dataset which is subsequently used to explain the methods and later on serves as an example dataset. We then highlight selected state-of-the-art methods in 4D point cloud analysis (Sect. 2.2), focusing on 100 methods that can use more than two timestamps, i.e. more than bitemporal analysis. In Sect. 2.3, we show how measurement uncertainties can be propagated to bitemporal change values using M3C2-EP. The Kalman filter equations are presented in Sect. 2.4. We then extract different feature spaces from the time series and use them to cluster areas of similar change in a 4D topographic point cloud of a rockfall area. 105 We are using TLS data of a rockfall-affected area in Vals, Tyrol, Austria (WGS84: 47°02'48" N 11°32'08" E) from August 2020. During this period, rock masses in the debris fan of the rockfall were relocated with excavators. These artificial, anthropogenic changes are clearly visible in the dataset, e.g. by simple differencing between the first and last epoch of the 5 days and 20 hours dataset that is being used in this research (Fig. 1a). Additionally, further up on the slope, erosion patterns are forming where precipitation and thunderstorms in the evenings are causing the relocation of debris. These patterns give an idea of the 110 activity at the study site.

Dataset: Vals rockfall
The data was recorded using a RIEGL VZ-2000i laser scanner permanently installed on a survey pillar in a shelter on the opposite slope about 800 m from the area affected by the rockfall. The rockfall depicted in the dataset originally occurred on 24 December 2017. A road located immediately beneath the rockfall slope was covered in 8 m of debris, and a total rock volume of 116,000 m 3 was relocated (Hartl, 2019). Though no serious damage was reported at buildings located in the area, 115 and a replacement road could be opened in 2019, the rockfall and the source area located above are since being continuously monitored to ensure that any movement or indicators for a new rockfall event would be detected.
The dataset presented here is part of this continuous monitoring setup, which was in operation from August to October 2020 and was designed to collect data for various research and development activities regarding the deployment of long-range laser scanners within a remotely controlled, web-based monitoring system from an engineering geodetic perspective. In addition to 120 the laser scanner, a total station (LEICA TM30), inclination sensors on the pillars (PC-IN1-1°from POSITION CONTROL) and various meteorological sensors were used in the shelter and the area of the rockfall. The additional measurements are used to verify systematic error influences on the result results. The topic is currently ongoing research. The methods presented in this paper are based on a part of the recorded data from 2020-08-20 00:00 to 2020-08-25 20:00 (all times are local time), corresponding to a total of 71 individual scans. Every 2 hours, a high-resolution scan of the area was performed with a resolution 125 of 15 mdeg in azimuth and elevation and a measurement rate of 50 kHz.
The first epoch (2020-08-20 00:00) was used as a reference epoch, also referred to as the null epoch. All subsequent epochs were aligned to this epoch by applying an ICP algorithm (Besl and McKay, 1992) implemented in OPALS (Glira et al., 2015) using stable surfaces adjacent to the rockfall area. The obtained transformations were applied to the dataset, and outliers and vegetation points were filtered using the statistical outlier filter (k=8, multiplier=10.0; Rusu et al., 2008) and the SMRF filter 130 (cell size=0.5 m, slope=2; Pingel et al., 2013), as well as a filter on the waveform deviation (≤50), implemented in PDAL (PDAL Contributors, 2018). The parameter file is supplied with the code (see Code availability statement).
On these data, we quantified bitemporal change magnitudes and uncertainties using M3C2-EP (cf. Sect. 2.3). We used the same normal vectors for all epochs, calculated on the null epoch using a 5 m search radius. The M3C2 distancing was carried out on a subset of the null epoch ("core points") that was created by distance-based subsampling in CloudCompare, reducing

Point cloud time series analysis 140
The large amount of data acquired by permanent laser scanning setups makes it necessary to distil the information in the dataset.
Whereas the recorded near-continuous point cloud data is 4D, researchers, decision-makers and engineers require a humanreadable presentation in 2D maps and 1D time series for selected locations. Therefore, the detection and selection of objects are crucial. Singular objects can then be located in space on a map and in time by plotting an aggregated change history of the concerned locations. For comparison with the state-of-the-art, we select two methods with fundamentally different approaches.

145
This allows us to discuss our method in the scope of the full spectrum of 4D change analysis.
Generally, unsupervised machine learning allows the clustering or segmentation of objects with similar properties. In the case of time series, these properties may be derived from the change history. For example, Anders et al. (2020) use discretetime warping distance to define a similarity metric between change histories of two locations. They then use seeded region growing to find spatially connected components, based on this similarity. A change object (so-called 4D object-by-change), 150 once found, can then be shown in a list, where it may be analysed in space and time. These objects may also overlap in space or in time, allowing to extract multiple processes that impose change at a single location separately. Kuschnerus et al. (2021) use the relative elevation values at spatial locations as features for clustering algorithms. They compare the performance of k-Means, agglomerative clustering, and DBSCAN on a dataset of a sandy beach, where both natural and anthropogenic forces impact the surface morphology. In their research, they show that k-Means and agglomerative 155 clustering perform similarly well, whereas DBSCAN suffers from the non-binary boundaries between neighbouring changes of similar properties. They also point out that their method is not able to cluster changes with similar properties if they occurred at different points in time. As they use the change values at the respective epochs as feature space, all dimensions have the same scale and unit (i.e., meters), and locations where one or more measurements are missing (e.g., due to temporary occlusion) cannot be assigned to a cluster.  (Williams et al., 2018), derived change values will almost never be zero. A careful consideration of measurement and processing errors therefore allows a statistical test on the significance of 165 a change value (Lague et al., 2013). For example, a change of 0.02 m may be significant for a relatively precise dataset, e.g. acquired with TLS. The same change magnitude may not be distinguishable from the noise for a different, less precise dataset, e.g. acquired by ALS. The so-called Level of Detection is a measure of how large a change value has to be attributed to actual change. As a probability measure and result of a statistical test, the Level of Detection depends on a significance level, which is commonly set to 95% (Lague et al., 2013;James et al., 2017). We, therefore, speak of the LoDetection 95% . In Winiwarter 170 et al. (2021), we showed how the LoDetection 95% can be derived from knowledge on the sensor accuracies (e.g. from the sensor data-sheet) and alignment accuracy (using an ICP alignment; Besl and McKay, 1992) by error propagation. In addition, we weighted the individual measurements by their respective uncertainties to arrive at an unbiased estimate for the change value. We refer to this method as M3C2-EP, as it extends the M3C2 algorithm by error propagation.
The M3C2-EP point cloud distance measure allows transferring uncertainty attributed to each of the original measurements, 175 i.e., laser ranges and angular measurements, to uncertainty in point cloud change for every individual core point. Thereby, the obtained M3C2-EP distance and its spatially heterogeneous uncertainty are representing our knowledge on the point cloud change itself, not on the measurements. This property allows us to use the distance for further-going analyses, such as the one presented in the following section. paper. For a state vector containing the position, the velocity, and the acceleration of an object, the state transition matrix is given in Eq. 1: Here, the next position (at t 1 = t 0 + ∆t) derives from the current position (at t 0 ), onto which the velocity multiplied by the time step and the contribution of the acceleration are added. The diagonal entries of 1 ensure that the current position, velocity, 190 and acceleration are transferred to the next point in time. The transition of the state vector from t 0 to t 1 is done using the prediction update equation (Eq. 2) in the prediction step: A measurement may subsequently be introduced in the so-called correction step. For this, a linear approximation of the measurement function H is required. A measurement consisting only of the position of the object, or in our case, the change 195 magnitude at a position, results in a matrix H as shown in Eq. 3. The velocity and the acceleration are not observed, so the second and third elements are zero. One could also imagine including physical measurements of velocity, e.g., using a Doppler radar system, or of acceleration, such as from an inertial measurement unit. In our application of terrestrial laser scanning repeated from a fixed position, we do not have such measurements.
The step size of the update ∆t is not necessarily equal to the measurement interval, leading to prediction steps without correction steps. This allows the estimation of the state for points in time where no measurement was recorded, based on previous measurements only. State estimation is not limited to interpolation but includes extrapolation into the future.
As in adjustment computation, every measurement in the Kalman filter is attributed with uncertainties, herein presented by the measurement noise matrix R. In our application, we use the uncertainty in point cloud distance obtained by M3C2-EP for 205 each epoch's change value.
Finally, the process noise matrix Q represents how much uncertainty is introduced in each prediction step, and therefore depends on the time step ∆t. By transitioning from t to t + ∆t, the state vector becomes more uncertain, unless new measurements are introduced. Q is representative of the system's ability to change outside of the filter constraints. In our example, we assume a system with constant acceleration, however, in reality, this is not the case, as, e.g., in a rockfall setting, friction 210 coefficients between topsoil and stable subsurface will change for different temperatures, moisture and other parameters, so we allow for an adaptation of the acceleration over time.
A common approach to model the process noise is discrete white noise. Here, we define the variance of the highest-order element (the acceleration) as σ 2 and calculate the effect of this variance on the other elements of the system's state (i.e., velocity and position) according to Labbe (2014, Chapter 07). In consequence, the state of the system becomes less certain over a longer 215 time and can be made more certain by introducing a new measurement with adequate uncertainty. For example, in the case of permanent TLS, change can be estimated one day into the future after having acquired one week of hourly measurements. This allows estimating whether a larger interval between the measurements still fully represents the expected changes.
The exact choice of this process noise model, especially the choice of the value of σ 2 is critical to the success of Kalman [m 2 /day 4 ]. The exact choice of these values does not matter, as long as they are larger than the expected magnitude of velocity and acceleration (Labbe, 2014, Chapter 8).
Running the Kalman filter then results in estimates of the state and its uncertainty for each point in time, based on all previous 230 states and measurements. This is referred to as a "forward pass", as calculation on a time series starts with the first measurement and then continues forward in time (Gelb et al., 1974, p. 156). It is, however, also possible to include consecutive states and measurements, which can decrease uncertainty and lead to a better estimate of the state as, e.g., outliers are much more easily detected compared to just using a forward pass. The Rauch-Tung-Striebel (RTS) smoother is a linear Gaussian method (such as the Kalman Filter itself) to consider consecutive states of the system (Rauch et al., 1965). It operates backwards on the 235 time series, starting with the latest Kalman state estimate ("backward pass"). The result is then a smoothed, estimated time series, making use of all of the available information (Gelb et al., 1974, p. 169). For more detail on the RTS smoother and its alternatives, the reader is referred to in-depth literature (e.g., Gelb et al., 1974;Labbe, 2014). We approach time-series feature definition from two different perspectives: First, we calculate several hand-picked features, grouped into four categories. The first group contains parameters that describe the most prominent events in the time series, like the magnitudes of the largest and smallest change (i.e., signed maximum and minimum values), velocity and acceleration, 250 as well as the acceleration at the point where the change value is maximal/minimal. The velocity value at these points is zero (as the velocity is the first derivative) unless the change value is maximal/minimal at the very end of the time series. These parameters are grouped as "Event attribute" in Table 1.

Time series feature extraction
In the second group, we collect parameters that are aggregated from the full time series. Here, the duration of the time series can take a large influence, as mean values for change, velocity and acceleration are calculated. Additionally, the mean absolute 255 slope of the change values is calculated as well as the total curvature (sum over the second derivative). Finally, the sum of squared residuals (change measurement -RTS smoother estimate) for the change values is computed. These parameters are summarised as "Full time series" in Table 2.
The third group contains all parameters related to the timing of the events whose magnitude was represented in the first group, i.e., the epoch of the maximum change value, velocity, and acceleration. We collect them as "Event timing" in Table 3. The final group is describing the final state of the filter, i.e., the change value, velocity and acceleration of the last epoch, and is shown as "Final state" in Table 4.

Clustering and identification of change patterns
Clustering is an unsupervised machine learning method. It allows the aggregation of similar data points to groups or clusters.

270
Due to its unsupervised nature, no training data is required, which would often be lacking in the case of geomorphic monitoring.   Instead, the resulting clusters can be analysed with respect to their size and magnitude, as well as visually by their shape in 3D space, and ultimately attributed to certain process types.
In this work, we selected two approaches to clustering: One is using the estimated change values themselves as a feature vector, and the other one uses features extracted from the time series prior to clustering. These features allow a more physical 275 interpretation of the processes forming the clusters (cf. Sect 2.5).
For the first approach, we use a k-Means clustering, which has been found to perform well by Kuschnerus et al. (2021).
The clustering algorithm minimizes the total sum of all distances from data points to the centroids of k clusters in an iterative approach (Hartigan and Wong, 1979). As the distance is euclidean, all dimensions are expected to be in the same unit and scale.
The second approach, as it extracts features from the time series, cannot satisfy the constraint of the same unit for all dimen-280 sions (cf. Sect. 2.5). We, therefore, use a Gaussian mixture model (GMM), which fits multidimensional Gaussian distributions to the data. These distributions include a full covariance matrix, i.e., allow different scaling in all feature dimensions. GMMs as implemented in the Python-package scikit-learn make use of the expectation-maximization algorithm to estimate mean and covariance for the cluster centroids (Pedregosa et al., 2011). We cluster the time series with GMMs of 20, 30, 50, 100 and 150 clusters.  Figure 3b).

305
An important information layer in time series analysis is the point in time where a change can first be detected as significant, especially when analysing surface change properties in relation to other data, e.g., from environmental sensors. Also, for early warning systems in natural hazard protection, minimal yet significant changes are of relevance. Each core point is colour-coded by the first epoch in which the estimated change magnitude is larger than the respective LoDetection 95% in Fig. 4a). This plot depicts clear patterns of different change processes occurring at different points in time. Note that for a specific core point 310 location, the LoDetection 95% threshold may be exceeded more than once. This is the case if the change is reverted, i.e., decreases in magnitude, or if the uncertainty increases, for example, if measurements are missing from the dataset over longer periods. We also display the epoch of maximum acceleration, corresponding to the onset of the most prominent event in the time series (Fig. 4b). At the bottom of the slope (in the area marked "IV"), these two values differ. Here, the first significant changes (between days 0 and 1) do not represent the changes with the maximum acceleration (occurring on day 4).

315
The difference between the RTS smoother estimate and the original measurements quantified by the sum of the squared residuals provides insights into how well the dynamic model of the Kalman filter fits to the data. This model includes the choice for the value of σ and the resulting process noise matrix Q (Sect. 2.4). This value is visualized for all core point locations for two options of σ in Fig. 5. Note the bead-like structures marked in the area of anthropogenic change (marked "II"), which closely relate to the different sections of excavator works carried out. These are also visible in Fig. 4 as different 320 periods but disappear from the visualisation of the final change magnitude image at the end of the observation period (Fig 3).
As presented in Sect. 2.6, the time series can be clustered using a k-Means-Algorithm on the time series values themselves.
Following Kuschnerus et al. (2021), we extract the smoothed estimates of change value from the RTS smoother and use them  as a feature vector for clustering. The result is presented in Fig. 6, where the former method results in no data (in grey) at a large share of locations. As soon as a single observation is missing from the time series, k-Means fails to assign the point 325 to a cluster. The smoothed time series interpolates and extrapolates the observations, and is, therefore, able to fill the gaps, including uncertainty estimates for these values. The uncertainty increases when measurements are not available (cf. Fig. 2).
Clusters are created as presented in Sect. 2.6, without any spatial information included. The obtained clusters however exhibit clear spatial boundaries, which line up with the expected changes.
Gaussian mixed model clusters resulting from the different subsets of features are presented in Fig. 7. The four different 330 subset cluster results facilitate interpretation of how clusters are formed, and how they change over time. Clusters appear spatially connected, but some clusters also consist of two separate parts, with similar properties, depending on the type of features used.
We compare the result of the engineered features to the state-of-the-art time series characteristics extracted using tsfresh (Christ et al., 2018) for clustering, and compare this result to the one achieved by spatiotempral segmentation using 4D Objects-   By-Change (4D-OBCs Anders et al., 2020), the clusters obtained by k-Means clustering as presented by Kuschnerus et al. (2021), and our adapted version where we use the RTS smoother result instead of the measurements themselves. To more clearly highlight differences and similarities, Fig. 8 depicts a bird's-eye view on the lower part of the slope, where most anthropogenic change has occurred (marked "II" in Fig. 1). Figure 8a) showcases very distinct features in the affected area, but lacks some clear patterns just above ("VI"). Figure 8b), resulting from the RTS smoother change estimates, clusters areas where 340 erosion processes seem to have acted on the topography of the rockfall surface: the light pink cluster ("VII") is aligned with the local gradient (indicated by the black arrow). Similar results, yet only for a subset of the points, are yielded by clustering on the raw time series information (Fig. 8c). The 4D-OBCs follow a completely different approach, and therefore also require a different visualisation strategy. As, by design, multiple objects can overlap spatially if their change histories allow to separate them, we display the outlines of segmented objects here. Accordingly, similar objects are recovered by all methods.

345
Visualization of selected time series parameters furthermore allows the exploration of the dataset. In Appendix A, we present a few hand-picked features, including FFT components, which allow the detection of periodic change patterns.
In contrast to the method presented by Kuschnerus et al. (2021), we can cluster every core point as missing data is inter-or even extrapolated. Furthermore, using the result of the RTS smoother, measurement errors have largely been reduced or even eliminated, making a comparison of resulting surface changes easier and more successful. k-Means requires the definition of a 350 number of clusters, which we set to 150. As presented in Sect. 2.1, a large amount of clusters forms in the forested parts of the   using 4D-Objects-By-Change (Anders et al., 2020). Individual colours refer to individual clusters. In the case of the 4D-Objects-By-Change (b), only outlines are given, as spatiotemporal clusters are also allowed to overlap. dataset (cf. Fig. 5, areas marked with "V"), where change information is not representative of surface topography. The choice of 150, therefore, allows enough clusters to form so that dominant change forms were also shown outside of this forest area.

Discussion
Kalman filtering provides some compelling advantages over established methods of point cloud-based change analysis. These 355 include the informed smoothing of the time series, reducing effects from measurement noise, and enabling temporal resampling and interpolation over missing epochs, and the option to predict future states. For example, k-Means clustering requires regularly sampled and complete data (cf. Kuschnerus et al., 2021), which is provided by the Kalman filter. This improvement through our approach is illustrated by Fig. 6, where the whole scene can be clustered using the Kalman filter-interpolated data.
In contrast, when using the change values themselves, only a small area in the bottom of the valley can be assigned to clusters.

360
The practical implications are that the erosion rills (marked "III") are not visible when applying simple k-Means clustering, because of missing data. While a simple (e.g., linear) interpolation would also solve this issue, the Kalman filter considers the estimated velocity and acceleration for interpolation. An alternative approach of 4D point cloud analysis is the extraction of 4D-OBCs as presented by Anders et al. (2020). Though our approach does not identify objects or clusters with spatial overlap, outlines of the 4D-OBCs however show close agreement with the clusters that are resulting from the point cloud. Moreover, 365 the clusters derived from the different feature sets (cf. Fig. 8) suggest that overlapping objects may be extracted from the time series using these multiple layers of information. As our method assigns every core point to a cluster, similar locations can be identified even if no 4D-OBC of a minimum size could be extracted, complementing the state-of-the-art. gives insights into events that co-occur temporally and that may therefore have a common external cause, such as rainfall. A clustering based on the final state of displacement magnitude, velocity and acceleration separates processes that have terminated (i.e., velocity and acceleration are close to zero) from ones that are still active, either decreasing or increasing in amplitude. A human interpreter may overlay and analyze these clusters with respect to the extracted parameters and the time series itself, 375 to gain further insights on the Earth surface processes driving the observed surface changes. In Fig. 6 and 8, we see that the extracted features form spatially contiguous clusters, even though no spatial information was included. This supports the validity of the extracted information.
Deriving the epoch-wise point cloud change values with M3C2-EP allowed us to incorporate uncertainty information into the time series smoothing. This becomes especially important when data from multiple sensors, with different levels of uncertainty, 380 are combined. The Kalman filter ensures that a new measurement cannot make the estimated parameters less certain. In the clustering step, this parameter uncertainty was not included in any of the feature vectors, to avoid an influence of the threshold value (typically 95%) on the results. However, areas with no significant change are clustered separately from ones where change is significant. This becomes apparent when, e.g., comparing the points with significant change displayed in Fig. 4a with the clusters in Fig. 7a, where most of the points without significant change are represented by the large green cluster. Subsequent 385 analyses of the results, where a single cluster is examined for its properties, may however again take advantage of the available uncertainty measures.
The manually extracted features and clusters presented here only represent one of many applications of Kalman-filtered and RTS-smoothed surface change time series. The overall concept of exploiting spatial and temporal autocorrelation allows an optimal consideration of uncertainty present in the data. The option of predicting future change values and associated 390 uncertainty opens a broad new field of adapting subsequent (future) measurements to the existing time series and the already observed data. Currently, applications are still limited by the manual definition of the state variance (σ) and by the linearization in the Kalman filter equations. More elaborate algorithms, such as the Extended Kalman Filter or the Unscented Kalman Filter may overcome these limitations in the future (Labbe, 2014).

395
We presented a novel method for the analysis of 4D point clouds for monitoring of Earth surface dynamics. The application of a Kalman filter allows informed temporal smoothing, which decreases uncertainty and enables interpolation as well as extrapolation of the time series. As M3C2-EP is used as a point cloud distance metric, and it spatially aggregates and smooths data, the full 4D domain is exploited to find optimal estimates for change value, velocity and acceleration.
Many real-world time series datasets contain gaps or are (by design) irregular. With our approach, the time series can be 400 both temporally resampled and interpolated. The regularity can subsequently be utilised by algorithms relying on a constant time step in the time series. We showed this by performing clustering of the spatial locations using the estimated change values as a feature vector, yielding 2D maps, that show groups of similar surface change history.
In a second clustering approach, we used features extracted from the time series to segment the point cloud. By engineering features that either depend on single events in the time series, or on the full change history at a location, we obtained multiple 405 new information layers for interpretating physically meaningful properties of the surface changes.
Our approach further allows predicting of future states, e.g., in an online monitoring setup. This facilitates future adaptive sensing strategies, where new measurements can be triggered by the estimated uncertainty crossing some pre-defined threshold value. Such strategies would allow improved use of available resources (time, energy) for permanent remote sensing setups.
Overall, the combination of an unsupervised machine learning approach with smoothed time series and the automatic ex-410 traction of physically meaningful and interpretable parameters is an important tool for the interpretation of topographic 4D point clouds. Our work can be used to detect locations and points in time where significant change occurs, and to group these locations into areas or subsets that have similar properties. The extraction of the smoothed time series then allows the interpretation of individual trajectories where the influence of random noise is largely suppressed, which in turn allows more precise statements about the significance of quantified change values. 4D data analysis using a Kalman filter and clustering techniques 415 facilitates easy interpretation and allows to extract of the relevant information from the data stack.
Code and data availability. The code used for processing the point clouds, including M3C2-EP and the Kalman filter, is available on GitHub (https://github.com/lwiniwar/kalman4d, v0.0.2) and is indexed with Zenodo (cf. Winiwarter, 2021). The data of the Vals rockfall is available upon reasonable request to Daniel Schröder at <daniel.schroeder@dmt-group.com>.
Appendix A: Selected time series characteristics