Rapid and objective characterization of channel morphology in a small, forested stream

Forested, gravel bed streams possess complex channel morphologies which are difficult to objectively characterise. The spatial scale necessary to adequately capture variability in these streams is often unclear, as channels are governed by irregularly spaced features and episodic processes. This issue is compounded by the high cost and time-consuming nature of field surveys in this type of environment. In larger stream systems, remotely piloted aircraft (RPAs) have proven to be effective tools for characterizing channels at high resolutions over large spatial extents, but to date their use in small, forested streams 5 with closed forest canopies has been limited. This paper seeks to demonstrate an objective method for characterizing channel attributes over large areas, using easily extractable data from RPA imagery collected under the forest canopy in a small (width = 10 to 15 m) stream, and to provide information on the spatial scale necessary to capture the dominant spatial morphological variability of these channels. First, the accuracy and coverage of RPAs for extracting channel data was investigated through a sub-canopy survey. From this survey data, relevant cross-sectional variables were extracted and used to characterize channel 10 unit morphology using a principal component analysis-clustering (PCA-clustering) technique. Finally, the length scale required to capture dominant morphological variability was investigated from analysis of morphological diversity along nearly 3 km of channel. The results demonstrate that sub-canopy RPA surveys provide a viable alternative to traditional survey approaches for characterizing these systems, with 87% coverage of the main channel stream bed. The PCA-clustering analysis provided a more objective means of classifying channel morphology with a correct classification rate of 85%. Analysis of morphological 15 diversity suggests that reaches of at least 15 bankfull width equivalents are required to capture the channel’s dominant heterogeneity. Altogether, the results provide a precedent for using RPAs to characterize the morphology and diversity of forested streams under dense canopies.

Detailed channel morphology data have been collected through annual topographic surveys in eight study sections (SAs 2-9), seven of which (SAs 2-8) are located downstream of a canyon (termed the 'canyon reach', see Fig. 1). The eighth study section (SA9) is located away from the others, upstream of the canyon. The lower study sections are 300-500 m apart and 5-10 w b (50-150 m) in length (Reid et al., 2019). An additional site (SA1) located in the channel estuary was active until the late 1980s, but has since been abandoned and was not included in this survey. Note that the RPA survey also included coverage of SA9, upstream of the other sites.

Remotely piloted aircraft survey
In July of 2018, approximately 3.0 km of channel was surveyed, with coverage extending from just upstream of the river mouth to the downstream limit of the canyon reach ( Fig. 1), as well as over most of the SA9 study section. SA9 is farther upstream and possesses smaller channel dimensions, and therefore serves as a challenging test site to evaluate the coverage attainable with the RPA. Total survey time was approximately 12 full days, including flights over SA9. The RPA survey involved low-100 level flights conducted in tandem with placement of Ground Control Points (GCPs) that were surveyed with a Leica TPS 1100 total station. Flights were undertaken with a DJI Phantom 4 Advanced RPA, a consumer-grade RPA which contains a camera with a focal length of 8.8 mm (24 mm in 35 mm format equivalent) and a field of view of 84 • . To avoid view obstruction reach. As riparian vegetation often obstructed parts of the channel bed and introduced errors when digital elevation models are generated from point clouds (Tamminga et al., 2015), the Cloth Simulation Filter (Zhang et al., 2016) from the open source software Cloud Compare (Cloud Compare, 2017) was employed. This tool inverts the point cloud and generates an interpolated surface analagous to 'draping' a simulated cloth over the ground surface to approximate the terrain of an obscured area (Zhang et al., 2016). Following visual inspection of the filtered result, a cloth resolution of 0.1 m and maximum distance between 0.5 130 to 1.0 m was found to adequately filter the bed points.
The elevations of submerged channel bed areas are often overestimated due to the refractive effect of overlying water (Dietrich, 2017). To correct for this effect and to develop accurate bathymetry, a corrective script developed by Dietrich (2017) was employed. By determining the distance from a generated water surface mesh to the estimated bed elevations in the point cloud below, the corrected water depth for a location could be calculated as a function of the multiple viewing angles used 135 to observe each point. Prior to applying this method, the clouds were sub-sampled to a spacing of 0.02 m while retaining the minimum height in each cell to further reduce cloud noise and ensure anomalous points from overhanging vegetation were removed in Cloud Compare.
Grain size estimates of the exposed bed were acquired by establishing a relationship between the roughness of the point cloud for 22 training sites and their median grain size (D 50 ) (see method described by Woodget and Austrums, 2017), a metric 140 often of interest to river managers. Each roughness sampling site was approximately 1 m 2 and was photographed by hovering the RPA approximately 2 m above ground level. Using an in-house photo-sieving Matlab-based GUI (Matlab, 2017), the grain size distributions of each training site were determined and a linear model then fit between their D 50 (Fig. 3) and their mean roughness value. Using a moving window analysis, grain size was then estimated across the exposed bed as described by Woodget and Austrums (2017).

150
To classify the channel along the longitudinal profile, the thalweg was first identified using the River Bathymetry Toolkit (RBT), an ArcMap add-in (McKean et al., 2009). The thalweg was used as a standardized location along which observations would be extracted at fixed intervals. To characterize patterns in channel morphology, five variables were extracted: the hydraulic radius (R h ), median grain size (D 50 ), local bed (S l ) and water surface slope (S ws ), and the reach bed slope (S r ). These variables were chosen as they are straightforward to extract from the data (a key requirement for a rapid classification scheme), and 155 because they reflect larger basin-scale variables relevant to channel form, such as geology, climate and land-use. To provide a measure of grain roughness across the channel, the average D 50 of the dry exposed bars in a 0.5 m buffer around each sampling location's cross-section was extracted. The local slopes of the bed and water surface were calculated for each sampling location by fitting a linear model through observations in a 15 m window around each sample site. This was repeated for the reach-scale bed slope using a 45 m window. Together these variables summarize the channel form (R h and S) and roughness of each 160 cross-section. Cross-sections where the channel banks were not discernible (due to channel obstructions or dense low-lying vegetation) were excluded from the analysis. Exclusion of these cross-sections, along with segments of the channel the RPA could not access, comprised approximatley 25% of the channel's thalweg.

Analysis
Following the extraction of the five channel variables, a principal component analysis (PCA) was applied to determine which 165 variables were important for characterizing channel morphology, and a k-means clustering approach was then used to classify the PCA results into channel types. To implement the PCA and k-means clustering, the package 'stats' in R was employed (R Core Team, 2018). The general objective of a PCA is to reduce the number of dimensions in a dataset that contains interrelated variables while describing the maximum amount of variation present (Jolliffe, 2002). Because the dataset was multi-dimensional with five variables over 2,362 sampling sites, a PCA was an appropriate tool to help simplify and extract 170 patterns in the data, a prerequisite for k-means clustering. The PCA was run using three of the five components, which together explained approximately 79.0% of the variation in the dataset, an appropriate cut-off according to Jolliffe (2002).
Following the PCA, the k-means clustering algorithm was run to identify groupings in the data along the first three components. A k-means clustering algorithm is an unsupervised classification that assigns observations from n dimensions to clusters that allow the within-cluster sum of squares to be minimized (Hartigan and Wong, 1979). Following guidelines for the method 175 described by Flynt and Dean (2016), six clusters were chosen to group the dataset, a value which is in reasonable agreement with the number of morphologies one may expect at Carnation Creek.
Following clustering of the cross-sectional variables, the mean values of each channel variable for each cluster were examined and a morphology type attributed to each cluster. Morphologies were assigned to clusters based on obvious features (e.g. shallow water slopes and greater depth for pools) and criteria presented in Church (1992), Anonymous (1996), and Buffington 180 and Woodsmith (2003). The resulting assignment of morphologies to clusters leads to a continuous classification of channel types found along the study reach at 1 m intervals, and provides insight into the survey extents necessary to adequately capture the heterogeneity of the system.
To characterize the diversity of channel morphology across the stream, a moving analysis using the Shannon diversity index (Shannon and Weaver, 1964) was conducted. This index provides a measure of the abundance and evenness of a property in 185 an area (Lloyd and Ghelardi, 1964). While this index is often calculated with regard to species types in ecology, the approach can also be applied to channel morphology types, similar to the work of Harris et al. (2009). To calculate index values, the proportion of each channel type in an area is multiplied by the natural logarithm of the proportion. These values are then summed for all the channel types present in an area. In order to apply the method to the Carnation Creek data, the index values are first calculated by iteratively dividing the channel into segments based on window sizes ranging from 15-750 m in length 190 (at 15 m intervals). For each iteration, the abundance of each channel morphology in each channel segment was determined.
Using the 'vegan' package in R, the Shannon's diversity index of each channel segment was then calculated.
To determine the spatial scale required to capture the heterogeneity of the channel, the standard deviation of the diversity metrics across the channel was calculated for each iteration (using the increasing window size ranging from 1-50 w b in length).
For example, for the first iteration, a standard deviation value was calculated from all the diversity metrics across the channel (root-mean-square-error (RMSE) and mean error (ME)) are provided for both the exposed and sumberged checkpoints.
that were based on 15 m channel segments. As sample size increases, the standard deviation of the channel segments tends towards an asymptote. The length scale required to approach this asymptote can therefore be interpreted as the scale beyond which diminishing returns arise in variability captured.

Survey accuracy 200
The channel-averaged vertical survey error was estimated by calculating the root-mean-square-error (RMSE) and the mean error (ME) of differences between the elevations of check points collected with the total station survey and those estimated from the DEMs. The RMSE provides a measure of the spread of the squared residuals whereas the ME provides a measure of any potential positive or negative bias to the data, and are similar to other metrics used to evaluate RPA survey performance (e.g. Tamminga, 2016). The overall spread of this error and summary statistics are illustrated in Fig. 4. Vertical errors of the 205 exposed bed points were found to be 0.093 m and 0.025 m for the RMSE and ME, respectively (n = 1,203), and similar values were obtained for the submerged bed points (RMSE = 0.11, ME = 0.025m, n = 521).

Survey coverage
In order to evaluate the coverage extent obtainable with the sub-canopy RPA survey, the RPA-based results were compared to channel boundaries delineated with a total station in the eight established study sections (see example in Fig. 5). When 210 including side channels, which were generally difficult to access with the RPA due to dense sub-canopy vegetation, it was possible to capture approximately 80% of the delineated study sections, a value which increased to 87% when side channels are excluded. When examining individual study sections that contained side channels, coverage ranged from a low of 54% in SA4, to a high of 89% in SA9. Generally, narrow (width < 3 m) side channels could not be effectively surveyed, but oblique imagery was advantageous in situations where a clear flight path was present alongside an obscured channel area (Fig. 6).

215
Similarly, bank top elevations were difficult to capture in most locations due to understory vegetation obscuring the ground surface. The inclusion of bathymetric calibration greatly increased the area over which bed topography could be estimated (e.g. Fig. 6).

Principal component analysis, clustering analysis, and channel classification
The first three components from the PCA explained approximately 80% of the variation in the data, with components one, two 220 and three reflecting 45.11%, 19.3% and 14.6% of the variation, respectively. The first component is dominated by S r , D 50 and S ws , the second by R h , and the third by S l and D 50 . After running the k-means clustering algorithm using six groupings on the first three components, these patterns were evident along the axis of the biplot (Fig. 7). For each cluster, the mean of each variable was calculated and the likely morphology corresponding to the cluster estimated from these values (Table   1). Moving from left to right along the first dimension (Fig. 7) there is a transition from morphologies with lower bed and 225 water surface slopes and finer bed sediment, to those with steeper gradients and coarser material. This appears to represent a transition from pool to riffle morphologies along the first component. Overall, distinctions between most channel attributes arising from the clustering are clear and lead to relatively unambiguous classification of morphology types (Table 1). Within the riffle channel type, the classification also captures a distinction between riffle morphologies with slightly coarser bed material, defined here as 'Riffle-coarse' (Riffle C , see Anonymous, 1996). When examining the second component (y axis of Fig. 7), 230 hydraulic radius (R h ) decreases from top to bottom, as indicated by the transition from lower-velocity pool to higher-velocity glide morphologies, with remaining morphologies possessing intermediate R h (Fig. 7).
Pools, riffles, glides and runs are relatively well distributed along the surveyed length of channel (Fig. 8). However, planebed and coarse riffle morphologies are mostly located near the upstream limit of the survey extent in this region. This area represents the outlet and downstream entrance of the canyon reach, where steeper gradients and coarser sediment are found.

Classification accuracy assessment
To assess the accuracy of the clustering algorithm, 100 locations along the surveyed length of channel were randomly selected and visually assigned to either glide, pool, run, riffle, riffle C , or plane-bed morphologies. These values were then compared to the morphologies predicted by the PCA. A summary of agreement between the PCA and visual classification approach is shown 11 https://doi.org/10.5194/esurf-2020-33 Preprint. Discussion started: 11 May 2020 c Author(s) 2020. CC BY 4.0 License.

Camera
Camera direction SA8 C C Bathymetry raster Figure 6. Coverage of a deep pool in SA8 under dense riparian vegetation. Note that the photo was taken in the autumn prior to the RPA survey, when the water level was higher than it was during the RPA survey. Photo courtesy of Iain Reid.
in Table 2. On average, 85% of sampled locations received the same morphology assignment between the two approaches, with 240 riffle areas showing the lowest agreement (72%) and plane-bed areas the highest (100%). Overall, the classification matches the typical expected progression of channel morphologies in a pool-riffle system, as is shown in Fig. 9 . The exit of the pool is classified as a glide, with negative bed surface gradients. As gradient increases we see shallow riffle morphologies that meld into a deeper run at the entry of the pool (Fig. 9). It is likely that much of the disagreement can be attributed to 'transition' morphologies, which most classification schemes are unable to capture or define.

Sub-canopy RPA surveys
The results of this study provide a precedent for using RPAs to characterize channel morphology in small, forested streams below the forest canopy. This approach provides several advantages over traditional ground-based surveys. Over twelve field   Table 2. Accuracy assessment of morphology classification.

Plane-bed 100
Glide 97 Run 85 Pool 80 All 85 days, nearly three kilometers of channel were surveyed with an estimated coverage rate of 80% (including side channels) Oblique imagery appears to provide good coverage of near-bank areas traditionally difficult to capture with vertical imagery, enabling the characterisation of low-velocity, near-bank channel areas which serve as critical fish habitat (Bjornn and Reiser, 260 1991). This additional imagery is generally straightforward to collect, but adds to the RPA power requirements and also increases survey time as a result of the need for additional flight passes. However, should repeat surveys be undertaken, a major reduction in survey time would be achieved through the installation of permanent ground control points. New RTK-GPS systems providing centimeter-level accuracy are also becoming available for consumer-grade RPAs, though signal attenuation through dense trees may reduce survey accuracy and limit their applicability for sub-canopy surveys.

265
While sub-canopy RPA surveys appear promising, certain environmental conditions and aspects of the survey approach continue to present limitations. First, the techniques for extracting the bathymetry may not be suitable for streams with turbid water that prevent observation of the submerged bed. While oblique imagery aided in characterisation of some bank areas, low elevation and dense riparian vegetation still pose a challenge for capturing bank topography in some locations, information which is necessary should the resulting survey be used for hydrodynamic modeling (Cienciala and Hassan, 2013) or to quantify 270 bank erosion (Reid et al., 2019). In addition to bank vegetation causing obstructions, low-hanging branches (predominantly from riparian deciduous species) occasionally led to flight difficulties. Therefore, these techniques may be most suited to small channels in relatively mature forests that have an open understory, and flights in winter months when foliage is absent may prove beneficial. In certain circumstances, a hybrid survey with both RPA and total station data could provide complete coverage, even in locations highly obscured by dense understory foliage. In spite of these limitations, however, the sub-canopy 275 RPA survey approach appears to offer substantial improvements over traditional survey methods. 16 https://doi.org/10.5194/esurf-2020-33 Preprint. Discussion started: 11 May 2020 c Author(s) 2020. CC BY 4.0 License.

Classification approach
The PCA-clustering classification approach appears to present a viable and less-subjective method for evaluating morphology at the channel-unit scale, and incorporates a larger number of key variables than traditional methods. While some subjectivity remains in the interpretation of the k-means-derived clusters, examination of the classification from the PCA-clustering analysis 280 revealed that there was good agreement between the characteristics of the morphologies derived from the clustering approach and morphologies identified visually (Table 2), with at least some remaining disagreement attributable to transition areas between morphological units. As shown in Table 3, the mean values of the variables for each assigned morphology are similar to reference values found for the slope, depth and grain size characteristics of similar channels classed in a number of other studies.

285
Including frequently measured channel metrics in a PCA-clustering analysis, as was conducted in this study, provides a sophisticated means not only for relating physical conditions to channel form (as descriptive schemes tend to do), but for identifying which key variables impact the relationship. Such an analysis may provide a precursory understanding of key variables worthy of investigation in the development of process-based classification schemes. A challenge encountered by many classification schemes is that they often lack the generality to be applied in environments outside of which they were devel-290 oped. For example, although Whiting and Bradley (1993) provided a strong process-based classification of channel form, it was intended for headwater channels, limiting its wider applicability (Buffington and Montgomery, 2013). Similarly, the approach to classifying channels proposed by Montgomery and Buffington (1997) has a clear process basis where the channel is partitioned into source, transport and deposition zones, but was developed for mountain drainage basins. While the classification approach proposed here is also based in a mountainous environment, the PCA-clustering technique allows for the 295 identification of morphologies in any fluvial environment where sufficient variation in bed topography is present. Unlike most classification schemes, identified clusters must be interpreted after the analysis to situate them within our conceptual understanding of river systems. While this consists of an additional step, it can provide opportunities to confirm our understanding of field observations in river systems or to guide further investigation when unexpected patterns appear.
Finally, it should be noted that in order to characterize the geometry of the channel, the PCA approach relies on wetted 300 variables, in contrast to flow-independent features like bankfull width or depth. When considering things such as the needs of salmonids, the low flow conditions observed in late-summer may be of concern and will determine the connectivity and distribution of certain channel types across the riverscape. Depending on the application, however, consideration of flowindependent variables may be required, like the bankfull width or depth, which are less dependent on the particular wetted conditions observed at the time of the survey.

Insight into scales of spatial variability
The results of calculating the standard deviation of the diversity metric for channel types (Fig. 10 a) suggest that a window size of approximately 13-15 w b (175-200 m in length) is necessary to capture the dominant variability along the channel.
Beyond this scale, additional variability is captured, but at a decreasing rate. The 3.0 km of channel over which this analysis a Slope values published from Church (1992) b Slope values published from Anonymous (1996) c Slope values published from Buffington and Woodsmith (2003) d Relative roughness values published from Church (1992) e Relative roughness values published from Anonymous (1996) was conducted would likely be considered a relatively homogeneous riffle-pool reach under traditional channel classification 310 schemes, such as that of Montgomery and Buffington (1997). The 15 w b length scale is shorter than the 30-50 w b equivalent often suggested for characterizing channel form (Bisson et al., 2006), and equivalent to 2-3 sets of pool-riffle units as defined by Keller and Melhorn (1978). This value fits in with the range of recommended study reach lengths that have been reported in the literature, though it is at the lower end (see Trainor and Church, 2003). For example, Montgomery and Buffington (1997) considered reaches 10-20 w b in length for their research while Woodsmith and Buffington (1996) considered reaches 20 w b 315 in length. At the higher end, Hogan (1986) and Trainor and Church (2003) consider reaches greater than 30 w b and reaches between 50-70 w b to be conservative lengths for their research, respectively. Given that additional variability is still captured with a greater spatial survey extent, the 15 w b value should be considered a minimum.
The explanation for the 15 w b domain over which a threshold in variability is reached may be related to the spacing of major sediment storage areas in the system. Previous work in Carnation Creek by Reid et al. (2019) suggests that non-random spatial 320 patterns in sediment storage are present along the channel (see Fig. 10 b and c). Both autocorrelation and spectral analysis methods applied to four sediment storage datasets collected between 1991 and 2017 revealed a periodicity in the data in the order of 12-20 w b , providing information on the spacing of major sediment storage areas. Given the similarity in length scales between Figs. 10 a-c, it is possible that these storage zones (mainly large bars) serve as end members between which the typical progression of channel-unit morphologies would be expected.

325
The bar-to-bar spacing represented by length scales shown in Fig. 10 is within the range, but close to the upper limit, of values reported for gravel bed streams in Thompson (2013). The explanation for the relatively large feature spacing may be