Automated riverbed material analysis using Deep Learning on underwater images
Abstract. The sediment of alluvial riverbeds plays a significant role in river systems both in engineering and natural processes. However, the sediment composition can show great spatial and temporal heterogeneity, even on river reach scale, making it difficult to representatively sample and assess. Indeed, conventional sampling methods in such cases cannot describe well the variability of the bed surface texture due to the amount of energy and time they would require. In this paper, an attempt is made to overcome this issue introducing a novel image-based, Deep Learning algorithm and related field measurement methodology with potential for becoming a complementary technique for bed material samplings and significantly reducing the necessary resources. The algorithm was trained to recognise main sediment classes in videos that were taken underwater in a large river with mixed bed sediments, along cross-sections, using semantic segmentation. The method is fast, i.e., the videos of 300–400 meter long sections can be analysed within minutes, with very dense spatial sampling distribution. The goodness of the trained algorithm is evaluated mathematically and via intercomparison with other direct and indirect methods. Suggestions for performing proper field measurements are also given, furthermore, possibilities for combining the algorithm with other techniques are highlighted, briefly showcasing the multi-purpose of underwater videos for hydromorphological adaptation. The paper is to show the potential of underwater videography and Deep Learning through a case study.
Alexander Anatol Ermilov et al.
Status: final response (author comments only)
RC1: 'Comment on esurf-2022-56', Anonymous Referee #1, 12 Jan 2023
- AC1: 'Reply on RC1', Alexander Ermilov Anatol, 25 Feb 2023
RC2: 'Comment on esurf-2022-56', Anonymous Referee #2, 28 Jan 2023
- AC2: 'Reply on RC2', Alexander Ermilov Anatol, 25 Feb 2023
Alexander Anatol Ermilov et al.
Alexander Anatol Ermilov et al.
Viewed (geographical distribution)
The research you have presented in this manuscript, visual recognition method of the riverbed composition, is very interesting and deserves to be published. I believe that authors have successfully demonstrated that presented method can offer significant advantages over the point sampling, most notably the wider area coverage offering identification of potential sediment sorting.
Please find below few remarks I believe should be addressed before publishing the paper.
Paper is in general well structured, and methods are clearly presented. Results of the analyses with proper discussion could be improved.
Authors have done significant work, processing large volumes of the video data manually, and using wavelet and AI methods. Since extensive data preparation and analyses are required for these methods, authors have decided to present the results as bar plots over the sampling points or over the boat transect. This way, majority of the data processing remains hidden and readers have little insight into the detailed functioning of each automated method. The section of the paper dedicated to results and comparison of the methods (4.2.) contains only 4 paragraphs. In my opinion, the results should be discussed in more detail as they are key for validation of your approach.
I suggest that authors additionally present the complete results of the data processing (on an image-by-image scale) in form of frequency distribution curve. This way more metrics about noise in the results would be given to complement the final result.
Similarly, authors in their description of the results mention advantages and disadvantages for each method, such as bed armoring effect, sand detection, isolated gravel patches, etc., but this remains described only as a sentence and lacks visual supplement that would provide concrete sources of uncertainties which are crucial for alleviating the shortcomings of the presented approach and maximize its use. This is especially noticeable for wavelet method that has related data in the main manuscript body.
Authors propose their setup to be moored from the boat on a line and weighted down, assuming that the setup remains horizontal throughout the deployment. However, depending on the line length and drag influence on the setup geometry (which is highly irregular), it is inevitable that setup will be tilted during measurement. Authors haven’t presented any details on how this affects the video data, which might be crucial since there are no reference points on the bed.
Lastly, authors have addressed implementation challenges in the section 4.3., which will be helpful in further development of this approach. However, challenges they have detected are result of the selected approach, not universal and therefore need to be addressed accordingly. E.g. vessel speed lower than 0.2 m/s is claimed not feasible for straight transect over the river. Since the proposed method aims to cover broader river section and result in 2D maps, why is it important to transect the river? Maybe longitudinal approach would allow for lower velocities while covering the same area? Implementation challenges should be addressed universally and shortcomings of your selected approach (boat type, camera type, illumination, transects, etc.) should be provided with discussion of the data. This is one of the reasons that this section contains repetitive text from earlier, and again some of the findings are repeated in the following section 4.4. Novelty and future work.
Overall, presentation of the results (figures and graphs) could be upgraded to match the quality of the conducted research.
Keywords are relatively general and do not offer information of specific contents of the paper. I suggest to drop the keywords “rivers”, “sedimentology”, “underwater” and “mapping“, and use the following keywords for distinction instead: “riverbed texture” and “underwater mapping“ (or something along these lines).
There are several issues with referencing approach used in the paper:
In the goal of the paper contribution is highlighted as “…through improved (continuous, quick, covering larger areas) data collection.”. I would suggest that you rephrase this into “…through more extensive data collection.” Since it is hard to argue that method is:
I suggest that these advantages authors address in the discussion and drop from the goal itself since they are not straightforward.
Considering the size of the paper and volume of the work conducted, the goal of the paper is a bit short. I suggest that you expand the goal of the paper to reflect the specific contribution to the field.
Methodology section briefly describes the three locations of Danube River where data was collected, introducing flow rate data, SSC, etc. This data might be useful for someone familiar with the river, which most of the readers won’t be. Please put the presented data in context – e.g. provide complementary duration data, long-term average data, etc. which would allow estimation of conditions under which surveys were performed (low flow, average flow, flood).
Part of the Methodology focusing on the equipment lacks information that would help understand the data quality, in the context of the maintaining the setup distance from the bed. Please expand the current description with the data about the desired height above the bed and what does it depend on (supposably illumination of the FOV). Similarly, you initially state that size-reference wasn’t used in the images (ln253), and after offer contrary statement that laser pointers were used to provide scale (ln294). Probably laser pointers do not offer constant distance due to the bed irregularities, but clarification of the way they were used would be helpful.
In the Section 4.1, explaining the training of Deep Learning authors present example of erroneous particle detection of the user. Although this is good and informative example, in my opinion it shouldn’t be presented as “ground truth” and rather included as sidenote explaining that training data needs to be carefully selected since you noticed errors in user judgement (n.b., who was the user – one of the authors or trained personnel?)
On ln 31 “fluvial navigation” -> more appropriate would be “fairway placement”, but the connection is loose so I suggest that you replace it with use more relevant to the grain sire instead.
On ln 31 “riverbed structure” -> unclear, does this represent morphology?
“Riverbed” and “river bed” are used interchangeably throughout the manuscript. Please proofread the manuscript.
Goal (aim) of the paper is combined with Introduction, making it indistinctive. Please separate the aim of the paper into separate paragraph.
Figure 1 is very simple and lacks details (Danube is not highlighted and therefore indistinctive from other rivers in the figure).
Can the background orthophoto data be added on the Figures 2 and 3?
Names of the sections (transects) are very hard to follow since they do not follow any logical order (I suppose they do for the authors, but I suggest that you rename them to achieve clarity for the readers).
“Streamlined weight” -> isokinetic suspended sediment sampler?
Methodology section would benefit from added flow chart (process chart) since video decomposition and enhancement is carried out through several steps.