Articles | Volume 14, issue 3
https://doi.org/10.5194/esurf-14-391-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
OrthoSAM: multi-scale extension of the Segment Anything Model for river pebble delineation from large orthophotos
Download
- Final revised paper (published on 12 May 2026)
- Preprint (discussion started on 29 Aug 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-4003', David Mair, 20 Sep 2025
- AC3: 'Reply on RC1', Vito Chan, 07 Dec 2025
-
RC2: 'Comment on egusphere-2025-4003', Zoltan Sylvester, 07 Oct 2025
- AC1: 'Reply on RC2', Vito Chan, 07 Dec 2025
- AC2: 'Reply on RC2', Vito Chan, 07 Dec 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Vito Chan on behalf of the Authors (29 Dec 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (12 Jan 2026) by Giulia Sofia
RR by David Mair (26 Jan 2026)
RR by Zoltan Sylvester (26 Jan 2026)
ED: Publish subject to minor revisions (review by editor) (12 Feb 2026) by Giulia Sofia
AR by Vito Chan on behalf of the Authors (12 Mar 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (21 Mar 2026) by Giulia Sofia
ED: Publish as is (21 Mar 2026) by Wolfgang Schwanghart (Editor)
AR by Vito Chan on behalf of the Authors (24 Mar 2026)
Post-review adjustments
AA – Author's adjustment | EA – Editor approval
AA by Vito Chan on behalf of the Authors (05 May 2026)
Author's adjustment
Manuscript
EA: Adjustments approved (11 May 2026) by Giulia Sofia
General Comments:
The authors present a novel method and proof-of-concept for pebble segmentation in orthoimages by adapting the popular and widely-used Segment Anything Model (SAM; Kirillov et al., 2023). They identify important, but often unaddressed, weaknesses of SAM, such as the reduced performance in dense segmentation tasks (where many instances of the same object class should be segmented), and its limited capability to segment objects from one class with a significant size variability. To test their approach, the authors use 1) synthetic images with circles as a proxy for pebbles and 2) ortho-mosaics of real pebbles created with handheld cameras and photogrammetric processing. In their experiment 1, they test for the effect of a variety of image perturbations on segmentation quality. Here, they find that particularly shadow effects have some negative impact on SAM’s segmentation performance. In experiment 2, they apply their workflow to real-world images, showcasing the improvement of their multi-scale segmentation with SAM. In this scenario, they categorically evaluate segmentation performance through manual counting due to the lack of ground truth masks. Both experiments show that their approach is up to the task and has the potential to mitigate some of the segmentation shortcomings of SAM for such applications.
I find the method well-conceived and thought-through, the data rigorously tested and clearly reported, and the manuscript well structured. In particular, I consider the balance between technical details in the main manuscript and the appendices well struck, which makes the manuscript very readable, while not omitting relevant information. The presented results generally support the findings and conclusions. Here, I would only have two suggestions for calculating additional scores and using an additional image dataset to test the approach (see specific comments below), which might allow for a better evaluation of some aspects of the segmentation performance of SAM/OrthoSAM. However, these are just suggestions, not concerns raised. Currently, the manuscript has many small figures; maybe combining some figures into larger figures (e.g., Figures 10 and 11) would be helpful. Additionally, some minor/technical comments are included as in-line comments in the attached pdf.
In summary, I find the work of very high quality, with only a few minor points where the manuscript could be further improved. I suspect the authors will have no problems in addressing these points, and I look forward to seeing the manuscript published soon.
Kind regards,
David Mair (Uni Bern)
Specific comments:
References (including in-line comments):
Chen, Y., Bao, J., Chen, R., Li, B., Yang, Y., Renteria, L., Delgado, D., Forbes, B., Goldman, A. E., Simhan, M., Barnes, M. E., Laan, M., McKever, S., Hou, Z. J., Chen, X., Scheibe, T., & Stegen, J. (2024). Quantifying Streambed Grain Size, Uncertainty, and Hydrobiogeochemical Parameters Using Machine Learning Model YOLO. Water Resources Research, 60(11). https://doi.org/10.1029/2023WR036456
Huang, Y., Yang, X., Liu, L., Zhou, H., Chang, A., Zhou, X., Chen, R., Yu, J., Chen, J., Chen, C., Liu, S., Chi, H., Hu, X., Yue, K., Li, L., Grau, V., Fan, D. P., Dong, F., & Ni, D. (2024). Segment anything model for medical images? Medical Image Analysis, 92. https://doi.org/10.1016/j.media.2023.103061
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C., Lo, W.-Y., Dollár, P., & Girshick, R. (2023). Segment Anything. http://arxiv.org/abs/2304.02643
Mair, D., Witz, G., Do Prado, A. H., Garefalakis, P., & Schlunegger, F. (2024). Automated detecting, segmenting and measuring of grains in images of fluvial sediments: The potential for large and precise data from specialist deep learning models and transfer learning. Earth Surface Processes and Landforms, 49(3), 1099–1116. https://doi.org/10.1002/esp.5755
Pachitariu, M., Rariden, M., & Stringer, C. (2025). Cellpose-SAM: superhuman generalization for cellular segmentation. https://doi.org/10.1101/2025.04.28.651001
Padilla, R., Netto, S. L., & da Silva, E. A. B. (2020). A Survey on Performance Metrics for Object-Detection Algorithms. 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), 237–242. https://doi.org/10.1109/IWSSIP48289.2020.9145130
Stringer, C., Wang, T., Michaelos, M., & Pachitariu, M. (2021). Cellpose: a generalist algorithm for cellular segmentation. Nature Methods, 18(1), 100–106. https://doi.org/10.1038/s41592-020-01018-x
Zegers, G., Hayashi, M., & Garcés, A. (2025). Distributed estimation of surface sediment size in paraglacial and periglacial environments using drone photogrammetry. Earth Surface Processes and Landforms, 50(7). https://doi.org/10.1002/esp.70093