This paper presents a new approach to estimating confidence intervals for grain size distributions. I agree with the authors’ overall message that we should consider confidence intervals when analysing grain size distributions, and think that this paper and the associated toolbox will be a useful prompt in getting researchers to do this. This is a revised paper, though I did not see the original version. Overall I found the paper to be fairly clear and easy to understand, though there were some areas (identified below) that might benefit from a bit more signposting.
One comment I had was about the assumptions made in the model. It is assumed that the probability of sampling grains larger than D50 is 0.5, with the statement that “half the surface grains are smaller” than D50. Both are not necessarily true. D50 is the median grain size sampled using whatever sampling technique is applied. However, if the grains are sampled on a grid, then more than half of the surface grains (by number) could be smaller than D50. The discrepancy is because larger grains are more likely to be sampled than smaller ones as they take up more space and so there is a higher probability of a grid node or foot landing on them. In the case of equal numbers of large and small grains, the larger grains would occupy more than half the surface area and be more likely to be sampled. I don’t think that this sampling bias affects your analysis, but it needs more careful wording in the section around page 5, line 5. (See the Bunte and Abt 2001 technical report, section 4.3, for converting between distributions collected using different sampling techniques, e.g. grid to area.)
From a quick look through the response to reviewers, it looks like the authors have worked on incorporating previous literature. There were some places where this could have been developed a bit further. For example, demonstrating the range of different recommendations that are currently in the literature. Could you have also compared the results of your bootstrapping with the findings of Rice and Church, rather than just saying that you used the same method?
Comments by page/line:
1/15: Make it clear that the spreadsheet is available with this paper?
2/19: The problem with image based analysis is that you don’t know whether you are seeing the b-axis, which could introduce a different bias into the data.
3/3: Can you demonstrate how different the results from the different empirical analyses are, e.g. in terms of % error in D50?
3/11: It might be helpful to summarise what Fripp and Diplas presented, e.g. the sample size suggested for a given level of precision.
3/17: Not clear who ‘they’ is referring to.
3/28: This issue of overlapping intervals that don’t include both estimates will not be unique to analysing grain size data. Why can’t we use methods that have been developed by other disciplines to address it?
4/13: Move bracket to before 1993.
5/16: It took me a couple of reads to get my head round this; it might be useful to add an additional statement (as you do later on) along the lines of ‘In the case that 60 stones are smaller than D50, then d60 = D50’.
7/3: After a clear overview, I wasn’t sure where these subsections (2.1.1 onwards) were going. Can you add a bit more signposting? In particular, I wasn’t sure what the aim of this paragraph was.
7/10: It seemed a bit odd to be referring to a confidence interval, when you hadn’t yet addressed how it was calculated.
7/13: I needed a bit more help with comparing the two different approaches in 2.1.2 and 2.1.3. The differences seem to be that 2.1.2 gives asymmetric intervals and is mapped to specific grain size measurements. 2.1.3 seems to be symmetric, and allows interpolation between grain size measurements. When would you use the different approaches? Could you have a version that was asymmetric, but allowed interpolation?
11/17: Change to ‘measurements’.
14/Fig.7: How have the polygons been calculated, i.e. which percentiles have confidence intervals been calculated for? Also, add something to the caption to explain why 22.6 mm is highlighted.
14/16: Some of this explanation was a bit confusing, because the earlier analysis only refers to comparisons between two GSDs, but there are three in Fig. 7. Are all the significance values for a three-way comparison? I was surprised that the text didn’t report more differences between the bars and the other two units as there seems to be almost no overlap in the figure.
18/9: In this line and the next, clarify that you are referring to the mean ϵ. It might be helpful to give the range as well.
18/12: Change to ‘wide range of’. |

thanks a lot for the revised manuscript. I have received two further reviews, one by an original reviewer, one by a new one. Both agree that the manuscript is useful and easy to read. They have a few more suggestions for improvements that I ask you to consider. I do not think that it will be necessary to send the paper out for review again.

Please prepare a revised manuscript and a rebuttal to the comments. I am looking forward to reading the paper again.

Thanks and best wishes, Jens Turowski