Theoretical and experimental work (Furbish et al., 2021a, b) indicates that the travel distances of rarefied particle motions on rough hillslope surfaces are described by a generalized Pareto distribution. The form of this distribution varies with the balance between gravitational heating, due to conversion of potential to kinetic energy, and frictional cooling, due to particle–surface collisions; it varies from a bounded form associated with rapid thermal collapse to an exponential form representing isothermal conditions to a heavy-tailed form associated with net heating of particles. The generalized Pareto distribution in this problem is a maximum entropy distribution constrained by a fixed energetic “cost” – the total cumulative energy extracted by collisional friction per unit kinetic energy available during particle motions. That is, among all possible accessible microstates – the many different ways to arrange a great number of particles into distance states where each arrangement satisfies the same fixed total energetic cost – the generalized Pareto distribution represents the most probable arrangement. Because this idea applies equally to the accessible microstates associated with net cooling, isothermal conditions and net heating, the fixed energetic cost provides a unifying interpretation for these distinctive behaviors, including the abrupt transition in the form of the generalized Pareto distribution in crossing isothermal conditions. The analysis therefore represents a novel generalization of an energy-based constraint in using the maximum entropy method to infer non-exponential distributions of particle motions. Moreover, the energetic costs of individual particle motions follow an extreme-value distribution that is heavy-tailed for net cooling and light-tailed for net heating. The relative contribution of different travel distances to the total energetic cost is reflected by the product of the travel distance distribution and the cost of individual particle motions – effectively a frequency–magnitude product.

In two companion papers (Furbish et al., 2021a, b) we examine a theoretical formulation of the probabilistic physics of rarefied particle motions and deposition on rough hillslope surfaces. The formulation is based on a description of the kinetic energy balance of a cohort of particles treated as a rarefied granular gas and a description of particle deposition that depends on the energy state of the particles. The formulation predicts a generalized Pareto distribution of particle travel distances whose form varies with the balance between gravitational heating, due to conversion of potential to kinetic energy, and frictional cooling, due to particle–surface collisions. Specifically, the generalized Pareto distribution varies from a bounded form associated with thermal collapse and rapid deposition to an exponential form representing isothermal conditions to a heavy-tailed form associated with net heating of particles and decreased deposition. The transition to a heavy-tailed form likely involves an increasing conversion of translational to rotational kinetic energy leading to larger travel distances with decreasing effectiveness of collisional friction. As described in Furbish et al. (2021b), these varying forms of the generalized Pareto distribution are consistent with laboratory measurements of particle travel distances reported by Gabet and Mendoza (2012) and Furbish et al. (2021b) and with field-based measurements of travel distances reported by DiBiase et al. (2017) and Roth et al. (2020).

Here we highlight a key point in Furbish et al. (2021a). Namely, the generalized Pareto distribution is not selected in an empirical manner based on goodness-of-fit criteria applied to data sets. Rather, this distribution is dictated by the physics of the problem, just as, for example, the Boltzmann distribution (an exponential distribution) emerges in classical statistical mechanics from consideration of the accessible energy microstates of a gas system. In this problem the versatile form of the generalized Pareto distribution – specifically its apparent success in describing three distinctive energetic behaviors of rarefied particle motions – is enigmatic. Although the different energetic behaviors have a clear mechanical explanation, the transition from a bounded form to a heavy-tailed form in crossing isothermal conditions is abrupt. The basis of this transition, including the upper bound on travel distances prior to transition, is unclear – whether it represents a fundamental change in mechanical behavior or is simply a mathematical curiosity of the generalized Pareto distribution.

The purpose of this third companion paper therefore is to further elaborate the probabilistic physics of particle motions as represented by the generalized Pareto distribution. To do this we appeal to the principle of maximum entropy as outlined in the pioneering work of Jaynes (1957a, b). We specifically demonstrate that in this problem the generalized Pareto distribution is a maximum entropy distribution constrained by a fixed total energetic “cost” – the total cumulative energy extracted by collisional friction per unit kinetic energy available during particle motions. The relative energetic cost locally increases with increasing travel distance for net particle cooling and rapid thermal collapse, it is uniform for isothermal conditions, and it decreases with increasing travel distance for net particle heating. The cumulative cost involves integrating the local cost over the particle travel distance, and the total cumulative cost is then obtained by summing over all particles. This fixed total cost unifies the interpretation of the three energetic behaviors, where the upper bound on travel distances prior to transition is a probabilistic mechanical outcome.

As a point of reference, the canonical example of a maximum entropy distribution is the Boltzmann distribution of the energy states of the particles composing an ordinary gas at thermal equilibrium. Similarly, the Maxwell–Boltzmann distribution of particle speeds, which is derived from the Boltzmann distribution, is a maximum entropy distribution. Here we are referring to the Gibbs entropy of statistical mechanics. A maximum entropy distribution then is the unique distribution that maximizes the Gibbs entropy, subject to constraints imposed on the system. In the canonical case these constraints consist of a fixed number of particles and a fixed total energy, which together guarantee a fixed average energy equal to

Jaynes (1957a, b) elaborated the significance of the fact that the Gibbs entropy in statistical mechanics and the Shannon entropy in information theory are essentially one and the same, differing only by a constant. This similarity inspired Jaynes to champion the use of a maximum entropy criterion in choosing a probability distribution, leading to what is now known as the maximum entropy method (a.k.a. MaxEnt or MEM). The key idea of the maximum entropy method, whether viewed as a method of statistical mechanics or as one of inferential statistics, is that it provides an unbiased choice of a distribution by honoring only what is known mechanically about a system. That is, this unbiased choice is a maximally noncommittal choice that is faithful to what we do not know; it is therefore the most reasonable choice in the absence of additional information (Jaynes, 1957a; Williamson, 2010, 25 and 51 pp.). Importantly, mechanical constraints imposed on the system are part of the choice of the distribution, as opposed to empirical fitting without regard to such constraints. The maximum entropy method has been applied in a remarkable variety of fields (Shore and Johnson, 1980; Ramirez and Carta, 2006; Verkley and Lynch, 2009; Singh, 2011; Peterson et al., 2013), including sediment transport (Furbish and Schmeeckle, 2013; Furbish et al., 2016).

In using the maximum entropy method, constraints imposed on the system normally translate to constraints imposed on the moments of the distribution. In this case the method leads to a distribution that is among the exponential family (e.g., exponential, Gaussian). However, applications of the maximum entropy method to non-exponential distributions, including heavy-tailed distributions, are of particular interest in many problems (Peterson et al., 2013). As described below, applying this method to heavy-tailed distributions presents a special challenge in that the first or second moment, or both of these moments, may be undefined for such distributions, including the generalized Pareto distribution (Pickands, 1975; Hosking and Wallis, 1987).

In Sect. 2 we provide background material, namely, the essential elements of the formulation of Furbish et al. (2021a) leading to the generalized Pareto distribution of particle travel distances, and a summary of the properties and derivation of a maximum entropy distribution. In Sect. 3 we describe how the energetic cost associated with collisional friction is expressed as a constraint used in the maximization method. In Sect. 4 we show how the generalized Pareto distribution is obtained as a maximum entropy distribution. In Sect. 5 we describe the probabilistic properties and significance of the energetic cost. We consider the implications of the analysis in the final section. In the fourth companion paper (Furbish and Doane, 2021) we step back and examine the philosophical underpinning of the statistical mechanics framework for describing sediment particle motions and transport.

With reference to Fig.

Definition diagram of surface inclined at angle

The particle energy balance formulated in Furbish et al. (2021a) leads to the result that for a given particle size and shape the disentrainment rate on an inclined surface with uniform slope and roughness is

In mechanical terms the shape and scale parameters

For plotting purposes we define a characteristic particle cooling distance

Plot of dimensionless probability density

We note that the definition of the differential entropy given in the next section involves the logarithm of the probability density function. In a strict sense this is acceptable only if the density is expressed in dimensionless form as in Eq. (13) or if the definition involves a discrete probability mass function. Nonetheless, the maximization method removes this logarithm such that the outcome is dimensionally the same whether one starts with the dimensional form or the dimensionless form of the density. For simplicity we use the dimensional form, Eq. (4). In addition, for simplicity in plotting we set the scale parameter

Following Furbish et al. (2021b) we calculate the quantities

Plot of modified exceedance probability

If

As a point of reference, a fixed mean with

The canonical example of the Boltzmann distribution of particle energy states is obtained in this manner as a maximum entropy distribution, where the mean is independently determined to be

Our next task is to adapt these ideas to the generalized Pareto distribution, which is not among the exponential family of distributions. We note that there is a continuing effort given to this topic, notably in relation to heavy-tailed (non-exponential) distributions. Peterson et al. (2013) summarize the basis of these efforts and note that one approach for inferring non-exponential distributions is to appeal to nontraditional definitions of the entropy, for example, the Tsallis entropy (Tsallis, 1988), rather than the canonical BGS entropy. The procedure is the same: to maximize the defined entropy subject to an extensive constraint that scales with the system size. Here, however, we adopt the view of Peterson et al. (2013), who highlight the conclusions of Shore and Johnson (1980). Namely, because the BGS definition of entropy uniquely ensures addition and multiplication rules of probability, any other definition of entropy yields a bias in the fitting of data. Peterson et al. (2013) suggest that this offers a “compelling first-principles basis for defining a proper variational principle for modeling distribution functions”. Like these authors in their analysis of the energetics associated with the economics of scale, we retain the BGS definition of entropy and seek a non-extensive energy constraint aligned with the mechanics of the rarefied particle motion problem.

In the canonical example of the Boltzmann distribution, the particle energy state is an instantaneous quantity. Similarly, in the example of bed load particle velocities (Furbish and Schmeeckle, 2013; Furbish et al., 2016), the velocity state is an instantaneous quantity. The state of a particle changes from one instant to the next, and this state can be reached from smaller or larger state values. In these cases, the total particle energy and the total streamwise momentum are well-defined extensive quantities such that the moments of the distributions are fixed. In the absence of additional information, the maximum entropy distribution must be among the exponential family.

In contrast to instantaneous quantities, the particle travel distance

The disentrainment rate

In turn, the relative energy extracted within a small interval

Plot of cumulative energetic cost

Consider first the isothermal case to illustrate the significance of the cost

With non-isothermal conditions and net heating, it is easier to achieve larger state values than with isothermal conditions. Among all accessible microstates, an increasing proportion will have particles in larger states than would be predicted with a uniform cost rate. In contrast, with net cooling a smaller proportion of microstates will have particles in large states

The energetic cost

Focusing on the generalized Pareto distribution, as above we start with the constraint given by

Starting with isothermal conditions (

For isothermal conditions, using Eqs. (27) and (30) maximization leads to (Appendix A)

More generally, using Eqs. (27) and (31) maximization leads to (Appendix A)

Because of the importance of the energetic cost as a constraint in the maximum entropy method, here we examine the properties of this cost. The cumulative energetic cost

Whereas the generalized Pareto distribution of travel distances

Plot of probability density

Plot of mean energetic cost

The total cumulative cost

Plot of the total cost per unit travel distance

The energetic cost outlined above pertains to the conversion of translational kinetic energy into other forms, including rotational energy, surface deformation and heat – all under the heading of collisional friction. This cost, however, is not the same as the total energy conversion to heat.

Consider the total energy extracted by friction and ultimately converted to heat. Note first that the quantity

This result offers an example of how application of the maximum entropy method can be misleading. Namely, suppose we assume that a total fixed quantity of heat generated by particle motions, because this is an energetic “cost”, provides a constraint on the maximization procedure. In this situation, and with no further constraints, the maximum entropy method leads to an exponential distribution

Let us acknowledge that a distribution identified as a maximum entropy distribution based on empirically constraining one or more of its moments is not necessarily a special outcome. For example, we frequently fit data to exponential and Gaussian distributions based on estimates of the mean and variance of these distributions – assuming these moments exist and are finite – without reference to maximum entropy. In other words, asserting that a random variable possesses a finite expected value (mean or variance) and then using this assertion to choose the distribution based on the maximum entropy method has no meaningful mechanical significance if the mechanical basis of the constraint is not specified. In this situation a maximum entropy criterion is just one among numerous inferential methods – albeit with the decided merit of being maximally indifferent in the choosing of the distribution. Only when the constraining moment has independent mechanical meaning, and in the absence of additional information, does the label of maximum entropy carry mechanical significance. The example of heat states

For example, Furbish et al. (2016) suggest the following:

In focusing on the mechanical side of the duality of Jaynes's principle [of maximum entropy], it becomes important to distinguish between a “strong” mechanical constraint, a “weak” mechanical constraint, and an empirical constraint, as these inform confidence in the resulting choice of a distribution … A strong mechanical constraint is one that derives directly from a dynamics argument … A weak constraint is one that derives from a mechanical definition, for example, an appeal to mass conservation … An empirical constraint is one that appeals to our confidence in suggesting a general behavior from experiments or dimensional analysis but lacks a clear dynamics underpinning.

For rarefied bed load particles transported under equilibrium conditions, Furbish et al. (2016) show that the condition of fixed total particle momentum provides a strong mechanical constraint. In this situation the maximum entropy method predicts an exponential distribution of particle velocities in the absence of any additional mechanical information – consistent with measurements of particle velocities based on high-speed imaging (e.g., Lajeunesse et al., 2010; Roseberry et al., 2012; Furbish and Schmeeckle, 2013; Fathel et al., 2015; Wei et al. 2015). We suggest that the total cumulative energetic cost used herein to constrain the maximum entropy method similarly represents a strong mechanical constraint.As a point of reference, the analysis presented herein is akin to the energetics associated with the economics of scale as examined by Peterson et al. (2013). To illustrate this idea we start with a binomial expansion of the disentrainment rate, Eq. (3), to give

When rearranged, the “cost-minus-benefit” function proposed by Peterson et al. (2013) yields a cost function (Appendix C) whose form is identical to that of the disentrainment rate, Eq. (3). In the economics of scale problem the costs are nominally absolute energetic costs. In the problem of rarefied particle motions the cost function (i.e., the disentrainment rate) represents the local relative energetic cost. Nonetheless, the formalism involving a fixed total cumulative cost is essentially the same. With net particle heating it becomes easier for particles to achieve larger states

In this problem the maximum entropy method in effect considers all possible accessible microstates – the many different ways to arrange a great number of particles into distance states

Here we return to Eq. (2), the standard formulation of the probability density

We now have the interesting result that, for this problem, determining the distribution

The analysis presented here represents an unusual situation. Namely, the generalized Pareto distribution of travel distances and its parametric values are known a priori, and this distribution is then shown to be a maximum entropy distribution consistent with the constraint imposed by a fixed energetic cost. In contrast, normally the distribution is not known and the maximum entropy method is used to choose the distribution in an unbiased manner based on known constraints – as exemplified by the Boltzmann distribution. As emphasized by many, starting with Jaynes (1957a), the maximum entropy method represents a compelling strategy for choosing a distribution. Nonetheless, it is important to highlight the fact that a distribution thus chosen is not necessarily the “correct” distribution (Furbish et al., 2016). Rather, a distribution derived from a maximum entropy criterion is unbiased in that it is faithful to what is known mechanically, but no more; it is the most reasonable choice in the absence of additional information. In this sense the maximum entropy method is a formal application of Occam's razor – an explanation involving the fewest possible assumptions. Thus, the value of showing that the generalized Pareto distribution is a maximum entropy distribution is this: the analysis represents a novel generalization of an energy-based constraint in using the maximum entropy method to infer non-exponential distributions – to include the versatile properties (forms) of the generalized Pareto distribution as applied to the rarefied particle motion problem. Importantly, the analysis uses the BGS definition of entropy rather than a nontraditional definition. We suggest that this result offers promise for examining particle motions in other systems, including particles transported as bed load, where insights involving particle energetics might become useful as we learn more about the physics involved.

The maximization method involves the calculus of variations (Cover and Thomas, 1991), of which a version closer to the original analysis of Boltzmann is presented in Furbish and Schmeeckle (2013) and Furbish et al. (2016). Using the BGS definition of entropy given by Eq. (15) together with the constraints

For isothermal conditions with

Let

For isothermal conditions (

For non-isothermal conditions (

The total cumulative cost

The total cumulative cost

Plot of total cumulative cost

Peterson et al. (2013) focus on discrete systems where the state

the quantity on the left side of Eq. (C1) is the total cost-minus-benefit when a particle joins a

The data plotted in Fig.

DJF wrote the paper with critical review and input from SGWW and THD.

The authors declare that they have no conflict of interest.

We appreciate continuing discussions with Peter Haff regarding entropy in Earth-surface systems. Nakul Deshpande offered useful reactions to an earlier draft. We appreciate reviews of our work provided by Joris Heyman and an anonymous referee.

This research has been supported by the National Science Foundation (grant nos. EAR-1420831 and EAR-1735992).

This paper was edited by Eric Lajeunesse and reviewed by Joris Heyman and one anonymous referee.