Rarefied particle motions on hillslopes – Part 4: Philosophy

Theoretical and experimental work (Furbish et al., 2021a, b, c) indicates that the travel distances of rarefied particle motions on rough hillslope surfaces are described by a generalized Pareto distribution. The form of this distribution varies with the balance between gravitational heating due to conversion of potential to kinetic energy and frictional cooling by particle–surface collisions. The generalized Pareto distribution in this problem is a maximum entropy distribution constrained by a fixed energetic “cost” – the total cumulative energy extracted by collisional friction per unit kinetic energy available during particle motions. The analyses leading to these results provide an ideal case study for highlighting three key elements of a statistical mechanics framework for describing sediment particle motions and transport: the merits of probabilistic versus deterministic descriptions of sediment motions, the implications of rarefied versus continuum transport conditions, and the consequences of increasing uncertainty in descriptions of sediment motions and transport that accompany increasing length scales and timescales. We use the analyses of particle energy extraction, the spatial evolution of particle energy states, and the maximum entropy method applied to the generalized Pareto distribution as examples to illustrate the mechanistic yet probabilistic nature of the approach. These examples highlight the idea that the endeavor is not simply about adopting theory or methods of statistical mechanics “off the shelf” but rather involves appealing to the style of thinking of statistical mechanics while tailoring the analysis to the process and scale of interest. Under rarefied conditions, descriptions of the particle flux and its divergence pertain to ensemble conditions involving a distribution of possible outcomes, each realization being compatible with the controlling factors. When these factors change over time, individual outcomes reflect a legacy of earlier conditions that depends on the rate of change in the controlling factors relative to the intermittency of particle motions. The implication is that landform configurations and associated particle fluxes reflect an inherent variability (“weather”) that is just as important as the expected (“climate”) conditions in characterizing system behavior.


Introduction
In three companion papers (Furbish et al., 2021a, b, c) we examine a theoretical formulation of the probabilistic physics of rarefied particle motions and deposition on rough hillslope surfaces. As noted by Furbish et al. (2021a), such motions include the ravel of particles following disturbances (Roering and Gerber, 2005;Doane et al., 2019;Roth et al., 2020) or release from obstacles (e.g., vegetation) follow-interactions dominate over effects of particle-particle interactions in determining the behavior of the particles -akin to granular shear flows at high Knudsen number (Risso and Cordero, 2002;Kumaran, 2005Kumaran, , 2006. The formulation  is based on a description of the kinetic energy balance of a cohort of particles treated as a rarefied granular gas and a description of particle deposition that depends on the energy state of the particles. The formulation predicts a generalized Pareto distribution of particle travel distances whose form varies with the balance between gravitational heating due to conversion of potential to kinetic energy and frictional cooling by particlesurface collisions. Specifically, the generalized Pareto distribution varies from a bounded form associated with thermal collapse and rapid deposition to an exponential form representing isothermal conditions to a heavy-tailed form associated with net heating of particles and decreased deposition. As described in Furbish et al. (2021b), these varying forms of the generalized Pareto distribution are consistent with laboratory measurements of particle travel distances reported by Gabet and Mendoza (2012) and Furbish et al. (2021b), as well as with field-based measurements of travel distances reported by DiBiase et al. (2017) and Roth et al. (2020). Moreover, as described in Furbish et al. (2021c), the generalized Pareto distribution in this problem is a maximum entropy distribution (Jaynes, 1957a, b) constrained by a fixed energetic "cost" -the total cumulative energy extracted by collisional friction per unit kinetic energy available during particle motions. That is, among all possible accessible microstatesthe many different ways to arrange a great number of particles into distance states where each arrangement satisfies the same fixed total energetic cost -the generalized Pareto distribution represents the most probable arrangement. The analyses of rarefied particle motions in these companion papers collectively provide an ideal case study for highlighting key elements of a statistical mechanics framework for describing sediment particle motions and transport. Indeed, as noted in the second companion paper : We enjoy eating our favorite tortilla chips, and mostly we enjoy them with a well-prepared dip, for example, spicy guacamole. But . . . [t]he experience then is no longer about the chips -it's about the dip. The chips are just the guacamole delivery system . . . Similarly, these companion papers nominally concern particle motions on inclined rough surfaces. But these particles are just the delivery system. The dip consists of the coherent statistical mechanics framework for describing the particle motions, and a demonstration that such a framework . . . is possible. This represents a solid basis for . . . fresh ideas concerning particle motions more generally.
To wit, the purpose of this fourth companion paper is to elaborate the italicized (added) part of the paragraph above. Specifically, we consider three framework elements: (1) the purpose and merits of probabilistic versus deterministic descriptions of particle motions, (2) the implications of rarefied versus continuum transport conditions in defining the particle flux and its divergence, and (3) the consequences of increasing uncertainty in descriptions of sediment transport that accompany increasing length scales and timescales.
We suggest that the timing is ideal for offering perspective on these elements of a statistical mechanics framework. Amidst echoes from the pioneering work of Einstein (1937) on bed load particle motions and that of Culling (1963) on hillslope soil creep, there is a reemerging interest in probabilistic descriptions of sediment motions and transport. For example, recent efforts involve descriptions of (1) bed load particle motions and transport; (2) bed load tracer particle motions, including effects of particle-bed exchanges; (3) nonlocal sediment transport on hillslopes; (4) particle motions in soils, including tracer particles; and (5) rain splash transport (Appendix A). (We also note that there is a parallel interest in describing the statistical physics of relatively dense granular materials, e.g., Bi et al., 2015.) However, this effort is a patchwork of approaches and methods, and to date it mostly has involved kinematic descriptions of sediment motions and transport with limited elucidation of the associated mechanics. We believe that it is important for the philosophical underpinning of this growing effort to be part of the conversation, adding to recent perspectives on bed load transport offered by Ancey (2020a, b). This includes attention to commonalities in the formalism used to describe transport in different settings, for example, in relation to transport on hillslopes and within rivers. The conversation also must include an honest assessment of the expectations and limitations of probabilistic descriptions of transport.
In Sect. 2 we summarize key material from the three companion papers for reference in later sections. In Sect. 3 we step back and consider in general terms the philosophical basis of a statistical mechanics framework for describing sediment motions and transport. In Sect. 4 we return specifically to the problem of rarefied particle motions on hillslopes and use the analysis to illustrate elements of the framework. In Sect. 5 we consider implications of the statistical mechanics description of rarefied particle motions on hillslopes. In several sections we provide historical background on the technical material covered.

Background
With reference to material in Furbish et al. (2021a, b, c), the problem of describing rarefied particle motions on hillslopes is motivated by the entrainment forms of the flux and the Exner equation. Namely, let f r (r; x, t) denote the probability density function of the travel distances r of particles whose motions start at x, and let R r (r; x, t) denote the associated exceedance probability function. Assuming motions are only in the positive x direction and noting that x = x − r, the particle volumetric flux q(x, t) may be written as (Furbish and Roering, 2013;Furbish et al., 2017) q where E s (x, t) denotes the particle volumetric entrainment rate at position x and time t. In turn, letting ζ (x, t) denote the local land-surface elevation, the entrainment form of the Exner equation is (Tsujimoto, 1978;Nakagawa and Tsujimoto, 1980) where c s = 1 − φ s is the particle volumetric concentration of the surface with porosity φ s . The central elements of Eqs. (1) and (2) are the probability density function f r (r; x, t) and the associated exceedance probability function R r (r; x, t). These are related to the disentrainment rate function defined by P r (r; x, t) = f r (r; x, t) R r (r; x, t) , which, when multiplied by dr, is interpreted as the probability that a particle will become disentrained within the small interval r to r + dr, given that it "survived" travel to the distance r. The disentrainment rate, Eq. (3), connects the descriptions of the flux and the rate of change in the landsurface elevation, Eqs. (1) and (2), to the physics of particle motions and disentrainment. We also note that Eqs. (1) and (2) are nonlocal in form and scale independent. The appearance of time t in Eqs. (1) and (2) is examined further below. Suffice it to say here that this is in reference to time variations associated with ensemble expected values of the flux q(x, t) and the rateζ (x, t) or to variations in appropriately defined time averages of these quantities (Furbish and Haff, 2010). In addition we are neglecting particle travel times throughout. For reference below the left side of Eq. (2) also may be written in terms of the divergence form of the Exner equation, namely, c sζ (x, t) = c s ∂ζ (x, t)/∂t = −∂q(x, t)/∂x when the flux q(x, t) is specified by Eq. (1) (Appendix B).
The analysis presented in Furbish et al. (2021a) describes the mechanical basis of the disentrainment rate P r (r; x, t) and the associated probability distribution f r (r; x, t). This involves a consideration of the kinetic energy balance of rarefied particle motions and how this balance determines the deposition of particles in relation to their energy state. In particular the analysis leads to the result that for a given particle size and shape the disentrainment rate on an inclined surface with uniform slope and roughness is P r (r; x, t) = 1 Ar + B , which in turn leads to the generalized Pareto distribution, where A ∈ R is a shape parameter and B > 0 is a scale parameter (Pickands, 1975;Hosking and Wallis, 1987) (Fig. 1). The cumulative distribution is and the exceedance probability is For A < 1 the mean is and for A < 1/2 the variance is .
The mean is undefined for A ≥ 1 and the variance is undefined for A ≥ 1/2. Note that because the density f r (r; x, t) may vary (slowly) with time t, the parameters A and B also may vary with time.
In mechanical terms the shape and scale parameters A and B are A = α γ S µ − 1 + 1 α (γ − 1) and (10) Here, S is the magnitude of the slope inclined at an angle θ , m is particle mass, g is acceleration due to gravity, µ is a friction factor due to extraction of particle kinetic energy E p = (m/2)u 2 where u is the surface-parallel particle velocity, E a = E p is the arithmetic average particle energy so that E a0 is the initial average energy at r = 0, γ = E a /E h where E h is the harmonic average particle energy, and α = α 0 /(1 − µ 1 Ki) where α 0 and µ 1 are factors of order unity and Ki is the Kirkby number defined by which represents the ratio of gravitational heating to frictional cooling. Here we emphasize that mg cos θ in Eq. (11) Figure 1. Plot of probability density f r (r; x) versus travel distance r for scale parameter B = 1 and different values of the shape parameter A for (a) A < 0 and (b) A ≥ 0 with associated exceedance probability plots (insets). Figure reproduced from companion paper (Furbish et al., 2021a). Compare with Fig. 1 in Hosking and Wallis (1987).
is not to be interpreted as the static normal weight of the particle, and µ is not interpreted as a Coulomb-like friction coefficient. Rather, µ ∼ β x , where β x denotes the expected proportion of particle kinetic energy extracted per particlesurface collision during downslope motion. Details are provided in Furbish et al. (2021a, b). Following Furbish et al. (2021b) we calculate the quantities R * = R A r and r * = Based on Eq. (7), values of the modified exceedance probability R * and the dimensionless travel distance r * should collapse to a straight line in a log-log plot with slope of −1 (Fig. 2). The data in this figure, spanning more than 3 orders of magnitude of the dimensionless travel distance r * , are compiled from Furbish et al. (2021b;Fig. 16 therein). Values of A and B are estimated from laboratory measurements of particle travel distances reported by Gabet and Mendoza (2012) and Furbish et al. (2021b) and from fieldbased measurements of travel distances reported by DiBiase et al. (2017) and Roth et al. (2020). This plot does not prove, but nonetheless supports, the idea that the generalized Pareto distribution correctly describes the energetics of the behavior of rarefied particle motions for a variety of slope and surface roughness conditions. The data fits for individual experiments with detailed explanation are presented in Furbish et al. (2021b). We refer to elements of the analysis summarized here in several sections below. Meanwhile we turn to the philosophy of the statistical mechanics framework.

Philosophy of the statistical mechanics framework
Although our companion papers are focused on rarefied particle motions on hillslopes, here we purposefully step back and initially consider the broader topic of probabilistic descriptions of sediment motions and transport. Borrowing ideas and wording from a book in preparation, we briefly consider elements of three foundational concepts in the natu-ral sciences that repeatedly appear in the study of complex systems, focusing here on sediment systems: (1) the relation between mechanistic descriptions of sediment behavior and probabilistic versus deterministic formulations of this behavior, (2) differences in rarefied versus continuum conditions;, and (3) the treatment of uncertainties in descriptions of system behavior that grow with increasing length scales and timescales considered. The point of this brief overview is to ask ourselves, at least momentarily, to step out of our comfort zones as informed and conditioned by our different backgrounds in, say, particle mechanics, continuum mechanics, probability and statistics, and, by the length scales and timescales with which we are most familiar. Following this overview, we return to the problem of rarefied particle motions and use it to illustrate the philosophical points involved.

Probabilistic versus deterministic descriptions
In learning how to describe the behavior of mechanical systems, mostly we are initially exposed to deterministic examples. We study Newton's laws as these pertain to simple particle systems and then move on to the behavior of solids and fluids treated as continuous materials, wrapping our heads around Lagrangian versus Eulerian perspectives. The formalism is unambiguous, and describing the behavior of a well-constrained system is in principle straightforward. Indeed, much of the legacy of geophysics resides in the determinism of continuum mechanics. Perhaps it is therefore natural that we might envision that a mechanistic description of the behavior of a system implies that such a description ought to be, or perhaps only can be, a deterministic one. Such a perception represents a lost opportunity. The most elegant counterpoint example is the field of classical statistical mechanics -devoted specifically to the probabilistic (i.e., non-deterministic) treatment of the behavior of gas particle systems in order to justify the principles of thermodynamics -yet which is no less mechanical in its conceptualization of this behavior than, say, the application of Newton's laws to the behavior of a deterministic system consisting of the interactions of a few billiard balls or involving the motion of a Newtonian fluid subject to specific initial and boundary conditions.
Once steeped in the language of mechanics, we understandably take comfort in mechanistic descriptions of system behavior. Specifically, we invest trust in the underlying foundation, and implied rigor, provided by classical mechanics. This is a good thing. But given the complexity and the uncertainty in describing the behavior of sediment systems, here it is essential to consider the idea that the concepts and language of probability are well suited to the problem of describing this behavior -precisely because of the complexity and uncertainty involved. This involves relaxing our expectations, for example, that a deterministic-like relationship exists between the flux of bed load sediment and the fluid stress imposed on the streambed or between the flux of sed-iment on a hillslope and the local land-surface slope -particularly when these involve noise-driven processes, as described below. This idea of leaning on probability to describe the behavior of sediment systems is not as straightforward as describing the behavior of idealized gas particle systems. Nonetheless, the objective is the same: to be mechanistic, yet probabilistic. These worldviews are entirely compatible.
To be sure, the extent to which the tools of probability can be fruitfully brought to bear to characterize particle motions and transport varies with the specific process considered and the information we have available to constrain any particular probabilistic description of motions. For example, we know far more about the probabilistic qualities of bed load sediment transport in shear flows based on flume experiments than, say, soil particle transport and mixing associated with bioturbation and granular creep (Appendix A). The objective therefore is to aim at probabilistic descriptions of sediment particle motions and transport that lean on the style of thinking of statistical mechanics, recognizing that this endeavor is not simply about adopting established theory or methods "off the shelf". Rather, such efforts involve tailoring descriptions of transport to the process, the scales of interest, and the techniques of observation and measurement used. The examples covered below illustrate these points.

Rarefied versus continuum conditions
The continuum hypothesis -the essential basis of continuum mechanics -stands as a triumph of the physical sciences. (Let us be clear that we are referring to the version of this hypothesis as applied to descriptions of real material systems rather than to the related mathematical idea posed by Georg Cantor, that there is no set of numbers whose size falls between the two infinities associated with the natural numbers and the real numbers.) This hypothesis allows us to envision many solid and fluid materials at our ordinary macroscopic scale of observation as being continuous things whose properties and behavior can be described using that part of the calculus given to continuously differentiable functions -even though when we focus our attention on the scale of the elements of a "continuous" material, that is, at the particle scale, we discover that it is decidedly discontinuous. Indeed, many of the definitions of basic, familiar quantities describing the properties, rheology and motion of real materials -their intensive properties, thermodynamic state variables, rheological coefficients, discharges and fluxes, the divergence of these fluxes, etc. -at the outset assume continuous substances and continuum behavior that involve smooth changes with respect to space and time. That said, this lovely continuum siren is to be avoided as a de facto starting point in descriptions of sediment motions and transport. Many sediment particle motions on Earth's surface are patchy, intermittent and demonstrably rarefied (Furbish et al., , 2018cAncey, 2020a, b) -conditions that are at odds with continuum formulations of these motions. For these reasons an appropriate strategy involves constructing descriptions of the collective behavior of sediment particles without assuming a continuum behavior at the outset. Indeed, precise definitions of the sediment particle flux and its divergence do not assume continuum conditions Furbish et al., 2012aFurbish et al., , 2016bFurbish et al., , 2017Ancey and Pascal, 2020). Instead, the idea is to develop more general (probabilistic) formulations of this behavior and then ask the following question: under what conditions does a continuum formulation of behavior make sense? As a point of reference, when continuum behavior is assumed at the outset, the Navier-Stokes momentum equation is derived from the Cauchy momentum equation. But when viewed with respect to particle (molecular) behavior, the Navier-Stokes equation is derived from the Boltzmann equation -which is decidedly probabilistic and entirely agnostic to continuum versus rarefied conditions. That is, the Boltzmann equation is equally applicable to both conditions. If the continuum hypothesis is satisfied, then it is natural to adopt the Navier-Stokes formalism. On the other hand, rarefied conditions must be treated probabilistically using methods of statistical mechanics. As described below with respect to sediment particles, this can include the use of continuum-like equations -noting that "continuum-like" means continuously differentiable, not that the particles behave as a continuum, and also noting that such equations apply to ensemble expected conditions, not to individual realizations. This means that we must be careful in interpreting the use of continuous probability distributions and related functions to describe attributes of particle motions (e.g., entrainment rates, travel distances) as in Eqs. (1) and (2).
One of the most important consequences of rarefied transport conditions is this: one cannot expect to predict a welldefined single value of the particle flux from specified, fixed controlling factors. Even under the ideal circumstances of a "perfect" model of the particle flux, such a prediction must be probabilistic. That is, a given set of controlling factors yields a probability distribution of fluxes rather than a single value. Any individual realization therefore can involve a value that may or may not coincide with a statistically expected value, whether this expected value is an empirical outcome or is predicted by a mechanical argument.

Uncertainty with growing scales
Our interest in sediment particle systems spans timescales of less than milliseconds to hundreds of thousands of years. The shortest timescales are represented by, say, observations of the details of particle motions in controlled experiments measured by high-speed imaging. Intermediate timescales are represented by, say, measurements of transport on hillslopes and in rivers on human timescales pertaining to the erosion and deposition of sediment in relation to land-use and river management. Long timescales are represented by our interest in understanding the evolution of hillslope and river systems at geomorphic timescales. Similarly, our interest in sediment systems spans length scales of less than a millimeter to at least tens or hundreds of kilometers. The smallest length scales are represented by differential particle motions during granular creep that are a fraction of a particle diameter or in relation to the initial jiggling of bed load particles prior to entrainment from their microtopographic "pockets". Intermediate length scales are represented by particle motions involved in the dynamics of river and eolian bedforms -ripples to dunes to megadunes -thence to scales involving, say, intermittent sediment motions from the crest of a hillslope to its base or within the extent of one or two river bends. The largest length scales are represented by the erosion and deposition of sediment over the scale of a hillslope-channel network or a depositional basin.
With increasing scale (length and time) goes increasing uncertainty in our descriptions of sediment motions and the behavior of sediment systems. The essential reasons for this increasing uncertainty reside in the increasing complexity, including heterogeneity, of sediment systems as their size increases, and in the increasing stochasticity, including the patchiness and intermittency, of factors that influence sediment motions and transport as both the system size and the timescale of interest increase. Equally important, with increasing scale our uncertainty grows in relation to the increasing difficulty, and the loss of resolution, associated with observing and measuring quantities that enter into our descriptions of sediment motions and transport -whether these quantities involve features of the sediment itself (e.g., particle sizes and arrangements, details of particle motions), or attributes of the factors influencing sediment transport (e.g., changing fluid motions, surface roughness). Moreover, in approaching climate-change timescales and longer, we can only imagine in probabilistic terms how many of the ingredients of sediment transport might vary (Benda and Dunne, 1997).
In relation to the uncertainty that grows with scale, we also must consider the consequences of differences in our ability to observe and measure quantities representing the dynamics of "fast" versus "slow" systems as viewed relative to the human experience. Focusing specifically on the configuration of a sediment system -a bedform, a river reach, a soil mantled hillslope -a fast system is one for which we can observe and measure attributes of the particle fluxes and associated changes in the system configuration over human timescales. A slow system is one for which the fluxes and changes in configuration are largely imperceptible over these timescales. In simple terms, for a system consisting of N particles whose configuration changes due to a characteristic particle flux q [T −1 ], the ratio T r ∼ N/q is akin to a particle residence time. In turn, let T e denote a characteristic observation time -an experimental run time, the duration of a research project, the time record of satellite images, a researcher's career. Then the ratio T e /T r is a rough measure of our capacity to observe, although not necessarily measure, the "completeness" of the dynamics of the system, that is, its full dynamical behavior.
Note that individual particle motions might be fast, but their contribution to changes in system configuration may be slow, as in the case of rarefied particle transport on hillslopes. From this perspective the stark contrasts between the capabilities and strategies of studies of transport in hillslope and river systems and their evolution become clear. For example, the ability to create a small version of a river reach in a laboratory and then measure long time series of bed load flux in order to fully characterize the fluctuations and ensembleaveraged behavior of such series in relation to bedform dynamics (Dhont and Ancey, 2018;Ancey and Pascal, 2020) is simply not a possibility in studies of natural soil creep and its long-term consequences. Indeed, we have only recently achieved the ability to measure small particle motions involved in soil creep (Deshpande et al., 2021), yet the particle residence times of soil mantled hillslopes may be 10 000 years or more, thus requiring indirect measures of particle behavior such as tracer particle mixing (Furbish et al., 2018c;Gray et al., 2020).
These ideas support a strong case for incorporating concepts and methods of probability -the natural language of uncertainty -into our descriptions of sediment particle motions and transport, tuning the specifics to the demands of different scales. This is as much a philosophical choice as a technical one; it is a choice to make the treatment of uncertainty a key part of the problem at the outset (Ancey and Pascal, 2020;Korup, 2020). Of course the strategies and methods vary with scale, as do the sources of uncertainty, in relation to the transport processes involved and the techniques of observation and measurement used. The purpose of explicitly incorporating probabilistic concepts in describing transport is to use this as a framework to explore, for example, the consequences of patchiness and intermittency of sediment motions in formulations of transport rates or how a predicted transport rate at a specified position and time within a real system actually represents a statistically expected behavior associated with a distribution of possible transport rates. The objective is to illustrate that this approach to uncertainty, combined with aiming at mechanistic, albeit probabilistic, descriptions of sediment particle behavior -avoiding a continuum description at the outset -will move us toward a deeper understanding of sediment particle motions and transport in both experimental and natural systems. We now step through the elements of the probabilistic framework outlined in the preceding sections with specific reference to the problem of rarefied particle motions on hillslopes.

Probabilistic versus deterministic descriptions
Here we consider three elements of the formulation presented in Furbish et al. (2021a, b, c) to highlight the purpose and merits of a probabilistic description of particle motions and disentrainment. The first concerns our treatment of the ex-traction of kinetic energy of a particle during particle-surface collisions as a random variable versus appealing to the deterministic idea of fixed coefficients of restitution. The second concerns our use of the Fokker-Planck equation to describe the changing energy states of the particles during their downslope motions, leading to deposition, versus considering only average particle energy conditions. The third concerns our efforts to demonstrate that the generalized Pareto distribution in this problem is a maximum entropy distribution. The objective is to highlight the mechanistic yet probabilistic nature of the analyses.

Particle energy extraction
We start with some background. In classical statistical mechanics the starting point for describing the motions and collective behavior of particles undergoing conservative collisions is the Boltzmann equation. This equation describes the evolution of the joint probability density function of particle positions and velocities in relation to the forces acting on the particles. Depending on the formulation of particle collisions (e.g., Chapman-Enskog theory), the Boltzmann equation leads to the Navier-Stokes equation in the continuum limit of vanishing Knudsen number. For dissipative collisions in granular materials, however, one must incorporate effects of energy losses during particle collisions. In pioneering work, Haff (1983) formulated the hydrodynamic analogue of the Navier-Stokes equations for granular flows. In this formulation he envisioned simple inelastic collisions and appealed to the normal coefficient of restitution to characterize energy losses during collisions, neglecting the details of particle collisions in the scalar treatment. The thermodynamic temperature is replaced with the granular temperature, and the hydrodynamic equations are supplemented with the mechanical energy equation in order to characterize granular flow behavior. One of the key outcomes of this work is Haff's cooling law (Brito and Ernst, 1998;Nie et al., 2002;Brilliantov and Pöschel, 2004;Dominguez and Zenit, 2007;Brilliantov et al., 2018;Yu et al., 2020), which predicts that when the external source of energy is removed the granular temperature decays with time as ∼ t −2 in the homogeneous cooling state for a velocity-independent coefficient of restitution (Brilliantov and Pöschel, 2005;Yu et al., 2020). Now consider particle-surface collisions in the rarefied particle motion problem. Of interest are downslope motions and travel distances. The energy balance described in Furbish et al. (2021a) thus focuses on the particle kinetic energy state E p = (m/2)u 2 involving the surface-parallel velocity u. Energy is extracted during particle-surface collisions, and analogous to collisions in the granular gas problem we define the proportion of energy extracted during a collision as The following was noted by Williams and Furbish (2021): The quantity β x is nominally related to a coefficient of restitution x as β x = 1 − 2 x . However, the change in translational energy E p is partitioned between deformational friction, rotational energy and transverse motion, so the coefficient x (and therefore the factor β x ) cannot simply represent a coefficient of restitution -although particle collision theory suggests that this coefficient includes effects of normal and tangential coefficients of restitution as normally defined (Brach, 1991;Stronge, 2000). This means that β x must be treated formally as a random variable rather than a fixed deterministic quantity as in granular gas theory.
Indeed, unlike idealized conditions often envisioned in collision theory, moving particles "see" a rough irregular surface rather than a smooth planar surface such that, for any incident trajectory angle measured relative to the x axis, the actual incident collision angle may vary significantly; this includes angular incidence measured in the transverse direction depending on the local configuration of the surface. The geometrical irregularity of natural particles further increases the geometrical complexity of possible collisions, and collision histories during downslope motion are unique and highly variable. Rather than attempting to consider the mechanical details of these motions and collisions in a deterministic manner (see Appendix E in Furbish et al. 2021a), it instead becomes defensible to more simply pool the partitioning of energy into different forms and treat this energy extraction as a random variable, as in Eq. (14). In effect we are asking ourselves to blur our eyes to the myriad details of surface roughness texture, particle shape, particle trajectories and collision mechanics and instead aim at a granularity suited to the task. This is not dissimilar to granular gas theory in which details of collisions are neglected, and energy dissipation is assumed to be adequately characterized by a single coefficient of restitution, either fixed or velocity dependent, although recent efforts have treated this quantity as a random variable (Gunkelmann et al., 2014;Serero et al., 2015). With this description of particle energy extraction in place, it then becomes straightforward to characterize the number of particle-surface collisions per unit distance and in turn the collisional energy loss per unit distance in relation to the surface-parallel velocity u . These descriptions are entirely analogous to the particle collision frequency and the rate of energy loss due to collisions in granular flows, as described by Haff (1983). This provides the basis for defining the characteristic deposition length as described in the next section.
What might an alternative approach look like? One possibility involves using discrete element methods (DEMs) to directly mimic particle motions on rough surfaces, extracting information from the simulations to characterize particlesurface collisions and energy extraction. Such simulations essentially represent numerical analogues of physical experi-ments. An obvious advantage is speed in examining different conditions, for example, surface slopes, roughness configurations and particle sizes, using the motions of a great number of particles versus the relatively small numbers of particles used in experiments. Add to this the capability to readily extract information on details of motions and collisions that are not accessible in physical experiments, except possibly via high-speed imaging. Disadvantages include the difficulty of mimicking irregular particle shapes and creating realistic surface-roughness textures. (And let us note that informed use of DEMs, for example LAMMPS and LIGGGHTS, is not a plug-and-play endeavor.) Nonetheless, one could potentially learn much in a generic sense about particle-surface interactions from DEMs, particularly if conducted in concert with carefully designed physical experiments. This includes assessing how sensitive macroscopic measures of particle motions (e.g., travel distances) are to variations in controlling factors -for example, particle shape and roughness texture -as these factors are varied. But rather than imagining the need to mimic all details of realistic conditions associated with a hillslope surface prototype, simulations should be designed to examine particle-surface interactions in a manner that allows for generalization of elements of collisional friction.
Herein arises a need for pause in using DEMs. Namely, these methods can quickly generate enormous amounts of numerical information on the details of particle motions and collisions. At risk is using a big numerical hammer to pound on the problem, mimicking particle motions without regard to elucidating general underlying principles of particle behavior, defaulting to descriptions of outcomes without gaining a deeper understanding of the systematics producing them -for example, without learning the mechanical basis of the deposition length scale that sets the pattern of deposition (Sect. 4.1.2) or without learning the form of the distribution of travel distances, the mechanical basis of its parametric values, or its maximum entropy properties (Sect. 4.1.3). This speaks to the relevance of the cliché that much of the power of numerical simulations resides in their use in concert with theory and experiments (Emanuel, 2020). Indeed, the success of numerical simulations that have revolutionized the study of granular gas dynamics grows directly from the grounding of this topic in kinetic and statistical mechanics theory (e.g., Brilliantov and Pöschel, 2004), which both motivates and guides the design of numerical treatments of granular gases. Similar remarks pertain to parallel efforts focused on the behavior of relatively dense granular materials.

Energy states and the Fokker-Planck equation
Consider a second element of the formulation, which concerns our use of the Fokker-Planck equation to describe the changing energy states of the particles during their downslope motions, leading to deposition, versus considering only average particle energy conditions. We start with some background.
The Fokker-Planck equation represents a triumph of classical statistical mechanics. Although originally developed to describe the time evolution of the distribution of velocities of particles subjected to viscous forces and random forces associated with collisions, the name of this equation now is more generally associated with other observable quantities whose distributions evolve according to an equation having the same form. For example, with reference to the distribution of particle positions rather than velocities, the Fokker-Planck equation historically is referred to as the Smoluchowski equation. It also is referred to as the Kolmogorov forward equation in the context of Markov processes. Although the Fokker-Planck equation can be derived in several ways, perhaps the most general treatment starts with a master equation, a general probabilistic expression of conservation of probability associated with observable states. Like the Fokker-Planck equation, there are several versions of the master equation, sometimes referred to as the Chapman-Kolmogorov equation, depending on the field and application. Here we focus on a continuous form of the master equation as described by Chandrasekhar (1943) and Risken (1984). We start with two familiar examples to develop the essential concepts before turning to the unfamiliar problem of rarefied particle motions addressed in Furbish et al. (2021a). Our objective is to illustrate the statistical mechanics framework of the analysis.
Let f x (x, t) denote the probability density function of particle positions x at time t. In turn let r = x(t + dt) − x(t) denote a small particle displacement during the interval dt, and let f r (r; x, t) denote the probability density function of displacements r occurring during this interval. At time t +dt the density of particle positions x is then given by This is one form of the master equation. It says that the probability density of particles at position x at time t +dt involves particle displacements r during dt to this position x from all possible starting locations x−r as well as motions away from x. This expression is nonlocal and scale independent. It is examined in more detail by Furbish et al. (2012a) with respect to bed load particle motions. Now, assuming the density f r (r; x, t) is peaked near r = 0 with finite first and second moments, we may expand the integrand in Eq. (15) as a Taylor series to second order (referred to as a Kramers-Moyal expansion), subtract f x (x, t) from both sides, then divide by dt and take the limit as dt → 0 to obtain a Fokker-Planck equation (or the Smoluchowski equation), namely, In the language of statistical mechanics, k 1 ( and it is physically interpreted as the ensemble-averaged particle velocity. The quantity k 2 (x, t) = lim dt→0 (1/2dt) r 2 [L 2 T −1 ] is a diffusion coefficient equal to one-half the rate of change in the variance of particle positions, namely, a diffusivity. The diffusive term in Eq. (16) is a mathematically local approximation of the nonlocal behavior embodied in the integral form of the master equation, Eq. (15). Note that the Fokker-Planck equation, Eq. (16), is like an ordinary advection-diffusion equation, although it describes the time evolution of the probability distribution f x (x, t) of particle positions x. This form of the Fokker-Planck equation is the basis of numerous descriptions of sediment particle transport, including tracer particles, in rivers and soils and by rain splash (see Appendix A). As a point of reference, the Fokker-Planck equation also can be obtained from the Langevin equation, a stochastic differential equation originally used to describe the behavior of Brownian particles. Moreover, in the specific context of bed load sediment transport, Ancey and Heyman (2014) and Heyman et al. (2014) show that for a birth-death Markov process describing the number of active particles within a streambed control volume, the evolution of the probability distribution of this number can be described as a Fokker-Planck equation obtained from the master equation representing transitions among number states. This makes use of the idea (Gardiner, 1983) that the probability distribution of number states can be represented as a mixture of Poisson distributions with varying rates and represents an alternative to the Kramers-Moyal expansion for conditions involving small particle numbers with large relative fluctuations.
Here is a particularly important sidebar. As probabilistic expressions the master equation and the Fokker-Planck equation are entirely agnostic to continuum versus rarefied conditions; they are equally applicable to both. If in an individual system (realization) the continuum hypothesis is satisfied -a condition that is independent of the probabilistic basis of the master equation or the Fokker-Planck equation -then the probabilistic formulation based on ensembleexpected conditions and its continuum counterpart are essentially one and the same. If, however, the continuum hypothesis is not satisfied, then one must appeal to a probabilistic formulation of ensemble-expected conditions (in order to justify the use of continuously differentiable equations), with the proviso that any prediction of the behavior of an individual (rarefied) system is probabilistic in nature. In other words, using the Fokker-Planck equation, Eq. (16), to describe the behavior of a system under rarefied conditions is in effect the 638 D. J. Furbish and T. H. Doane: Rarefied particle motions -Part 4: Philosophy same as using a continuum-like equation, where "continuumlike" means continuously differentiable, not that the particles behave as a continuum. Because of the significance of these points, we have reproduced in Appendix C key material from Appendix A presented in Furbish et al. (2018c). We return to the idea of rarefied versus continuum conditions in Sect. 4.2. Now let f u (u, t) denote the probability density function of particle velocities states u at time t. In turn let q = u(t + dt) − u(t) denote a small change in the velocity state during the interval dt, and let f q (q; u, t) denote the probability density function of the changes q occurring during this interval. At time t + dt the density of particle velocity states u is then given by a master equation having the same form as Eq. (15). Again assuming the density f q (q; u, t) is peaked near q = 0 with finite first and second moments, we follow the developments above to obtain a Fokker-Planck equation describing the time evolution of the density f u (u, t) over the velocity domain, namely, is interpreted as an acceleration, or the ensemble-averaged force per unit particle mass acting on particles with velocity u. The diffusion coefficient is the rate of change in the variance of velocity fluctuations about u, akin to a change in kinetic energy. Note that in going from a Fokker-Planck equation involving particle positions, Eq. (16), to one involving particle velocities, Eq. (17), the drift coefficient k 1 transitions from a velocity to an acceleration, and the diffusion coefficient k 2 transitions from a rate of change in the variance of positions to a rate of change in the variance of velocities. As a point of reference, Furbish et al. (2012b) start with the Fokker-Planck equation given by Eq. (17) to examine the statistical equilibrium distribution of the streamwise velocities of bed load particles. Wu et al. (2020) elaborate this idea by demonstrating that a large proportion of long particle hops experiencing relatively large velocities "results in a Gaussian-like velocity distribution, while a mixture of both short and long hop distance particles leads to an exponentiallike velocity distribution." With these ideas in place, we now turn to the problem of rarefied particle motions on rough hillslopes. Here is where we highlight the idea that the endeavor is not simply about adopting theory or methods "off the shelf" but rather involves appealing to the style of thinking of statistical mechanics while tailoring the description to the process.
Because particle motions in this problem are highly rarefied and intermittent, we appeal to the idea of a cohort of particles -a Gibbs-like ensemble of systems, each containing one particle that is subjected to the same physics during downslope motion (Appendix B in Furbish et al., 2021a). Moreover, because the formulation is centered on particle travel distances and therefore on deposition over space rather than time, it aims at describing the evolution of the ensemble distribution of particle energy states with respect to downslope position x. Thus, let f E p (E p , x) denote the probability density function of particle energy states E p at position x. In turn let q = E p (x + dx) − E p (x) denote a small change in the energy state over the interval dx, and let f q (q; E p , x) denote the probability density function of the changes q occurring over this interval. At position x + dx the density of particle energy states E p is then given by a master equation having the same form as Eq. (15), where the particle position (state) x is replaced with the energy state E p and time t is replaced with position x (Appendix C in Furbish et al., 2021a). Again assuming the density f q (q; E p , x) is peaked near q = 0 with finite first and second moments, we follow the developments above to obtain a Fokker-Planck equation describing the spatial evolution of the density f E p (E p , x) over the energy domain, namely, In this problem, unlike the examples above, the drift coefficient k 1 (E p , x) [M L T −2 ] has two parts: a fixed spatial rate k 1h = mg sin θ of gravitational heating due to conversion of potential to kinetic energy and a spatial cooling rate k 1c (E p , x) ≈ mgµ cos θ due to collisional friction. These are interpreted as average spatial rates of change in particle energy per unit distance; each is a force. That the drift coefficient involves two parts is similar to the description of particle motions in soils provided by Furbish et al. (2009b), where gravitational settling motions are distinguished from scattering motions. The quantity k 2 (E p , x) is a diffusion coefficient equal to one-half the spatial rate of change in the variance of particle energy states. Not shown in Eq. (18) is an energy loss term due to deposition . With this description of the spatial evolution of the density f E p (E p , x) in place, it then becomes straightforward to recast Eq. (18) into dimensionless form in order to define a characteristic cooling length X cA associated with collisional friction. (Note that the analysis does not actually involve solving the Fokker-Planck equation, Eq. (18).) This quantity is entirely analogous to the advective timescale associated with an advection (or advection-diffusion) equation where, instead of time, it refers to a distance. Namely, in the absence of gravitational heating, X cA is a characteristic distance over which thermal collapse by advective cooling occurs due to collisional friction. This allows us to associate the spatial deposition rate with the advective cooling rate. When the cooling term is integrated over all particle energy states (see Furbish et al., 2021a), the resulting characteristic length scale L c then has the role of an e-folding length of deposition, which connects the energy balance to the particle mass balance. Namely, for N(x) particles, with deposition length L c = αE a (x)/γ mgµ cos θ . Moreover, this formulation has an important probabilistic attribute. From Furbish et al. (2021a), Note that the formulation does not involve specifying a threshold energy for deposition . . . Whereas low-energy particles are on average more likely to become disentrained than are high-energy particles, a set of particles with precisely the same low energy will for probabilistic reasons not be disentrained simultaneously. Each particle experiences a unique set of conditions that disentrain it, and because of this uniqueness of conditions a particle with energy below an arbitrarily assigned threshold can with finite probability be gravitationally reheated to a higher-energy state. For given particle and surface roughness conditions, the formulation treats this aspect of disentrainment as a probabilistic process . . . [in relation] to the distribution of particle energy states and the probabilistically expected extraction of energy during collisions.
What might an alternative formulation of deposition look like? (Having approached the problem as above, we admit that a description of this sort represents a straw person to criticize. Nonetheless, we also can admit that our early efforts involved thinking of alternatives, so this criticism is not blind.) Suppose we start with the assumption that the deposition rate is inversely proportional to the average particle kinetic energy E a , namely, dN(x)/dx = −λN with rate constant λ ∼ 1/E a [L −1 ]. This might be motivated heuristically by the idea that the likelihood of deposition decreases with increasing particle energy as characterized by the average energy E a (x) at position x. That is, for given slope and roughness conditions, the motions of high-energy particles are less likely to be arrested during collisions than are those of low-energy particles, and the average energy E a (x) is a measure of the overall energetic state of the particles. According to Eq. (19), this assumption actually would provide a very good start on the problem! (This assumes all elements of λ = 1/L c could be deduced.) However, this approach would still require a separate formulation of how the energy E a (x) varies with position x. And it would risk missing a critical step revealed in the full analysis, that a key average is the harmonic average energy E h = E a /γ , which determines how deposition influences the energy balance. Moreover, the full energy balance is needed to specify the disentrainment rate as in Eq. (19), thence leading to the derivation of the generalized Pareto distribution of travel distances. We suggest that this example, details of which are provided in Furbish et al. (2021a), offers a clear view of the value of a statistical mechanics approach involving the Fokker-Planck equation, highlighting the mechanistic yet probabilistic nature of the analysis.

The generalized Pareto distribution as a maximum entropy distribution
Our third example highlighting the purpose and merits of a probabilistic description of particle motions and transport is centered on demonstrating that the generalized Pareto distribution is a maximum entropy distribution. We again start with some background, briefly outlining the origin of the idea of a maximum entropy distribution.
The canonical example of a maximum entropy distribution is the Boltzmann distribution of the energy states of gas particles at thermal equilibrium. For a great number N of gas particles with a fixed total energy E, the original derivation of the Boltzmann distribution involves enumerating the total number of accessible microstates -the many different ways to arrange N particles into energy states where each arrangement possesses the same fixed total energy Ethen determining the most probable arrangement. (This idea is illustrated in Fig. 3 in Furbish and Schmeeckle, 2013.) Schrödinger (1946, p. 6) succinctly describes the matter. Using his notation, consider the energy states 1 , 2 , 3 , . . ., l , . . .. Then let a 1 , a 2 , a 3 , . . ., a l , . . . denote the number of independent systems in each energy state. (Here it suffices to imagine that a "system" consists of an individual particle.) We then imagine a set of macrostates ("classes") where each macrostate involves a set of N systems arranged into energy states. Namely, all [micro]states of the assembly are embracedwithout overlapping -by the classes [macrostates] described by all different admissible sets of numbers a l . . . The number of single [micro]states, belonging to this class [macrostate], is obviously The set of numbers a l must, of course, comply with the conditions l a l = N , l l a l = E .
The present method [of the most probable distribution] admits that, on account of the enormous largeness of the number N, the total number of distributions (i.e., the sum of all P 's) is very nearly exhausted by the sum of those P 's whose number sets a l do not deviate appreciably from the set which gives P its maximum value (among those, of course, which comply with [Eq. (21)]). In other words, if we regard this set of occupation numbers as obtaining always, we disregard only a very small fraction of all possible distributions -and this has "a vanishing likelihood of ever being realized".
The procedure thus amounts to choosing the macrostate containing the greatest number of microstates, each consistent with fixed N and E. This is achieved by applying Stirling's approximation to Eq. (20), taking the differential of the resulting expression and of Eq. (21) with respect to a l and setting these to zero, then using Lagrange multipliers to determine that This is effectively the Boltzmann distribution. The Lagrange multiplier λ = 1/k B T , where k B is the Boltzmann constant and T is temperature, is determined independently. A key point of the derivation is that Stirling's approximation of Eq. (20) yields a term representing the Gibbs entropy of the system. So in choosing the macrostate containing the greatest number of microstates, the procedure is equivalent to choosing the distribution whose entropy is a maximum relative to all other possible choices. From Furbish et al. (2021c), Jaynes (1957aJaynes ( , 1957b) elaborated the significance of the fact that the Gibbs entropy in statistical mechanics and the Shannon entropy in information theory are essentially one and the same, differing only by a constant. This similarity inspired Jaynes to champion the use of a maximum entropy criterion in choosing a probability distribution, leading to what is now known as the maximum entropy method . . .
[W]hether viewed as a method of statistical mechanics or as one of inferential statistics . . . it provides an unbiased choice of a distribution by honoring only what is known mechanically about a system. That is, this unbiased choice is a maximally noncommittal choice that is faithful to what we do not know; it is therefore the most reasonable choice in the absence of additional information . . .
Within this context, there are three notable elements in our effort  to demonstrate that the generalized Pareto distribution as applied to the rarefied motion problem is a maximum entropy distribution. First, in this work we noted that "constraints imposed on the system normally translate to constraints imposed on the moments of the distribution. . . . [in which] case the method leads to a distribution that is among the exponential family (e.g., exponential, Gaussian). However, applications of the maximum entropy method to non-exponential distributions, including heavy-tailed distributions, are of particular interest in many problems (Peterson et al., 2013)." Moreover, recall that the generalized Pareto distribution has three forms: it is bounded for shape parameter A < 0, heavy-tailed for A > 0 and exponential for A = 0 (Fig. 1). If we are to suggest that the generalized Pareto distribution is for mechanical reasons a maximum entropy distribution, then it becomes essential to show that all three of its forms are constrained in the same manner -as opposed to appealing to separate constraints for each of the three forms. Indeed, the energetic basis of the disentrainment rate, Eq. (4), provides this common constraint. It allows us to calculate an energetic "cost" -the total cumulative energy extracted by collisional friction per unit kinetic energy available during particle motions. This energetic cost is entirely analogous to that associated with the economics of scale (Peterson et al., 2013), where net heating contributes to an energetic "discount" that allows particles to achieve larger distance states x, and net cooling imposes a "penalty" that suppresses long-distance motions. In effect "the analysis represents a novel generalization of an energybased constraint in using the maximum entropy method to infer non-exponential distributions -to include the versatile properties (forms) of the generalized Pareto distribution as applied to the rarefied particle motion problem." Stepping back, we suggest that similar considerations of particle energetics may be useful for clarifying the behavior of particles in other systems.
Second, the versatile form of the generalized Pareto distribution is rather unusual. Numerous well-known distributions of course take the form of a related distribution for certain parametric values. Nonetheless, the generalized Pareto distribution is distinctive in that it has a bounded form (A < 0) that decays faster than an exponential distribution with the triangular and uniform distributions as special cases, a heavytailed form (A > 0) with undefined mean or variance for sufficiently large values of the shape parameter A, and an exponential form (A = 0) separating its bounded and heavy-tailed forms. Moreover, that this distribution seems to correctly describe the energetics of rarefied particle motions for varying slope and surface roughness conditions representing all three forms (Fig. 2) is at first glance astonishing -notably including the abrupt mathematical transition between bounded and heavy-tailed forms. That the constraint provided by a fixed energetic cost relative to the available kinetic energy provides a unifying explanation of the three behaviors lends confidence that each of the three forms of the distribution represents the most probable arrangement of distance states -just as the Boltzmann distribution represents the most probable arrangement of energy states of gas particles at thermal equilibrium. Nothing special or unusual changes in the physics of disentrainment in the transition from the bounded form to the heavy-tailed form of the distribution in crossing isothermal conditions -a point of clarity provided by the maximum entropy analysis.
This result also adds clarity to the idea of nonlocal versus local transport (Metzler and Klafter, 2000;Schumer et al., 2009;Foufoula-Georgiou et al., 2010;Furbish and Roering, 2013). In studies of tracer particle transport, and setting aside the effects of particle rest times, local behavior is associated with a light-tailed distribution of particle displacements during a small interval dt, leading to Gaussian dispersion. Nonlocal behavior is associated with a heavy-tailed distribution of displacements leading to non-Gaussian (anomalous) dispersion as represented by, say, a fractional advectiondiffusion equation with respect to space (Metzler and Klafter, 2000;Schumer et al., 2009). In comparison, consider the situation in which the shape parameter A of the generalized Pareto distribution is small, positive or negative. With small A < 0 the light-tailed form of the distribution has an upper bound at x = B/|A| (Fig. 1), and as A approaches zero from below, this upper bound may become exceedingly large but nonetheless remains finite. With small positive A > 0 close to zero the distribution has "flipped" to an unbounded heavytailed form (with finite mean and variance). Yet any difference in the physics of particle motions up to x = B/|A| for A positive or negative (and small) is imperceptible. Indeed, except near the upper bound x = B/|A| and beyond, the two forms of the distribution for A close to zero are essentially indistinguishable -the difference representing a mathematical precision far beyond what one might be capable of detecting from measurements of particle travel distances (Furbish et al., 2021b). Thus, within the context of tracer particle behavior, whereas local versus nonlocal behavior defined in terms of the distribution form may be an important mathematical distinction -notably if the distribution involves undefined moments -this distinction offers limited insight regarding the mechanical interpretation of particle motions. Hence, in the problem of rarefied particle motions on hillslopes, and consistent with physical interpretations of nonlocal behavior (Bocquet et al., 2009;Brantov and Bychenkov, 2013;Henann and Kamrin, 2013), the idea of nonlocal transport as embodied in Eqs. (1) and (2) reminds us that the flux or its divergence at position x is determined by factors influencing entrainment and particle motions "far" upslope from this position (Furbish and Roering, 2013;Furbish et al., 2016bFurbish et al., , 2021aDoane et al., 2018).
Third, focusing on the second part of the quotation above, the maximum entropy method reminds us of the value of the principle of parsimony -appealing to the simplest explanation consistent with available evidence -in the presence of uncertainty. Boltzmann did not know a priori the distribution of gas particle energy states, Eq. (22); he imposed only the constraints of a fixed number of particles and a fixed total energy. The maximum entropy derivation thus honored his understanding of the system, but no more. In effect the derived distribution of energy states -including the foundational assumption that each accessible microstate is equally probable -became a hypothesis to be tested against experimental observations (Tolman, 1938). With respect to applications of the maximum entropy method to sediment particle motions, we "highlight the fact that a distribution thus chosen is not necessarily the 'correct' distribution  . . . [I]t is the most reasonable choice in the absence of additional information . . . [and in] this sense the maximum entropy method is a formal application of Occam's razor" . We therefore suggest that this represents one viable element of a strategy to deepen our mechanical understanding of attributes of particle motions that we observe, measure and describe statistically. As noted by Ancey (2020b) in relation to bed load transport, "One strength of entropy-based methods is their use of the physical information conveyed by data, thereby enforcing physical consistency . . . [opening] new avenues of research combining statistical information and physics-based models." On this point we note that a distribution selected according to a maximum entropy criterion may serve as an ideal prior hypothesis in subsequent analysis, including Bayesian analysis (Jaynes, 1988).

Motivation
Perhaps it is obvious that in this problem a description of the physics of particle motions cannot meaningfully start from the idea of continuum behavior. Particle motions are patchy and highly intermittent, and in most settings these motions are far from conditions that could be considered continuum-like granular flows. Particle behavior is dominated by particle-surface interactions rather than particleparticle interactions, and the conventional idea of appealing to a Knudsen number to ascertain continuum behavior is irrelevant. Moreover, aside from descriptions of the physics of particle motions, in the absence of continuum conditions we cannot justifiably appeal to familiar continuum-like definitions of the particle flux and its divergence that are based on the assumption that particle number densities and locally averaged velocities are well-defined quantities that vary smoothly over space and time. Yet the particle flux and its divergence are of particular interest in many problems, and we therefore turn our focus to a thorough description of these quantities.
As noted above, precise definitions of the sediment particle flux and its divergence do not assume continuum conditions at the outset Furbish et al., 2012aFurbish et al., , 2016bFurbish et al., , 2017Ancey and Pascal, 2020). For rarefied conditions these definitions are translated into probabilistic expressions, of which the entrainment forms of the flux and the Exner equation, Eqs. (1) and (2), are examples. Of concern, then, is the meaning and use of continuously differentiable functions in these nonlocal expressions, namely, the entrainment rate E s (x, t), the probability density f r (r; x, t) of travel distances r and the associated exceedance probability function R r (r; x, t). At risk is misinterpreting these continuous probabilistic functions as implying a continuum-like description of transport such that the flux q(x, t), if defined as a time-averaged quantity, may be envisioned as varying continuously in space and time akin to continuum-like expressions of transport -the canonical examples being the process-response models introduced by Kirkby (1971) and Carson and Kirkby (1972) involving expressions of the flux that vary with surface configuration and semi-empirical rate constants. Similar comments pertain to the rate of changė ζ (x, t).
Consider the nonlocal expressions, Eqs. (1) and (2). For simplicity of illustration we focus on a single particle size and rewrite these expressions as follows. Let q n (x, t) [L −1 T −1 ] denote the particle number flux at position x and time t. Dividing Eq. (1) by the particle volume V p then gives where E n (x, t) [L −2 T −1 ] denotes the particle number entrainment rate. In turn, let n(x, t) [L −2 ] denote the areal particle number density. Dividing Eq.
(2) by V p then giveṡ In these expressions the flux q n (x, t), the number density n(x, t) and the entrainment rate E n (x, t) are treated as continuous functions of position x and time t. Moreover, the density f r (r; x, t) and the exceedance probability R r (r; x, t) are continuous functions of the travel distance r, and the forms of these functions are considered to vary smoothly with x and t. However, Eqs. (23) and (24) do not imply that transport may be envisioned as a continuum-like behavior or that the flux and its divergence vary smoothly with position x and time t in any particular setting. Rather, these are probabilistic expressions that represent ensemble expected conditions, not the outcome of any individual realization (Appendix C). In fact, both the flux q n (x, t) and the rateṅ(x, t) = ∂n(x, t)/∂t are to be considered random variables due to rarefied transport conditions. To illustrate these points, here we consider an idealized situation in which the entrainment rate E n (x, t) is Poisson in time and space but modified to include effects of intermittency and patchiness. The cases presented next, although involving approximations of the entrainment process, nonetheless suffice to illustrate the consequences of a noise-driven process. This includes an explicit definition of ensembleexpected conditions versus the outcome of an individual realization, as well as the relation between these and timeaveraged conditions. The presentation reflects elements of the analysis of Ancey and Pascal (2020) concerning bed load transport.

Line source
The idea of a line source of sediment particles delivered to a hillslope is embodied in the experimental and field-based work of Kirkby and Statham (1975) and Statham (1976) concerning the motions and downslope sorting of particles on scree slopes. Here we consider a simple version of this problem.
Flux with Poisson delivery rate. We start by envisioning a planar hillslope at the base of a cliff. Particles are delivered from the cliff to the top of the hillslope as a line source at x = 0, and we neglect particle entrainment on the hillslope. In this situation we may simplify the notation. Namely, the travel distance r → x, so the density f r (r; x) → f x (x) and the exceedance probability R r (r; x) → R x (x). In turn, the entrainment rate E n is reinterpreted as a boundary flux, the expected number n of particles delivered to the top of the hillslope per unit width per unit time. For a width y, the expected number of particles delivered per unit time is η = E n y. The number n of particles delivered during an interval τ is then given by the Poisson distribution, namely, with mean µ n = ητ and variance σ 2 n = ητ . More generally, the expected number of particles reaching position x per unit time is ηR x (x), and the number n of particles reaching x during τ is again given by a Poisson distribution, with mean µ n = R x (x)ητ and variance σ 2 n = R x (x)ητ . For reference below we note that the distribution of waiting times w between successive events is exponential with mean µ w = 1/η in the case of Eq. (25) and µ w = 1/R x (x)η in the case of Eq. (26). (Note that the waiting time w should not be confused with a particle rest time.) These results illustrate that the number n of particles reaching x during τ involves a distribution of possible outcomes whose variance increases linearly with the elapsed time τ . For plotting we normalize this number by the expected number reaching x, namely,n(x) = n(x)/η t with t = 1 (Fig. 3). Note that the numbers used to generate these plots are somewhat arbitrary, as the width y is not explicitly specified. That is, for given E n , the numbers (and thus η) increase with y.
These plots may be interpreted several ways. First, for a specified delivery rate η at the line source (x = 0), the panels (a) to (d) depict a decreasing exceedance probability R x (x) representing an increasing distance from the source for a fixed form of the distribution f x (x) of travel distances x. Second, the plots (a) to (d) depict a decreasing delivery rate η at the source viewed at x = 0. In this situation, downslope locations would display an increasing variability relative to the  Fig. 3a could represent conditions at a position x 0 with R x (x) = 0.01 for a rate η that is 10 2 larger. Third, if in Fig. 3a the rate η = 1000 represents the expected number of events per year, then the signals in Fig. 3d could represent this same rate but plotted in terms of expected events per "milli-year" (or one event per 8.8 h). Note that individual realizations never converge to ensemble-expected values, although the relative variability decreases with large η. That is, the coefficient of variation decreases slowly as [R x (x)ητ ] −1/2 , but this is not a mean-reverting process.
The plots in Fig. 3 also may be reinterpreted in terms of variations in the form of the generalized Pareto distribution with respect to hillslope positions x. That is, each plot may represent different positions of x coinciding with the same exceedance probability R x (x) for different values of the shape and scale parameters A and B. For example, assuming fixed B, each plot may represent a relatively small value x > 0 with small A and a relatively large x > 0 with large A. Or, for a fixed position x > 0, Fig. 3c might represent a distribution with small R x (x) coinciding with small A, whereas Fig. 3a might represent a distribution with large R x (x) coinciding with large A. In general the position x associated with a specific value of R x (x) increases with A for given B. Note that no particles move past a position x beyond the upper bound B/|A| for A < 0.
In turn, we compute the normalized time-averaged particle number flux asq n (x) =n(x)/τ (Fig. 4). This flux also is a random variable. It exhibits relatively large variability with small elapsed times τ and then converges to the ensemble expected value with increasing τ . The rate of convergence decreases with decreasing delivery rate η or with decreasing exceedance probability R x (x) representing an increasing distance from the line source.  Fig. 3. Note that initial values start at τ > 0. Ancey and Pascal (2020) examine the more general question of estimating the time-averaged flux associated with a Poisson process (compare their Fig. 2 with our Fig. 3). Within the context of measurements of bed load sediment transport, they show how the variability in estimates of the time-averaged flux varies with the measurement interval, and they present a re-sampling (bootstrap) protocol for assessing how the variance of the flux varies with the sampling interval based on an individual realization. As noted below, however, we rarely if ever have time series needed to support this type of analysis when describing slow systems.
With this example in place we offer an explicit definition of the idea of ensemble-expected conditions for a Poisson process. The word "ensemble" refers to a great number N e of nominally identical but independent systems, each subject to the same physics of random events characterized by the rate constant η (Appendix D) for specific values of A and B. In this view, each realization plotted in Fig. 3 represents the outcome of one system in the ensemble. (Note that this meaning of "ensemble" is quite different from the oft used description of an ensemble of particles.) For any individual system there is one possible outcome n at time τ with probability given by Eqs. (25) or (26). For an ensemble of systems all possible outcomes n = 0, 1, 2, 3, . . . exist at time τ in the proportions given by Eqs. (25) or (26). Thus, ensemble conditions in this example refer to the distribution of possible outcomes with a well-defined ensemble average and variance. In turn, the particle flux q n (x, t) given by Eq. (23) represents the ensembleexpected flux, not the flux associated with any particular realization. Similarly, as described further below the rate of change in particle numbersṅ(x, t) given by Eq. (24) represents the outcome defined by ensemble-expected conditions, not the rate associated with any realization.
Flux with intermittent delivery rate. The idea of a Poisson delivery rate nicely illustrates the growing variance of particle numbers associated with a simple noise-driven process. However, of particular interest are effects of an intermittent delivery rate -recognizing that this rate likely involves fluctuations in particle numbers with seasonal to longer-term variations in factors influencing particle entrainment. Again consider a line source of particles at x = 0. We assume that events are Poisson with an expected rate η, where each event, rather than representing one particle, instead involves n particles described by a specified distribution. Results described below are qualitatively insensitive to the choice of this distribution, so for simplicity we use an (integer) exponential distribution with mean µ n , which provides skew in the number of particles per event. In this situation the delivery of particles at x = 0 is no longer a Poisson process; increasing intermittency is represented by a decreasing rate η.
As a point of reference, Benjamin et al. (2020) provide an assessment of efforts to observe and measure rockfall events contributing to cliff erosion and thus to downslope delivery of particles. The frequency and magnitude of these events may vary widely, from the chronic activity of small rockfall events to large infrequent events, depending on the geological and environmental factors that influence the mechanisms of weathering and failure (Luckman, 2013;Strunden et al., 2015;Mair et al., 2020). The frequency of occurrence of rockfall volume typically varies as an approximate inverse power function of volume, where the specific relation depends on the spatial coverage and temporal duration of the data set (Benjamin et al., 2020). Rockfall volumes do not translate directly to particle numbers, both of which are influenced by the geometry of cliff rock fracturing and fragmentation (Domokos et al., 2020;Verdian et al., 2020), and impact shattering (Luckman, 2013). Nonetheless, these observations point to the inherent stochasticity of rockfall over many scales, including variations in intermittency with time (Sect. 4.3.2).
Like a Poisson particle delivery rate (Fig. 3), the number of particles n reaching x during τ involves a distribution of possible outcomes whose variance increases with the elapsed time τ (Fig. 5). Increasing the average number of particles per event, µ n , tends to decrease the variability about the ensemble-expected values. However, this variability is far more influenced by the degree of intermittency. Namely, the variability increases with decreasing η and is notably larger than that associated with a purely Poisson process (Fig. 3). For comparison, each of the plots in Fig. 5 involves 100 times fewer events per year than the corresponding plot in Fig. 3 but 10 times the number of expected particles per year. The plots may be interpreted in a manner similar to that given above for Fig. 3. Although not shown to save space, the normalized time-averaged particle fluxq n (x) is similar in appearance to that associated with a Poisson delivery rate (Fig. 4), but with larger variability and slower convergence to the ensemble-expected value with increasing τ .
Steady deposition. Consider the rateṅ(x) = ∂n(x)/∂t in Eq. (24). In the idealized situation involving a line source of particles arriving at x = 0 with rate η, and in the absence of particle entrainment, the expected rate of deposition within the small interval x at position x isṅ(x) ≈ f x (x) xη. However, for an individual realization this rate is discontinuous in time. Let n denote the number of particles deposited within a specified interval t. The time-averaged rate of deposition during t is then n/ t. The number of particles deposited with x at x during t is then again distributed as a Poisson distribution, with mean µ n = f x (x) xη t and variance σ 2 n = f x (x) xη t. Thus, as above, the variance of the numerator of n/ t increases indefinitely with the interval t.
For a Poisson process, the deposition events n within any successive interval t are independent. Letting k = 1, 2, 3, . . . denote successive intervals, then τ = k t. This means that summing the number of particles deposited during successive intervals t is the same as summing over the total elapsed time τ . Thus, deposition at x proceeds as a random process whose appearance is qualitatively similar to the examples above (Fig. 3). The rate n/ t similarly converges slowly to the ensemble valueṅ with increasing time interval τ . Similar conclusions apply to the case of an intermittent delivery rate of particles. The expected deposition rateṅ(x) thus represents the ensemble average, not the rate of individual realizations except in the limit of τ = k t → ∞.

Distributed entrainment
The idea of distributed entrainment is embodied in the work of Doane (2018) and Doane et al. (2018) concerning nonlocal sediment transport. This work involves numerical simulations of the time evolution of the profiles of steep lateral moraines in the Sierra Nevada, California, for comparison with field-based measurements. It examines entrainment that occurs over the entire moraine profile due to disturbances and the role of vegetation in sediment capacitance -the capture, storage and release of sediment (Furbish et al., 2009a;Lamb et al., 2011Lamb et al., , 2013DiBiase and Lamb, 2013;Doane et al., 2018). Here we consider a simple version of this problem involving uniformly random entrainment.
Flux with Poisson entrainment. In contrast to a line source, here we envision a uniformly random entrainment rate E n over the domain x. We return to our original notation involving travel distances r with probability density f r (r; x) and exceedance probability R r (r; x), neglecting variations with time t.
For a uniformly random entrainment rate E n (x) = E n the expected number of particles reaching x per unit time is The integral in Eq. (28) is equal to the mean travel distance µ r , if this moment exits. Thus, η = E n µ r , which is identical to the definition of the particle flux provided by Ein- stein (1950) for steady uniform bed load transport. The expected number n of particles reaching position x during τ is ητ , so with mean µ n = ητ and variance σ 2 n = ητ . Note that Eq. (29) is identical in form to Eqs. (25) and (26). This means that the number of particles reaching position x varies in a manner that is qualitatively the same as depicted in Fig. 3, and the time-averaged flux is qualitatively the same as depicted in Fig. 4. For a given mean travel distance µ r , convergence to the expected value η = E n µ r therefore strongly depends on the entrainment rate E n . This convergence decreases with increasingly rarefied conditions.
If the integral in Eq. (28) does not converge -that is, the mean travel distance is undefined -then the expected particle flux is undefined. This coincides with a shape parameter A ≥ 1 for the generalized Pareto distribution. This also means that the expected divergence of the flux (see below) is undefined. However, we must be cautious to not over-interpret this result, as Eq. (28) assumes the upslope integration of the exceedance probability R r (r; x) is unbounded. In reality, the integration associated with any position x extends only to the hillslope crest, thus truncating R r (r; x) such that the integral in Eq. (28) is finite. Nonetheless, numerical simulations confirm that the expected flux increases indefinitely as the upslope distance of integration increases. The implication is this: if the distribution of particle travel distances is heavytailed with undefined mean, and if the heavy-tailed distribution applies to much or all of the hillslope, then the expected flux at the base of the hillslope depends on its length. Indeed, if the mean travel distance is undefined and disentrainment is negligible (i.e., few if any particles stop), then the expected flux at the base of the hillslope is essentially equal to the entrainment rate integrated over the entire hillslope. This represents the "entrainment-limited" analogue of "detachmentlimited" conditions. Furbish et al. (2021b) report estimates of A ≥ 1 for several of the field-based experiments reported by DiBiase et al. (2017) and Roth et al. (2020).
Flux with patchy entrainment. Consider uniformly random entrainment events, where each event, rather than represent-ing one particle, instead involves n particles. As above we use an (integer) exponential distribution with mean µ n . In this situation the random location of events involving different numbers of particles yields a spatial patchiness in particle entrainment. If E n now denotes the event rate, then the expected rate at which particles reach position x is η = E n µ n µ r . However, unlike the previous example, this is not a Poisson process. Nonetheless, similar to the conclusions above, numerical simulations reveal that the number of particles reaching position x varies in a manner that is qualitatively the same as depicted in Fig. 3, and the time-averaged flux is qualitatively the same as depicted in Fig. 4. For a given mean travel distance µ r and mean number of particles µ n , convergence to the expected value η = E n µ n µ r therefore strongly depends on the rate E n . This convergence decreases with increasingly rarefied conditions. As in the preceding example, if the integral in Eq. (28) does not converge -that is, the mean travel distance is undefined -then the expected particle flux is undefined. The implications of this are similar to those described above regarding Poisson entrainment.
Divergence with Poisson entrainment. Consider the changing number n(t) of particles within an element x as particles arriving from upslope are deposited and particles entrained within the element move downslope. Assuming that uniformly random entrainment is a Poisson process with expected entrainment rate E n , then the expected particle number flux into the element x is equal to the expected flux out of this element. Specifically, the rate at which particles moving into x become deposited within this element is The first term in Eq. (30) is the rate at which particles reach x from upslope. The second term is the rate at which particles reaching x from upslope move past the interval x. The rate at which particles are entrained within x and leave this element is Adding Eqs. (30) and (31) then leads to the conclusion that, per unit width, the immigration rate η I is equal to the emigration rate η E . That is, η I = η E = E n µ r , if the mean travel distance µ r is defined. Alternatively, for positions x such that x ≤ x ≤ x + x the expected deposition rate D n within x, including particles that are entrained within this interval, is The negative rate at which particles are entrained within x, including those that remain within this interval and those that leave it, is The integrals involving x and r are equal to unity, so upon summing Eqs. (32) and (33) the expected rateṅ = −E n + D n = 0 with D n = E n . That is, the local expected rate of deposition is equal to the expected rate of entrainment. One might anticipate that the balance between expected immigration and emigration rates involving a uniform expected flux, or between the expected deposition and entrainment rates, yields a particle number n(t) within the element x that fluctuates with time due to the randomness of the process, but which nonetheless is centered on a fixed mean value -a decidedly deterministic (continuum) point of view. This anticipated result, however, is incorrect. In fact, the number of particles within the element does not possess a stable distribution, and n(t) is not a mean-reverting process. Instead, the number n(t) undergoes an uncorrelated random walk over the n domain, and with finite probability it may "wander" to an arbitrarily large or small value. If the condition that n(t) cannot become negative is imposed, then this is the well-known M/M/1 queuing problem (Stewart, 2009).
With n(t) = 0, 1, 2, 3, . . . and η I < η E , the stationary ensemble distribution f n (n) of states n is a geometric distribution (the discrete version of an exponential distribution) with mean ρ/(1 − ρ) and variance ρ/(1 − ρ) 2 where ρ = η I /η E . In contrast, with η I > η E individual realizations n(t) increase indefinitely, similar to the example of steady deposition above. Note that this problem is just the beginning of a rich theory of dynamical systems falling under the headings of queuing theory, birth-death processes, Markov processes and generalized elastic models. We mention specific cases below, but otherwise the example above suffices to illustrate a basic, perhaps counterintuitive, outcome of noise-driven processes.
There is little evidence that numbers n(t) (i.e., local elevations ζ (t)) on natural hillslopes exhibit unbounded (randomwalk) behavior as in the example above. Entrainment and deposition do not lead to arbitrarily rough surfaces absent spatial correlation in surface elevation. This reflects that additional physics becomes involved in the entrainment and deposition processes. Before addressing this point, first consider a related problem involving bed load transport, initially focusing on the number n a (t) of moving particles rather than the state of the streambed. The Markov birth-death formulation provided by Ancey et al. (2008) for rarefied transport conditions posits that if both the expected deposition rate and the expected emigration rate associated with an interval x depend on the system state -the number n a (t) -then n a possesses a stable (ensemble) binomial distribution f n a (n a ) with mean µ n a . If in addition collective entrainment proportional to n a (t) is involved Heyman et al., 2014;Lee and Jerolmack, 2018;Pierce and Hassan, 2020a), then n a is described by a negative binomial distribution with well-defined mean and variance. The key lesson is this. With this noise-driven process the distribution f n a (n a ) of states n a is as important as the expected state µ n a in characterizing the process. Moreover, whereas the value of this expected state can be specified in terms of the rate constants associated with immigration, deposition and entrainment, the expected state has no more mechanical significance than other values of n a ; the expected (or modal) value is just more probable than other values. With regard to the state n(t) of the bed, because both deposition and collective entrainment depend on n a (t), these have a counterbalancing effect. In addition, Pierce and Hassan (2020a) modify the formulation of Ancey et al. (2008) to couple erosion and deposition with the bed state n(t). This explicitly includes a stabilizing feedback where entrainment preferentially occurs with aggradation and deposition preferentially occurs with degradation (Sawai, 1987;Wong et al., 2007). This coupling ensures that the state n possesses a stable distribution with finite mean and variance.
Returning to transport on hillslopes, stabilizing effects may involve local changes in the entrainment rate due to effects of unstable particle configurations, collective entrainment of surface particles by moving particles, and preferential "capture" of moving particles within local low spots or by roughness elements Roth et al., 2020). Note that the rules of deposition in the particle-based transport model of Tucker and Bradley (2010) inherently provide this sort of stabilizing effect. Moreover, although attention has been given to the role of vegetation in sediment capacitance Lamb et al., 2011Lamb et al., , 2013DiBiase and Lamb, 2013;Doane et al., 2018), this topic otherwise is largely unexplored in relation to modulating rates of entrainment and deposition over long timescales. In addition we may imagine a configuration involving (in a Fourier sense) a small-amplitude sinusoidal variation in surface elevation or roughness. The effect of this -including preferential deposition at certain locations -almost certainly would influence the behavior of particles whose motions start upslope, thereby leading to a distribution of travel distances that deviates from an idealized form associated with uniform conditions. We may imagine similar effects on planar surfaces with nominally homogeneous roughness, but with local variations in roughness at the particle and slightly larger scale. However, whether these local effects could be distinguished from the inherent randomness of deposition is an open question. Note also that other processes may operate on hillslope surfaces such that stabilizing -or "smooth-ing" -influences do not need to be related just to rarefied particle motions as envisioned above. As an unusual example, "the impacts by small distal ejecta fragments. . . . is the largest contributor to the diffusive [topographic] degradation which controls the equilibrium [size-frequency distribution] of small craters" of the lunar maria (Minton et al., 2019). More generally, the formalism of generalized elastic models used to describe the macroscopic dynamics of fluctuating surfaces due to the competition between processes of surface roughening and relaxation (Pelletier and Turcotte, 1997;Turcotte, 2007) is now being extended to erosional landscapes . Whether involving stabilizing effects or not, fluctuations in entrainment and deposition and accompanying variations in the land-surface state about expected conditions are just as important as the expected conditions in characterizing surface behavior. Moreover, expected conditions may not represent a stable attractor in the sense of a mean-reverting process, akin to the stable basic state associated with slope-dependent soil creep (Furbish and Fagherazzi, 2001).
Here is the key lesson. In the presence of noise-driven processes with rarefied conditions, one must be cautious about predicting behavior in response to fixed continuum-like rates that do not acknowledge noise effects. Individual realizations associated with these effects can involve rich behavior that is not anticipated from a simple deterministic perspective.
Divergence with patchy entrainment. Consider a similar situation in which entrainment events are Poisson in time and uniformly random in space, but the number of particles associated with each individual event is represented by an exponential distribution. Deposition and entrainment within an element x are no longer a Poisson processes. Nonetheless, variations in the number of particles n(t) in an element x with time (not shown) qualitatively exhibit the same unstable behavior as Poisson entrainment. Because individual events may involve numerous particles, the variability generally is larger. This reinforces the point made above, that a stable distribution of particle numbers with finite mean requires additional physics, for example, where the entrainment and deposition rates depend on the state n(t).

Uncertainty with increasing scales
Here we consider uncertainty associated with rarefied particle motions on hillslopes viewed as a "slow" system, where changes in hillslope configuration are largely imperceptible over the human timescale (Sect. 3.3). We highlight results from above, that with rarefied transport conditions our descriptions of the particle flux and its divergence pertain to ensemble conditions involving a distribution of possible outcomes, each realization being compatible with the controlling factors. When these factors change over time, individual outcomes reflect a legacy of earlier conditions that is influenced by the rate of change in the controlling factors relative to the intermittency of particle motions. The implication of this result together with preceding material is that landform configurations reflect an inherent variability that is just as important as the expected (average) conditions in characterizing system behavior.

Ensemble-expected conditions
We start by returning to a key starting point described in Sect. 4.2. Namely, despite the continuous forms of the entrainment rate E s (x, t), the probability density function f r (r; x, t) and the exceedance probability function R r (r; x, t) in the expressions of the flux, Eq. (1), and the Exner equation, Eq.
(2), these do not imply that the flux q(x, t) and the rate of change in the land-surface elevationζ (x, t) = ∂ζ (x, t)/∂t may be considered as varying smoothly with space and time. Rather, for rarefied conditions these quantities are random variables. As a consequence the expressions Eqs. (1) and (2) specifically represent ensemble-expected conditions. Individual realizations may vary significantly from these expected conditions. The simple Poisson processes described in the examples above suffice to illustrate the consequences of rarefied conditions, where intermittency and patchiness add variability about expected conditions. Relative to the expected conditions, this variability may be large when viewed over small timescales. Only in the limit of a large number of particles with averaging over long timescales do predictions of the flux and its divergence approach expected (deterministiclike) values.
For any realization, the flux q(x, t) and the rateζ (x, t) do not vary as continuously differentiable functions of position x or time t. However, these quantities are welldefined continuously differentiable functions when applied to ensemble-expected conditions. In other words, when we write ∂ζ (x, t)/∂t = −∂q(x, t)/∂x as a "model" of landsurface evolution in which q(x, t) is specified by Eq. (1), we in fact are imagining a smoothly varying land-surface configuration that would occur if and only if q(x, t) at all times coincides with the ensemble-expected flux. A similar assessment applies to the entrainment form of the Exner equation, Eq. (2). Indeed, this description of the flux or the changing land-surface configuration is an idealization that does not acknowledge noise effects. It is "continuum-like" in the sense that the described behavior proceeds in a continuously differentiable manner according to the (continuous) entrainment rate E s (x, t) convolved with either the smooth (continuous) exceedance probability R r (r; x, t) or the smooth probability density f r (r; x, t). In effect E s (x, t), R r (r; x, t) and f r (r; x, t) are treated as deterministic functions rather than probabilistically expected quantities.
More generally, Eqs. (1) and (2) are probabilistic algorithms in which E s (x, t), R r (r; x, t) and f r (r; x, t) are statistically expected quantities. Each generated realization of q(x, t) orζ (x, t) is entirely compatible with the controlling quantities and is no less likely to occur than the expected value -although certain values are more probable than others according to the ensemble distribution of possible values. Thus, the ensemble distribution of possible values (Appendix D) is as important as the expected value in characterizing the behavior of the system. In the examples above involving Poisson events (delivery rate, entrainment rate), the behavior of q(x, t), n(x, t) orṅ(x, t) (or ζ (x, t) orζ (x, t) is not mean-reverting. The expected (average) state of the system therefore has no more mechanical significance than other state values.
As outlined in Sect. 4.2.3, however, additional factors may provide stabilizing effects. Focusing on the rateζ (x, t), consider time-varying conditions as the land surface changes (slope, surface roughness, etc.). One way to conceptualize this involves defining a zeroth-order configuration that changes slowly and first-order fluctuations about the zerothorder state that change relatively rapidly. Following Sweeney et al. (2020), let ζ 0 (x, t) denote a zeroth-order land-surface elevation and let ζ 1 (x, t) denote a first-order deviation about the zeroth-order state. Then ζ (x, t) = ζ 0 (x, t) + ζ 1 (x, t) anḋ ζ (x, t) =ζ 0 (x, t) +ζ 1 (x, t). The zeroth-order rateζ 0 (x, t) may be interpreted as representing ensemble-expected conditions as described above. The first-order rateζ 1 (x, t) then is akin to the behavior of an individual realization if this is conceptualized as a mean-reverting process. (The laboratoryscale experiments of Sweeney et al., 2020, indicate that this rate can be described as a first-order autoregressive process, the discrete version of a mean-reverting Ornstein-Uhlenbeck process.) In effect, the slowly varying zeroth-order behavior is akin to the "climate" of the land surface and first-order fluctuations are akin to its "weather" (Sweeney et al., 2020).
The analyses above focus on one-dimensional downslope transport. For completeness we note that the particle flux and its divergence more generally involve two-dimensional transport. For example, Williams and Furbish (2021) consider elements of the two-dimensional forms of Eqs. (1) and (2). They show how transverse diffusion of particles arises from particle-surface collisions during downslope travel and how transverse motions influence the downslope particle flux. Clarifying the consequences of two-dimensional rarefied particle transport remains an interesting, open topic.

Legacy of realizations
The factors that control particle delivery rates and entrainment, as well as the conditions that influence particle motions and deposition, change with time at different scales. For example, particle entrainment and surface-roughness texture associated with vegetal sediment capacitance may vary at fire recurrence timescales. Over longer timescales, continuing entrainment with downslope particle motions and deposition may contribute to changes in surface roughness and local land-surface slopes, thus changing the distribution of particle travel distances. At climate-change timescales, particle delivery rates to scree slopes may vary in relation to changing weathering rates and particle release from bedrock. Thus, we must acknowledge an adaptable view of particle delivery and entrainment. Namely, an intermittent "event" during an episode of fire might be represented by the release of sediment from a vegetation capacitor. At the climate-change timescale, in contrast, an "event" may be viewed as consisting of the entirety of the release (or entrainment) of sediment associated with a fire and the period of post-fire recovery to vegetated conditions. Here we consider one element of the consequences of changes in the factors controlling particle motions, in particular the possible mismatch between the timescale over which expected rates of delivery or entrainment change relative to the scale of intermittency in these rates.
Recall that the time-averaged flux eventually converges to the ensemble-expected value with increasing elapsed time and that the rate of convergence decreases with increasing intermittency in the delivery or entrainment of particles (Sect. 4.2.2). However, over intervals that are much shorter than the time required for convergence, the time-averaged flux in an individual realization may differ significantly from the ensemble-averaged value. This is the same as saying that the number of particles moving past a position x over a specified interval may be much different from the expected number, where the likely difference increases with increasing intermittency.
Consider for illustration the situation where particles are delivered intermittently to the top of a hillslope as a line source (Sect. 4.2.2). For illustration we specify the ensembleexpected rate η of events as an inhomogeneous rate that declines as an exponential function with e-folding time T η . A small value of T η implies a relatively rapid change in η, and a large value of T η implies a slow change. Recall that a decreasing value of η coincides with increasing intermittency and that the variability in particle numbers reaching a downslope position x is more strongly influenced by this intermittency than by the expected number of particles per event. Then, a relatively large value of the dimensionless ratio ηT η = T η /µ w implies the expected delivery rate changes sufficiently slowly that the variability of individual realizations about ensemble-expected conditions is small. A relatively small value of ηT η implies that the expected delivery rate changes faster than the rate at which the time-averaged flux converges to the expected rate (Fig. 4). Thus, for decreasing T η and decreasing exceedance probability R x (x), individual realizations of the number of particles reaching position x exhibit increasing variability about the ensemble-expected values (Fig. 6). Note that numerous other scenarios are possible. For example, a linear change in the expected rate (not shown) illustrates the same idea depicted in Fig. 6. Specifically, these plots illustrate the growing effects of legacy in previous controlling conditions with increasing intermittency and increasingly rarefied conditions. Namely, what occurs by chance under these conditions in the early part of the time series during rapidly changing expected conditions is inher-ited in later stages of the series as the rate of change in expected conditions decreases. It is only in the limit of vanishing intermittency relative to the e-folding time T η with large particle numbers near the source (x = 0, R x = 1) that individual realizations track the ensemble-expected conditions. With rapidly changing expected conditions, and far from the source, uncertainty in particle numbers increases with time.
Specifically, and with reference to Fig. 6d, if by chance during the early part of the series a relatively large number of events occur, then this preconditions the total numbern(x, t) as η decreases with time t. The realization thus overshoots the ensemble-expected state. If by chance during the early part of the series only a small number of events occur, despite an initially large rate η, then this again preconditions the total number as η decreases. The realization thus undershoots the expected state.
Consider the slopes of the individual realizations in Fig. 6 estimated by projecting lines of varying duration through different parts of the stepped curves. These slopes represent estimates of the particle flux. With increasingly rarefied conditions, notably when the expected rate η rapidly changes, such estimated rates may be markedly different from the local expected rate associated with η. Further note that the idea of convergence of a time-averaged flux to an ensemble-expected value with increasing averaging interval, as in the situation depicted in Fig. 4, is not relevant in Fig. 6. The ensemble-expected value continuously changes over a timescale T η that is shorter than the timescale required for convergence, particularly with increasingly rarefied conditions.

Discussion and conclusions
In keeping with our philosophical objectives, we begin this section at a high level. This is to reinforce our view that it is important for growing efforts centered on probabilistic descriptions of sediment transport to include the philosophical underpinnings of this work within the conversation.
In 1943 Kurt Lewin first offered his oft quoted maxim that "there is nothing so practical as a good theory" (Lewin, 1943) for providing a framework to guide analyses of complex systems. This basic, lasting principle appears to resonate in many fields, particularly the social sciences (Mc-Cain, 2016). More recently, Deutsch (2009 built from ideas of Karl Popper to strongly argue for the essential guiding role of theory in the development of scientific explanation -that compelling explanations of natural phenomena are "theory laden". He forcefully rejects the idea of empiricism, that observation of the world alone can suggest which ideas to adopt. In addition, Eugene Wigner provides an important elaboration of Lewin's maxim for the natural sciences. In his classic essay entitled "The unreasonable effectiveness of mathematics in the natural sciences", Wigner (1960) notes that the triumph of physics resides in principles of invariance  (Wigner, 1985) -that the laws of nature are invariant with any suitable transformation of space or time, thereby rendering them independent of initial conditions, position and history yet holding true for all time. He suggests that it is precisely the existence of this invariance that gives us the confidence and inspiration -what he calls the "empirical law of epistemology" -for continuing the endeavor of discovery with growing complexity and uncertainty. Without this invariance, we would lose trust in our use of the laws of physics in different problems and settings -just as we would lose confidence and interest in playing the game of chess if the rules continually changed from one match to the next, precluding any gain in expertise from experience with fixed rules.
Turning these ideas toward sediment systems, we suggest that the statistical mechanics framework outlined herein offers a compelling strategy for examining particle motions and transport, particularly with rarefied conditions and in view of the uncertainty that goes with describing slow systems. This framework has two key elements that embody the points above. First, this framework is grounded in principles and methods for dealing with particle systems, continuum and rarefied, that have been rigorously scrutinized for more than a century. Its principles rest on Wigner's views of invariance, and its familiarity lends confidence for investing trust in its mechanical basis when examining unfamiliar problems outside of classical statistical mechanics. Second, this framework embraces uncertainty at the outset in its use of probability. It offers established ways to formulate expressions of conservation, clear rules for counting and averaging particle states, and the foundational concept of a Gibbs ensemble Bi et al., 2015;Furbish et al., 2018c). Again, we are inspired to invest trust in this formalism applied to unfamiliar systems. In particular, this framework points us in the right direction for examining the physics of rarefied particle motions on hillslopes, wherein we see the behavior of the particle system precisely for what it is -an unusual granular gas. The effort then consists of elucidating a micro-view of the mechanical behavior of the particles during their downslope motions, which, when described probabilistically, leads to a macroscopic view of their collective (emergent) behavior. The theoretical analysis of particle motions involves threading together elements of statistical mechanics, concepts from granular gas theory, particle collision mechanics, and probability distribution theory (Furbish et al., 2021a, c). Importantly, the analysis leans on the style of thinking of statistical mechanics while recognizing -as a delightfully challenging twist -that it is not about simply adopting, off the shelf, theory and methods from this field. Instead, the work must be tailored to the transport process and scale of interest.
Each of the examples used in Sect. 4.1 to highlight the merits of a probabilistic description of particle motions and disentrainment -particle energy extraction, energy states and the Fokker-Planck equation, and the generalized Pareto distribution as a maximum entropy distribution -represents a direct extension of established concepts in statistical mechanics as applied to both ordinary gases and granular gases. The analyses are not as straightforward as describing the behavior of ideal gas particle systems. Nonetheless, they nicely illustrate the transferability of basic principles, for example, the treatment of dissipative collisions as a random process, the value of appealing to a Gibbs ensemble as applied to a cohort of particles, and the use of an energetic cost to constrain the entropy maximization method. Thus, these examples illustrate elements of a coherent statistical mechanics framework for describing sediment particle motions -that a mechanistic yet probabilistic analysis is possible. Moreover, the maximum entropy analysis specifically offers clarity on particle behavior that is not otherwise accessible. Namely, that all three forms of the generalized Pareto distribution are constrained in the same manner demonstrates that nothing special or unusual changes in the physics of disentrainment in the transition from the bounded form to the heavy-tailed form of the distribution in crossing isothermal conditions. The analyses thus rest on a solid foundation of statistical mechanics. Nonetheless, it is essential that these results be challenged, and, if necessary, culled and replaced with fresh ideas.
With respect to consequences of rarefied versus continuum conditions, herein we focused on descriptions of the particle flux and its divergence (Sect. 4.2). Inasmuch as the particle delivery rate (as a line source) or the entrainment rate can be approximated as a Poisson or intermittent Poisson-like process, then the analysis clearly points to the idea that the flux or its divergence involves a distribution of possible outcomes, not just a single expected value -an idea that is decidedly different from conventional continuum descriptions of these quantities. Note that the descriptions of the flux and its divergence do not depend on the results described above concerning the physics of particle motions. Indeed, the probabilistic nonlocal expressions of the flux and its divergence, Eqs. (1) and (2), are independent of the form of the probability density f r (r; x, t) of particle travel distances r and the associated exceedance probability function R r (r; x, t). Nonethe-less, these expressions are firmly grounded in the methods of classical statistical mechanics, albeit specialized to sediment motions.
Here we reinforce the idea that Eqs. (1) and (2) are probabilistic algorithms. For rarefied conditions the entrainment rate E s (x, t), although expressed as a continuously differentiable function of position x and time t, is actually an expected rate constant. Particle entrainment "events" are decidedly discontinuous (Figs. 3 and 5). Similarly, the continuous forms of the density f r (r; x, t) and the exceedance probability function R r (r; x, t) indicate that these represent ensemble-expected conditions, not the outcome of any individual realization (Appendixes C and D). During an interval of time t a finite number n of particles is entrained at the expected rate E s (x, t). This is equivalent to saying that a sample of size n is drawn from the density f r (r; x, t). Such a sample, if plotted as a histogram of distances r, would have an irregular (discontinuous) form that only roughly mimics the smooth form of f r (r; x, t). Similar irregular histograms involving different values of n, no two alike, would occur during successive intervals t. As a consequence the flux q(x, t) and the rateζ (x, t), although expressed as continuous functions, are decidedly discontinuous. All realizations are distinct, and none matches the ensemble-expected state (Figs. 3,4,5 and 6).
As written, then, the flux q(x, t) and the rateζ (x, t) are physically imagined quantities -as if the rate constant E s (x, t) represented a time-continuous "stream" of particle material distributed instantly and smoothly over space according to f r (r; x, t) or R r (r; x, t). It is only in this sense that Eqs. (1) and (2) yield a single value of the flux or its divergence for specified controlling factors embodied in E s (x, t), f r (r; x, t) and R r (r; x, t). More precisely, these expressions describe how ensemble-expected values of the flux q(x, t) and the rateζ (x, t) vary smoothly with position and time. That is, these ensemble-expected quantities are well-defined continuously differentiable functions. Then, as noted in Sect. 4.3.1, if we writeζ (x, t) = −∂q(x, t)/∂x as a "model" of land-surface evolution in which q(x, t) is specified by Eq. (1), we in fact are describing an imaginary landsurface configuration that changes in a continuously differentiable manner if and only if q(x, t) at all times coincides with the ensemble-expected flux, neglecting any noise effects. Viewed in this manner, simulations of hillslope evolution based directly on Eq. (2) (Furbish and Haff, 2010;Furbish and Roering, 2013;Doane et al., 2018) represent ensemble-expected behavior. In contrast, simulations of hillslope evolution based on particle-based models Bithell et al., 2014) The examples involving Poisson or intermittent Poissonlike processes described in Sect. 4.2 and 4.3 highlight the inherent variability that goes with noise-driven processes and point to an important consideration in interpreting landform configurations. Namely, for rarefied transport conditions a landform at any instant represents one of many possible realizations for the same controlling factors, whether fixed or varying with time. This view is distinctly different from the perspective offered by a smoothly varying deterministic model prediction that is based on fixed or slowing varying controlling factors without acknowledging noise effects that lead to a distribution of possible outcomes. The implication is that landform configurations reflect an inherent variability that is not simply attributable to perturbations about a deterministically expected state as in a mean-reverting behavior (Furbish and Fagherazzi, 2001). This variability is just as important as an expected state in characterizing the landform behavior. Unfortunately, unlike fast systems, we cannot necessarily constrain the values of the controlling factors by direct measurements or in the manner of controlled experiments. We therefore must embrace the uncertainty that goes with rarefied transport conditions and enjoy the weather of landforms as much as we do their imagined climate.
This perspective also induces us, while acknowledging consequences of the noisiness of rarefied systems, to examine the dynamics of competition between roughening and smoothing processes . As mentioned in Sect. 4.2.3, this may involve collective entrainment, the sediment capacitance of vegetation and other roughness elements, preferential entrainment and deposition in relation to surface geometry and roughness, effects of particle size sorting, or "smoothing" processes that are not related to rarefied particle motions per se. We suggest that there is value in taking cues from current work on noise-driven bed load transport, including the coupling between moving particles and the streambed state Ancey and Heyman, 2014;Pierce and Hassan, 2020a). Whereas we have focused on local consequences of noisy delivery rates and entrainment, there is a need to systematically examine land-surface behavior in relation to rarefied particle transport.
In their examination of experimental time series of bed load flux, Ancey and Pascal (2020) provide an interesting lesson for considering slow systems. They show that for a noisedriven process the time-averaged flux calculated from an individual realization (Figs. 3 and 4) may differ significantly from the (known) sediment feed rate. We can imagine having information, for example from a sediment deposit, that allows us to estimate the time-averaged sediment delivery rate associated with a slow, noisy system. However, this estimated rate, representing an individual realization, may not coincide with the ensemble-expected rate associated with the extant controlling conditions. In the absence of a high-fidelity time series of the delivery rate analyzed by re-sampling methods (Ancey and Pascal, 2020), the result is unavoidable uncertainty in this averaged rate.
We end with an anecdote to reinforce the starting point of this section. Last fall we wandered into Guilherme Gualda's graduate class on phase transformations in magmatic systems and, to our delight, discovered on the chalkboard a derivation of the Boltzmann distribution of the energy states of atoms in a crystal lattice -complete with a pictorial rendering of the energy macrostates and microstates of a simple example system. The derivation continued with a description of the particle diffusion coefficient containing the Gibbs activation energy, as a direct consequence of the Boltzmann distribution, thence to the Arrhenius equation. We then enjoyed the discussion surrounding the idea that, in practice, one would experimentally determine the diffusion coefficient for a real (i.e., not ideal) system rather than predict it from the statistical mechanics theory for specified atomic constituents and thermodynamic conditions. The students were quick! "So what is the value of the theory?" "Aha!", Gualda responded with delight, "the theory provides an unambiguous framework for interpreting our experimental data in view of the uncertainties of real systems! For example, in addition to providing a coherent, testable explanation of the phenomenon, the theory points to the appropriate functional form -the logical basis -of the expected relationship in curve fitting. This in turn provides the basis of error assessment -either by classic propagation using the calculus or by Monte Carlo methods -which is particularly valuable given that experimental data often are sparse and of variable quality. And it assigns clear meaning to estimated parameters for comparison with other work." It is such a lovely, simple lesson: "There is nothing so practical as a good theory . . . " and we would add in the case of sediment systems, ". . . that pays as much attention to fluctuations as it does to expected (mean) values."

Appendix B: Divergence form of Exner equation
Consider the entrainment forms of the flux and the Exner equation, Eqs. (1) and (2). Here we show that Eq. (2) is consistent with the divergence expressed as c sζ (x, t) = −∂q(x, t)/∂x when the flux q(x, t) is given by Eq. (1).
We start by writing h( Taking the derivative of Eq. (B1) and applying Leibniz's rule, With r = x − x we observe that the operation ∂/∂x = ∂/∂r. Evaluating the derivative in Eq. (B2), noting that f r (r; x, t) = −dR r (r; x, t)/dr and R r (0; x, t) = 1, then yields Reverting to partial notation the entrainment form of the Exner equation, Eq.
Consider an alternative formulation. With r = x − x we note that dx = −dr. We now use this change of variable to rewrite Eq. (1) as This form of the convolution may be expanded as a Taylor series to show that the flux consists of advective and diffusive parts so long as the integral of R r (r; x, t) converges (Furbish and Roering, 2013;Furbish et al., 2017). In turn we use the communicative property of convolutions to write Eq. (B4) as which has the same form as Eq.
(1). We then take the derivative of Eq. (B5) and proceed as above to obtain the entrainment form of the Exner equation, Eq. (2).

Appendix C: Rarefied versus continuum conditions
The material in this appendix is mostly extracted directly from Appendix A in Furbish et al. (2018c). Our aim is to further illustrate the significance of rarefied versus continuum conditions, and the interpretation of the Fokker-Planck equation applied to these conditions. We focus on the familiar example of Brownian motion, the initial formal description of which is separately attributable to Einstein (1905) and von Smoluchowski (1906). For additional background, Schumer et al. (2009) provide a particularly clear description of the Lagrangian perspective of particle motions and its relation to the Eulerian perspective of particle behavior as embodied in the Fokker-Planck equation.
With reference to Fig. C1, let x denote a coordinate along which Brownian particles take one-dimensional random walks, where x extends indefinitely in the positive and negative directions about the origin x = 0. Suppose that a particle starts at the origin at time t = 0 and with equal probability moves in the positive or negative direction during successive small intervals dt. By the definition of a random walk, the motion of the particle -specifically its expected position x after an interval of time t > 0 -can be predicted only in a probabilistic sense. Namely, letting f x (x, t) denote the probability density function of possible positions x, then this density satisfies a Fokker-Planck equation involving only its diffusion term: where the particle diffusivity k 2 is assumed to be constant. The solution of Eq. (C1) is the Gaussian distribution with mean µ x = 0, namely, For this highly rarefied system involving a single particle, we can only offer probabilistic predictions of its position at time t. For example, we may confidently state that with probability p = 1/2 the particle is either at a position x < 0 or at a position x > 0. Or we may state that with probability p ≈ 0.68 the particle is within the domain defined by plus 1 and minus 1 standard deviations about the mean position, namely, − √ 2k 2 t < x < + √ 2k 2 t. For this single-particle system (realization), the actual particle position x a is represented by a Dirac distribution δ(x a − x, t) (Fig. B1), but this cannot be predicted deterministically ).
Let us now imagine an arbitrarily great number N of identical, independent particles that start at the origin x = 0 at time t = 0, each undergoing a random walk during t > 0. When viewed together, the distribution of these particles at time t = 0 is given by the Dirac distribution, namely, f x (x, 0) = δ(x). At any time t > 0 these particles are distributed according to Eq. (C2). That is, because N is arbitrarily large, the proportion of particles within any small interval x to x + dx closely matches what is predicted by Eq. (C2), namely f x (x, t)dx, such that in the limit of dx → 0 the actual distribution of positions x varies smoothly (continuously) and converges to Eq. (C2) (Fig. C2). In contrast to the highly rarefied single-particle system in the previous example, we may thus assume that this great number of particles, occurring in one system (realization), satisfies the continuum hypothesis. Nonetheless, upon randomly selecting a single particle from this system, we still can only offer probabilistic predictions of its position at time t -as in the example above involving a system with a single particle. Moreover, note that the continuous distribution of positions x realized at time t for this one system involving a great number N of particles is identical to the distribution that would be realized upon pooling the x positions at time t associated with a great number N of independent systems, each involving a single particle. Now select a system with a modest number N of particles such that conditions are rarefied. By this we mean that, after some time t, the actual distribution of particle positions x is at best represented by an irregular histogram that roughly appears Gaussian but is decidedly discontinuous (Fig. C2). Moreover, any realization involving N particles possesses a similar irregular form at time t, and no two are the same. In effect, each realization represents a sample of size N drawn from an imagined population represented by Eq. (C2). Also note that each realization involving N particles at time t is the same as N realizations, each involving a single particle, when viewed collectively at time t.
Let us now consider a great number N e of independent but nominally identical systems -an ensemble -at any fixed time t, where each system contains N particles, large or small. We now wish to describe the ensemble-expected conditions. To envision this, consider any small interval x to x+dx. If N = 1 as in the first example above, then f x (x, t)dx is just the proportion of the N e systems containing a particle within x to x + dx at time t. Note that this is identical to the result above involving an individual system containing a great number N = N e of particles. If instead each system involves a great number N of particles, then f x (x, t)dx simply becomes the expected proportion of the N particles within x to x + dx at time t, where the expectation is calculated over the N e systems. And note that this outcome is identical to the proportion of N × N e independent systems, each involving a single particle, which contain a particle within x to x + dx at time t. In either case, the expected proportion within the interval is the same. Moreover, we reach the same conclusion in considering a great number N e of systems, each involving a modest number N of particles. Thus, when calculated over a great number of systems for all intervals dx, then in the limit of dx → 0, the continuous function, Eq. (C2), is retrieved. The key points are these: first, whether N is relatively small (representing a rarefied condition) or N is large (representing a continuum condition), the ensemble-expected behavior represented by Eq. (C2) applies equally to both conditions in a probabilistic sense. Second, if N is small, then Eq. (C2) represents the ensemble-expected behavior, not the actual behavior of any one system (realization); if N is large, then the actual behavior of the system is expected to converge to the smooth ensemble behavior represented by Eq. (C2).
To complete the picture, suppose that the x domain in Fig. C1 is bounded such that −1 < x < 1. Particles that reach these boundaries are "reflected" and remain within the domain, continuing their random walks. In the limit of t → ∞, the probability density of particle positions x reaches a steady-state form, that is, ∂f x (x, t)/∂t → 0 such that f x (x, t) → f x (x). In this limit, Eq. (C1) becomes d 2 f x (x)/dx 2 = 0. Moreover, the probability flux q x = −k 2 df x (x)/dx = 0 at all positions x, which means that df x (x)/dx = 0. These constraints together with the fact that the distribution f x (x) must integrate to unity yield the result that f x (x) = 1/2 over the bounded domain (Fig. C1). That is, the expected distribution f x (x) is uniform. As with the unsteady problem described above, a modest number N of particles representing rarefied conditions in any one realization is at best represented by an irregular histogram that roughly appears uniform but is decidedly discontinuous (Fig. C3). Moreover, at an arbitrary later time, the resulting distribution (histogram) would be just as irregular; it does not become smoother with increasing time. As above, the expected continuous steady-state distribution is retrieved when expected values are calculated over a great number N e of systems. Figure C3. Histogram of particle positions x at time t → ∞ for one system showing that with a modest number of particles representing a rarefied condition this histogram is irregular and discontinuous; in this example N = 500. With respect to developments in the text, the Fokker-Planck equation describes the time evolution of the probability density f x (x, t). The formulation does not assume either rarefied or continuum conditions. It is indifferent to these conditions yet equally applicable to both.

Appendix D: Ensemble expected conditions of a Poisson process
Here we provide an explicit definition of an ensembleexpected value or state and its relation to a time-averaged value. Recall that the Poisson distribution, Eq. (25), describes the probability that n events will occur within a specified interval of time t when these events occur randomly with a fixed rate constant η [T −1 ]. Here we write this distribution as f n (n, t) = (ηt) n n! e −ηt , with mean µ n = ηt and variance σ 2 n = ηt. As written, Eq. (D1) looks like a description of the time evolution of a system involving Poisson events. However, note that for an individual realization of a Poisson process over a specified interval (0, t) there is only one outcome, that is, precisely n events. As usually presented, we are to imagine such an interval of time and use Eq. (D1) to assess the likelihood that n events will occur during the interval. Here we alternatively consider an ensemble of systems.
Let us imagine, as did Gibbs (1902), a great number N e of nominally identical but independent systems (an ensemble), each subject to the same physics of random events characterized by the rate constant η. (One may consider each realization plotted in Fig. 3, for example, as representing the outcome of one system in the ensemble.) Let the subscript i = 1, 2, 3, . . ., N e denote the individual systems composing the ensemble. We now imagine starting each system at time t = 0 with the initial conditions, f n (0, 0) i = 1 and f n (n, 0) i = 0 , n ≥ 1 .
That is, at time t = 0 each of the N e systems has a value of unity at n = 0 and a value of zero for all n ≥ 1. Taking the ensemble expectations, f n (0, 0) i = N e N e = 1 and (D3) Thus, f n (0, 0) is just the proportion of the N e systems with n = 0 events and f n (n, 0) is the proportion of the N e systems with n ≥ 1 events at time t = 0. Now, at any time t > 0 each system has precisely n events with probability one. Those that have n = 0 events are represented by f n (0, t) i = 1 with f n (n, t) i = 0 , n = 0 , those that have n = 1 events are represented by f n (1, t) i = 1 with f n (n, t) i = 0 , n = 1 , those that have n = 2 events are represented by f n (2, t) i = 1 with f n (n, t) i = 0 , n = 2 , and so on for systems with n = 3, 4, 5, . . . events. Now let N(n) denote the number of systems with n events. Taking ensemble expectations, and so on for systems with n = 2, 3, 4, . . . events. That is, each probability f n (n, t) for n = 0, 1, 2, . . . is just the proportion of the N e systems with n events at time t. More generally, f n (n, t) i = (ηt) n n! e −ηt , n ≥ 0 , which is our starting point. At this juncture the form of Eq. (D10) now may be interpreted as describing the time evolution of the ensemble distribution f n (n, t) (Fig. D1). To reiterate, upon starting each member of the ensemble at t = 0 according to Eq. (D2) then letting time proceed, Eq. (D10) describes the distribution of the values of n associated with the N e members of the ensemble when viewed at any instant. Whereas at time t = 0 all members of the ensemble have n = 0 events with a probability of 1, the proportion f n (0, t) decays as e −ηt with t > 0. The proportion f n (1, t) initially grows, reaches a peak, and then decays as ηte −ηt . The proportion f n (2, t) likewise initially grows, reaches a peak, and then decays as (1/2)(ηt) 2 e −ηt . This pattern continues for proportions involving n = 3, 4, 5, . . .. The essential idea is this. For any individual system there is one possible outcome n at time t with probability given by Eq. (D1). For an ensemble of systems all possible outcomes n = 0, 1, 2, 3, . . . exist at time t in the proportions given by Eq. (D1). In fact, these ensemble proportions constitute the formal, classic definition (Hájek, 2012) of the probabilities f n (n, t) of n events at time t.
Consider the situation where particles are delivered as a line source (x = 0) at the expected rate η. In this problem the ensemble-expected particle flux at position x > 0 is R x (x)η, which by definition is the expected time-averaged flux for any averaging interval t. This generally is not the same as the time-averaged flux of any realization. Namely, if n i (x, t) denotes the total number of particles moving past x during the interval t for the ith realization, then the time-averaged flux is n i (x, t)/t. This time average converges to the expected rate R x (x)η only in the limit of t → ∞. To gain a sense of this convergence we may consider the value of the timeaveraged flux coinciding with 1 standard deviation in the expected number of particles moving past x during t, which is √ R x (x)ηt/t = [R x (x)η/t] 1/2 . This suggests that the flux converges as ∼ t −1/2 , which is slower than exponential convergence.
Whereas the ensemble distribution represented by Eq. (D10) evolves with time t, for completeness we comment here on the idea of a "stable", or stationary, ensemble distribution that is independent of time. The geometric distribution associated with the M/M/1 queuing problem with η I < η E (Sect. 4.2.3) is a stable distribution. Individual realizations of n(t) may fluctuate over the domain n = 0, 1, 2, 3, . . ., but at any instant the ensemble distribution f n (n) is independent of time. Similarly, the binomial and negative binomial distributions f n a (n a ) describing the number n a of active particles Heyman et al., 2014) are stable distributions. The uniform distribution associated with a Brownian particle as described in Appendix D is a stable (time-independent) distribution. The exponential distribution of particle velocities described by Furbish et al. (2012b is considered to be a time-independent ensemble distribution. Figure D1. Plot of time evolution of (a) Poisson ensemble distribution f n (n, t) of the proportion N(n)/N e of systems with n events at time t = 0, 1, 2, 3, 4 and 5, and (b) proportion N(n)/N e of systems with n = 0, 1, 2, 3, 4 and 5 events. Plots are based on η = 1. Lines connecting the probability values (circles) in (a) are to aid visualization of successive states of the discrete distribution and do not imply the presence of non-integer values of n.