Although in situ measurements in modern frequently occurring turbidity currents have been performed, the flow characteristics of turbidity currents that occur only once every 100 years and deposit turbidites over a large area have not yet been elucidated. In this study, we propose a method for estimating the paleo-hydraulic conditions of turbidity currents from ancient turbidites by using machine learning. In this method, we hypothesize that turbidity currents result from suspended sediment clouds that flow down a steep slope in a submarine canyon and into a gently sloping basin plain. Using inverse modeling, we reconstruct seven model input parameters including the initial flow depth, the sediment concentration, and the basin slope. A reasonable number (3500) of repetitions of numerical simulations using a one-dimensional layer-averaged model under various input parameters generates a dataset of the characteristic features of turbidites. This artificial dataset is then used for supervised training of a deep-learning neural network (NN) to produce an inverse model capable of estimating paleo-hydraulic conditions from data on the ancient turbidites. The performance of the inverse model is tested using independently generated datasets. Consequently, the NN successfully reconstructs the flow conditions of the test datasets. In addition, the proposed inverse model is quite robust to random errors in the input data. Judging from the results of subsampling tests, inversion of turbidity currents can be conducted if an individual turbidite can be correlated over 10 km at approximately 1 km intervals. These results suggest that the proposed method can sufficiently analyze field-scale turbidity currents.

Turbidity currents are sediment-laden density flows that occur intermittently in deep-sea environments

Through recent development of observational instruments, the velocity and flow depth of deep-sea currents can be measured directly

What caused the difference in observed frequency of modern turbidity currents compared with the records of ancient turbidites? One of the possibilities is that most of the turbidity currents observed in the present day may be very small in magnitude or are diluted and leave little or no deposits in large areas. If this is the case, turbidites several tens of centimeters thick observed in geologic records can be interpreted as deposits of extraordinarily large-scale events that occurred once every several hundred years. This hypothesis implies that turbidites in the strata resulted from low-frequency but very high-risk events such as large tsunamis and earthquakes

The inverse analysis of turbidites in strata may fill the gap between observations of turbidity currents and geologic field observations of ancient turbidites. The reconstruction of past conditions by inverse analysis has been a major tool in several research fields including sedimentology and geomorphology. For example, several studies have reconstructed the magnitudes of past tsunamis from tsunami deposits

However, no practical methodology for the inverse analysis of turbidity currents applicable on a field scale has yet been established. Early attempts to obtain hydraulic parameters of turbidity currents were based on the grain size distribution of turbidites

To obtain reasonable flow characteristics from turbidites, inverse analysis using a numerical model should be performed.

Here, we propose a new methodology using an artificial neural network (NN) for obtaining flow characteristics of turbidity currents from their deposits (Fig.

In this study, we implement an NN-based inverse analysis and examine its effectiveness for turbidites at the field scale. The focus of this study is on rapidly decelerating sedimentary turbidity currents, and normally graded turbidites are considered to be deposited from such decaying flows. This approach has already proven to be effective for the inverse analysis of tsunami deposits by

Schematic diagram of the inversion process of turbidity currents from deposits. The method is composed of three steps: (1) generation of training datasets by the forward model using random values for model input parameters, (2) training of the NN based on the artificial datasets, and (3) application of the trained inverse model to unknown field datasets.

Here we describe the formulation of the forward model used for producing training datasets for the inverse model (Fig.

In a turbidity current flowing over hundreds of kilometers,

However, the focus of this study is on rapidly decelerating sedimentary turbidity currents. Normally graded turbidites are considered to be deposited from such decaying flows. In this study, the distribution of turbidites is assumed to be limited to several tens of kilometers at most, and the separation of the lower and upper layers that occurs in sustained turbidity currents after flowing tens of kilometers does not need to be considered when calculating such relatively small-scale turbidity currents. In fact, the model of

Explanation of model parameters. The turbidity current exchanges suspended sediment with the active layer (

Let

The mass conservation of the sediment in the active layer and the deposit (historical layer) takes the form

To solve Eqs. (

For computational efficiency and numerical stability, a deformed grid approach was adopted to solve Eqs. (

In this study, a turbidity current was assumed to occur from a cloud of suspended sediment (height

Model input parameters. The initial conditions of the turbidity current are assumed to be the suspended sediment cloud that is

In this study, numerical simulation of a turbidity current is repeated under various random initial conditions to produce a dataset of the characteristic features of turbidites. Then, this artificial dataset of turbidites is used for supervised training of a deep-learning NN. The values of the turbidite characteristics, i.e., distribution of volume per unit area of all grain size classes, in the training dataset are input to the NN, and the estimated initial conditions (e.g., initial flow height and concentration) of the turbidity current are obtained from the output nodes of the NN. The output values of the NN are compared with the true conditions. The optimization of weight coefficients of the NN is then conducted to reduce the mean square of the difference between the true conditions and the output values of the NN. If the number of training datasets is sufficiently large, the trained NN should be able to estimate the paleo-hydraulic conditions from the data on the ancient turbidites (Fig.

The local conditions of a turbidity current (velocity, concentration, etc.) at any location and time can be estimated from the reconstructed initial conditions. The flow parameters are obtained by calculating the time evolution of the forward model from the initial conditions. In this way, we can obtain the behavior of the flow with a relatively small number of parameters. This approach has already been tried successfully by

The details of these procedures are described below.

We conducted iterative calculations using the forward model and accumulated data to train and validate the inverse model. To investigate the appropriate amounts of data for training the inverse model, we conducted 500–3500 iterations of the forward model calculations. To verify the performance of the trained model, 300 test datasets were also generated numerically, independent of the training data.

Model input parameters that are subject to inversion are required to produce the training and test data by the forward model calculation (Fig.

Each run of the forward model calculation is initiated with the given model input parameters and is terminated when the flow head reaches the downstream end or a sufficiently long time period (

Before the model input parameters are input to the NN, all values are normalized between 0 and 1 using the following equation:

The artificial NN is used as the inverse model to reconstruct flow conditions from the depositional architecture. We input the spatial distribution of volume per unit area of multiple grain size classes of a turbidite in the NN, which outputs the values of the flow initial conditions and the basin slope. In this study, we use a fully connected NN that has four hidden layers. The volume per unit area of

The rectified linear unit (ReLU) activation function is adopted for all NN layers

The NN is expected to output the model input parameters (i.e., the initial flow conditions and the basin slope), and therefore the number of nodes in the output layer is equal to the number of input parameters for the forward model, which is seven here (the initial flow length, depth, sediment concentrations, and the basin slope).

To develop the inverse model, supervised training is conducted using the artificial dataset produced by the forward model calculation. First, the artificial dataset is randomly split into training and validation datasets to detect overfitting during the training process. The ratio of the validation dataset is set to 0.2 so that 80 % of the artificial dataset is used for training. The model input parameters used for producing training and validation sets were regarded as the teacher data to train and evaluate the model.

The methodology applied for training the NN is as follows. The mean squared error (MSE) is adopted as the loss function because the supervised training of the NN in this study is classified as a regression problem

Several hyperparameters should be specified for the training of an NN. Specifically, the dropout rate, the learning rate, the batch size, the number of epochs, and the momentum are adjusted manually after repeated trial and error. To perform an optimization calculation with SGD, the batch size and the learning rate were set to 32 and 0.02, and the value 0.9 was chosen for the momentum. The dropout rate for regularization was 0.5.

The performance of the inverse model is tested using a set of 300 data that are produced independently of the training and validation datasets. The inversion precision for each model input parameter is evaluated by the root mean square error (RMSE) and the mean absolute error (MAE) of the prediction. These error metrics are computed for both raw and normalized values with true values and used to evaluate the model. Moreover, the bias of prediction (i.e., the mean deviation of the model predictions from the true input parameters) is used to describe the accuracy of the inversion.

Three additional tests are conducted for verifying the robustness of the inverse model that is significant for the applicability of the model to field datasets. The results of these tests are evaluated by the average of the normalized RMSE, which is defined as

First, noise is artificially added to the test data to evaluate the robustness of the inversion results against the measurement error. Under natural conditions, measurement errors in the thickness and grain size analysis of turbidites as well as the local topography affect these results. If the results of the inverse analysis change significantly due to such errors, it means that our method is not suitable for application to field data. To investigate this, we apply normal random numbers to the volume per unit area at each grid point in the training data at various rates, and we observe how much influence the noise has on the inverse analysis results.

The second test on the inverse model is to perform a subsampling of the grid points in the training data. Outcrops are not continuous over tens of kilometers, so the thickness and the grain size distribution of a turbidite in the interval between outcrops can only be obtained by interpolation. To simulate this situation, the grid points in test datasets are randomly removed in this test, and the volume per unit area at the removed grid points is linearly interpolated. By varying the rate at which grid points are removed, this test also allows us to estimate the average interval of the outcrops necessary for conducting the inverse analysis. That is, if 90 % of the grid points set at 5 m intervals are removed and the inverse analysis is conducted on the remaining 10 %, the average distance between the grid points is 50 m. Estimating the outcrop spacing requires obtaining reasonable results of inverse analysis before applying it to the actual field.

Finally, the influence of the length of the upstream slope was examined. In this study, it is assumed that a steep slope (10 %) of a submarine canyon with a length of 5 km exists upstream, and a basin plain with a gentle slope exists downstream of the steep slope. Although the topography and deposits of the upstream slope are not the subject of the inverse model analysis, the length of the slope potentially affects the results of the inverse analysis. As a test, we set a slope of 10 km length instead of 5 km upstream and deposited a turbidite bed from the turbidity current flowing down from the uppermost part of the slope. The turbidite was then analyzed using a model trained on the assumption of a 5 km slope to compare the reconstructed values with the original conditions.

Here, we describe the properties of turbidite artificial data generated for training and testing the inverse model. Several artificial datasets of turbidites are produced using a 1D shallow-water equation model. Figure

Examples of turbidites calculated by the forward model.

We trained the NN inverse model with various numbers of artificial data and lengths of the sampling window, and the best result in terms of the value of the loss function for the validation sets and the practical usage of the model can be obtained with 3500 training datasets and a 10 km long sampling window (Fig.

Results of training of the NN with different numbers of training datasets and lengths of the sampling window.

Hereafter, we further investigate the performance of the inverse model trained on 3500 datasets with a 10 km long sampling window. The history of training indicates that the values of the loss function improved significantly in the first 1000 epochs, and the results are improved up to 15 000 epochs (Fig.

Training history of the NN; 3500 datasets and a 10 km long sampling window were used for this training.

Result of the inverse analysis compared with the true parameters. The

Using 300 test datasets, the performance of the inverse model trained with 3500 datasets and a 10 km long sampling window is evaluated. The estimated parameters match well, with slight deviations (Figs.

Histograms indicating the deviation of the predicted values from the true values.

Errors and bias of the predicted parameters. Prediction errors are exhibited by the root mean squared error (RMSE) and the mean absolute error (MAE), and the mean bias is also described. Normalized values of RMSE, MAE, and mean bias by true values are also shown.

The forward model is calculated again using the reconstructed values to examine the influence of the estimation error of the model input parameters on the predicted flow behavior (Fig.

The predicted and true parameters used for an example calculation of the time evolution of the flow characteristics.

Example of forward model calculation with reconstructed and true parameters. The solid line indicates the calculation result using the predicted parameters, and the dashed line exhibits the results using the true parameters.

The test data with various normal random values are analyzed to verify the robustness of the inverse model. Consequently, even when the standard deviation of the normal random numbers given as measurement errors was set to approximately 200 % of the value of the original data, only a small effect was observed in the normalized root mean square (rms) of the results of the inverse analysis (Fig.

Result of inverse analysis of the test datasets with artificial noise. The values of RMSE are averaged over 20 iterations. Error bars indicate standard errors of RMSE values.

Similarly, using subsampling data obtained by extracting some of the spatial grids from the original data, we conducted an inverse analysis of the test datasets. The results show that there is little influence on the RMSE values of the inverse analysis of the test datasets when the sampling rate of grids is greater than 1 % (Fig.

Results of the inverse analysis for the subsampled test datasets. The values of RMSE are averaged over 20 iterations. Error bars indicate standard errors of RMSE values.

Here, a turbidite deposited in a different topographic setting was analyzed to determine the influence of the topographic assumptions on inversion results. The slope of 10 km instead of 5 km was set at the upstream end of the calculation domain. The initial conditions for this test assuming a 10 km slope were a suspended sediment cloud 359 m high and 227 m long, with concentrations of 0.13 %, 0.15 %, 0.38 %, and 0.65 % for the four grain size classes. The gradient of the downstream slope was set to be 0.69 %.

As a result, the initial conditions estimated by the inverse model trained on the assumption of a 5 km upstream slope were a suspended sediment cloud 117 m high and 587 m long, with concentrations of 0.33 %, 0.38 %, 0.48 %, and 0.53 % for each grain size class, and the downstream slope was estimated to be 0.96 %. Then, these initial conditions were given to the forward model to calculate their time development, and the obtained parameters were compared on a basin plain where the turbidite was deposited (Fig.

The results showed that the model with a 5 km slope predicted values relatively close to the original results for the flow velocity (Fig.

In contrast, the concentration of the turbidity current was significantly overestimated in the model reconstruction assuming a 5 km slope (Fig.

Influence of the length of the upstream slope on the result of inverse analysis. A 10 km long slope was used to produce a turbidite, and the bed was analyzed by the inverse model trained with a 5 km long upstream slope. Solid lines are values for currents producing the bed, and the dashed lines are reconstructed values.

The performance of the inverse model for turbidity currents is evaluated using the test dataset, implying that this model can accurately reconstruct the flow characteristics of the turbidity currents from the spatial distribution of the thickness and grain size of turbidites (Figs.

The inverse model reconstructed not only the initial conditions of turbidity currents accurately, but also the predicted time evolution of the flow behavior accurately and precisely. In the results of the forward model calculations using the predicted model input parameters that relatively deviate from the true values (Table

Turbidity currents have a mechanism called self-acceleration, which is caused by erosion and associated increase in the flow density

The relationship between turbidity currents and characteristics of turbidites is nonlinear. Especially when the flow is self-accelerating, a small difference in the initial conditions can result in very different sedimentary characteristics. This means that it is easy to find the initial conditions of the flow by inverse analysis because even if the characteristics of the deposits are very different, the initial conditions of the flow should not be so different. Thus, the inverse results in this case are expected to be robust even if there are some measurement errors in the characteristics of deposits. In other words, there is a trade-off between the robustness of the forward and inverse modeling.

This property of the inversion can be understood when we consider the opposite case. If the initial conditions of the flow are different but the characteristics of the turbidites are exactly the same, it is impossible to estimate the flow conditions from the turbidites. The inverse analysis of hydraulic conditions is possible because the depositional characteristics are sensitive to conditions of turbidity currents. The self-acceleration of turbidity flow is an extreme example of the sensitivity of turbidites to the flow initial conditions.

To apply this method to outcrops, the extent of the area that should be surveyed to collect data and the interval between outcrops should be determined. The tests with different sizes of sampling windows suggest that the survey region should be located more than 10 km from the proximal region (Fig.

These requirements for accurate inversion are attainable in the actual field. For example,

Besides these outcrop conditions, measurement errors in the field are another important factor for application. The test results suggest that the proposed inverse model of this study is very robust against random noise; random errors in the measured data have little effect on the results (Fig.

Perhaps the most significant drawback to analyzing actual turbidites is the assumption about the topography of the upstream submarine canyon. In this study, we tested doubling the length of the upstream slope and found that the predicted values for the concentration were different from the original values (Fig.

In existing inverse analysis methods for turbidity currents, the difference in depositional characteristics between the outputs of the forward model and the field observation is quantified as the objective function, and the initial and boundary conditions of the forward model are determined by conducting optimization calculations to minimize the objective function

Another potential approach to optimization is the Markov chain Monte Carlo (MCMC) method, but even with this method, repetition of the forward model calculation is unavoidable, since MCMC usually requires repetition of calculations of the objective function, which cannot be parallelized more than the order of

The approach proposed in this study is obviously superior to existing methods in terms of applicability to the field, as it allows computationally demanding models to be applied as forward models. The general relationship between the bed and the input parameters is learned by an NN rather than adjusting the input parameters of the numerical model to reproduce the characteristics of specific individual beds. The objective function used in the training of this NN is not the difference between the features of the sediment, but the precision of the inverse analysis results themselves. The most computationally demanding part of the inverse analysis method proposed here is the generation of the training data for the NN. However, since the computations of the forward models are completely independent of each other, the generation of the training data can be conducted in parallel. Thus, our method enables us to easily prepare a large number of training data by using PC clusters, even for very computationally demanding forward models. In addition, the number of calculations required for training is not as high as other methods, specifically only approximately 3000. It is also advantageous that the proposed method enables us to perform various tests for robustness or precision of inversion before application to field examples because the NN outputs the results of inverse analysis extremely fast. For these reasons, we consider this study to have successfully generated an inverse model using the layer-averaged model for unsteady turbidity currents that can be applied to the field.

The inverse model proposed in this study has several limitations. Inevitably, the accuracy of the inverse analysis is governed by the validity of the forward model that generates the training data. The present implementation of the inverse model uses the one-dimensional layer-averaged model as the forward model, but this model is likely to be applicable only to sedimentary basins that are laterally constrained or to the inside of the submarine channels. The layer-averaged model of

Although

It is relatively easy to solve these problems described above. Without changing the framework of the proposed method, we can adapt it to any situation by changing the forward model to generate the training data. For processes such as sediment transport, it is easy to revise the model to incorporate state-of-the-art knowledge. By adopting computationally demanding models, inverse analysis using 2D and 3D forward models may be possible. In future research, these issues should be addressed, and the methodology should be applied to actual field examples.

The analysis of ancient turbidites will be an important issue in the future. However, even if ancient turbidites are analyzed, it is not possible to verify that the results obtained are correct because the hydraulic conditions for ancient turbidity currents are unknown. Another way to verify the validity of the method is to reconstruct the hydraulic conditions of experimental turbidity currents from the turbidites deposited in the flume and compare them with the measured values. The turbidity currents measured in the modern submarine canyons and their deposits would be another candidate to be used for model verification.

This study implemented an inverse model that reconstructs the flow characteristics of turbidity currents from their deposits using an NN and verified its effectiveness at the field scale. In this study, we assumed that turbidity currents occur from suspended sediment clouds, which flow down from the steep slope in a submarine canyon to a gently sloping basin plain. The inverse model attempts to reconstruct seven model input parameters (height and length of the initial suspended sediment cloud, sediment concentration of four grain size classes, and slope of the basin plain) from the thickness and grain size distribution of the turbidite deposited on the basin plain. The forward model, using one-dimensional layer-averaged equations, was used to produce training datasets with random conditions in prescribed ranges. The NN was trained using the generated data to develop the inverse model. Thereafter, the test data generated independently from the training data were analyzed to verify the performance of the inverse model.

As a result of the training and tests conducted on the inverse model, the following was found.

More than 2000 datasets were required for the training to avoid over-learning. An increase in the number of training datasets results in improved performance of the inverse model; however, the degree of improvement becomes smaller if more than 3000 datasets are used.

The hydraulic conditions and basin slopes were precisely reconstructed from the test datasets. The thickness and grain size distribution of the turbidites deposited over a 10 km long interval in a sedimentary basin were sufficient to reconstruct the flow conditions.

The inverse model of this study is quite robust to random errors in the input data. The addition of a normal random number with about the same magnitude of the standard deviation as the original data had little effect on the results of the inverse analysis.

Judging from the results of subsampling tests, the inversion of turbidity currents can be performed if an individual turbidite can be correlated over 10 km at approximately 1 km intervals.

These results imply that the inverse model of turbidity currents proposed in this study is promising for analyzing field-scale turbidites. This method is expected to be applied to actual turbidites in the future.

All codes used in this study are deposited in the Zenodo repository (

HN was responsible for conceptualization, methodology, software, writing, and review and editing. KN developed the software.

The authors declare that they have no conflict of interest.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This paper has benefited greatly from discussions at the workshops of the Earthquake Research Institute, University of Tokyo Joint Usage/Research Program 2018-B0.

This work was supported by JSPS KAKENHI under grant nos .26287127 and 20H01985 as well as a research grant of the Earthquake Research Institute, University of Tokyo Joint Usage/Research Program 2018-B01.

This paper was edited by Paola Passalacqua and reviewed by two anonymous referees.