Short Communication: Topotoolbox 2 – Matlab-based Software for Topographic Analysis and Modeling in Earth Surface Sciences

TopoToolbox is a MATLAB program for the analysis of digital elevation models (DEMs). With the release of version 2, the software adopts an object-oriented programming (OOP) approach to work with gridded DEMs and derived data such as flow directions and stream networks. The introduction of a novel technique to store flow directions as topologically ordered vectors of indices enables calculation of flow-related attributes such as flow accumulation ∼ 20 times faster than conventional algorithms while at the same time reducing memory overhead to 33 % of that required by the previous version. Graphical user interfaces (GUIs) enable visual exploration and interaction with DEMs and derivatives and provide access to tools targeted at fluvial and tectonic geomorphologists. With its new release, TopoToolbox has become a more memory-efficient and faster tool for basic and advanced digital terrain analysis that can be used as a framework for building hydrological and geomorphological models in MATLAB.


Introduction
An increasing number of research studies use digital elevation models (DEMs) for automated spatial analysis and advanced process-based modeling.Although geographical information system (GIS) software packages such as ESRI's ArcGIS include various interfaces for implementing userspecific codes and models, many users prefer programming environments such as MATLAB that already include large libraries for different computational tasks.Since its release in 2010, TopoToolbox, a code library for MATLAB, has been used in various studies of the Earth's surface and surficial water and material fluxes.Applications include topics such as supraglacial meltwater dynamics (Clason et al., 2012), contaminant transport in streams and groundwater (Messier et al., 2012), tectonic geomorphology (Shahzad and Gloaguen, 2011;Scherler et al., 2013), soil nutrient dynam-ics (Schwanghart and Jarmer, 2011), karst hydrology (Borghi et al., 2011) and flood modeling (de Moel et al., 2012).
Despite broad attention and positive feedback, the lack of a user-friendly graphical user interface (GUI) may also have discouraged interested users who favor a visual approach towards DEM processing and analysis.Moreover, processing very large DEMs, or, within a modeling framework, processing certain computations a large number of times, is a common challenge in modern geomorphological studies and requires efficient computational techniques.While very large DEMs are most easily handled with software that implements efficient input/output (I/O) communication between internal and external memory, a strategy pursued by ArcGIS, GRASS (Jasiewicz, 2011) and more specialized DEM processing software such as terraflow (Arge et al., 2003), the memory constraint exerted by MATLAB's primary use of the main memory is an obstacle even for medium-sized Published by Copernicus Publications on behalf of the European Geosciences Union.
DEMs.Although the spread of 64 bit computers has allayed the memory problem somewhat, MATLAB's heavy use of double-precision data imposes limits that are too rigid for larger analyses and modeling tasks.
Here we introduce several important modifications and various extensions that come with the release of version 2 of the TopoToolbox.Specifically, we report the development of an object-oriented design towards the representation of grids, flow routing, and stream networks and give account of a suite of GUIs that provide easy access to a variety of new and existing methods frequently used in topographic studies.Our strategy to reduce memory constraints, which have been particularly pertinent to flow-related algorithms, enables much faster and memory-efficient work flow and thus the processing of much larger DEMs than with previous versions of TopoToolbox.We believe that these advances will promote the development of tools and models in Earth surface sciences.

Object-oriented design
Gridded DEMs and terrain attributes incorporate a variety of information other than a gridded set of points.A DEM has physical units and is georeferenced; that is, it is associated with physical space by a set of geographical or map coordinates.In addition, DEM derivatives (or terrain attributes) are tightly linked to algorithms and models and lend themselves to specific types of analysis (Pike et al., 2009).These task-and object-specific properties provide the basis for an object-oriented approach to DEMs.In object-oriented programming (OOP), data, attributes, and procedures are coupled into objects that encapsulate the model of what each object represents (Register, 2007).Typical objects in DEM analysis are surface flow paths and the resulting stream networks.In TopoToolbox the aim of using OOP is to organize data and functions into classes with specific object-related attributes that organize and encapsulate information on spatial referencing, flow topology, and geometry and thereby facilitate data handling and analysis.
TopoToolbox 2 includes three main object classes: GRIDobj, a class to store and analyze gridded data (grid objects); FLOWobj, a class for flow-path networks (flow objects) that is derived from an instance of GRIDobj; and STREAMobj, which represents the channelized fraction (stream objects) of the flow-path network from which it is derived.Additional object classes, such as SWATHobj, which contains swath profiles that are derived along directed linear features (e.g., an instance of STREAMobj), are more tool-specific and not as tightly coupled within TopoToolbox as the three main object classes.These classes are templates for the creation of specific instances of each class.Whereas an instance of GRIDobj simply contains the data grid as well as georeferencing data, instances of FLOWobj and STREAMobj implement a novel storage approach that represents the directed network of flow paths and stream networks, respectively.

A graph-theoretical representation of flow directions
Topography governs the transport mode and the magnitude of lateral material and water fluxes on the Earth's surface and in the near-surface underground.Among topographic attributes, the flow direction probably belongs to the most frequently used, because it forms the basis for many other hydrologyrelated variables that explain or control many key environmental processes (Wilson and Gallant, 2000).Algorithms to derive flow directions are classified into single-neighbor and multiple-neighbor flow algorithms (Gruber and Peckham, 2009).Single-neighbor algorithms pass the flow from each cell to its lowest neighboring cell (usually referred to as the SFD or D8 algorithm; O' Callaghan and Mark, 1984).Multiple-neighbor algorithms, such as MFD (Quinn et al., 1991), D∞ (Tarboton, 1997), DEMON (Costa-Cabral and Burges, 1994) or the mass-flux method (Gruber and Peckham, 2009), encompass a number of different approaches to model dispersive flow, which is particularly important for simulating water distribution on hillslopes.TopoToolbox supports the SFD and MFD algorithms.In contrast to other GIS software, in the first version of Topo-Toolbox, flow directions were stored in a sparse, weighted adjacency matrix to represent the flow network.While this strategy allows for calculation of flow accumulation and other flow-related terrain attributes using MATLAB's builtin sparse matrix routines (Schwanghart and Kuhn, 2010), it also requires large memory space.The MFD and SFD matrix require approximately six and two times the memory space of the DEM, respectively, if all variables are stored in in the same data formats.
In the newly introduced flow direction class (FLOWobj), the storage of flow directions takes advantage of the fact that flow networks can be modeled as directed acyclic graphs (DAGs).Our approach is similar to MATLAB's sparse matrix storage organization but uses the "tuple" format, which is a collection of row, column and value 3-tuples of the nonzero elements of the flow direction matrix (Kepner, 2011).The important difference is that row and column indices are stored in topologically descending order, an approach that was recently adopted in a numerical landscape evolution model (LEM) (Braun and Willett, 2012) and a dynamic overland flow-routing model (Huang and Lee, 2013).Topological sorting is only possible for DAGs and refers to the permutation of nodes that brings the flow direction matrix to a lower triangular form (Chen and Jacquemin, 1988).The storage of flow directions in topological order in a FLOWobj speeds up the computation of flow-related topographic attributes by about 500 % compared to the previous version of TopoToolbox.Furthermore, the FLOWobj requires 89 % of the memory of the associated multiple-flow-direction matrix and 33 % of the single-flow-direction matrix.Because the nodes can be divided into independent subsets according to drainage basin affiliation, they can be distributed to different processors, so that flow-related terrain attributes can be evaluated in parallel, an important aspect for reducing computation times in LEMs (Braun and Willett, 2012).
In a topologically sorted FLOWobj, each grid cell's index ix appears before its downstream neighbor ixc if there is a directed link from ix to ixc.For routing algorithms that simulate flow through the directed network, the topological order entails that each link or edge is only traversed once, which simplifies flow computations and decreases running time to O(n).For example, flow accumulation is calculated by sequentially updating an initial weight raster A with rows x cols = n cells connected by a number of nrEdges directed links.The links connect upstream cells addressed with the vector of linear indices ix with their respective downstream neighbors indexed by the vector ixc.A = ones (row, cols) ; (1) end Note that the calculation is performed by in-place operations (that is, the function input A is directly overwritten) which significantly reduces overhead memory of the algorithm.The flow direction object (FLOWobj) can be created from an existing flow direction matrix (SFD or MFD) or a DEM.The latter is currently supported only for D8 flow routing, but allows users to simultaneously define if and how flow directions across flat terrain and topographic depressions are derived.Sinks and contiguous, flat areas in a DEM can be traversed using an auxiliary topography calculated by a grayweighted distance transform algorithm (Soille et al., 2003;Metz et al., 2011;Schwanghart et al., 2013).The resulting flow directions tend to correspond more closely to actual flow paths compared to those that are derived from a DEM that was hydrologically conditioned using "flood filling" (Poggio and Soille, 2011).

Performance comparison
To evaluate the performance of TopoToolbox 2, we compared the runtimes of tools implemented in TopoToolbox with those of similar software.We emphasize that times to perform these computations fail to provide a concise, comparable measure since different programs pursue different strategies to store, process and access the data.TAUDEM 5, for example, does not hold the entire DEM in the main memory but partitions it into domains that are processed in parallel (Tesfa et al., 2011) and thus shows its particular strength with data sets that would exceed main memory capacity.Unlike TopoToolbox 2 and most other programs that calculate flow accumulation based on a data set that contains flow directions, SAGA GIS calculates flow accumulation directly from the DEM.
www.earth-surf-dynam.net/2/1/2014/Earth Surf.Dynam., 2, 1-7, 2014 We compared the performance between TopoToolbox 2 and ArcGIS 10.1, TauDEM 5, SAGA GIS 2.1 and Whitebox GAT 3.0.8.We used a 6 • × 6 • tile of the 3 arcsec DEM made available by Ferranti (2013) covering the Alps and their foreland, which we reprojected into a Universal Transverse Mercator projection.We chose this region since it includes a high diversity of landscapes ranging from high-mountain areas to flat and low-lying topography and encompasses frequent sinks and flat areas that either arise from valley narrowing or lakes or as a result of the integer representation of elevations.The resulting DEM has 5537 rows and 5484 columns and a cell size of 79 m.We measured the times that each program requires (1) to hydrologically condition the DEM by filling sinks, (2) to derive D8 flow directions and (3) to calculate flow accumulation.We repeated this procedure after resampling the DEM to a cell size of 158 and 316 m, corresponding to 2769 × 2742 and 1385 × 1371 rows and columns, respectively.
Table 1 contains the benchmark results and shows that the elapsed times required by the software to perform the respective operations are highly variable.Sink processing is performed the fastest with TopoToolbox, which resorts to MATLAB's image-processing toolbox's built-in morphological reconstruction algorithm (Vincent, 1993).Besides the adoption of different algorithms by the other programs, reasons for not being as fast may include data swapping between the main memory and the hard disk to avoid memory problems (ArcGIS, TauDEM) or due to minimum slope imposition onto flat sections as implemented in SAGA GIS and Whitebox GAT.Whitebox GAT performs better in deriving D8 flow directions, which is likely attributed to TopoToolbox's computationally more demanding approach that enforces flow convergence in flat terrain (Soille et al. 2003).Finally, the new approach in TopoToolbox to represent flow directions leads to approximately 20 times faster calculation of flow accumulation than TauDEM 5, which we found to be second fastest among the programs that we compared.

Further changes and enhancements
The main goal of TopoToolbox is to provide access to numerical tools for DEM analysis in the form of functions that can be used as a library in custom scripts or functions for advanced analyses or modeling purposes.Most functions available with TopoToolbox 1 are now methods associated with the new classes GRIDobj and FLOWobj.Changes to calling syntaxes are few and were mainly made since the new classes encapsulate information such as cell size and spatial referencing in their properties, which previously had to be supplied to functions as additional input arguments.Thus, function syntaxes have become simpler and more intuitive.A new suite of functions was developed around the class STREAMobj that allows for creating, modifying and analyzing stream networks.Emphasis is placed on tools that facil-itate the study of drainage networks and their relation to the tectonic setting, e.g., the analysis of longitudinal river profiles using slope-area plots (Wobus et al., 2006) or χ plots (Harkins et al., 2007;Perron and Royden, 2013;Scherler et al., 2013).
Additional performance increases were achieved by writing many flow-related algorithms as subroutines in C programming language executed by MATLAB (C-MEX), which provides 5-10 times faster evaluation of algorithms such as flow accumulation or drainage basin delineation.Computation speed is particularly significant when repeatedly calculating flow networks and accumulation areas, such as in numerical LEMs (e.g., Willgoose, 2005;Pelletier, 2008;Tucker and Hancock, 2010).While rivers act as major pathways for sediment or upstream-migrating knickpoints in mountain environments, they encompass only a small portion of the entire landscape.Accordingly, in LEMs the heterogeneous distribution of upslope areas leads to numerical solutions that require very small time steps to be numerically stable.Though implicit methods exist that relax the time step constraints (Fagherazzi et al., 2001;Braun and Willet, 2013), they fail to include the planform changes of river and flow network development, thus making frequent updating of the flow directions necessary.With its new implementation, TopoToolbox 2 can serve as a function library for the development of LEMs in MATLAB by providing fast-executing codes in a programming language that already supports a large library of numerical tools.
Visual exploration and interpretation of DEMs and data derived during their analysis constitute an integral work step in most studies.Furthermore, certain tasks are often easier and more rapidly accomplished through direct interaction with DEMs and their derivatives, as compared to writing and running scripts.Thus, TopoToolbox 2 now comes with several applications that allow usage and exploration of various toolbox functions with a GUI (Fig. 1).
The application topoapp is an explorative tool for visual analysis of DEMs and provides rapid access to many basic and several advanced functions of the TopoToolbox 2. Importantly, this application is defined as an own object class that, when working on an instance of topoapp with the GUI, remains in the workspace and is constantly updated with every action of the user.Additionally, it was designed in a way that allows users to add their own functions or modules to it and therefore provides an expandable platform for displaying the graphical outcomes of DEM analysis with the TopoToolbox and other MATLAB tools.With the currently embedded functions, users can quickly and easily extract and plot simple profiles; entire drainage networks, or subsets of them; watershed boundaries; and individual channel reaches (Fig. 1).The created features can be used to build swath profiles, to subset a DEM, to compute basic catchment statistics and to create χ plots; furthermore, all features can also be exported to common formats, amongst other options.Manual editing of DEMs with spurious sinks is often required to produce hydrologically correct DEMs.Although automated ways to distinguish between actual and spurious sinks exist (Lindsay and Creed, 2006), choosing which sinks should be filled, carved or retained may be preferred and requires visual interpretation of the data and results.The new application preprocessapp enables a combination of automatic and manual hydrological correction of a DEM by providing interactive tools to fill topographic depressions or carve along least-cost paths to breach features that obstruct digital flow paths.The application flowpathapp is a tool to map individual streams or a stream network by interactively setting channel head locations, and slopeareatool provides an interactive tool to study slope-area relationship for individual streams or stream networks.

Conclusions
With the release of version 2, the TopoToolbox for MATLAB has become a more memory-efficient and faster tool for digital terrain analysis.The main difference to the previous ver-sion is the adoption of an object-oriented representation of grids, flow direction and stream networks.Topological ordering reduces the memory bottleneck of the previous, sparse matrix approach to flow direction while increasing computation speed and allowing parallel computing.Software quality and particularly reusability are improved so that Topo-Toolbox can be used as a framework for implementing dynamic, hydrological and geomorphological models.Finally, newly introduced, interactive tools will not only be valuable for researchers to perform rapid analysis but also for conveying and exploring the richness of Earth surface processes in classrooms or labs.
TopoToolbox 2 is platform-independent, free and open software, but requires MATLAB 2011b or later as well as the Image Processing Toolbox.Version 2.0 is included in the supplementary material.The software, future releases and updates are hosted by the Community Surface Dynamics Modeling System (CSDMS) server (http://csdms.colorado.edu/).

Figure 1 .
Figure 1.Example layout of topoapp, a graphical user interface that enables access to the majority of TopoToolbox functions.

Table 1 .
Performance comparison between TopoToolbox 2 and similar software.Performance is evaluated as computation time in seconds required to run a specific tool on a computer with an Intel Core i5 CPU (M540) with 2.53 GHz and 4 GB RAM and a solid-state drive.http://hydrology.usu.edu/taudem/taudem5/index.html; version 5; we used the command line interface to run the application with 8 parallel processes.Numbers in parentheses refer to computation time without overhead time for reading and writing. 3http://www.saga-gis.org/;version 2.1; fill sinks algorithm proposed by Wang and Liu (2006) (XXL) with minimum of slope of 0.001 • ; recursive catchment area algorithm.