Supporting Information for GERALDINE (Google earth Engine supRaglAciaL Debris INput dEtector)-A new Tool for Identifying and Monitoring Supraglacial Landslide Inputs

1. Randolph Glacier Inventory (RGI) v6.0 errors The Randolph Glacier Inventory v6.0 (Pfeffer et al., 2014) is a global dataset of digitised glacier outlines, excluding the ice sheets of Greenland and Antarctica. These outlines were digitised from images acquired between 1943 and the present day. This large temporal array of images introduces a variety of dataset errors, due to rapid glacial thinning and retreat, in response to climatic change over the last century. GERALDINE’s delineation of new debris additions on glaciers is impacted by these discrepancies.

: Retreat of the Columbia Glacier, Alaska and the impact on RGI v6.0 outline accuracy. A) RGI v6.0 glacier outlines (green) and the area, which is no longer glaciated (orange). B) GERALDINE new debris results in this area for 2018. Landsat 8 background image from 2019-08-20.

Cloud mask threshold validation
To determine the optimum threshold for cloud masking we chose 13 rock avalanches (RA) from the validation dataset, to validate different thresholds against. The high prevalence of RAs in Alaska meant all events chosen for cloud mask threshold validation occurred in the region, with nine occurring in Glacier Bay National Park (Coe et al., 2018) and four occurring in the eastern Alaska Range, in particular the area around Mt Hayes (Jibson et al., 2006) (Table 1). This selection incorporated two areas with different climatic regimes (marine vs continental climate) and a wide temporal coverage incorporating all Landsat satellites, with particular focus on the current constellation, due to the tools main use for aiding RA detection in the present day. Five different cloud thresholds were tested: 10%, 20%, 30%, 50% and 90%, to investigate their influence on highlighting new RA events. GERALDINE was run for the year of the event, or in the case of the eastern Alaska Range RAs, the year after the event, because these RAs occurred in November 2002 and therefore appeared in no Landsat imagery during that year. New debris layers generated by GERALDINE were downloaded, and the area of new debris detected in the location of RAs, was compared to digitised RA outlines from the same year. A cloud mask threshold of 20% highlighted the largest area of new RAs, delineating 60.6% of RA area ( Figure S2). The 10% cloud threshold masked too much of each image, inhibiting its ability to highlight new debris. Higher thresholds did not mask enough cloud discrepancies out of images resulting in misclassification of cloud as debris in previous year debris extents. This misclassification prevented GERALDINE from highlighting new debris because debris was already present in the previous year. GERALDINE therefore utilises a 20% cloud threshold by default.

NDWI mask
To evaluate NDWI thresholds we utilised the 2018 glacial lake inventory of (Wang et al., 2020). We chose two glacial regions with a high frequency of glacial lakes, the Bhutan Himalaya and Sagarmatha National Park. Lakes were clipped to RGI boundaries for analysis and any lakes <0.005 km 2 were omitted, resulting in 82 and 98 lakes respectively ( Figure S3). The average NDWI values of all lakes were mapped across 2018 ( Figure S4) along with average NDWI values of supraglacial debris (NDSI debris masks from Scherler et al., 2018) and clean ice/snow, calculated by differencing debris masks from RGI boundaries. NDWI values follow a seasonal trend over 2018 with lower values between February and May ( Figure S4), potentially due to the ephemeral state of water within them i.e. frozen in winter months, modifying their spectral signature to that similar of clean ice and debris. These large fluctuations in values between -0.056 and 0.717 inhibits any threshold from masking these features year-round. GERALDINE employs a 0.4 threshold to mask water bodies, as debris and clean ice do not get close to this value (highest observed was in Sagarmatha National Park on 02/12/2018 at 0.287) and it has been widely used in other studies in mountainous regions (Acharya et al., 2018;Zhao et al., 2018). We recognize that a solitary threshold is sub-optimum (Li and Sheng, 2012), particularly for global scale analysis but the user can tune this value (code line 245 and 255) for their specific ROI if there is a high prevalence of water bodies.
Average water body size within RGI boundaries from Wang et al. (2020)  RGI boundaries due to glacial retreat since RGI boundaries were digitised. Updated glacier margins would omit proglacial lakes from analysis and can be utilized if available (see section 2.0 Method) but the RGI is the best worldwide inventory. The dark spectral nature of water bodies means they are often depicted as debris in the previous year debris extent, due to their low reflectance in the visible spectrum, so they do not influence detection of new debris additions, preventing false detections.

GERALDINE User Guide
The tool is freely available to use at (https://code.earthengine.google.com/79a0e5b27c2d7824559314e09f5971a9?hideCode=t rue) but requires a Google account authorized to use Google Earth Engine (GEE), which is free of charge if used for research and educational purposes (sign up for Google account here: https://accounts.google.com/signup/v2/webcreateaccount?flowName=GlifWebSignIn&flowEntry =SignUp and register for GEE access here: https://earthengine.google.com/). Exporting of tool outputs requires a Google Drive account, which is complimentary with the Gmail account required to sign up for GEE. The tool is open access and GUI (graphical user interface) driven. Tutorials on how to use Earth Engine are available at https://developers.google.com/earth-engine/ but here we will provide instructions on how to use our tool to detect supraglacial debris inputs.
Step 1: Open v1.0 of GERALDINE (the version described in the manuscript) by clicking on this link: https://code.earthengine.google.com/79a0e5b27c2d7824559314e09f5971a9?hideCode=tr ue or access the latest version of GERALDINE at https://doi.org/10.5281/zenodo.3524414 (if using the latest version these instructions may differ slightly).
Step 2: You will be greeted by the start page shown below. Click 'New project' to start analysis.
Step 3: Draw region of interest (ROI) by zooming in and clicking around an area to draw a polygon (Note: tool areas should not exceed 5000 km 2 due to GEE memory capacity). Alternatively, upload a shapefile of your ROI to Google Earth Engine (see: https://developers.google.com/earthengine/importing for more information) and specify the GEE file path, which can be found by sliding down the top panel and navigating to the 'Assets' tab in the top left hand panel (highlighted by red box in image below). Click OK button when your ROI is defined.
Step 4: Specify date range from which you want the tool to detect new debris additions and select if you would like to use tier 2 and/or real time Landsat imagery in addition to the default tier 1 imagery. Tier 2 imagery is useful if minimal tier 1 imagery is available i.e. in Antarctica, and real time imagery should only be used if the event has occurred in the previous 16 days (Note: for real time imagery 'End date' must be set as todays date). Tool accuracy and speed is optimum if date ranges are annual or sub-annual and only tier 1 imagery is utilised. Date must be in the format of Year -Month -Day e.g. 2018-12-22. Press OK once start and end date are defined.
Step 5: Tool should display results on map (it can take up to 3 minutes for layers to load if your ROI is large). Two layers are created: a previous year maximum debris cover layer and a new debris additions layer. The user can view and toggle these layers by hovering the mouse over the 'Layers' button in the top right hand corner of the map viewer (highlighted by red rectangle in below image). To export the data click on the Export data button.
Step 6: Instructions are displayed detailing how to export data from GEE. Once you have navigated to the Task tab in the top right hand panel and clicked 'Run' next to the layer you wish to download (note: you do not need to wait for layers to load within GEE before you export). The following window will be displayed (see image below), prompting the user to confirm or alter the filename, confirm the export format (GeoJSON is strongly recommended because it decreases export time), and confirm the save location. Once data is exported, it can be used in a GIS of your choice. Alternatively, you can save your files as an Earth Engine asset, this is particularly useful for your ROI, enabling you to call it in during Step 3, instead of redrawing it every time you use GERALDINE.