An open access geospatial database for the sub-Antarctic Prince Edward Islands

of invasive species. However, much of the geospatial data that currently exist have limitations in spatial coverage and/or resolution, are outdated, or are not readily available. To address these issues, we present an online geospatial database for the Prince Edward Islands (both islands) produced from a high-resolution digital surface model and satellite imagery. This database contains vector files, raster data sets, and maps of topographical and hydrological parameters. It is freely available to download from Figshare – an open access data repository. We encourage the South African polar science community to make use of similar platforms for improved data sharing practices.


Introduction
The sub-Antarctic Prince Edward Islands (47°S 38°E) -consisting of Prince Edward Island and Marion Island -are sentinels for terrestrial and marine research in the southern Indian Ocean (Figure 1). 1 Located just north of the presentday Antarctic Polar Front and dominated by a hyper-maritime climate, the Islands provide a unique opportunity to study ecosystem responses to climate change. [2][3][4][5][6][7] The research projects conducted at the Prince Edward Islands cover a range of botanical, geological, geomorphological, and biological studies. 8,9 On the larger Marion Island, scientific research has been continuous for the last five decades, whereas on Prince Edward Island, access has been restricted to a single contingent of 10 people every four years 10 (Figure 1). Terrestrial science, therefore, occurs predominantly on Marion Island while at Prince Edward Island it is limited in scope with most progress in botanical and ornithological studies. 9 Geospatial information of Marion Island's topography has aided scientific investigations by not only providing the backdrop for site selection and planning of sampling strategies but also interpreting and modelling landscape and ecosystem evolution. [11][12][13][14] Since the introduction of handheld Global Positioning Systems (GPS) in early 2000s, terrestrial multi-and transdisciplinary research on Marion Island has increasingly started to include assessment of fine-scale interactions within the landscape to understand ecosystem responses to past and present climate change, as well as the impacts of invasive species. [15][16][17][18] Individual-based population studies focusing on various animal species have, by necessity, also been structured around specific geographical localities at the Islands to aid in experimental design. 19 Not only do scientific endeavours depend on accurate geospatial information 20 , but conservation efforts, such as the planned mouse eradication programme on Marion Island 21 , require precise geospatial data to support the planning phase of (if successful) the world's largest eradication programme of mice from an island. However, much of the geospatial information for the Prince Edward Islands was previously only available in hardcopy format 22,23 or, when such data have become available in digital format 24 , they have typically had a limited spatial coverage and/ or resolution, particularly for Marion Island's west coast 25 . Generally, such geospatial data are shared informally among the scientists who work on the Prince Edward Islands or are reproduced ad hoc from existing publications. Yet, the circulating data are rarely curated or updated, or are sometimes lost entirely as researchers retire or move on to other research sites. Some of the geospatial data needs have been addressed by the production of data on the Islands' geology 25 , but fine-scale topographical and hydrological data are still outstanding. Furthermore, the naming process of the Prince Edward Islands' features remains unfinished. 26 Since the first attempt to register Prince Edward Islands place names with the South African Geographical Names Council (Act 118 of 1998) in 2001 27 , only a select few features (e.g. Umkhombe, Mascarin and Resolution Peaks) have thus far been approved 26 . Most of Prince Edward Islands' place names are still considered 'provisional' 28 and are practically absent from the gazetteer of South African geographical names 26,29 . Nonetheless, these names are widely used in scientific spheres 9 and official policy documents 10 . However, these names are used piecemeal in subject-specific or region-specific works and there is, therefore, a need for a complete list of place names for the Prince Edward Islands, whether officially recognised, provisionally accepted, or colloquial, as these are currently not readily available in the public domain.
Data sharing issues are not unique to the South African National Antarctic Programme (SANAP) science community. Globally, the focus on data sharing practices or 'open science' is increasing 30,31 and has already transpired with specific African 32,33 and South(ern) African perspectives. The push from government (through the South African Spatial Data Infrastructure Act, 54 of 2003), funding agencies, publishers, and institutions and for improved data availability 33 , have encouraged sharing practices by several scientific fields, amongst others, ecology 34,35 , geomatics 36 , and soil science 37 . Therefore, a geospatial database for both of the Prince Edward Islands is provided here, which includes topographical data (e.g. contour lines, aspect, slope, and hillshade raster), and hydrological data (e.g. streams and lakes) that were produced from a 1 m x 1 m digital surface model (DSM). In addition, topographical and locality maps of both Prince Edward and Marion Islands are provided in downloadable PDF format. We augment these contributions with marine mammal research-linked coastline codes/ names that have been in long-term use for experimental design, and have consequently been adopted by the larger scientific community working at the Islands. Lastly, we provide a collated list of all the place names in use on Prince Edward and Marion Islands. The geospatial database, maps and the record of place names are available to download from Figshare (https://figshare.com/), an open access data repository.

Methods and results
All the geospatial data were generated in Esri ® ArcGIS ® Desktop 10.6 where the 'WGS 84 datum' and 'Transverse Mercator projection' with longitude 37°E as the central meridian (CM37E) were selected ( Figure 2). The mapping process for both Prince Edward and Marion Islands was based on a DSM with a 1 m x 1 m cell size resolution and 0.7 m vertical accuracy as the primary data source ( Figure 2A). This DSM was produced by the Chief Directorate: National Geospatial Information of the South African Department of Agriculture, Land Reform and Rural Development and completed in 2017 photogrammetrically using stereo Pléiades imagery and accurate ground control points captured in 2016. All the topographical data were generated directly from the DSM. A hillshade raster was generated from the DSM using the 'hillshade' tool ( Figure 2D). Slope (in degrees) and aspect were calculated using ArcGIS ® 'Slope' and 'Aspect' tools, respectively ( Figure 2B and 2C). Contour lines were produced by first smoothing the DSM with the 'Focal Statistics' tool (statistic type = mean) and using a 10-m vertical and 20-m horizontal cell-size neighbourhood, following the proposed methods of Price 38 . The 'Contour' tool was used to generate 10-m contours from the smoothed DSM raster and the final layer was cleaned by deleting all contours below sea level and contour line segments less than 50 m in length, to overcome the potential interference of artefacts ( Figure 2B and 2E). Drainage lines were generated using the Esri ® 'Hydrology' toolset's 'Fill' (z-limit=unspecified), 'Flow direction' (method=D8) and 'Flow accumulation' functions, following the procedures of Jenson and Domingue 39 . A flow accumulation threshold of 50 000 was used to determine the final drainage density by using the 'con' (conditional) tool. This threshold was considered sufficient to capture all the drainage lines previously mapped for Marion Island 28 , whereas a higher threshold would have produced excessive detail. The stream order of each drainage line was determined according to Strahler's classification, using the 'stream order' tool. The output raster was converted to a polyline feature (drainage lines) and smoothed at 30 m using a PEAK smoothing algorithm of the 'smooth line' tool ( Figure 2D). The stream order, as well as the names of well-known 22,23,28 stream channels, are included in the attribute data of the final 'drainage line' dataset layer (Table 1). All these geospatial layers were clipped to coastline polygons, sourced from National Geospatial Information in 2019. Waterbodies or 'lakes' were mapped with the combined use of the hillshade raster and satellite imagery from Earth Observing 1 -Advanced Land Imager (EO1-ALI), QuickBird (QB), WorldView 1 (WV1) and WorldView 2 (WV2). The EO1-ALI has a 10-m cell-size resolution, was captured on 5 May 2009, and is orthorectified and georeferenced. The resolutions, production dates and limitations to the spatial coverage of the imagery sets from QB, WV1 and WV2 have already been covered by Rudolph et al. 25,40 The QB, WV1 and WV2 imagery sets are not orthorectified but are rather only georeferenced. The outline of waterbodies were digitised at a scale of 1:1000 using the QB, WV1 and WV2 images, and then repositioned spatially using the EO1-ALI and hillshade raster as reference. Minor artefacts exist in the reflectance data of DSM in regions typically associated with cloud cover, scoria substrate or snow cover ( Figure 2E). These interferences invariably effect the accuracy of the hillshade raster in, for example, Marion Island's Central Highland or on the west coastal plains where such surfaces are widespread and cloud cover is common ( Figure 2F). In such cases, verification was done using available satellite imagery, existing maps 22,23,28 and over two decades' of field observations 8,19 which allowed for some lakes to be mapped for the first time. Alternative data validation is not possible at this time, as the reference data used in this study form the most complete, up-to-date and highest-resolution spatial dataset that is currently available. Still, geospatial data from these regions should be used with some caution. The attributes and use limitations of the final raster and vector layers are presented in   Table 1). Refer to Supplementary tables 1 and 2 for the coordinates of these feature points.   Table 1). Projection: Transverse Mercator CM37E.

Discussion
The geospatial database we have produced provides a valuable online resource for researchers working on the Prince Edward Islands. Prior to this database, geospatial data of the Prince Edward Islands existed either exclusively in hardcopy form [22][23][24] , had limited spatial resolution or were not readily available 28 . A digital dataset such as this, that provides fine-resolution geospatial data of both islands, will facilitate multiand transdisciplinary research and allow for a more comprehensive assessment of biotic-abiotic interactions on an island scale, as well as improve modelling capabilities. More specifically, scientific investigations, which consider slope, aspect or elevation as key variables in their studies, can assess these relationships at a finer scale, using the topographical data provided here. For example, our understanding of the development of geomorphic features through aeolian 47 , soil frost 4 and freeze-thaw 7,48 or mass movement processes 49 , has been limited to point or site-specific datasets. Similarly, studies that focus on indigenous or invasive species can now investigate the potential control of topographical and/or hydrological factors on their distribution at a larger scale. This can be applied to, for example, burrow-nesting bird species 13 , microorganisms 12,14 or plant communities. The effect of these topographic controls on variations in  16 , temperature 15 or precipitation 6 -can also be explored at a higher resolution. Furthermore, long-term landscape development such as the islands' geological and geomorphological evolution 44 , deglaciation 11 and island responses to climate change 50,51 rely heavily on the knowledge of topographical controls, which can be readily achieved by (accurate) mapping 25,40,50,52 . In addition, as the dataset also incorporates Prince Edward Island at the same spatial resolution as that for Marion Island, it provides a unique opportunity to model and predict processes (e.g. geomorphic) or ecosystem interactions (e.g. vegetation assemblages, species population distribution) for the less frequented Prince Edward Island. The combined use of satellite imagery and the DSM allowed for the mapping of numerous waterbodies (lakes), including some along Marion Island's west and southwest coasts which have never been mapped before. In addition, a compilation of commonly used (official, provisional and colloquial) place names for both Prince Edward Islands and their feature descriptions are presented here. This record provides a much-needed summary or baseline of current 'local knowledge' and can facilitate the process of presenting these names, or alternatives, to the South African Geographical Names Council to ratify their use.
The availability of published data (or lack thereof), and particularly spatial data, is an issue not unique to the Prince Edward Islands' scientific community, but one that exists in general scientific practice. 30,31,53 The South African government rightly recognises the need to encourage better data sharing practices through ratification of the South African Spatial Data Infrastructure Act (Act 54 of 2003). There are numerous advantages of data sharing 33,53 and successful practices have been realised by several scientific disciplines [34][35][36]54 . The increasing demand for data sharing has sparked the emergence of numerous online data repositories such as Figshare 55 , Mendeley Data 56,57 , and Zenodo 58 . A Registry of Research Data Repositories (https://www.re3data.org/) makes it possible to find a digital repository to suit the specific needs of a research lab or project. Most of these repositories enable the user to publish data under a Creative Commons Attribution Licence, which allows for the data to be used, shared and/or adapted, as long as proper credit is given. In other words, authors retain copyright of the dataset. This practice is further supported by sharing platforms through assigning digital object identifiers (DOIs) to datasets, making them fully citable. Alternatively, dedicated universal data hosting infrastructures such as the Group on Earth Observations e.g. ArfiGEOSS 59 , and the South African Earth Observation Network (SAEON) also provide the opportunity for earth science data to be curated. Martínez-López et al. 60 provide an overview of some of these platforms and we encourage scientists to explore and make use of these to improve access and curation of geospatial data. We further recommend that such practices form an integral part of the SANAP scientific community's mandate to foster open science.

Conclusion
Geospatial information provides the necessary geographical data for terrestrial scientific investigations. We provide here a topographical and hydrological geospatial database, produced from a 1 m x 1 m DSM of the Prince Edward Islands. These geospatial data will facilitate the consideration of finer-scale spatial variables in terrestrial scientific investigations at the Prince Edward Islands, and especially on Marion Island, from data collection to analysis and modelling phases of scientific investigations. Updated topographical maps of both islands are also available for download, along with locality maps, and lists inclusive of the Islands' place names and their localities are also provided. The geospatial database is downloadable from an open access data repository and the file formats ensure wide use across platforms. A more comprehensive integrated terrestrial and marine geospatial dataset is still needed to effectively monitor climate change impacts at the Prince Edward Islands and for the successful management of the Islands as a Marine Protected Area. For example, high-resolution bathymetry data of the ocean floor will facilitate an integration of terrestrial and oceanic studies to better understand the ocean-land interactions. We encourage research endeavours in the wider South African scientific community to support open science practices and make similar geospatial data readily available through open access data repositories, as has been done here.