Modelling water temperature in the lower Olifants River and the implications for climate change

models require less data input than physical models, which is particularly useful in data deficient regions. We validated a statistical water temperature model in the lower Olifants River, South Africa, and verified its spatial applicability in the upper Klaserie River. Monthly and daily temporal scale calibrations and validations were conducted. The results show that simulated water temperatures in all cases closely mimicked those of the observed data for both temporal resolutions and across sites (NSE>0.75 for the Olifants River and NSE>0.8 for the Klaserie). Overall, the model performed better at a monthly than a daily scale, while generally underestimating from the observed (indicated by negative percentage bias values). The statistical models can be used to predict water temperature variance using air temperature and this use can have implications for future climate projections and the effects climate change will have on aquatic species.


Introduction
Freshwater systems face compound effects of direct anthropogenic disturbances and climate change, making them among the most vulnerable ecosystems. [1][2][3][4][5] Climate change and the consequential rise in water temperature has had many adverse effects on freshwater fish communities, including disrupting trophic inter-dependencies, changing phenology, losses in species richness and diversity, mass mortality events, and extinctions. 1,[6][7][8][9][10] In subtropical southern Africa, warming is predicted to occur at more than double that of the global rate, and annual-average near-surface temperatures are predicted to rise by 6 °C by 2100. 11 The intergovernmental panel on climate change (IPCC) released a sixth assessment report under the RCP8.5 scenario to forecast future temperature changes, and a mean air temperature rise of 4-7 °C is anticipated, while the maximum air temperatures are predicted to rise by 4-8 °C in southern Africa by the end of the century. 12 This rise is compounded with a forecast of up to 40% less summer rainfall in southern Africa where evaporation rates can be as high as 65%, which will decrease effective rainfall. 13 Higher temperatures and lower effective rainfall, in conjunction with an increase in associated extreme drought events and the increasing demand for fresh water from a growing human population, is a concern for the persistence of freshwater ecosystems and their associated fauna. 11,14 For example, freshwater fish inhabit the upper limits of their thermal tolerance and will not be able to move or evolve fast enough to track climate change; therefore, the effects of rising temperatures will be detrimental to these taxa. 1 Forecasts of water temperature in freshwater rivers and streams have assimilated physical, statistical, and ensemble water temperature models. [15][16][17][18][19][20] An example of a physical model is the semi-Lagrangian River Basin Model (RIC) developed by Yearsley 16 to solve time-dependent equations for the thermal energy budget in rivers. It can be used to model climate change in rivers and integrate a macro-scale hydraulic model called variable infiltration capacity. 15 Both models require large amounts of data and many parameters that include solar and long-wavelength radiation, humidity, soil type, elevation, land cover, precipitation and various river channel parameters, making them dataintensive and constrained by model parameter availability. [15][16][17] Statistical water temperature models use variables such as air temperature to estimate current and/or future water temperatures. Although both linear and non-linear regression models have been developed in the pursuit of modelling water temperature using air temperature 21,22 , linear models are more accurate and produce a better fit 23 . These types of statistical models require less data input than physical models and are easier to execute. 17 The aim of this study was to use statistical models based on historical data to calibrate and validate water temperature models in the lower Olifants River, South Africa. The lower Olifants River is a higher-order river that runs through South Africa's largest national park, the Kruger National Park, and supplies water to both South Africa and Mozambique. 24 This region is water stressed, as the Olifants River Basin has been heavily exploited and over-abstracted. 25 Southern African rivers have unique thermal and morphological characteristics, and the use of statistical models developed on northern-hemisphere rivers is problematic. 26 We follow the framework of a statistical linear regression model developed by Rivers-Moore et al. 20 , which has been used to simulate water temperature in four other freshwater rivers in South Africa. The framework uses four options, with varying parameters: (1) air temperature parameters only, (2) air temperature parameters and flow, (3) air temperature parameters and relative humidity, and (4) air temperature parameters, flow, and relative humidity. The previous applications of this approach found that air temperature had the most significant influence, and that flow and relative humidity reduced model accuracy. 20 We validated the model using a second river within the Olifants River Basin -the upper Klaserie River.
This site is at a higher altitude and observed data are expected to be lower than those of the Olifants River. We aimed to test whether the statistical model was equal in efficacy for the Olifants and Klaserie Rivers and predicted that the simulated outputs would be similar for both rivers.

Calibrating the model
The air temperature data from the Hoedspruit weather station were primarily used and supplemented by data from Phalaborwa and Giyani. Mean daily, mean monthly, minimum monthly, and minimum daily temperatures were calculated. The general regression model from Rivers-Moore et al. 20 based on correlations between minimum and average air temperatures and the average water temperature (Equation 1) was adapted for this study: where WT max = maximum water temperature, AT avg = mean air temperature, AT min = minimum air temperature, a = mean air temperature coefficient, b = minimum air temperature coefficient, and c = regression constant.
Both monthly and daily data sets were calibrated using August 2015 to November 2017 water temperature data from Mamba Weir. Periods without observed data were deleted to create the best model fit. The parameters were deduced by keeping b and c constant while changing the value of a, and then repeating this process with b and c. Parameter a relates mean air temperature to mean water temperature while b reduces the effects of high diurnal air temperatures (minimum and maximum) on WT max . The set of constants was chosen based on the appearance of the hydrothermograph, the calculated residuals (i.e. the difference between simulated and observed water temperature) and the highest Nash-Sutcliffe efficiency (NSE) that represents an indicator of how well the observed versus the simulated data fit the 1:1 line. 27

Model evaluation statistics
The statistical analyses of Moriasi et al. 28 were used in addition to the hydrothermographs to evaluate the model performance. These statistical analyses included: the NSE (Equation 2); the coefficient of determination (R 2 ) showing the degree of variance between the simulated and observed data sets, which ranges between 0 and 1 (Equation 3); the percentage bias (PBIAS) which measures the average likelihood for the simulated data to be higher or lower than the observed data (Equation 4); the root mean square error (RMSE; Equation 5) which is used to calculate the observations standard deviation ratio (RSR; Equation 6) and combines error index statistics and scaling factors by standardising the RMSE using the standard deviation of the observed data as follows: where Y obs = the observed temperature, Y sim = the simulated temperature, Y obs , mean = the mean of observed data for the constituent being evaluated, Y sim , mean = the mean of the simulated data for the constituent being evaluated, and n = the total number of observations.

Results
Hydrothermographs were generated for calibration and validation data for daily and monthly timescales for Mamba Weir (Figure 2 A-D) using Equation 1 and the following constants: a = 0.900, b = 0.132, and c = 1.600, and for daily and monthly timescales for Klaserie River (Figure 3 A,B) using the model Equation 1 that generated the following constants: a = 0.600, b = 0.132, and c = 1.700. Figure 2A shows the calibration hydrothermograph for Mamba Weir using monthly mean water temperature from August 2015 to November 2017. The mean observed water temperature was 23.70±3.32 °C, while the mean simulated water temperature was 23.95±2.96 °C. Both observed and simulated hydrographs produced a strong seasonal water temperature pattern (Figure 2A). The model evaluation statistics performed for each model ( As with the monthly time-step, the graphic representation of the simulated and observed water temperatures are very similar, which is also supported by a low PBIAS -0.17%. There is more variation in daily temperatures of both the simulated and observed temperatures between October and January. Once again, the NSE and R 2 are slightly lower than the monthly time-step at 0.78 and 0.80, respectively, while RSR remains relatively low at 0.47 (Table 1). Figure 3A shows the hydrothermograph for Klaserie River using monthly mean water temperature from March 2011 to April 2013. The mean observed water temperature was 16.00±2.80 °C, while the mean simulated water temperature was 16.18±2.56 °C. Both observed and simulated hydrographs produced a strong seasonal water temperature pattern. The residuals for monthly mean water temperature are on average 0.19±0.57 °C ( Table 1). The NSE and R 2 are very good at 0.95 and 0.96, respectively (Table 1). PBIAS is -1.18% (Table 1), indicating simulated data plot below observed data, which can also be seen in Figure 3A. The RMSE and RSR values are low at 0.59 and 0.21, respectively (Table 1). Figure 3B shows the hydrothermograph for Klaserie River using daily mean water temperature from March 2011 to April 2013. The mean observed water temperature was 16.09±2.95 °C, while the mean simulated water temperature was 15.76±2.92 °C. Both observed and simulated hydrographs produced a strong seasonal water temperature pattern. The residuals for monthly mean water temperature are on average 0.32±1.23 °C ( Table 1). The NSE and R 2 are 0.81 and 0.83, respectively (Table 1). PBIAS is -2.05% (Table 1), indicating simulated data are lower than observed data, which can also be seen in Figure 3A. The RSR value is low at 0.44 (Table 1).

Discussion
The model predicts water temperature variance based on air temperature with a degree of accuracy in the seasonal and diurnal time frames that is biologically relevant, for both the Mamba Wier and the upper Klaserie sites. The NSE is one of the most widely used statistics for validating water models, and many studies have found that NSE values of ≥0.6 are satisfactory, while values ≥0.75 are considered very good. [28][29][30][31][32][33][34][35] The NSE determines how closely the observed and the simulated data fit the 1:1 line and, similarly, the R 2 measures variance between observed and simulated which indicates the fit of the model. 28 Models such as the Hydrological Simulation Program FORTRAN (HSPF) had NSE values of between 0.6 and 0.7 for analysis of monthly water temperatures in tropical rivers of southern Malaysia. 23 This model produced NSE and R 2 values above 0.75 for both monthly and daily models, with monthly models performing slightly better.
The PBIAS is a measure of how often the simulated data differ from the observed data, and further has the ability to show whether the model is under-or overestimating simulating temperatures. 28,34 The results show that the simulations for both daily and monthly data sets resemble the observed data closely. The PBIAS indicates that the model tends to slightly underestimate the water temperatures, with this underestimation being more prevalent during the daily timestep, likely due to the model being unable to predict anomalous hot days. Our results are between -0.17% and -2%, whereas satisfactory PBIAS values are ±25% and very good values ±10%; therefore, our values are almost negligibly underestimating from the observed. 28 RSR incorporates the benefits of error index statistics and includes a scaling/normalisation factor. 28 A perfect model would have an RSR value of 0, indicating no residual variation and therefore low RSR and RMSE values are considered good indicators of model performance. 32 The RSR values produced are all lower than 0.5 and are considered very good. 28 The model had a tendency to underestimate water temperatures, which must be considered in future projections. While this underestimation is very small, a conservative model for climate predictions is preferred over a more aggressive model that will give a false representation of the increase in water temperatures. This may be due to the model being over-simplistic and not incorporating variables such as river channel metrics, geology, groundwater metrics, vegetation, humidity, solar radiation, evaporation and various other parameters that may drive or influence water temperature. 26,36 However, it has been demonstrated that the addition of variables such as relative humidity, rainfall and flow in a multiple regression model had little effect on the model, and, in the case of flow, even reduced accuracy. 20 While, conversely, air temperature has been shown to be the most important driver of water temperature, and in the absence of additional data, produces a simplistic model that accurately predicts water temperatures. 20,26 Our study also demonstrates that, in the case of the two study river sites, a simple statistical model can simulate water temperature variance with accuracy and precision that is biologically relevant. This is particularly important in data-deficient regions, such as in Africa, where climate change studies on freshwater systems are important given the alarming rise in air temperature. 11 An important caveat is that, while air temperature is the only input variable to the models, the parameterisation differs between sites. This means that air temperature alone does not account universally for water temperature, and models need site-specific calibration. This shortcoming is perhaps relevant at large spatial or temporal scales, but a critical implication of the models is that diurnal and seasonal variances (as opposed to absolute values) in water temperature are strongly driven by variance in air temperature. As a first-order approximation of the impact of long-term water temperature drivers, such as climate change, on river biology, this is very useful, but for a universally applicable solution, it is necessary to invoke more complex models.
Complex models, such as multiple regression models, typically have more input parameters, making them susceptible to equifinality. Equifinality is common in hydrological models, and in this context refers to the likelihood that multiple sets of parameters will produce equivalent models. [37][38][39][40] There are sources of equifinality in models, namely over-parameterisation and errors in the observational or input data of parameters. 39,40 Errors in observational data not only cause equifinality, but also reduce accuracy 40 , with the more parameters added, the more observer bias or collection errors added to the model.
The study prediction that the model would successfully simulate water temperature for both rivers was correct, despite the difference in river order and altitude. While Rivers-Moore et al. 20 , as well as our study results, support the use of simple, linear statistical models in simulating water temperatures using air temperature within South African rivers, this can be applied in other river study sites where there is a deficit of data. Future studies should focus on the effects of global climate change on freshwater systems and include both the physical and biological impacts. Currently, studies on invertebrates and fish within South African rivers are showing the potential impacts of rising water temperatures on species' thermal tolerances [41][42][43][44] ; modelling future water temperature of these rivers is vital towards the understanding of when these impacts will take effect and guide mitigation actions.