Model inter-comparison for short-range forecasts over the southern African domain

FUNDING: African Academy of Sciences Climate Research for Development (grant no. CR4D-19-11) Numerical weather prediction (NWP) models have been increasing in skill and their capability to simulate weather systems and provide valuable information at convective scales has improved in recent years. Much effort has been put into developing NWP models across the globe. Representation of physical processes is one of the critical issues in NWP, and it differs from one model to another. We investigated the performance of three regional NWP models used by the South African Weather Service over southern Africa, to identify the model that produces the best deterministic forecasts for the study domain. The three models – Unified Model (UM), Consortium for Small-scale Modelling (COSMO) and Weather Research and Forecasting (WRF) – were run at a horizontal grid spacing of about 4.4 km. Model forecasts for precipitation, 2-m temperature, and wind speed were verified against different observations. Snow was evaluated against reported snow records. Both the temporal and spatial verification of the model forecasts showed that the three models are comparable, with slight variations. Temperature and wind speed forecasts were similar for the three different models. Accumulated precipitation was mostly similar, except where WRF captured small rainfall amounts from a coastal low, while it over-estimated rainfall over the ocean. The UM showed a bubble-like shape towards the tropics, while COSMO cut-off part of the rainfall band that extended from the tropics to the sub-tropics. The COSMO and WRF models simulated a larger spatial coverage of precipitation than UM and snow-report records.


Introduction
Weather and climate impact everyday life, while extreme events can cause loss of life and injuries as well as damage to property. The impacts of adverse weather events can be reduced if effective early warnings exist. Numerical weather prediction (NWP) is an integral part of early weather warnings because it provides weather forecasters with a longer lead time than what is available in the now-casting timescale. NWP models are based on the laws of physics that govern atmospheric dynamics and thermodynamics, and they use observations as inputs to forecast the future state of the atmosphere. [1][2][3] This process has improved significantly over recent years due to several factors. These factors include an improved understanding and representation of physical and dynamical processes in models, advances in observation technology and data assimilation techniques, and improved computational resources and capabilities 2-4 that make it possible for NWP models to run at high resolutions like a few kilometres to hundreds of metres. The advantage of running such NWP models include improved model forecast skill such as accurate numerical prediction of near-surface weather conditions (e.g. clouds, fog, frontal precipitation) and simulation of severe weather events triggered by deep moist convection (supercell thunderstorms, intense mesoscale convective complexes, prefrontal squall line storms, tornadoes and heavy snowfall from wintertime mid-latitude cyclones) 5,6 as well as heatwaves 7 . In addition, NWP models allow for formation of structures recognisable as convective clouds and can simulate cloud microphysics such that updraught life cycle and downdraught generation are addressed fairly. 1 Depending on several factors -such as the design of a forecasting system, model configuration, initial conditions for the model, and lateral and surface boundary conditions -different NWP models produce variable forecasts across different parts of the globe. 1,6,8 Southern Africa is characterised by numerous climatic regions that range from arid to temperate and Mediterranean winter-rainfall regions. [9][10][11] These regions are affected by different weather systems during different seasons, and have variable characteristics. In addition, the southern hemisphere has a higher sea area compared to land cover, which also has an effect on weather systems. 12 As a result, the simulation of these systems requires specific model parameterisation configurations which are compatible with the region, and which take into account localised influences such as land cover, soil moisture, and cloud processes. 13,14 The weather systems may vary in horizontal scale, duration, and intensity, resulting in different effects on communities. Therefore, it is critical to identify a highly skilled NWP model that can capture the weather events that affect the study domain.
Representation of physical (sub-grid) processes differs in NWP models, such that different inter-comparison studies have been developed. 8,[15][16][17] Dabernig et al. 8 conducted a model inter-comparison study to identify the most suitable model for predicting wind power over parts of Europe. Four NWP models or configurations were used: global deterministic European Centre for Medium-Range Weather Forecasts (ECMWF-DET), regional deterministic of the Austrian weather service (ALARO), ECMWF global ensemble prediction system (ECMWF EPS), and ECMWF global ensemble hindcast and reforecast of global ensembles (ECMWF-HC and GEFS-RF, respectively). For this comparison study, the ECMWF-DET was found to have the highest skill.
Mahlobo 17 performed a model inter-comparison of three configurations of the Unified Model (UM), namely, 12-km UM with data assimilation (DA), 12-km UM without DA and 15-km UM for a number of weather variables over South Africa. The overall results showed that the 12-km UM with DA had better and more reliable forecasts than its counterparts. Results further showed that the 15-km UM was more accurate and reliable than the 12-km without DA in simulating minimum and maximum temperatures, while the 12-km UM without DA performed better in rainfall forecasts than the 15-km UM.
The skill and capability of NWP models to provide valuable highresolution weather information are continuously improving. 6,18 This information includes improved location and timing of weather systems. 1 High-quality weather forecasts and information are important for saving lives, protecting the environment, assisting in the prevention and mitigation of weather-related hazards, as well as preventing economic loss in agriculture, energy, and other weather-sensitive sectors. 2,19,20 Developments in convective scale NWP have been made in recent years because of improved computational resources, to the extent that a grid spacing of less than 5 km can be used for a large domain. 1,2,19 Convection-permitting models simulate deep convection explicitly, and use convection schemes either in a limited way or not at all. 1 The United Kingdom's national weather service (Met Office) is amongst the model developers that have made massive advances in high resolution NWP modelling, with their global model running with a 10km grid spacing. Woodhams et al. 21 conducted a convection-permitting model inter-comparison for convective storm prediction over east Africa using the UM. The study was done over a 2-year period using the 4.4km UM convection-permitting model and the 17x25 km UM global model. The convection-permitting model performed better than the global model for sub-daily forecasts. However, within a 48-h forecast, both models showed little dependence on forecast lead time and large dependence on time of day. The authors recommended further research and consideration for ensemble forecasting.
Since 2016, the South African Weather Service (SAWS) has been using subsets from the 10-km global UM, and operationally running the UM at two convection-permitting resolutions, namely 4.4 km and 1.5 km. 22 Stein et al. 22 compared the three configurations in order to examine the benefits of increasing model resolution for forecasting convection over southern Africa. They identified benefits in using convection-permitting models in the timing of the diurnal cycle and precipitation amounts. However, the 4.4-km model showed a delayed onset of convection compared with the 1.5-km model, as well as an inconsistent bias for both convection-permitting models.
A model inter-comparison for nine NWP models, including the Weather Research and Forecasting (WRF), UM and the Consortium for Smallscale Modelling model (COSMO), was done for the simulation of the evolution of the coupled boundary layer-valley wind system. 23 The models were simulated using the same initial and boundary conditions, as well as basic physics settings. All the models depicted a similar performance for the evolution of the valley wind system, while significant differences in the simulations of the different aspects of the boundary layer and the along-valley wind were identified amongst the different models. The authors concluded that the source of differences was most likely differences in the simulated energy balance.
In this study, we investigated the performance of three NWP models -namely the UM, COSMO and WRF -over southern Africa to identify the model that produces the best forecasts for the study domain. These three models were selected for this study because they are already in use for generating operational forecasts in the Southern African Development Community (SADC) region. 12,24 The UM is run only in South Africa; however, it is the main operational model used by SAWS. COSMO and WRF are used operationally in other SADC countries including Botswana, Namibia and Tanzania. It may, however, be noted that these models are usually operationalised in the region with little to no proper evaluation of their performance.

Description of NWP models
High-resolution weather prediction models are essential for issuing suitable guidance for severe weather warnings as they can capture nearsurface and small-scale severe weather events and resolve complex topography. 16,25 Each model is run at a horizontal grid spacing of 4.4 km over the SADC domain, i.e. between 5° and 55°E, and 40° and 0°S. The UM simulations were obtained from the operational simulations produced by SAWS for operational use and were initialised at 00 UTC. COSMO and WRF simulations were produced for selected case studies and were also initialised at 00 UTC start time for a 30-h simulation on the Centre for High Performance Computing server (Council for Scientific and Industrial Research, Pretoria, South Africa). 24 We were interested in analysing only the first 24 h of model forecasts, hence the choice of lead times for the COSMO and the WRF. Outputs for these models were written out hourly. The 00 UTC cycle for the UM has a 72-h forecast lead time.

The Unified Model
The UM is the name given to an atmospheric and oceanic numerical modelling software suite developed by the UK's Met Office. 26,27 The model is designed to run with both global and limited area configurations. In addition, the modelling system can also run with the atmosphereonly component and can also be coupled with a dynamic land surface and the ocean. The UM is a seamless system which can be used for prediction across various spatial scales ranging from sub-kilometre to tens of kilometres and temporal scales which range from short term to multi-decades. 27 The radiation scheme employed was the radiative transfer by Manners et al. 28 , which is suitable for use with the two-stream radiation code of Edwards and Slingo 29 . Large-scale precipitation was parameterised using a scheme based on that of Wilson and Ballard's 30 mixedphase precipitation scheme. Light rain and drizzle falling speed were parameterised using the speed based on Abel and Shipway 31 . The warm rain processes (auto-conversion and accretion) were parameterised based on Khairoutdinov and Kogan's 32 scheme. Furthermore, the autoconversion and accretion were bias corrected for sub-grid variability in cloud and rain water based on the parameterisations discussed in Boutle et al. 33 The ice particle distributions were parameterised using Field et al.'s 34 scheme, which calculates the microphysical transfer rates between ice and other water species.
Other parameterisations were convection, boundary layer, clouds and land surface. Convection was represented by mass flux from the turbulent alternatives convection scheme of Gregory and Rowntree 35 with the assumption of many clouds per grid box. The boundary layer was parameterised using the method described in Lock et al. 36 Clouds were parameterised using the mixed-phase scheme similarly to precipitation. 23 The separate values for cloud water and cloud ice mixing ratios were used for fractional cloud cover. The calculated cloud fraction and condensate amounts by the cloud scheme were then used as inputs to the radiation scheme. 27 Land surface was parameterised using the Joint UK Land Environment Simulator (JULES) scheme. 37,38 The UM is run operationally as the main NWP model for short-range forecasting at SAWS. SAWS runs the UM with a horizontal grid spacing of 4.4 km and 1.5 km over southern Africa and South Africa, respectively, and these are updated four times daily, for the 00 UTC, 06 UTC, 12 UTC and 18 UTC cycles. The local configured UM obtains its initial and boundary conditions from the UK Met Office global UM model run with a horizontal grid spacing of 10 km. The 4.4-km UM, hereafter referred to as the UM, is run with 70 vertical levels over the southern African domain. The assimilation of local observations (i.e. data assimilation/ DA) at SAWS had not been concluded during the time of this study, hence the models were run without DA.

The Consortium for Small-scale Modelling model
The COSMO model is a non-hydrostatic limited area atmospheric prediction model, which is formulated in a rotated geographical coordinate system. 25 The COSMO was developed by a consortium of institutions, mainly European and Asian. 25 It obtains its initial conditions and lateral boundary conditions from the Icosahedral Non-hydrostatic (ICON) global model. 5 The ICON model employs a grid spacing of 13 km globally, which allows the COSMO to be simulated at meso-β and meso-γ scales where non-hydrostatic effects are more evident in the evolution of the atmosphere. 25 In this study, the COSMO model was run with a grid spacing of 4.4 km, 40 vertical levels and shallow convection scheme (reduced Tiedtke scheme 39 for shallow convection only). The time-step of 45 s was used for model simulations. The model runs used 39 the massflux convection scheme over a geographical rotated coordinate system with a generalised terrain following height coordinate system and userdefined grid stretching in the vertical 25,40 . A two-steam radiation scheme by Ritter and Geleyn 41 , which accounts for short-and long-wave radiation as well as full-cloud radiation feedback, and a multi-layer soil model by Jacobesen and Heise 42 , which includes snow and interception storage, were used for model runs. Numerical systems used for model runs include the Arakawa C-grid with Lorenz vertical grid staggering by Arakawa and Moorthi 43 , second-order finite differences for spatial discretisation and the Runge-Kutta split explicit time integration scheme by Wicker and Skamarok 44 . Other schemes employed in the running of the model were the Flake scheme, the sea-ice scheme and the finite differencing scheme. [45][46][47] The orography and land cover data for running the model were obtained from the US Geological Survey. 40

The Weather Research and Forecasting model
The WRF model serves as a back-up operational model for SAWS. The advanced research WRF model is a non-hydrostatic model. 48 The advanced research WRF model used was version 3.9.1, which was released in April 2017 (http://www2.mmm.ucar.edu/wrf/users/wrfv3.9/ updates-3.9.html). The model was developed in the late 1990s through a collaborative partnership between the US National Center for Atmospheric Research (NCAR), the National Oceanic and Atmospheric Administration (represented by the US National Centers for Environmental Prediction (NCEP) and the Earth System Research Laboratory), the US Air Force, the Naval Research Laboratory, the University of Oklahoma, and the US Federal Aviation Administration.
The WRF was set up to run with a horizontal grid spacing of 4.329 km x 4.329 km using 1250x1000 grid points, with a Mercator projection applied, and is centred over the SADC region. The model topography and boundary conditions were obtained from the US Geological Survey, with a resolution of at least 2 arc minutes. The WRF is run for up to 30 h ahead, with the input data provided 3-hourly from the Global Forecast System, which is located on the University Corporation for Atmospheric Research (UCAR) Research Data Archive (RDA) webpage (https:// rda.ucar.edu/datasets/ds084.1/index.html#sfol-wl-/data/) and has a horizontal grid spacing of 0.25°.
In the set-up, a time integration step was set at 15 s in order to capture high-resolution meteorological events, instead of the usual timestep (dt=6*dx). The model top was set at 40 km, with 70 vertical levels, as well as four soil levels. The physics set-up for model runs was as follows: • WRF Single-Moment 6-class scheme, which includes ice, snow and graupel processes suitable for high-resolution simulations 49 ; • New Tiedtke scheme, used previously in REGCM4 and ECMWF cy40r1 models; • RRTMG scheme, a new version Rapid Radiative Transfer Model added from WRF version 3.1, which includes the Monte Carlo Independent Column Approximation method of random cloud overlap and major trace gases 50 ; • Yonsei University scheme, Non-local-K scheme with explicit entrainment layer and parabolic K profile in unstable mixed layer 51 ; • Chen-Zhang thermal roughness length over the land, which depends on vegetation height, whereby 0 is assigned for the original thermal roughness in each sfclay option; • Noah land surface model: Unified NCEP/NCAR/AFWA scheme with soil temperature and moisture in four layers, fractional snow cover, and frozen soil physics. The modifications added improve the representation of snow and ice sheets. 52

Validation data
A number of observational data sets were utilised to quantify the model forecasts to ensure a reliable outcome. It is essential to have high-quality and comprehensive observations to help assess and improve model performance, as well as to help communicate the level of confidence that model users should have in forecasts. 1,2

Ground observations
SAWS operates a network of over 200 automatic weather stations from which observational data, hereafter referred to as synops, are available on an hourly or 6-hourly basis. Wind speed and surface temperature are available hourly, while accumulated rainfall and total cloud cover are available 6 hourly. Wind speed is observed at 10 m above the ground, and surface temperature is taken at 2 m above the ground. For this study, these observations are available only over South Africa, hence analysis against synops was done only over the South African domain between 15° and 35°E, and 38° and 20°S.

The Global Precipitation Measurement data
Global Precipitation Measurement (GPM) data are satellite-based precipitation estimates with a global coverage. 53,54 The GPM mission was launched in February 2014 as a successor for the Tropical Rainfall Measuring Mission. 55 Total precipitation estimate was downloaded in NETCDF format for this study. 53 The spacecraft used to collect GPM data has additional channels on both the dual-frequency precipitation radar and GPM microwave imager with capabilities to sense light rain and falling snow, with advanced observations of precipitation in the midlatitudes. 54 According to Skofronick-Jackson et al. 54 , GPM underestimates precipitation in the higher latitudes. The GPM data have been widely used over the African continent in order to bridge a gap in in-situ observations, which, for example, give a poor representation of observed amounts, intensities and locations of precipitation. [56][57][58] Suleman et al. 57 evaluated GPM data against ground observations over parts of South Africa. Their study showed that, although the data performance was variable from one location to the next, the GPM showed poor correlation with regard to rainfall magnitude and high accuracy with regard to rainfall volumes. According to Suleman et al. 57 , GPM data should be used in conjunction with ground observations for more accurate results. A study to evaluate GPM data over the African continent showed that the GPM data generally agree with rain gauges, although its performance is dependent on season, region and evaluation statistics. 56 The study further showed the limitations of GPM data over areas of high topography and high performance over Lake Victoria. The GPM data set has a 30-min time interval and a spatial resolution of 0.1°. 53 For the purposes of this study, the data were merged into hourly data sets and used in conjunction with ground observations.

Snow report data
An archive of snow events observed in South Africa since the late 19th century is made available by the Snow Report community. 59 These reports indicate the geographical location for snow occurrence, as reported by the community, and not the amount of snow observed. This record assists in identifying the spatial extent to which the models were able to capture snow occurrence.

Description of case studies: High impact weather events
NWP models perform differently across different parts of the globe and for different weather phenomena. Rainfall over South Africa occurs as a result of a number of weather systems, including those that occur primarily in the tropics and mid-latitudes. This is due to South Africa's location in the subtropics. For this study, three events that occurred in 2017 were selected due to their impact on communities, infrastructure and the economy, as well as for the amount of media coverage they generated. The events occurred on 15 July 2017, 10 October 2017 and 30 December 2017, and were respectively associated with a cold front and a coastal low, a cut-off low with a ridging high, and a surface trough.

Verification
The aim of this study was to identify the strengths and weaknesses of the UM, COSMO and WRF models, in order to make decisions for producing high-quality operational forecasts and NWP data. Understanding the quality (reliability, accuracy, skill, sharpness and uncertainty) of model forecasts is useful for gaining insight into the strengths and weaknesses of the model, for model users, decision-makers and model developers. 60 For the purpose of this study, several weather variables were selected for verifying the models and evaluating their performance against different observations, namely, wind speed, surface temperature, total precipitation and accumulated snow. We highlight only a few strengths and weaknesses of the models here; more variables and further verification would be required to reach a final conclusion.

Subjective verification
The spatial distribution of snow is displayed to study how well the models' deterministic forecasts capture the spatial distribution of these weather events. These model forecasts are displayed along with the corresponding observations for eye-ball verification over the South African domain.

Objective verification
An objective verification approach was employed to measure the models' skill in predicting different weather events. Spatial distribution of model bias against ground observations and GPM measurements for surface temperature and accumulated precipitation were plotted. The bias indicates model accuracy with regard to the observations. Time series for areal average mean error (ME), root mean squared error (RMSE) and Spearman's correlation (CORR) for 2-m temperature and 10-m wind speed were calculated for each model (Equations 1-3). 61 Probability of detection (POD), false alarm rate (FAR), BIAS score and threat score (TS) were computed for 6-hourly accumulated rainfall (Equations 4-7). 61 These statistics were computed for each model at about 200 station points. These are station locations that had valid synops at the forecast hour under investigation (point-to-point verification). where

Cold Front:15 July 2017
The weather on 15 July 2017 was influenced by a cold front over the southwestern parts of South Africa, preceded by a coastal low along the east coast. 62 Cold temperatures with snow and heavy rainfall were observed in places over most of the western and southern parts of the country and over the Lesotho highlands. 59 A synoptic chart that indicates the observations at 12 UTC on the day can be viewed at www.weathersa. co.za/Documents/Publications/20170715.pdf. Figure 1 indicates that the 2-m temperature simulations for the three models have a similar pattern: the models generally have a cold bias. The highest cold bias for all the models is situated over the Lesotho highlands. The UM depicts the highest cold and warm bias (Figure 1a), followed by the COSMO (Figure  1b). The WRF shows the lowest cold and warm bias (Figure 1c). Most of the bias for all the models is within 5 °C of the observed temperatures.  The precipitation that occurred on 15 July 2017 resulted from a cold front over the southwestern parts of South Africa and a coastal low ahead of it. When compared with the GPM measurements, the COSMO model depicts the highest positive bias over South Africa and adjacent oceans (Figure 2b), while the UM (Figure 2a) and WRF (Figure 2c) have a similar pattern. Figure 1a further depicts a negative bias over the southeastern parts of South Africa, implying that the UM failed to capture the precipitation that resulted from a coastal low ahead of the cold front. The COSMO and WRF depict little to no bias in the coastal low area (Figure 2b and 2c). The UM has a larger spatial coverage for positive and negative bias towards the tropics (Figure 2a). The WRF has the lowest positive and negative bias values (Figure 2c). The left column in Figure 3 shows the ME, RMSE and CORR for the 2-m temperature for all three models relative to the station observations. The ME for the UM and the WRF closely correspond and generally have a small negative mean bias for most of the day, whereas the COSMO has a much higher positive bias during forecast hours 01-05, and again later in the day (Figure 3). The COSMO model shows almost no temperature bias during forecast hours 09 and 16, because the COSMO has a much larger area over South Africa where warmer temperatures over the northeastern parts of the country were over-forecast and the cooler temperatures over the southern parts were over-forecast, resulting in a small additive bias. The same applies to the early and late hours of the day when the UM and WRF have little to no bias. The UM has the lowest magnitude of errors (RMSE) in the early hours and later in the day when its RMSE is equivalent to the WRF. The RMSE for the three models is almost equivalent during the sunlight hours of the day. The WRF has the lowest correlation during the early hours, but this changes as the three models have equivalent correlation throughout the rest of the diurnal cycle. The model forecasts for 2-m temperature show high correlation with ground observations.
The right column of Figure 3 shows areal average statistics for 10-m wind speed forecasts against ground observations. The three models are comparable. The models have a negative wind speed bias throughout the diurnal cycle. The UM and COSMO have an equivalent bias (ME) throughout the day, while the WRF has lower negative bias in the early hours and higher negative bias later in the day. The models generally have large magnitudes of error (RMSE), and the UM and the COSMO correspond throughout, while the WRF is generally lower in the early hours and higher throughout the rest of the day. The models have low positive correlation for wind speed and it fluctuates throughout the day. The correlation coefficient for WRF is highest during the first two hours when the correlation coefficient for UM and COSMO are very low. The COSMO, sometimes along with the UM, shows the highest correlation coefficient during most sunlight hours, while the WRF has better performance later in the day. Wind forecast skill is poorer than temperature forecasts (Figure 3).
The skill scores for precipitation depict that the precipitation that occurred in the early hours of the day as a result of the coastal low, was only captured by the WRF (Figure 4). A cold front made landfall over the southwestern tip of South Africa later in the day, resulting in significant amounts of rainfall. Only WRF shows skill during the first half of the day, with near perfect bias scores (bias 1) at forecast hour 12 (Figure 4c). Both COSMO and WRF have higher POD during the last half of the day (Figure 4a). At the same time the three models depict equivalent FAR (Figure 4b), the COSMO and WRF show near perfect bias scores ( Figure  4c) and higher TS (Figure 4d).

Ridging high and cut-off low: 10 October 2017
The weather on 10 October 2017 was characterised by a cut-off low which is an upper air disturbance that promotes uplift, coupled with a ridging high whose role is the advection of moist air from the ocean onto the land. 48 A synoptic chart that indicates the observations at 12 UTC on the day can be viewed at www.weathersa.co.za/Documents/ Publications/20171010.pdf. Large parts of the southeastern half of South Africa experienced cold temperatures as a result. The models captured these temperatures; however, the extent of the area covered by lower temperatures differs across the three models. The models depict a similar pattern with areas for peak cold and warm biases located at the same points ( Figure 5). UM (Figure 5a) and WRF (Figure 5c) are characterised by large areas of cold bias, while the COSMO (Figure 5b) generally over-forecasts surface temperature across the country. WRF depicts the lowest cold and warm bias compared to the other models. Figure 6 shows the total 24-h precipitation as simulated by the three models compared to GPM rainfall estimates. The spatial coverage and intensities for the three models depict a close resemblance when compared with the GPM measurements ( Figure 6). The UM shows the highest positive and negative bias (Figure 6a), followed by COSMO (Figure 6b). The WRF generally depicts the lowest positive and negative bias (Figure 6c).   All the models captured the northwest to southeast rainfall pattern, while underestimating the amount in some areas. The rainfall over the southeastern part of South Africa was captured by all the models; however, WRF (Figure 6c) extended the rainfall area more south and north along the coastal area. It may be noted that this event resulted in flooding over the east coast of South Africa, caused severe damage to property and eight people lost their lives.

Surface trough: 30 December 2017
The weather on 30 December 2017 was characterised by a broad surface trough that extended from the central interior to the western parts of South Africa, with a high to the east of the country. 60  across most of the eastern and southern parts of the country. A warm tongue was observed in the Northern Cape which extends into Namibia. The three models were able to capture the general spatial pattern showing lower temperatures over the south and east of the country. The models' simulations depict a similar pattern for surface temperature simulations with peaks for cold and warm biases situated at the same locations ( Figure 7). The UM and COSMO generally show a positive bias (Figure 7a,b). WRF depicts larger areas of cold bias within 5 °C. A large band of rainfall, extending from the tropics to the southern coast of South Africa, was observed on 30 December 2017. The 6-hourly accumulated precipitation for models compared with ground observations were used in this case as they contain more detail than the 24-hour totals. All the time steps followed a similar pattern, hence one time step is depicted. The models depict a similarity in simulating precipitation ( Figure 8). The UM depicts the highest positive and negative biases, followed by WRF.  Additive bias for 2-m temperature shows high accuracy for the UM throughout the day (Figure 9, left column). There is a general overforecast by the COSMO throughout the day. The WRF shows an overforecast in the early and late hours, and a gradual decline during sunlight hours, to as low as -1 towards midday. The models have equivalent magnitudes of error (RMSE) for temperature throughout the diurnal cycle. The correlation coefficient for the models is also equivalent, although it is low.
The right column of Figure 9 shows statistics for 10-m wind speed. The models under-forecast wind speed throughout the diurnal cycle, and they have equivalent ME, except during the warm hours of the day when the COSMO simulation is relatively more accurate. The models show equivalent RMSE throughout the diurnal cycle. The models show a low correlation throughout the diurnal cycle.
The POD for precipitation at station points over South Africa is higher for the COSMO during the first half of the day, and higher for the WRF for the rest of the day ( Figure 10). The WRF has the highest FAR at forecast hours 06 and 18, while the COSMO has the highest at forecast hour 12 and UM at forecast hour 24. The UM under-forecasts precipitation throughout the diurnal cycle. The WRF shows relatively high accuracy throughout (bias closest to the perfect score of 1); although the models have low TS throughout the diurnal cycle, WRF shows better skill compared to the other models.

Discussions and conclusions
In this study, we evaluated the performance of three models, used within the SADC region for operational weather forecasting in South Africa, in simulating three high-impact weather events. These three events were associated with a cold front, a ridging high associated with an upperair cut-off low, and a surface trough and tornado, respectively. The second event resulted in flooding over parts of the east coast of South Africa with eight people reported to have lost their lives. The third event was reported to have damaged houses in parts of the central interior of South Africa. All three models were able to capture the events, with slight differences in performance.
All the models were able to capture the general temperature pattern with an acceptable areal average bias of about 1 °C. COSMO generally underestimated the size of the area associated with lower temperatures. The three models generally underestimated the temperature in the Limpopo Province, especially in the western parts of the province. The models failed to capture warmer temperatures over the western interior of South Africa that extended into Namibia. WRF captured this feature best; however, the area was overestimated. The three models poorly simulated 10-m winds, and as a result produced high bias and poorly correlated with ground observations.
The three models were able to capture the general rainfall pattern associated with the three case studies: 24-h accumulated precipitation for the three models was comparable to the observations, while 6-h accumulated rainfall showed poor performance. This is most noticeable on 30 December 2017 where FAR is high and POD is poor. The spatial pattern of the rainfall differed across the three models and, in some instances, the WRF and the COSMO underestimated the rainfall amount. The COSMO did not capture all the rainfall associated with the rain bands that extended from the tropics to South Africa in a northwest to southeast pattern on 10 October 2017 and 30 December 2017.  WRF was able to capture rainfall associated with the coastal low better than the other two models. COSMO captured the event, but underestimated the rainfall, while the UM did not capture the event at all. The UM generally depicted a similar or higher amount of rainfall as that observed, but had a much smaller spatial coverage (blobbiness). COSMO generally failed to capture small and residual rainfall, and also forecasted slightly lower amounts of rainfall than the GPM observations. WRF performed well over land, and even captured the small (and residual) rainfall that the UM and COSMO did not capture. However, it overestimated rainfall over the ocean.
The study shows that the models are skilled in capturing general details of big events, similar to the ones that were analysed here. The differences in the simulations point to issues with small-scale processes and, in particular, in the simulation of rainfall. The poor performance of the models may be associated with the use of coarse-resolution observations and non-gridded data sets. 1,63 It is important that further research is conducted to understand the reasons associated with the different model performances. However, it may be noted, as also shown in this study, that there are shortcomings in the available observations, which makes it difficult for models to be verified in detail. Satellite estimates show different amounts of rainfall, as also shown here.