The SABAP2 legacy: A review of the history and use of data generated by a long-running citizen science project

AFFILIATIONS: 1Centre for Functional Biodiversity, School of Life Sciences, University of KwaZulu-Natal, Pietermaritzburg, South Africa 2FitzPatrick Institute of African Ornithology, Department of Biological Sciences, University of Cape Town, Cape Town, South Africa 3BirdLife South Africa, Isdell House, Johannesburg, South Africa 4Department of Biological Sciences, University of Cape Town, Cape Town, South Africa 5Biodiversity and Development Institute, Cape Town, South Africa


•
The project is the template for other established projects that now operate across the continent, collectively now falling under the 'African Bird Atlas Project' umbrella.
• We show that since the initiation of SABAP2, there has been a three-fold increase in publications, with over 150 papers that can be attributed to SABAP2.

•
The contribution of citizen scientists to the published scientific domain has been enormous.
One of the largest citizen science projects in Africa is the Second Southern African Bird Atlas Project (SABAP2). SABAP2 is a follow-up project of the Southern African Bird Atlas Project (now labelled SABAP1). The primary data collection period for the first bird atlas project was 1987 to 1991; it incorporated data from as far back as 1980, and in some regions included data until 1993, assembling a total of 7.2 million records of bird distribution. 1 SABAP1 generated the Atlas of Southern African Birds in two volumes. 2 Harrison et al. 3 demonstrated that the SABAP1 database had become a valuable resource to four main user constituencies: environmental consultants, conservationists, research scientists, and birders. Academic research output (theses and papers) was summarised by Underhill 4 ; most of the 102 papers and 19 postgraduate theses listed had been based on SABAP1 data.
The 'second' atlas project, SABAP2, was launched in 2007 and was ongoing in 2021. There is currently no planned end to the project, as the database is recognised as providing useful information in a changing world. 5 The BirdMap data collection protocol has been extended into Nigeria and Kenya, including bespoke websites and data curation, with data collected through these projects falling under the umbrella of the 'African Bird Atlas Project'. 6,7 The SABAP2 data are already extensively used: in scientific publications to inform conservation management; species conservation assessments; and in environmental impact assessments. We summarise this use here.
The initial principal aim of the bird atlas projects was to produce avian range maps from the sightings of volunteers contributing bird lists from various geographic locations. 2 However, the systematic data collection protocol allows an investigation of a wide variety of conservation and academic questions. 8 Today, the continued strength of the project is the easy calculation of relative abundance, which is possible due to multiple lists contributed for each sampling area. Global range maps are recently better visualised using the eBirds global database, which taps into a much larger citizen science contributor database 9 , although, for the southern African subregion, SABAP2 is still the best source of distributional information given the data vetting processes in place to check data quality. 10 Indeed, SABAP2 lists can be exported into eBird data for submission to that database through the BirdLasser bird recording software. 11 Due to the long-term undertaking of SABAP2, it is also becoming increasingly important for evaluation of population trend analyses. 12 The objectives of this paper are to describe the background to the SABAP2 database and examine the use of the data in the publication record.

African Bird Atlas Project description
SABAP2 and the BirdMap protocol were the foundations of the African Bird Atlas Project. This project is now the umbrella for country-specific citizen science projects that collect bird list data submitted by the bird watching community using the 'BirdMap' protocol.

BirdMap protocol
African Bird Atlas Project data collection follows a simple protocol. 8 Lists are collected within a geographical pentad, which is a grid cell on a map corresponding to five geographical minutes of latitude north-south and

A brief history of the African Bird Atlas Projects
The first South African Bird Atlas Project (SABAP1) took place from 1986 to 1997, with data collection representing the period 1987 to 1992. The initiative was based out of the Avian Demography Unit (now retired) at the University of Cape Town, building on various regional atlas projects conducted prior to this period. 13, 14 The methods and protocol are outlined in detail in The Atlas of Southern African Birds. 2 In essence, the birding community of southern Africa was encouraged to collect their sightings of birds in a standardised format by compiling their lists per quarter degree grid cell geographic areas (QDGC, (approximately 27 km long (north-south) and 23 km wide (east-west)); but larger half degree grid cells in Botswana). Volunteers were sent introductory materials, including an instruction booklet and printed checklists. Lists were compiled by hand and sent to the University of Cape Town for data checking, entry, and upload. SABAP1 gathered 7.2 million peer-reviewed distribution records for 932 bird species in the southern African sub-region, contributed by more than 5000 birdwatchers. 3 It covered six southern African countries (Botswana, Lesotho, Namibia, South Africa, Swaziland, and Zimbabwe). Mozambique was excluded due to the civil war in that country at that time. It was the first time a biological survey had been attempted on anything like that scale in Africa. Indeed, SABAP1 remains one of the largest completed projects of its kind, even globally. 3 The resulting published atlas volumes contained contributions by 62 authors and seven editors. 3 The second atlas project (SABAP2) was directed as of 2006 by Les Underhill at the Animal Demography Unit. Data collection started in 2007 and is ongoing. SABAP2 is currently managed by the FitzPatrick Institute of African Ornithology. The data collection protocol was similar to that used for SABAP1, but at a finer spatial and temporal resolution -using pentads (5 × 5 geographical minutes: there are nine pentads in a QDGC) and recording species over at most 5-day periods, compared to monthly lists in SABAP1. There was also an attempt to standardise the minimum time effort for a list to count towards estimates of species reporting rates (2 hours and effort to cover all major habitats for lists to qualify as 'full protocol' lists). In addition, species were to be reported in the order sighted, on the assumption that more common species will appear earlier on species lists and rare species generally recorded last, on average. 10 By 2009, the second full year of SABAP2, citizen scientists were submitting c. 17 000 checklists per year to the project; this remained stable until 2014. In 2015, a combination of the initiation of a series of Citizen Scientist Days and the introduction of mobile apps, especially BirdLasser, resulted in an increase in the rate of submission of checklists to c. 30 000 per year (Table 1). There was a decrease in submissions in 2020 due to the COVID-19 pandemic.

Citizen scientists and their contributions
In August 2021 there were 3106 registered contributors to the African Bird Atlas Projects, with those registered with SABAP2 representing the majority of these: 2501 observers, followed by Kenya (348), Nigeria (196) and others (61). However, it is rare to have more than 850 observers contributing full protocol checklists in any one year to SABAP2 (  Branded to gainfully use time in a safe yet meaningful manner, as well as 'contributing to science', the nourishing effects on emotional well-being and mental health have been highlighted as benefits of birding 'with a cause'. 16 Volunteers in SABAP2 were satisfied and exhibited behaviours suggesting they act as advocates for the programme. 17 Atlasers (the term used to describe contributors to the African Bird Atlas Project) travel large distances to contribute to the atlas, often engaging with landowners on bird conservation issues. 3 Of great value to the project in terms of data generation, and also to atlas participants who gain a sense of camaraderie, are 'atlas bashes'. These can be once-off expeditions to target remote regions or encourage systematic repeated data collection over a defined geographical region, for example, the 'Four Degrees region of Greater Gauteng: the challenge to obtain at least 11 checklists in 576 pentads'. 18 The systematic atlasing coordinated by Johan van Rooyen in Stilbaai is a further exemplary case of how to maximise coverage with a small team of people. 19

Data availability
Publicly available data can be obtained for species or locations (pentads) via the project websites (http://sabap2.birdmap.africa/) or via an R package rabm (https://github.com/davidclarance/rabm). For locations, this includes species lists at various temporal intervals (total, annual, monthly), allowing examination of trend data and annual patterns of occurrence. Species occurrence data are available either as reporting rates in pentads, allowing broadscale distribution modelling, or can be obtained including null counts, which allows for better modelling of factors influencing occurrence. Species reporting rate data are also available as geoJSON files, which can be used in GIS software.
A comparison of SABAP2 vs SABAP1 reporting rates is also available. Bespoke data products are also available by arrangement with the project coordinators.

Examining output and trends in publications referring to SABAP
Given the lack of a centrally citable resource for use of the SABAP2 database, tracking use and output from the available database is extraordinarily difficult because the data are free to download in various formats with no registration or declaration of use required. For instance, a set of the SABAP2 data has been shared with the GBIF global biodiversity database, which is used by global ecologists to model broad biological or ecological questions using multiple data channels. That set of the data alone had been cited 43 times as of 3 June 2021 according to the database description landing page (https://www.gbif. org/dataset/906e6978-e292-4a8b-9c39-adf6bb0f3323).
A set of publications brought to the attention of project coordinators is available on the project website (http://sabap2.birdmap.africa/media/ bibliography#pgcontent). This set is based on the initial bibliography of peer-reviewed articles, theses and semi-scientific papers that make substantial use of the SABAP data. 4 As of 1 June 2021, the website contained 201 documents, including both peer-reviewed articles and non-peer reviewed newsletters or reports.
To perform as comprehensive a survey as possible of wider use and recognition, we used the 'Publish or Perish' software 20 to implement a keyword search based on search terms 'SABAP', 'Southern African Bird Atlas' and 'SABAP2' through the Google Scholar search engine, excluding patents and citations. Searches were saved as .csv files and imported into R 21 for further data cleaning and analysis. Attempts to search by the previously mentioned GBIF DOI were also attempted but returned no results.
Search results were manually scanned for relevance. The 'SABAP' search term alone returned 1190 results; however, as 'sabap' has alternative meanings in other languages, many results were not relevant. After excluding these, combining search results across search terms, and excluding repeated and irrelevant results, 717 documents and publications -representing a mix of books, html documents, 145 environmental impact assessments, and peer-reviewed articlesreferred to the atlas projects.
Of 275 identified peer-reviewed articles, 186 were published after 2006, corresponding with the SABAP2 period. Separating articles that merely refer to SABAP rather than make use of the data was harder to gauge. For instance, the two articles with the greatest citations referred to research related to SABAP, 22,23 but did not make use of the data. Of the 717 articles, 94 specifically mention SABAP2 in either title or abstract. However, many articles which made use of SABAP2 data (including all the GBIF articles) did not mention this in the title or abstract. 24,25 As a minimum estimate based on the above filters, SABAP2 data alone has contributed to at least 150 peer-reviewed articles, and likely many more.
In addition, the atlas projects are often referred to in publications specifically on the growing field of citizen science research: these publications do not actually use SABAP data (e.g. Wright et al. 17 ).
Many of the articles that refer to the atlas project or use the data are in themselves highly influential (Supplementary table 2).
Plotting the temporal pattern of publication data from the Google Scholar search results reveals a linear increase in publications per year from the initiation of SABAP2 in 2007, until about 2015, and a tripling of research output compared to the period before this associated with SABAP1 ( Figure 2). In both 2016 and 2020, more than 40 articles referred to the atlas projects; these articles were associated with a series in Biodiversity Observations (2016) and a special issue of Ostrich on the theme of citizen science. 26

A recipe for value and success
The SABAP2 project has been a success due to a mutually beneficial triumvirate of three organisations: South African National Biodiversity Institute (SANBI; a governmental organisation), University of Cape Town (UCT; academic institution) and BirdLife South Africa (a non-governmental organisation). SANBI initially sponsored the project, implementing the database vision of Les Underhill at the Animal Demography Unit of UCT, with the mobilisation of the key data contributors (birders) encouraged by BirdLife South Africa. Currently, the African Bird Atlas Project provides extraordinary value at no cost to data users. The entire project is run essentially on volunteers, both citizens and professionals, contributing time, money and resources. Provisional estimates suggest that the value of the in-kind contributions by citizen scientists exceeds ZAR40 million per year -more than 25 times the cost of maintaining the core team which runs the project.
In 2021 there were essentially two salaried positions at UCT: the database manager and a communications officer. After Les Underhill's retirement, the institutional support of the FitzPatrick Institute at UCT has been critical to maintaining the project, which provides the administrative envelope for delivering the current features. The partnership with BirdLife BO = Biodiversity Observations; 'Not listed' represents missing data for titles or publisher, usually associated with web documents and reports.  Here we have quantified academic use of the database, but the value extends into many more dimensions that are harder to quantify: social, economic and cultural. On a day-to-day basis, the data are used for an extraordinary cross section of purposes, from planning holidays to informing industrial development. BirdLife South Africa has used the data in a number of projects: the Important Bird Area Directory, 27 the 2015 Red List Assessment 28 , current environmental impact assessment site-screening tools, and within BirdLife South Africa to motivate for research projects.
Given the value of this project, and the ethos of open data (conditional for early SANBI support), the support of this position through government institutions makes sense -this is after all an area where citizen science taxpayers would be happy to see their money spent. Nonetheless, project funding has been a constant source of struggle for almost the entire history of the project. SANBI's annual investment has resulted in a product worth millions of rands because of the money spent by atlasers. If ever there was proof of the value of the project, both to local conservation and to informing a wide spectrum of global scientific research, this review reveals the extraordinary publication output from the SABAP2. Needless to say, this output is also only the tip of the iceberg in terms of the potential of this extensive and impressive database.