A framework to select a classification algorithm in electricity fraud detection

Sisa Pazi; Chantelle M. Clohessy; Gary D. Sharp

doi:10.17159/sajs.2020/8189

A framework to select a classification algorithm in electricity fraud detection

Authors

Sisa Pazi Department of Statistics, Nelson Mandela University, Port Elizabeth, South Africa https://orcid.org/0000-0002-8880-3881
Chantelle M. Clohessy Department of Statistics, Nelson Mandela University, Port Elizabeth, South Africa https://orcid.org/0000-0002-4612-2228
Gary D. Sharp Department of Statistics, Nelson Mandela University, Port Elizabeth, South Africa https://orcid.org/0000-0003-0321-8067

DOI:

https://doi.org/10.17159/sajs.2020/8189

Keywords:

electricity fraud detection, confusion matrix, classification algorithms

Abstract

In the electrical domain, a non-technical loss often refers to energy used but not paid for by a consumer. The identification and detection of this loss is important as the financial loss by the electricity supplier has a negative impact on revenue. Several statistical and machine learning classification algorithms have been developed to identify customers who use energy without paying. These algorithms are generally assessed and compared using results from a confusion matrix. We propose that the data for the performance metrics from the confusion matrix be resampled to improve the comparison methods of the algorithms. We use the results from three classification algorithms, namely a support vector machine, k-nearest neighbour and naïve Bayes procedure, to demonstrate how the methodology identifies the best classifier. The case study is of electrical consumption data for a large municipality in South Africa.

Significance:

The methodology provides data analysts with a procedure for analysing electricity consumption in an attempt to identify abnormal usage.
The resampling procedure provides a method for assessing performance measures in fraud detection systems.
The results show that no single metric is best, and that the selected metric is dependent on the objective of the analysis.

Downloads

Published

2020-09-29

Issue

Vol. 116 No. 9/10 (2020)

Section

Research Article

License

All articles are published under a Creative Commons Attribution 4.0 International Licence

Copyright is retained by the authors. Readers are welcome to reproduce, share and adapt the content without permission provided the source is attributed.

Disclaimer: The publisher and editors accept no responsibility for statements made by the authors

How to Cite

Pazi, S., Clohessy, C. M., & Sharp, G. D. (2020). A framework to select a classification algorithm in electricity fraud detection. South African Journal of Science, 116(9/10). https://doi.org/10.17159/sajs.2020/8189