Explore open access research and scholarly works from STORE - University of Staffordshire Online Repository

Advanced Search

SALAD: A split active learning based unsupervised network data stream anomaly detection method using autoencoders

Nixon, Christopher, SEDKY, Mohamed, CHAMPION, Justin and HASSAN, Mohamed (2024) SALAD: A split active learning based unsupervised network data stream anomaly detection method using autoencoders. Expert Systems with Applications, 248. p. 123439. ISSN 0957-4174

[thumbnail of nixon2023-salad-esa.pdf] Text
nixon2023-salad-esa.pdf - AUTHOR'S ACCEPTED Version (default)
Restricted to Repository staff only until 7 February 2026.
Available under License Type All Rights Reserved.

Download (978kB) | Request a copy
Official URL: https://doi.org/10.1016/j.eswa.2024.123439

Abstract or description

Machine learning based intrusion detection systems monitor network data streams for cyber attacks. Challenges in this space include detecting unknown attacks, adapting to changes in the data stream such as changes in underlying behavior, the human cost of labeling data to retrain the machine learning model and the processing and memory constraints of a real-time data stream. Failure to manage the aforementioned factors could result in missed attacks, degraded detection performance, unnecessary expense or delayed detection times. This research proposes a new semi-supervised network data stream anomaly detection method, Split Active Learning Anomaly Detector (SALAD), which combines our novel Adaptive Anomaly Threshold and Stochastic Anomaly Threshold with Fading Factor methods. A novel Reconstruction Error based Distance from Threshold strategy is proposed and evaluated as part of an active stream framework to demonstrate reduction in labeling costs. The proposed methods are evaluated with the KDD Cup 1999, and UNSW-NB15 data sets, using the scikit-multiflow framework. Results demonstrated that the proposed SALAD method offered equivalent performance to full labeled and alternative Naïve Bayes (NB) and Hoeffding Adaptive Tree (HAT) methods, with a labeling budget of just 20%, significantly reducing the required human expertise to annotate the network data. Processing times of the SALAD method were demonstrated to be significantly lower than NB and HAT methods, allowing for greatly improved responsiveness to attacks occurring in real time.

Item Type: Article
Uncontrolled Keywords: Active learning; Online learning; Autoencoders; Anomaly detection; Intrusion detection system
Faculty: School of Digital, Technologies and Arts > Computer Science, AI and Robotics
Depositing User: Mohamed SEDKY
Date Deposited: 27 Nov 2025 16:40
Last Modified: 27 Nov 2025 16:40
URI: https://eprints.staffs.ac.uk/id/eprint/8989

Actions (login required)

View Item
View Item