HASSAN, Akbar, NAWAZ, Tahir, ASADUZZAMAN, Md, HASAN, Mohammad, QURESHI, Waqar S and SHAFAIT, Faisal (2025) Lightweight Distilled Transformer-Based Vision Framework for Detection of Forest Fire and Smoke in Real-World Scenes. The Journal of Electronic Imaging (JEI), 34 (3). 033035. ISSN 1560-229X
Forest_Fire_Manuscript_Accepted_Version.pdf - AUTHOR'S ACCEPTED Version (default)
Available under License Type Creative Commons Attribution 4.0 International (CC BY 4.0) .
Download (1MB) | Preview
Abstract or description
Forest fires have become a ravaging threat with incidents growing rapidly across the globe. Several approaches for forest fire detection have been presented over the years, however, the need remains for an effective, computationally efficient, and unified vision-based solution, which can easily be deployable on edge devices for real-world applications. To this end, we present a lightweight model based on a distilled vision transformer (D-ViT) to classify forest imagery into fire, smoke and normal scenarios. We used ResNet50 as a teacher model trained on the target dataset and a compressed D-ViT as a student model trained using the knowledge distillation (KD) approach. Unlike existing approaches, the proposed D-ViT framework is computationally efficient with fewer trainable parameters and is unified in terms of detecting both fire and smoke (whichever is dominant) at longer ranges with visible imagery in the scene. For experimental validation, we deployed the model on Jetson Nano board, and performed an extensive evaluation and analysis of the proposed framework on data collected from public online sources, which we have made available on request for use by the research community. The proposed D-ViT model achieves an encouraging performance with a processing speed of 18.84 frames per second (FPS) and accuracy of 94% using soft distillation, thus demonstrating a performance improvement over the 90% accuracy obtained with the ViT (without distillation). A comparison with several other standard deep classification models also shows encouraging results, with a better trade-off between accuracy and computational efficiency.
Item Type: | Article |
---|---|
Additional Information: | © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. [DOI: 10.1117/1.JEI.34.3.033035] |
Faculty: | School of Digital, Technologies and Arts > Engineering |
Depositing User: | Md ASADUZZAMAN |
Date Deposited: | 12 Jun 2025 13:11 |
Last Modified: | 13 Jun 2025 04:30 |
URI: | https://eprints.staffs.ac.uk/id/eprint/9077 |