基于 YOLOv8s 的红外道路交通目标检测算法

Translated title of the contribution: Infrared Road Traffic Target Detection Algorithm Based on YOLOv8s
  • Lei Lu
  • , Qian Yu
  • , Hong Zhang*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Objective Infrared target detection has significant application value in numerous fields such as unmanned aerial vehicle monitoring, fire early warning, and military guidance. Infrared images mainly reflect the temperature distribution on the surface of objects. The features of the target in the image are mainly reflected in the temperature difference from the background and its thermal radiation contours. However, the temperature difference characteristics of infrared images also lead to several challenges: infrared targets often lack clear edges and rich texture details, and their morphology is manifested as relatively blurry clumps. When the targets overlap or are relatively close to each other, due to the lack of obvious boundary information and fine structure, their independent contours are more difficult to distinguish and are easily regarded as a single target or cause the detection box to be misaligned. When the temperature difference between the target and the background is small or there is a large amount of heat source interference in the background, the pixel value of the target area will be further reduced in terms of its discrimination from the background. Traditional detection methods are difficult to effectively extract robust target representations. Currently, deep learning-based methods have addressed challenges such as insufficient feature information in infrared images, large differences in target scales, and difficulties in target feature extraction. However, when dealing with the problems of blurred boundaries when targets overlap or are close in distance in infrared road traffic scenes, and low discrimination between the background and the target when the temperature difference between the environment and the target is low, there are still defects such as relatively high rate of missed detection and false detection of targets. In this study, we propose an infrared road traffic target detection algorithm based on YOLOv8s. By processing image features, the robustness and accuracy rate of the detection method is enhanced when there is insufficient feature information or feature semantic conflicts, providing assistance to solve the difficulties encountered in target detection in infrared road traffic scenes. Methods This paper proposes infrared traffic detect-YOLO (ITD-YOLO), an infrared road traffic target detection algorithm based on YOLOv8s. In order to deal with the problem of low discrimination between background and target when the temperature difference is low, the algorithm designs a multi-scale feature extraction module. Under the premise of controlling the computational complexity, the module extracts complementary spatial information by fusing receptive fields of different sizes, and enhances the perception ability of the model to multi-scale features. Aiming at the target missed detection phenomenon caused by the fuzzy target boundary and the limitation of the convolution backbone network in modeling long-distance dependencies, the algorithm further constructs the token statistics and context gated module at the end of the backbone network. The module integrates the TSSA (token statistics selfattention) mechanism based on statistical feature modeling, which is based on the variational form of maximal coding rate reduction, which not only improves the computational efficiency of self-attention, but also enhances the modeling ability of global context and fine-grained features. In addition, convolutional gated linear unit is used to replace the single linear layer to promote the fusion of global and local features. Aiming at the problem of target misdetection caused by feature semantic deviation or conflict during cross-layer feature fusion of neck network, the convolutional block attention module is introduced to adaptively adjust the channel and spatial dimension before feature fusion, improve the expression ability of key information and reduce the influence of noise. Results and Discussions Compared with YOLOv11s and YOLOv12s, the mAP50 (mean average precision at intersection over union threshold of 0.50) of the algorithm is increased by 3.4 and 4.0 percentage points, and the mAP5095 (mean average precision under intersection over union thresholds from 0.50 to 0.95) is increased by 2.2 and 2.5 percentage points, and the number of parameters is increased by 2.9×106 and 3.0×106, respectively. It fully proves the effectiveness of ITD-YOLO in optimizing the model structure (Table 2). First, the algorithm introduces the multi-scale feature extraction module, the precision rate decreases by 0.3 percentage point, the recall rate increases by 1.2 percentage points, the mAP50 and mAP5095 increase by 1.0 percentage point and 1.5 percentage points respectively. It shows that the introduction of multi-scale information effectively promotes the model to learn the target features, especially helps to improve the recall rate of detection, and significantly improves the average detection accuracy of the target. However, the difference in spatial alignment of multi-scale features leads to a slight decrease in precision rate. In addition, the number of parameters of the model is only increased by 0.1×106, which fully reflects the advantages of good lightweight while improving the detection performance of the module (Table 3). Second, the token statistics and context gated module is introduced to guide the model to focus on the key features of the target through global information, which improves the recall rate of detection by 2.0 percentage points compared with the baseline model, indicating that the missed detection rate of the target is significantly reduced. However, the self-attention mechanism is only calculated on the feature map of the smallest size, which produces feature semantic deviation and conflict during the cross-layer feature fusion of the neck network, resulting in a precision rate reduction of 0.5 percentage point (Table 3). Finally, the convolutional block attention module is introduced to promote the semantic fusion between features of different scales to improve the detection accuracy. The precision rate is 1.8 percentage points higher than that of the baseline model, indicating that the target false detection phenomenon is significantly suppressed (Table 3). In general, although the total number of model parameters is increased by 0.9×106, the precision, recall, mAP50 and mAP5095 are significantly improved (Table 3), and the false detection rate and missed detection rate of the target are significantly reduced (Fig. 7). Conclusions Aiming at the poor detection performance in infrared road traffic images due to the blurred target boundaries when the targets overlap or are relatively close, and the low discrimination between the background and the targets when the temperature difference between the environment and the targets is low, this paper proposes the infrared road traffic target detection algorithm ITD-YOLO based on the YOLOv8s algorithm. This algorithm designs a multi-scale feature extraction module to enhance the model’s ability to extract features of different scales and improve the detection accuracy. Design token statistics and context gated module at the end of the backbone network to reduce the target missed detection rate and further improve the detection accuracy. The convolutional block attention module is introduced before the feature fusion of the neck network to reduce the false detection rate of the target by selecting the fusion features at the channel and spatial levels. Experiments show that it is superior to the benchmark algorithm YOLOv8s, and has good detection performance compared with YOLOv11s and YOLOv12s, but the number of model parameters has slightly increased. Future research should focus on further developing towards model lightweighting without sacrificing high precision.

Translated title of the contributionInfrared Road Traffic Target Detection Algorithm Based on YOLOv8s
Original languageChinese (Traditional)
Article number0437005
JournalLaser and Optoelectronics Progress
Volume63
Issue number4
DOIs
StatePublished - Feb 2026

Fingerprint

Dive into the research topics of 'Infrared Road Traffic Target Detection Algorithm Based on YOLOv8s'. Together they form a unique fingerprint.

Cite this