Lightweight Yolov5 algorithm target detection system based on space-grade NPU
-
摘要: 针对基于深度学习的目标检测算法的网络结构复杂且所需计算量过大导致其不易部署在资源十分有限的星上图像处理平台这一问题,本文提出了一种基于宇航级神经网络处理器(Neural-network Processing Unit,NPU)的卷积神经网络加速设计,利用改进后的Yolov5s网络在NPU上实现快速图像处理功能。该设计将网络中与NPU适配度低的结构替换掉,并引入注意力机制来弥补轻量化网络带来的精度损失,得到优化后的网络后通过公开数据集VOC在GPU上进行迭代训练,经CPU-NPU并行协同处理设计后,实现将图像处理的三部分并行执行,充分利用Yulong810A平台的有限计算和存储资源。实验证明,优化后的网络不仅参数量减小82%,精度较原版Yolov5s网络也有所提升,mAP值达82.35%。该算法部署到Yulong810A星上处理平台后的目标检测速度达47.67fps/s,比原Yolov5s网络的速度提升两倍以上,实现了一个更轻量化、更快速的目标检测系统。Abstract: In order to solve the problem that the object detection algorithm based on deep learning is difficult to deploy on a space-based image processing platform with limited resources due to the complex network structure and excessive computational cost, this paper proposes a convolutional neural network acceleration design based on aerospace-grade neural network processor (NPU), and uses the improved Yolov5s network to realize fast image processing function on the NPU. The optimized network is iteratively trained on the GPU through the public dataset VOC, and the three parts of image processing are executed in parallel after the CPU-NPU parallel collaborative processing design, making full use of the limited computing and storage resources of the Yulong810A platform. Experiments show that the optimized network not only reduces the number of parameters by 82%, but also improves the accuracy compared with the original Yolov5s network, with an mAP value of 82.35%. After the algorithm is deployed on the Yulong810A on-board processing platform, the target detection speed reaches 41.67fps/s, which is more than twice the speed of the original Yolov5s network, and realizes a lighter and faster object detection system.
-
-
计量
- 文章访问数: 253
- HTML全文浏览量: 22
- PDF下载量: 10
-
被引次数:
0(来源:Crossref)
0(来源:其他)