Volume 44 Issue 1
Feb.  2024
Turn off MathJax
Article Contents
TAN Pengyuan, XUE Changbin, ZHOU Li. Acceleration of Remote Sensing Image Filtering Based on Embedded CPU+GPU Heterogeneous Platform (in Chinese). Chinese Journal of Space Science, 2024, 44(1): 95-102 doi: 10.11728/cjss2024.01.2023-0033
Citation: TAN Pengyuan, XUE Changbin, ZHOU Li. Acceleration of Remote Sensing Image Filtering Based on Embedded CPU+GPU Heterogeneous Platform (in Chinese). Chinese Journal of Space Science, 2024, 44(1): 95-102 doi: 10.11728/cjss2024.01.2023-0033

Acceleration of Remote Sensing Image Filtering Based on Embedded CPU+GPU Heterogeneous Platform

doi: 10.11728/cjss2024.01.2023-0033 cstr: 32142.14.cjss2024.01.2023-0033
  • Received Date: 2023-03-02
  • Rev Recd Date: 2023-04-26
  • Available Online: 2023-07-27
  • A method is proposed for accelerating remote sensing image filtering in real-time using an embedded CPU + GPU heterogeneous platform for satellite-based image processing. the algorithm was initially parallelized through data division and mapping, leveraging the parallel computing capabilities of the GPU. Subsequently, hardware resources like the vector unit and cache of the GPU were employed to enhance algorithm speed through vectorization, vector permutation, and workgroup tuning. The feasibility and efficiency of this accelerated design were validated on an embedded development board. The experiments demonstrate a speedup ranging from 4.08 to 16.92 times when incorporating GPU parallel processing, compared to the serial implementation on a single CPU. Further optimization using GPU hardware resources can push the speedup to 15.38 to 56.41 times.

     

  • loading
  • [1]
    韦玉春, 汤国安, 杨昕, 等. 遥感数字图像处理教程[M]. 北京: 科学出版社, 2007: 174-184

    WEI Yuchun, TANG Guoan, YANG Xin, et al. Remote Sensing Digital Image Processing Course[M]. Beijing: Science Press, 2007: 174-184
    [2]
    KOSMIDIS L, RODRIGUEZ I, JOVER-ALVAREZ A, et al. GPU4S: Major project outcomes, lessons learnt and way forward[C]//2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). Grenoble, France: IEEE, 2021: 1314-1319
    [3]
    XIAO H, GUO B Y, ZHANG H Y, et al. A parallel algorithm of image mean filtering based on OpenCL[J]. IEEE Access, 2021, 9: 65001-65016 doi: 10.1109/ACCESS.2021.3068772
    [4]
    XIAO H, XIAO S Y, MA G, et al. Image Sobel edge extraction algorithm accelerated by OpenCL[J]. The Journal of Supercomputing, 2022, 78(14): 16236-16265 doi: 10.1007/s11227-022-04404-8
    [5]
    PANG Y L, JIANG S, CHENG B W, et al. Design and implement of median filter toward remote sensing images based on FPGA[C]//2021 IEEE 14th International Conference on ASIC (ASICON). Kunming, China: IEEE, 2021: 1-4
    [6]
    HARRIS P. The Mali GPU: An Abstract Machine, Part 3-The Midgard Shader Core[OL]. (2014-03-12)[2023-02-10]. https://community.arm.com/arm-community-blogs/b/graphics-gaming-and-vr-blog/posts/the-mali-gpu-an-abstract-machine-part-3--the-midgard-shader-core
    [7]
    Khronos OpenCL Working Group. The OpenCL Specification V1.2[EB/OL]. (2011-11-14)[2013-02-10]. https://registry.khronos.org/OpenCL/specs/opencl-1.2.pdf
    [8]
    周浔. 工业射线图像增强算法的研究[D]. 广州: 华南理工大学, 2020

    ZHOU Xun. Research on Industrial Ray Image Enhancement Algorithm[D]. Guangzhou: South China University of Technology, 2020
    [9]
    SEO S, LEE J, JO G, et al. Automatic OpenCL work-group size selection for multicore CPUs[C]//Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. Edinburgh, UK: IEEE, 2013: 387-397
    [10]
    USAMENTIAGA R. Real-time filtering on parallel SIMD architectures for automated quality inspection[J]. Journal of Real-Time Image Processing, 2021, 18(1): 127-141 doi: 10.1007/s11554-020-00954-3
    [11]
    LI K, YUAN L, ZHANG Y Q, et al. Reducing redundancy in data organization and arithmetic calculation for stencil computations[C]//Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. St. Louis, Missouri: ACM, 2021: 84
    [12]
    董钰山. 面向SMP的模板计算访存优化研究[D]. 长沙: 国防科学技术大学, 2015

    DONG Yushan. Optimizations of Memory-access for Stencil Computations on Shared-memory Multi-core Processor[D]. Changsha: National University of Defense Technology, 2015
    [13]
    JIANG S Q, RAN L H, CAO T, et al. Profiling and optimizing deep learning inference on mobile GPUs[C]//Proceedings of the 11th ACM SIGOPS Asia-Pacific Workshop on Systems. Tsukuba, Japan: ACM, 2020: 75-81
  • 加载中

Catalog

    Figures(11)

    Article Metrics

    Article Views(780) PDF Downloads(79) Cited by()
    Visiting statistics
    Related Articles

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return