Volume 44 Issue 1
Feb.  2024
Turn off MathJax
Article Contents
TAN Pengyuan, XUE Changbin, ZHOU Li. Acceleration of Remote Sensing Image Filtering Based on Embedded CPU+GPU Heterogeneous Platform (in Chinese). Chinese Journal of Space Science, 2024, 44(1): 95-102 doi: 10.11728/cjss2024.01.2023-0033
Citation: TAN Pengyuan, XUE Changbin, ZHOU Li. Acceleration of Remote Sensing Image Filtering Based on Embedded CPU+GPU Heterogeneous Platform (in Chinese). Chinese Journal of Space Science, 2024, 44(1): 95-102 doi: 10.11728/cjss2024.01.2023-0033

Acceleration of Remote Sensing Image Filtering Based on Embedded CPU+GPU Heterogeneous Platform

doi: 10.11728/cjss2024.01.2023-0033 cstr: 32142.14.cjss2024.01.2023-0033
  • Received Date: 2023-03-02
  • Rev Recd Date: 2023-04-26
  • Available Online: 2023-07-27
  • A method is proposed for accelerating remote sensing image filtering in real-time using an embedded CPU + GPU heterogeneous platform for satellite-based image processing. the algorithm was initially parallelized through data division and mapping, leveraging the parallel computing capabilities of the GPU. Subsequently, hardware resources like the vector unit and cache of the GPU were employed to enhance algorithm speed through vectorization, vector permutation, and workgroup tuning. The feasibility and efficiency of this accelerated design were validated on an embedded development board. The experiments demonstrate a speedup ranging from 4.08 to 16.92 times when incorporating GPU parallel processing, compared to the serial implementation on a single CPU. Further optimization using GPU hardware resources can push the speedup to 15.38 to 56.41 times.

     

  • loading
  • [1]
    韦玉春, 汤国安, 杨昕, 等. 遥感数字图像处理教程[M]. 北京: 科学出版社, 2007: 174-184

    WEI Yuchun, TANG Guoan, YANG Xin, et al. Remote Sensing Digital Image Processing Course[M]. Beijing: Science Press, 2007: 174-184
    [2]
    KOSMIDIS L, RODRIGUEZ I, JOVER-ALVAREZ A, et al. GPU4S: Major project outcomes, lessons learnt and way forward[C]//2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). Grenoble, France: IEEE, 2021: 1314-1319
    [3]
    XIAO H, GUO B Y, ZHANG H Y, et al. A parallel algorithm of image mean filtering based on OpenCL[J]. IEEE Access, 2021, 9: 65001-65016 doi: 10.1109/ACCESS.2021.3068772
    [4]
    XIAO H, XIAO S Y, MA G, et al. Image Sobel edge extraction algorithm accelerated by OpenCL[J]. The Journal of Supercomputing, 2022, 78(14): 16236-16265 doi: 10.1007/s11227-022-04404-8
    [5]
    PANG Y L, JIANG S, CHENG B W, et al. Design and implement of median filter toward remote sensing images based on FPGA[C]//2021 IEEE 14th International Conference on ASIC (ASICON). Kunming, China: IEEE, 2021: 1-4
    [6]
    HARRIS P. The Mali GPU: An Abstract Machine, Part 3-The Midgard Shader Core[OL]. (2014-03-12)[2023-02-10]. https://community.arm.com/arm-community-blogs/b/graphics-gaming-and-vr-blog/posts/the-mali-gpu-an-abstract-machine-part-3--the-midgard-shader-core
    [7]
    Khronos OpenCL Working Group. The OpenCL Specification V1.2[EB/OL]. (2011-11-14)[2013-02-10]. https://registry.khronos.org/OpenCL/specs/opencl-1.2.pdf
    [8]
    周浔. 工业射线图像增强算法的研究[D]. 广州: 华南理工大学, 2020

    ZHOU Xun. Research on Industrial Ray Image Enhancement Algorithm[D]. Guangzhou: South China University of Technology, 2020
    [9]
    SEO S, LEE J, JO G, et al. Automatic OpenCL work-group size selection for multicore CPUs[C]//Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. Edinburgh, UK: IEEE, 2013: 387-397
    [10]
    USAMENTIAGA R. Real-time filtering on parallel SIMD architectures for automated quality inspection[J]. Journal of Real-Time Image Processing, 2021, 18(1): 127-141 doi: 10.1007/s11554-020-00954-3
    [11]
    LI K, YUAN L, ZHANG Y Q, et al. Reducing redundancy in data organization and arithmetic calculation for stencil computations[C]//Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. St. Louis, Missouri: ACM, 2021: 84
    [12]
    董钰山. 面向SMP的模板计算访存优化研究[D]. 长沙: 国防科学技术大学, 2015

    DONG Yushan. Optimizations of Memory-access for Stencil Computations on Shared-memory Multi-core Processor[D]. Changsha: National University of Defense Technology, 2015
    [13]
    JIANG S Q, RAN L H, CAO T, et al. Profiling and optimizing deep learning inference on mobile GPUs[C]//Proceedings of the 11th ACM SIGOPS Asia-Pacific Workshop on Systems. Tsukuba, Japan: ACM, 2020: 75-81
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)

    Article Metrics

    Article Views(497) PDF Downloads(79) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return