Unified Storage and Efficient Retrieval Method for Space Weather Scientific Data of the Chinese Meridian Project
-
摘要: 面向子午工程空间天气科学数据海量小文件管理与多粒度访问需求,提出一种统一存储与检索方法。该方法采用对象存储承载列式容器,按照数据产品类别与时间分区组织离散文件,在同一容器内协同保存原始文件内容与解析后的结构化记录,并结合元数据索引与容器内部统计裁剪机制,实现文件回溯访问与记录条件检索的统一支撑。基于时间跨度 4 年、总规模约 2.4 TB、记录总数约 151 亿条的真实观测数据开展验证。结果表明,总体压缩率为 29.9%,其中文本类数据压缩率为 46.2%;可解析数值型数据由约 200 万个文件收敛为 93 个容器,单位空间记录承载能力提升约 80%。在并发访问条件下,两类服务均能稳定完成请求处理。该方法可有效缓解对象与元数据管理压力,为同类科学观测数据的统一组织与在线服务提供参考。Abstract: A unified storage and retrieval method was developed for space weather scientific data of the Chinese Meridian Project to address massive small-file management and multi-granularity access requirements. Object storage was used as the underlying platform, and columnar containers were adopted as the basic data organization unit. Discrete files were grouped by data product category and temporal partition, and original file contents were stored together with parsed structured records derived from the same file within each container. An external metadata index was combined with statistical pruning inside containers to support both file-level trace-back retrieval and record-level conditional query while reducing unnecessary data scanning. The method was validated using real observational data collected over four years, with a total volume of approximately 2.4 TB and about 15.1 billion records. Results showed that the overall compression ratio reached 29.9%, and text data achieved a compression ratio of 46.2%. For parseable numerical data, about 2 million original files were consolidated into 93 containers, and record density per unit storage space increased by about 80%. Under concurrent access conditions, both services remained stable. The method alleviates the pressure of object and metadata management and provides a reference for the unified organization and online service of similar scientific observation data.
-
-
计量
- 文章访问数: 26
- HTML全文浏览量: 2
- PDF下载量: 0
-
被引次数:
0(来源:Crossref)
0(来源:其他)
下载: