Citation: | WEN Min, YUE Yi, ZHANG Huaidong, WANG Xianghong, SHI Yan, LIU Rongmei, SUN Hanrui. 2024. A framework and key technologies for national geological survey management data integration and analysis for decision support. Geological Bulletin of China, 43(7): 1221-1232. doi: 10.12097/gbc.2022.11.004 |
Various information systems have been constructed for national geological survey organization, which have generated massive multi−source and heterogeneous management data. Effective integration and analysis of these data are in dire needs, for the collaborative and intelligent management of the national geological survey. This paper creates a framework based on big data, GIS and data mining technologies. Related key technologies are proposed, involving automatic and dynamic data integration, hybrid data management by Hadoop and "data−lake−warehouse" architecture, and decision support model for geological survey management. Based on the above, National Geological Survey Management Big Data System was constructed, which has integrated data from 24 different sources automatically and dynamically, more than 150 million records and 200 thousand documents have been organized in one, and has supported management decision by data or analysis services. It has been proved that can solve the problem of data integration for decision−making support, and has promoted the management efficiency of the national geological survey.
[1] | Aissi M, Benjelloun S, Loukili Y, et al. 2021. Data Lake Versus Data Warehouse Architecture: A Comparative Study[M]. New York: Springer. |
[2] | Angelaccio M, Basili A, Buttarazzi B, et al. 2012. Using Geo−Business Intelligence to improve Quality of Life[C]//Satellite Telecommunications (ESTEL), 2012 IEEE First AESS European Conference on. IEEE. |
[3] | Das Sarma A, Dong X, Halevy A. 2008. Bootstrapping pay−as−you−go data integration systems[C]//Proc. of the 2008 ACM SIGMOD Int’l Conf. on Management of Data. ACM: 861−874. |
[4] | Do H H, Rahm E. 2002. COMA: A system for flexible combination of schema matching approaches[C]//Proc. of the VLDB Endowment: 610−621. |
[5] | Doan A H, Madhavan J, Domingos P, et al. 2002. Learning to map between ontologies on the semantic Web[C]//Proc. of the 11th Int’l Conf. on World Wide Web. ACM: 662−673. |
[6] | Halevy A, Korn F, Noy N, et al. 2016. Goods: Organizing google's datasets[C]//Proc. of the 2016 Int’l Conf. on Management of Data (SIGMOD '16). ACM: 795–806. |
[7] | Lehmberg O, Bizer C. 2017. Stitching Web tables for improving matching quality[C]//Proc. of the VLDB Endowment, 10(11): 1502−1513. |
[8] | Nasr M, Sultan T, Khedr A, et al. 2013. Dynamic AI−Geo Health Application based on BIGIS−DSS Approach[J]. IOSR Journal of Computer Engineering, 13. 36-42. |
[9] | Nottelmann H, Straccia U. 2007. Information retrieval and machine learning for probabilistic schema matching[J]. Information Processing & Management, 43(3): 552−576. |
[10] | Parimbelli E, Sacchi L, Bellazzi R. 2016. Decision support through data integration: Strategies to meet the big data challenge[J]. International Journal of Medical Research & Health Sciences, 12(1): 10−14. |
[11] | Saddad E, El−Bastawissy A, Hoda M, et al. 2020, Lake Data Warehouse Architecture for Big Data Solutions[J]. International Journal of Advanced Computer Science and Applications, 11(8): 417−424. |
[12] | Sultan T, Nasr M, Khedr A, et al. 2013. A Proposed Integrated Approach for BI and GIS in Health Sector to Support Decision Makers (BIGIS−DSS)[J]. International Journal of Advanced Computer Science and Applications, 4(1): 170−176. |
[13] | Torre C, Guazzo G M, Ekani V, et al. 2022. The relationship between big data and decision making: A Systematic Literature Review[J]. Journal of Service Science and Management, 15: 89−107. |
[14] | Wang Y Z, Jin X L, Cheng X Q. 2014. Network Big Data: Present and Future[J]. Chinese Journal of Computers, 36(6): 1125−1138. doi: 10.3724/SP.J.1016.2013.01125 |
[15] | Wen M, Tang X M, Shi S Y, et al. 2020. Semantic Integration for Multi−Source Geo−Data based on Ontology−A case integration of chart and map[C]//2010 3rd International Conference on Computer and Electrical Engineering (ICEE 2010), 7: 96−99. |
[16] | 韩红太, 焦利伟, 马林娜, 等. 2019. 自然资源管理辅助决策服务平台设计与实现[J]. 测绘科学, 44(6): 337−340. |
[17] | 韩家琪, 毛克彪, 夏浪, 等. 2016. 基于空间数据仓库的农业大数据研究[J]. 中国农业科技导报, 18(5): 17−24. |
[18] | 洪之旭, 陈浩, 程亮. 2017. 基于大数据的社会治理数据集成及决策分析方法[J]. 清华大学学报 (自然科学版), 57(3): 6. |
[19] | 胡侃, 夏绍玮. 1998. 基于大型数据仓库的数据采掘: 研究综述[J]. 软件学报, (1): 54−64. |
[20] | 姜楠, 文必龙, 林宗斌. 2018. 基于元模型驱动异构数据统一建模的研究[J]. 电脑知识与技术, 14(12): 4−5, 8. |
[21] | 李超岭, 李健强, 张宏春, 等. 2015. 智能地质调查大数据应用体系架构与关键技术[J]. 地质通报, 34(7): 1288−1299. |
[22] | 李文俊, 杨学强, 杜家兴. 2020. 基于数据中台的装备保障数据集成[J]. 系统工程与电子技术, 42(6): 1317−1323. |
[23] | 刘洪霞, 冯益明, 曹晓明, 等. 2018. 荒漠生态系统大数据资源平台建设与服务[J]. 干旱区资源与环境, 32(9): 126−131. |
[24] | 刘晴, 汤玮, 刘旭. 2020. 基于虚拟数据库技术的异地异构数据源整合[J]. 信息技术, 44(1): 130−133. |
[25] | 刘文军, 吴俐民, 方源敏. 2014. 基于ETL的多源异构空间数据集成技术研究[J]. 城市勘测, (2): 55−59. |
[26] | 刘文毅, 邓吉秋, 韩肖肖, 等. 2019. 大数据环境下地质资料的存储策略与文本化导入技术[J]. 地质学刊, 43(3): 367−371. |
[27] | 任晓霞, 喻孟良, 张鸣之, 等. 2018. 基于Hadoop分布式系统的地质环境大数据框架探讨[J]. 中国地质灾害与防治学报, 29(1): 130−134, 142. |
[28] | 苏萌, 贾喜顺, 杜晓梦, 等. 2019. 数据中台技术相关进展及发展趋势[J]. 数据与计算发展前沿, 1(1): 120−130. |
[29] | 王凯, 曹建成, 王乃生, 等. 2015. Hadoop支持下的地理信息大数据处理技术初探[J]. 测绘通报, 2015(10): 114−117. |
[30] | 文敏, 唐新明, 史绍雨, 等. 2011. 针对海陆图融合的数字海图自动预处理及实现[J]. 地理空间信息, 9(1): 126−127, 135. |
[31] | 吴冲龙, 刘刚, 周琦, 等. 2020. 地质科学大数据统合应用的基本问题[J]. 地质科技通报, 39(4): 11. |
[32] | 徐佳沅, 文薪荐, 王思敏, 等, 彭晖儿. 2020. 一站式地理大数据智能化平台构建[J]. 测绘通报, (12): 132−137. |
[33] | 刘大杰, 陶本藻. 2000. 实用测量数据处理方法[M]. 北京: 测绘出版社: 79−81. |
[34] | 张鸣之, 喻孟良, 王勇, 等. 2013. 国家级地质环境数据仓库的设计与实现[J]. 地球科学(中国地质大学学报), 38(6): 1347−1355. |
[35] | 赵伟伟, 王守东, 贾凉, 等. 2021. 地理信息中台在智慧城市中的应用——以南京市为例[J]. 工程勘察, 49(4): 57−61. |
[36] | 钟晓, 马少平, 张钹, 等. 2001. 数据挖掘综述[J]. 模式识别与人工智能, 14(1): 48−55. |
Diagram of technical process
Overall technical architecture
Overall application architecture
Technology flowchart of data automation dynamic integration and governance
Universal data modal of geological surveying management
Technical framework of data analysis for decision support
Technology roadmap of multidimensional dynamic report
Flow chart of field work diagnostic evaluation model
Geological survey budget execution prediction and early warning
Data middle platform of geological survey management
Data center of geological survey management
Application system of geological survey management big data