Citation: | ZUO Qunchao, ZHANG Zhihui, SONG Yue. 2025. Era of big data, digitalization and intelligentization: New paradigm of the whole process and total factor quality control[J]. Geology in China, 52(3): 890-914. doi: 10.12029/gc20240919002 |
This paper is the result of mineral exploration engineering.
The current hotspots on big data are almost focused on how to discover or reveal the hidden value or secrets of big data, and there is an extreme lack of systematic, in−depth, and practical research or solutions on how to improve the quality of big data construction. However, the quality of big data construction is crucial for successfully discovering or revealing the hidden value or secrets of big data, and for making or implementing scientific and accurate decisions.
Firstly, by analyzing and summarizing the existing relevant research results, including the quality control experiences of the big data construction of China's National Mineral Resource Potential Evaluation (2006−2013), the entity data models for ore−searching prognosis in mineralization concentrating areas, and the basic framework of quality control models in the field of earth science. Then, with the help of big data thinking and digital and intelligent technologies, a digital and intelligent system of the whole process and total factor quality control for ore−searching prognosis and big data entity construction in mineralization concentrating areas is established. Finally, a new paradigm of the whole process and total factor quality control is proposed, which can be extended and supported to the quality control of big data asset construction in the field of earth science and beyond.
This study has proposed the quality control theory and method on data division and graininessl, established a digital and intelligent system of the whole process and total factor quality control for ore−searching prognosis and big data entity construction in mineralization concentrating areas, developed quality control software, formulated standards and specifications for data quality check and evaluation, and efficiently supported and fulfilled the quality control works such as self check, mutual check, special check, supervision and check, field acceptance, initial review, final review, re−examination, and confirmation of acceptance for the construction of big data entities for ore−searching prognosis in mineralization concentrating areas.
The digital and intelligent system of the whole process and total factor quality control for ore−searching prognosis and big data entity construction in mineralization concentrating areas has originality, practicality, efficiency and universality, which can directly be expand to and support the quality control of big data asset construction in the field of earth science and beyond. The proposed new paradigm of the whole process and total factor quality control also is scientific, effective, and universally applicable. The proposed concept of ontology constraint for data model has universality and practical significance, that is, the definition of data model should meet the requirements of conceptualization, sharing, explicitness, and formalization, applicable to two−dimensional or high−dimensional map−type data entities, as well as simple(e.g. data sheet) or complex(e.g. relation database)(relation database) table−type data entities.
[1] | Ali A, A. N, A. S, R. A. 2019. An assessment of open data sets completeness[J]. International Journal of Advanced Computer Science and Applications, 10(6): 557−562. |
[2] | Bartky W. 1942. Statistical method from the viewpoint of quality control. by Walter A. Shewhart; W. Edwards Deming[J]. American Mathematical Monthly, 49(3): 188. |
[3] | Bianco S, Buzzelli M, Mazzini D, Schettini R. 2017. Deep learning for logo recognition[J]. Neurocomputing, 245: 23−30. doi: 10.1016/j.neucom.2017.03.051 |
[4] | Cai Lihua, Ni Daichuan. 2021. Review on evaluation of research data at home and abroad[J]. Digital library Forum, (11): 65−72 (in Chinese with English abstract). |
[5] | Carranza E J M, Laborte A G. 2015. Random forest predictive modeling of mineral prospectivity with small number of prospects and data with missing values in Abra (Philippines)[J]. Computers & Geosciences, 74: 60−70. |
[6] | Chen C L, Zhang C. 2014. Data−intensive applications, challenges, techniques and technologies: A survey on Big Data[J]. Information Sciences, 275: 314−347. doi: 10.1016/j.ins.2014.01.015 |
[7] | Chen Jianping, Li Jing, Xie Shuai, Liu Jing, Hu Bin. 2017. China geological big data research status[J]. Journal of Geology, 41(3): 353−366 (in Chinese with English abstract |
[8] | Cheng Qiuming. 2021. What are mathematical geosciences and its frontiers?[J]. Earth Science Frontiers, 28(3): 6−25 (in Chinese with English abstrac |
[9] | Guo H, Wang L, Chen F, Liang D. 2014. Scientific big data and digital earth[J]. Chinese Science Bulletin, 59(35): 5066−5073. doi: 10.1007/s11434-014-0645-3 |
[10] | Guo H. 2018. Steps to the digital Silk Road[J]. Nature, 554(7690): 25−27. doi: 10.1038/d41586-018-01303-y |
[11] | Guo Huadong. 2018. Scientific big data— A footstone of ational strategy for big data[J]. Proceedings of the Chinese Academy of Sciences, 33(8): 768−773 (in Chinese). |
[12] | Jonathan T O, Gerald A M, Bony S. 2011. Special online collection: dealing with data[J]. Science, 331(6018): 639−806. |
[13] | Kennedy M C, Hagan O. 2001. Bayesian calibration of computer models[J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(3): 425−464. doi: 10.1111/1467-9868.00294 |
[14] | Li Guojie, Cheng Xueqi. 2012. Research status and scientific thinking of big data[J]. Proceedings of the Chinese Academy of Sciences, 27(6): 647−657 (in Chinese). |
[15] | Li Guojie. 2024. Big data and computing models[J]. Big Data Research, 10(1): 9−16 (in Chinese). |
[16] | Lü H, Ma H. 2019. Performance assessment and major trends in open government data research based on Web of Science data[J]. Data Technologies and Applications, 53(3): 286−303. doi: 10.1108/DTA-10-2017-0078 |
[17] | Máchová R, Hub M, Lnenicka M. 2018. Usability evaluation of open data portals[J]. Aslib Journal of Information Management, 70(3): 252−268. doi: 10.1108/AJIM-02-2018-0026 |
[18] | Marković D. 2015. Universalization and virtualization of the market economy, financial and real economy and the concept of global sustainable development[J]. Zbornik radova Filozofskog fakulteta u Pristini, 45(1): 123−142. doi: 10.5937/zrffp45-9712 |
[19] | Mayernik M S, Breseman K, Downs R R, Duerr R, Garretson A, Hou C S. 2020. Risk assessment for scientific data[J]. Data Science Journal, 19: 1−15. doi: 10.5334/dsj-2020-001 |
[20] | Sa Xu, Wang Jian, Fan Zhixuan, Liu Jianpin, Zhang Guilan, Xu Bo. 2020. Identification of scientific data quality evaluation indicators from the perspective of data journals peer review[J]. Library and Information Service, 64(17): 123−130. |
[21] | Schmidt C O, Struckmann S, Enzenbach C, Reineke A, Stausberg J, Damerow S, Huebner M, Schmidt B, Sauerbrei W, Richter A. 2021. Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R[J]. BMC Medical Research Methodology, 21(1): 1−15. doi: 10.1186/s12874-020-01190-w |
[22] | Song Jundian, Liu Fengyuan. 2018. A method and application study of data quality evaluation supporting[J]. Computer Applications and Spftware, 35(5): 328−333. |
[23] | Strong D M, Lee Y W, Wang R Y. 1997. Data quality in context[J]. Communications of the ACM, 40(5): 103−110. |
[24] | Tolle K M, Tansley D S W, Hey A J G. 2011. The fourth paradigm: Data−intensive scientific discovery [Point of View][J]. Proceedings of the IEEE, 99(8): 1334−1337. doi: 10.1109/JPROC.2011.2155130 |
[25] | Tononi G, Sporns O. 2003. Measuring information integration[J]. BMC Neuroence, 4(1): 31. |
[26] | Wang Chao, Zhang Hui. 2020. Research and enlightenment of international open data evaluation index system in Europe and America[J]. China Science & Technology Resources Review, 52(5): 71−77. |
[27] | Wang Juanle, Chen Shenbin. 2006. Research on index and method of geosciences raster grid data quality evaluation[J]. Science of Surveying and Mapping, 31(5): 83−85, 82, 6. |
[28] | Wu Chonglong, Liu Gang. 2019. Big data and future development of geological science[J]. Geological Bulletin of China, 38(7): 1081−1088 (in Chinese with English abstract). |
[29] | Wu Chonglong, Liu Gang, Zhang Xialin, He Zhenwen, Zhang Zhiting. 2016. Discussion on geological science big data and its applications[J]. Chinese Science Bulletin, 61(16): 1797−1807 (in Chinese with English abstract). |
[30] | Zhai Mingguo, Yang Shufeng, Chen Ninghua, Chen Hanlin. 2018. Big data epoch: Challenges and opportunities for geology[J]. Bulletin of Chinese Academy of Sciences, 33(8): 825−831 (in Chinese with English abstract). |
[31] | Zhang H, Xiao J. 2020. Quality assessment framework for open government data[J]. The Electronic Library, 38(2): 209−222. doi: 10.1108/EL-06-2019-0145 |
[32] | Zhang Qi, Zhou Yongzhang. 2017. Big data will lead to a profound revolution in the field of geological science[J]. Chinese Journal of Geology, 52(3): 637−648 (in Chinese with English abstract). |
[33] | Zhang Z H, Zuo Q C. 2024. Definition and construction of data models of big data entity for prospecting prediction of ore concentration area) [J/OL]. Geology in China. https://link.cnki.net/urlid/11.1167.P.20240702.1551.011. |
[34] | Zhang Zhihui, Zuo Qunchao. 2024. Development of application softwares for data models for prospecting prediction in mineralization focused areas based on model−driven and automatic evolution software theory[J]. Mineral Exploration, 15(8): 1478−1490 (in Chinese with English abstract). |
[35] | Zhang Zhihui, Zuo Qunchao. 2024. Framework of theories, methodologies and technologies system of ore−searching prognosis in mineralization focused areas[J]. Geology in China. https://link.cnki.net/urlid/11.1167.P.20240523.0846.002. |
[36] | Zhao Pengda. 2019. Characteristics and rational utilization of geological big data[J]. Earth Science Frontiers, 26(4): 1−5 (in Chinese with English abstract). |
[37] | Zhao Pangda, Chen Jianping, Zhang Shouting. 2003. The new develoment of "Three Components" quantitative mineral prediction[J]. Earth Science Frontiers, 10(2): 455−463 (in Chinese with English abstract). |
[38] | Zhao Pengda, Meng Xianguo. 1992. Quantification and geosciences[J]. Earth Science, (S1): 51−56 (in Chinese). |
[39] | Zhou Chenghu, Wang Hua, Wang Chengshan, Hou Zengqian, Zheng Zhiming, Shen Shuzhong, Cheng Qiuming, Feng Zhiqiang, Wang Xinbing, Lü Hairong. 2021. Geoscience knowledge graph in the big data era[J]. Science China Earth Sciences, 51(7): 1070−1079 (in Chinese). |
[40] | Zhou Yongzhang, Chen Chuan, Zhang Qi, Wang Gongwen, Xiao Fan, Shen Wenjie, Bian Jing, Wang Ya, Yang Wei, Jiao Shoutao, Liu Yanpeng, Han Feng. 2020. In troduction of tools for geological big data mining and their applications[J]. Geotectonica et Metallogenia, 44(2): 173−182 (in Chinese with English abstract). |
[41] | Zuiderwijk A, Pirannejad A, Susha I. 2021. Comparing open data benchmarks: Which metrics and methodologies determine countries’ positions in the ranking lists?[J]. Telematics and Informatics, 62: 1−23. |
[42] | Zuo Qunchao, Yang Donglai, Song Yue, Ma Juan, Xiao Zhijian. 2013. The data quality control and technique of the mineral resources potential evaluation in China[J]. Geology in China, 40(4): 1314−1328 (in Chinese with English abstract). |
[43] | Zuo Qunchao, Pang Zhenshan, Xue Jianling, Chen Hui, Ye Tianzhu. 2022. An Application Guide to Theory, Methodology and Technical System of Ore−searching Prognosis in Mineralization Concentrating Area[M]. Beijing: Geological Publishing House (in Chinese). |
[44] | Zuo Qunchao. 2022. Methodology of Checking and Evaluation on Database Quality of Ore−searching Prognosis in Mineralization Concentrating Area[M]. Beijing: Geological Publishing House (in Chinese). |
[45] | Zuo Renguang, Peng Yong, Li Tong, Xiong Yihui. 2021. Challenges of geological prospecting big data mining and integration using deep learning algorithms[J]. Earth Science, 46(1): 350−358 (in Chinese with English abstract). |
[46] | 蔡丽华, 倪代川. 2021. 国内外科学数据评价研究综述[J]. 数字图书馆论坛, (11): 65−72. doi: 10.3772/j.issn.1673-2286.2021.11.009 |
[47] | 陈建平, 李靖, 谢帅, 刘静, 胡彬. 2017. 中国地质大数据研究现状[J]. 地质学刊, 41(3): 353−366. doi: 10.3969/issn.1674-3636.2017.03.001 |
[48] | 成秋明. 2021. 什么是数学地球科学及其前沿领域?[J]. 地学前缘, 28(3): 6−25. |
[49] | 郭华东. 2018. 科学大数据—国家大数据战略的基石[J]. 中国科学院院刊, 33(8): 768−773. |
[50] | 李国杰, 程学旗. 2012. 大数据研究: 未来科技及经济社会发展的重大战略领域—大数据的研究现状与科学思考[J]. 中国科学院院刊, 27(6): 647−657. doi: 10.3969/j.issn.1000-3045.2012.06.001 |
[51] | 李国杰. 2024. 大数据与计算模型[J]. 大数据, 10(1): 9−16. doi: 10.11959/j.issn.2096-0271.2024017 |
[52] | 撒旭, 王健, 范智萱, 刘建平, 张贵兰, 徐波. 2020. 数据期刊同行评议视角下科学数据质量评价指标识别[J]. 图书情报工作, 64(17): 123−130. |
[53] | 宋俊典, 刘丰源. 2018. 一种支持数据质量评价的方法与应用研究. 计算机应用与软件, 35(5): 328−333. |
[54] | 王超, 张辉. 2020. 欧美开放数据评估指标体系调查研究及启示[J]. 中国科技资源导刊, 52(5): 71−77. doi: 10.3772/j.issn.1674-1544.2020.05.010 |
[55] | 王卷乐, 陈沈斌. 2006. 地学栅格格网数据质量评价指标与方法[J]. 测绘科学, 31(5): 83−85, 82, 6. doi: 10.3771/j.issn.1009-2307.2006.05.027 |
[56] | 吴冲龙, 刘刚, 张夏林, 何珍文, 张志庭. 2016. 地质科学大数据及其利用的若干问题探讨[J]. 科学通报, 61(16): 1797−1807. |
[57] | 吴冲龙, 刘刚. 2019. 大数据与地质学的未来发展[J]. 地质通报, 38(7): 1081−1088. doi: 10.12097/gbc.dztb-38-7-1081 |
[58] | 翟明国, 杨树锋, 陈宁华, 陈汉林. 2018. 大数据时代: 地质学的挑战与机遇[J]. 中国科学院院刊, 33(8): 825−831. |
[59] | 张旗, 周永章. 2017. 大数据正在引发地球科学领域一场深刻的革命—《地质科学》2017 年大数据专题代序[J]. 地质科学, 52(3): 637−648. |
[60] | 张志辉, 左群超. 2024a. 基于模型驱动和自动演进理论的矿集区找矿预测数据模型应用软件开发[J]. 矿产勘查, 15(8): 1478−1490. |
[61] | 张志辉, 左群超. 2024b. 矿集区找矿预测大数据实体数据模型的界定与构建[J/OL]. 中国地质. https://link.cnki.net/urlid/11.1167.P.20240702.1551.011. |
[62] | 张志辉, 左群超. 2024c. 矿集区找矿预测理论方法技术体系框架[J/OL]. 中国地质, https://link.cnki.net/urlid/11.1167.P.20240523.0846.002. |
[63] | 赵鹏大, 陈建平, 张寿庭. 2003. “三联式”成矿预测新进展[J]. 地学前缘, 10(2): 455−463. |
[64] | 赵鹏大, 孟宪国. 1992. 地质学的定量化问题[J]. 地球科学, (S1): 51−56. |
[65] | 赵鹏大. 2019. 地质大数据特点及其合理开发利用[J]. 地学前缘, 26(4): 1−5. |
[66] | 周成虎, 王华, 王成善, 侯增谦, 郑志明, 沈树忠, 成秋明, 冯志强, 王新兵. 2021. 大数据时代的地学知识图谱研究[J]. 中国科学(地球科学), 51(7): 1070−1079. |
[67] | 周永章, 陈川, 张旗, 王功文, 肖凡, 沈文杰, 卞静, 王亚, 杨威, 焦守涛, 刘艳鹏, 韩枫. 2020. 地质大数据分析的若干工具与应用[J]. 大地构造与成矿学, 44(2): 173−182. |
[68] | 左群超, 杨东来, 宋越, 马娟, 肖志坚. 2013. 中国矿产资源潜力评价成果数据质量控制及方法技术[J]. 中国地质, 40(4): 1314−1328. doi: 10.3969/j.issn.1000-3657.2013.04.028 |
[69] | 左群超, 庞振山, 薛建玲, 陈辉, 叶天竺. 2022. 矿集区找矿预测理论方法技术体系应用导论[M]. 北京: 地质出版社. |
[70] | 左群超. 2022. 矿集区找矿预测数据库质量检查与评价方法[M]. 北京: 地质出版社. |
[71] | 左仁广, 彭勇, 李童, 熊义辉. 2021. 基于深度学习的地质找矿大数据挖掘与集成的挑战[J]. 地球科学, 46(1): 350−358. |
Integrated process of ore-searching prognosis and big data entity construction in mineralization concentrating areas
Digital and intelligent system of the whole process and total factor quality check and evaluation for ore-searching prognosis and big data entity construction in mineralization concentrating areas
Final review quality control activity for project database construction of ore-searching prognosis in mineralization concentrating areas
Diagram of login interface and functional modules of GeoDQCBM software
Construction method and hierarchical structure of the quality control model database for ore-searching prognosis in mineralization concentrating areas
Division and reduction scheme of big data entities for ore-searching prognosis in mineralization concentrating areas
Quality models of map−type data
Quality models of table-type data