Citation: | HUANG Xusheng, ZHU Yueqin, FU Lijun, LIU Yujiang, TANG Keke, LI Jin. 2021. Research on a geological entity relation extraction model for gold mine based on BERT. Journal of Geomechanics, 27(3): 391-399. doi: 10.12090/j.issn.1006-6616.2021.27.03.035 |
Intelligent identification of entity relation is an important method and approach to improve literature mining and analysis, and knowledge extraction of gold mine. This study focuses on the core issues affecting current entity relation extraction of gold mine such as complex entity relation and less manual annotation information, and proposes a BERT (Bidirectional Encoder Representations from Transformer) remotely supervised relation extraction model. The accuracy of relation extraction is increased by optimizing and improving the modules related to geological data coding, geological classification and geological entity filtering. And the effectiveness of the model is verified by the entity relation extraction experiment of 290489 pieces of gold ore documents.
ALT C, HVBNER M, HENNIG L, 2019. Fine-tuning pre-trained transformer language models to distantly supervised relation extraction[C]//Proceedings of the 57th annual meeting of the association for computational linguistics. Florence, Italy: Association for Computational Linguistics: 1388-1398. |
BING X Y, SHEN L D, ZHENG L Y, 2019. A moderately deep convolutional neural network for relation extraction[C]//Proceedings of the 2019 11th international conference on machine learning and computing. New York, NY, USA: Association for Computing Machinery: 173-177. |
CAI Q, HAO J Y, CAO J, et al., 2018. Multi-level attention mechanism based distant supervision for relation extraction[J]. Journal of Chinese Information Processing, 32(1): 96-101. (in Chinese with English abstract) |
CAI Q, LI J, HAO J Y, 2019. Distant supervision relation extraction based on focal loss and residual network[J]. Computer Engineering, 45(12): 166-170. (in Chinese with English abstract) |
CHEN J P, LI J, XIE S, et al., 2017. China geological big data research status[J]. Journal of Geology, 41(3): 353-366. (in Chinese with English abstract) |
DEVLIN J, CHANG M W, LEE K, et al., 2019. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies. Minneapolis, Minnesota: Association for Computational Linguistics: 4171-4186. |
FENG J, HUANG M L, ZHAO L, et al., 2018. Reinforcement learning for relation classification from noisy data[C]//Proceedings of the 32nd AAAI conference on artificial intelligence. Menlo Park, CA: AAAI: 5779-5786. |
GAO H, LIU Z, VAN DER MAATEN L, et al., 2017. Densely connected convolutional networks[C]//Proceedings of the 2017 IEEE conference on computer vision and pattern recognition. Honolulu, HI, USA: IEEE: 4700-4708. |
HOFFMANN R, ZHANG C L, LING X, et al., 2011. Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies. Portland, Oregon, USA: Association for Computational Linguistics: 541-550. |
HUANG Y Y, WANG W Y, 2017. Deep residual learning for weakly-supervised relation extraction[C]//proceedings of the 2017 conference on empirical methods in natural language processing. Copenhagen, Denmark: Association for Computational Linguistics: 1803-1807. |
LIN T Y, GOYAL P, GIRSHICK R, et al., 2017. Focal loss for dense object detection[C]//2017 IEEE international conference on computer vision (ICCV). Venice, Italy: IEEE: 2999-3007. |
LIN Y K, SHEN S Q, LIU Z Y, et al., 2016. Neural relation extraction with selective attention over instances[C]//Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers). Berlin, Germany: Association for Computational Linguistics: 2124-2133. |
LYU P F, WANG C N, ZHU Y Q, 2017. Study on geologic entity relation extraction method based on literature[J]. China Mining Magazine, 26(10): 167-172. (in Chinese with English abstract) |
MINTZ M, BILLS S, SNOW R, et al., 2009. Distant supervision for relation extraction without labeled data[C]//Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Volume 2-Volume 2. Stroudsburg, PA: Association for Computational Linguistics: 1003-1011. |
QIAN X M, LIU J Y, CHENG P S, 2020. Distant supervised relation extraction based on densely connected convolutional networks[J]. Computer Science, 47(2): 157-162. (in Chinese with English abstract) |
RIEDEL S, YAO L M, MCCALLUM A, 2010. Modeling relations and their mentions without labeled text[C]//Proceedings of the 2010 European conference on machine learning and knowledge discovery in databases. Berlin: Springer-Verlag: 148-163. |
SOARES L B, FITZGERALD N, LING J, et al., 2019. Matching the blanks: distributional similarity for relation learning[C]//Proceedings of the 57th annual meeting of the association for computational linguistics. Florence, Italy: Association for Computational Linguistics: 2895-2905. |
SONG M C, LI S Z, YI P H, et al., 2014. Classification and metallogenic theory of the Jiaojia-Style gold deposit in Jiaodong Peninsula, China[J]. Journal of Jilin University (Earth Science Edition), 44(1): 87-104. (in Chinese with English abstract) |
TAN Y J, WEN M, ZHU Y Q, et al., 2017. Research on the big data characteristics of geological data[J]. China Mining Magazine, 26(9): 67-71, 84. (in Chinese with English abstract) |
TANG C, NUO M H, HU Y, 2020. A hybrid model for relation extraction via ResNet & BiGRU[J]. Journal of Chinese Information Processing, 34(2): 38-45. (in Chinese with English abstract) |
VASWANI A, SHAZEER N, PARMAR N, et al., 2017. Attention is all you need[C]//Proceedings of the 31st international conference on neural information processing systems. Red Hook, NY, USA: Curran Associates Inc. : 6000-6010. |
WANG Q S, ZHANG J H, YOU T, et al., 2021. Study on the multiple-element exploration method of ore beds in wells and gold exploration experiment in the area with thick cover: Taken Wuhe area in Northeast Anhui as anexample[J]. Geology and Exploration, 57(1): 136-145. (in Chinese with English abstract) |
XUE Y S, WANG R T, WANG C, et al., 2020. Ore-controlling rules of fault structures in the Wangjiaping gold deposit in Shanyang County, Shaanxi Province[J]. Journal of Geomechanics, 26(3): 391-404. (in Chinese with English abstract) |
YIH W T, CHANG M W, HE X D, et al., 2015. Semantic parsing via staged query graph generation: question answering with knowledge base[C]//Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Beijing, China: Association for Computational Linguistics: 1321-1331. |
ZENG D J, LIU K, CHEN Y B, et al., 2015. Distant supervision for relation extraction via piecewise convolutional neural networks[C]//Proceedings of the 2015 conference on empirical methods in natural language processing. Lisbon, Portugal: Association for Computational Linguistics: 1753-1762. |
ZHANG B Q, YANG Q H, ZHAO F Y, et al., 2020. The ore-bearing horizon and ore characteristics of gold deposits in the Emesishan basalt area of western Guizhou: A case study of the Jiadi gold deposite in Panxian County[J]. Geology and Exploration, 56(6): 1145-1157. (in Chinese with English abstract) |
ZHANG K, YANG X K, YU H B, et al., 2020. Analysis of ore-controlling structure in the Changgou gold deposit of the northern Hanyin gold orefield, southern Qinling Mountains[J]. Journal of Geomechanics, 26(3): 363-375. (in Chinese with English abstract) |
ZHANG X Y, YE P, WANG S, et al., 2018. Geological entity recognition method based on Deep Belief Networks[J]. Acta Petrologica Sinica, 34(2): 343-351. (in Chinese with English abstract) |
ZHU Y Q, TAN Y J, WU Y L, et al., 2017. Research on semantic retrieval model towards geological big data[J]. China Mining Magazine, 26(12): 143-149. (in Chinese with English abstract) |
ZHU Y Q, ZHOU W W, XU Y, et al., 2017b. Intelligent learning for knowledge graph towards geological data[J]. Scientific Programming, 2017: 5072427. |
蔡强, 郝佳云, 曹健, 等, 2018. 采用多尺度注意力机制的远程监督关系抽取[J]. 中文信息学报, 32(1): 96-101. doi: 10.3969/j.issn.1003-0077.2018.01.013 |
蔡强, 李晶, 郝佳云, 2019. 基于聚焦损失与残差网络的远程监督关系抽取[J]. 计算机工程, 45(12): 166-170. |
陈建平, 李靖, 谢帅, 等, 2017. 中国地质大数据研究现状[J]. 地质学刊, 2017, 41(3): 353-366. |
吕鹏飞, 王春宁, 朱月琴, 2017. 基于文献的地质实体关系抽取方法研究[J]. 中国矿业, 26(10): 167-172. |
钱小梅, 刘嘉勇, 程芃森, 2020. 基于密集连接卷积神经网络的远程监督关系抽取[J]. 计算机科学, 47(2): 157-162. |
宋明春, 李三忠, 伊丕厚, 等, 2014. 中国胶东焦家式金矿类型及其成矿理论[J]. 吉林大学学报(地球科学版), 44(1): 87-104. |
谭永杰, 文敏, 朱月琴, 等, 2017. 地质数据的大数据特性研究[J]. 中国矿业, 26(9): 67-71, 84. |
唐朝, 诺明花, 胡岩, 2020. ResNet结合BiGRU的关系抽取混合模型[J]. 中文信息学报, 34(2): 38-45. doi: 10.3969/j.issn.1003-0077.2020.02.005 |
汪青松, 张金会, 尤淼, 等, 2021. 井中矿层多要素探测方法研究与厚覆盖区金矿勘查试验: 以皖东北五河地区为例[J]. 地质与勘探, 57(1): 136-145. |
薛玉山, 王瑞廷, 汪超, 等, 2020. 陕西省山阳县王家坪金矿断裂构造控矿规律[J]. 地质力学学报, 26(3): 391-404. |
张兵强, 杨清毫, 赵富远, 等, 2020. 贵州西部峨眉山玄武岩区金矿赋矿层位及矿石特征: 以盘县架底金矿为例[J]. 地质与勘探, 56(6): 1145-1157. |
张康, 杨兴科, 于恒彬, 等, 2020. 南秦岭汉阴北部金矿田长沟金矿区控矿构造解析[J]. 地质力学学报, 26(3): 363-375. |
张雪英, 叶鹏, 王曙, 等, 2018. 基于深度信念网络的地质实体识别方法[J]. 岩石学报, 34(2): 343-351. |
朱月琴, 谭永杰, 吴永亮, 等, 2017. 面向地质大数据的语义检索模型研究[J]. 中国矿业, 26(12): 143-149. |
Framework of the remote supervision
Remotely supervised relation extraction model
Ontology diagram
Categories of entity relation
PR graph of each model in NYT dataset
PR graph of each model in geological dataset
Extraction effect of BERT model