1/2016 - 2 |
Information Extraction Using Distant Supervision and Semantic SimilaritiesPARK, Y. , KANG, S. , SEO, J. |
Extra paper information in |
Click to see author's profile in SCOPUS, IEEE Xplore, Web of Science |
Download PDF (1,163 KB) | Citation | Downloads: 1,410 | Views: 3,534 |
Author keywords
relation extraction, unsupervised learning, distant supervision, information extraction, natural language processing
References keywords
extraction(15), link(14), computational(12), relation(11), meeting(10), linguistics(10), association(10), supervision(7), distant(7), semantic(5)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2016-02-28
Volume 16, Issue 1, Year 2016, On page(s): 11 - 18
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2016.01002
Web of Science Accession Number: 000376995400002
SCOPUS ID: 84960113172
Abstract
Information extraction is one of the main research tasks in natural language processing and text mining that extracts useful information from unstructured sentences. Information extraction techniques include named entity recognition, relation extraction, and co-reference resolution. Among them, relation extraction refers to a task that extracts semantic relations between entities such as personal and geographic names in documents. This is an important research area, which is used in knowledge base construction and question and answering systems. This study presents relation extraction using a distant supervision learning technique among semi-supervised learning methods, which have been spotlighted in recent years to reduce human manual work and costs required for supervised learning. That is, this study proposes a method that can improve relation extraction by improving a distant supervision learning technique by applying a clustering method to create a learning corpus and semantic analysis for relation extraction that is difficult to identify using existing distant supervision. Through comparison experiments of various semantic similarity comparison methods, similarity calculation methods that are useful to relation extraction using distant supervision are searched, and a large number of accurate relation triples can be extracted using the proposed structural advantages and semantic similarity comparison. |
References | | | Cited By «-- Click to see who has cited this paper |
[1] M. Craven and J. Kumlien, "Constructing Biological Knowledge Bases by Extracting Information from Text Sources," in Proc. of International Conference on Intelligent System for Molecular Biology, pp. 77-86, 1999.
[2] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, J. Taylor, "Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge," in Proc. of SIGMOD, pp. 1247-1250, 2008. [CrossRef] [SCOPUS Times Cited 4196] [3] N. Kambhatla, "Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Extracting Relations," in Proc. of Annual Meeting of the Association for Computational Linguistics, pp. 178-181, 2004. [CrossRef] [4] G. Zhou, J. Su, J. Zhang, M. Zhang, "Exploring various knowledge in relation extraction," in Proc. of Annual Meeting of the Association for Computational Linguistics, pp. 427-434, 2005. [CrossRef] [5] R. Bunescu, R. Mooney, "A Shortest Path Dependency Kernel for Relation Extraction". in Proc. of HLT/EMNLP, pp. 724-731, 2005. [CrossRef] [SCOPUS Times Cited 801] [6] B. Plank, A. Moschitti, "Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction," in Proc. of Annual Meeting of the Association for Computational Linguistics, pp. 1498-1507, 2013. [Online] Available: Temporary on-line reference link removed - see the PDF document [7] F. Wu, D. Weld, "Open information extraction using Wikipedia," in Proc. of Annual Meeting of the Association for Computational Linguistics, pp. 118-127, 2010. [Online] Available: Temporary on-line reference link removed - see the PDF document [8] M. Banko, M. Cafarella, S. Soderland, M. Broadhead, O. Etzioni, "Open information extraction from the web," in Proc. of International Joint Conference on Artificial Intelligence, pp. 2670-2676, 2007. [CrossRef] [Web of Science Times Cited 259] [SCOPUS Times Cited 486] [9] M. Mintz, S. Bills, R. Snow, D. Jurafsky, "Distant supervision for relation extraction without labeled data," in Proc. of Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003-1011, 2009. [Online] Available: Temporary on-line reference link removed - see the PDF document [10] S. Takamatsu, I. Sato, H. Nakagawa, "Reducing Wrong Labels in Distant Supervision for Relation Extraction," in Proc. of Annual Meeting of the Association for Computational Linguistics, pp. 721-729, 2012. [Online] Available: Temporary on-line reference link removed - see the PDF document [11] I. Augenstein, "Seed Selection for Distantly Supervised Web-Based Relation Extraction," in Proc. of COLING Workshop on Semantic Web and Information Extraction, pp. 17-24, 2014. [Online] Available: Temporary on-line reference link removed - see the PDF document [12] X. Zhang, J. Zhang, J. Zeng, J. Yan, Z. Chen, Z. Sui, "Towards Accurate Distant Supervision For Relational Facts Extraction," in Proc. of Annual Meeting of the Association for Computational Linguistics, pp. 810-815, 2013. [Online] Available: Temporary on-line reference link removed - see the PDF document [13] M. Surdeanu, J. Tibshirani, R. Nallapati, C. Manning, "Multi-instance Multi-label Learning for Relation Extraction," in Proc. of Empirical Methods in Natural Language Processing and Computational Natural Language Learning. pp. 455-465, 2012. [Online] Available: Temporary on-line reference link removed - see the PDF document [14] M. Fan, D. Zhao, Q. Zhou, Z. Liu, T. Zheng, E. Chang, "Distant Supervision for Relation Extraction with Matrix Completion", in Proc. of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 839-849, 2014. [Online] Available: Temporary on-line reference link removed - see the PDF document [15] T. Nguyen, A. Moschitti, "End-to-end Relation Extraction using Distant Supervision from External Semantic Repositories," in Proc. of Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 277-282, 2014. [Online] Available: Temporary on-line reference link removed - see the PDF document [16] S. Krause, H. Li, H. Uszkoreit, F, Xu, "Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web," in Proc. of International Semantic Web Conference, pp. 263-278, 2012. [Online] Available: Temporary on-line reference link removed - see the PDF document [17] G. Garrido, A. Penas, B. Cabaleiro, A. Rodrigo, "Temporally Anchored Relation Extraction," in Proc. of Annual Meeting of the Association for Computational Linguistics, pp. 107-116, 2012. [Online] Available: Temporary on-line reference link removed - see the PDF document [18] M. Surdeanu, D. McClosky, J. Tibshirani, J. Bauer, A. Chang, V. Spitkovsky, C. Manning, "A Simple Distant Supervision Approach for the TAC-KBP Slot Filling Task," in Proc. of Text Analysis Conference, 2010. [Online] Available: Temporary on-line reference link removed - see the PDF document [19] Y, Kim. "Automatic Training Corpus Generation Method of Named Entity Recognition using Big Data", Ms. Thesis, Sogang University, 2014. [Online] Available: Temporary on-line reference link removed - see the PDF document [20] D. Lin. "Extracting Collocations from Text Corpora," Workshop on Computational Terminology, pp. 57-63. 1998. [Online] Available: Temporary on-line reference link removed - see the PDF document Web of Science® Citations for all references: 259 TCR SCOPUS® Citations for all references: 5,483 TCR Web of Science® Average Citations per reference: 12 ACR SCOPUS® Average Citations per reference: 261 ACR TCR = Total Citations for References / ACR = Average Citations per Reference We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more Citations for references updated on 2024-11-18 01:24 in 37 seconds. Note1: Web of Science® is a registered trademark of Clarivate Analytics. Note2: SCOPUS® is a registered trademark of Elsevier B.V. Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site. |
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.