3/2018 - 15 |
Rule-Based Turkish Text Summarizer (RB-TTS)BIRANT, C. C. , AKTAS, O. |
Extra paper information in |
Click to see author's profile in SCOPUS, IEEE Xplore, Web of Science |
Download PDF (1,113 KB) | Citation | Downloads: 839 | Views: 2,465 |
Author keywords
data processing, dictionaries. morphology, natural language processing, text processing
References keywords
turkish(10), text(6), language(5), summarization(4), information(4), extraction(4), evaluation(4)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2018-08-31
Volume 18, Issue 3, Year 2018, On page(s): 113 - 118
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2018.03015
Web of Science Accession Number: 000442420900015
SCOPUS ID: 85052145263
Abstract
The volume of data produced has exponentially increased with the digital revolution and it continues to race to the limits of the capacity of our computers and supercomputers. Automatic text summarization is one of efforts to tame the bestial product of our daily data production, which have generated the 90 percent of the data ever produced by humans, in the last two years. In order to understand what a text is about, a summary is needed which is short enough not to compromise the understandability, and comprehensive to include the most important topics of that text. Numerous automatic text summarization software which aimed at achieving this goal use semantic relations, thesauri, and word frequency lists. In this paper, development phases and evaluation results of a software tool called Rule Based Turkish Text Summarizer (RB-TTS) are presented. The average success rate of the RB-TTS is analyzed both quantitatively using ROUGE-N metrics and qualitatively. In the qualitative analysis, five summaries, obtained automatically from texts, are evaluated by 10 Ph.D. students from Dokuz Eylul University Department of Linguistics. The summaries generated by RB-TTS software are compared with the summaries, which were written by the authors of the corresponding texts, and marked as close to them. |
References | | | Cited By «-- Click to see who has cited this paper |
[1] Oflazer K. & Kuruoz, I., "Tagging and morphological disambiguation of Turkish text", Proceedings of the Fourth Conference on Applied Natural LanguaProcessing, October 13-15, Stuttgart, Germany, 1994. [CrossRef] [2] Tur G., Hakkani-Tur D. & Oflazer, K., "A statistical information extraction system for Turkish". Natural Language Engineering, 9, pp. 181-210, 2003. [CrossRef] [SCOPUS Times Cited 83] [3] Bilgin O., Cetinoglu O. & Oflazer K., "Building a wordnet for Turkish," Romanian Journal of Information Science and Technology, 7 (1-2). pp. 163-172, 2004. [4] Karakaya K. M. & Guvenir H. A., "ARG: A Tool for Automatic Report Generation", Istanbul University - Journal of Electrical & Electronics Engineering, Vol. 4, No. 2, pp. 1101-1109, 2004. [5] Amasyali, M. F. & Diri, B., "Automatic turkish text categorization in terms of author, genre and gender". NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems, pp. 221-226, 2006. [CrossRef] [6] Ercan, G., "Automated Text Summarization and Keyphrase Extraction". Unpublished MSc thesis, Bilkent University, 2006. [7] Ercan, G. & Cicekli, I., "Using lexical chains for Keyword Extraction". Information Processing and Management, 43, pp. 1705-1714, 2007. [CrossRef] [Web of Science Times Cited 124] [SCOPUS Times Cited 181] [8] Kutlu, M., Cigir, C. & Cicekli, I. "Generic Text Summarization in Turkish". The Computer Journal, 53: 8, pp. 1315-1323, 2010. [CrossRef] [Web of Science Times Cited 32] [SCOPUS Times Cited 36] [9] Ozsoy, M. G., Cicekli, I. & Alpaslan, F. N., "Text summarization of Turkish texts using latent semantic analysis". Proceedings of the 23rd International Conference on Computational Linguistics, COLING'10, pp. 869-876, 2010. [10] Uzun-Per, M., "Developing a Concept Extraction System for Turkish". Unpublished MSc. Thesis, Bogazici University, 2011. [11] Demir, S., Durgar El-Kahlout, I., Unal, E. & Kaya, H., "Turkish Paraphrase Corpus". Proceedings of the Eight International Conference on Language Resources and Evaluation LREC'12. pp. 4087-4091, 2012. [12] Aktas, O. & Cebi, Y., "Rule-Based Sentence Detection Method (RBSDM) for Turkish", International Journal of Language and Linguistics, 1 (1), 1-6, 2013. [CrossRef] [13] Hyland, K., "Persuasion and context: The pragmatics of academic metadiscourse". Journal of Pragmatics. 30: 437-455, 1998. [CrossRef] [Web of Science Times Cited 338] [SCOPUS Times Cited 431] [14] Liu, F. & Liu, Y., "Exploring Correlation between ROUGE and Human Evaluation in Meeting Summaries". IEEE Transactions On Audio, Speech, and Language Processing, 2009. [CrossRef] [Web of Science Times Cited 18] [SCOPUS Times Cited 26] [15] Lin C.-Y., "ROUGE: a package for automatic evaluation of summaries". In Moens, M. F. & Szpakowicz, S. (eds.), Workshop Text Summarization Branches Out (ACL '04), ACL, Barcelona, Spain, pp. 74-81, July 2004. [16] Doran, W. P., Stokes, N., Dunnion, J. & Carthy, J., "Comparing lexical chain-based summarisation approaches using an extrinsic evaluation". In Proceedings of the Global Wordnet Conference (GWC 2004). [17] Birant, C. C., "Root-Suffix seperation of Turkish words". M.Sc. Thesis. Izmir: Dokuz Eylul Universitesi, 2009. Web of Science® Citations for all references: 512 TCR SCOPUS® Citations for all references: 757 TCR Web of Science® Average Citations per reference: 28 ACR SCOPUS® Average Citations per reference: 42 ACR TCR = Total Citations for References / ACR = Average Citations per Reference We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more Citations for references updated on 2024-12-20 23:52 in 55 seconds. Note1: Web of Science® is a registered trademark of Clarivate Analytics. Note2: SCOPUS® is a registered trademark of Elsevier B.V. Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site. |
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.