A Genetic Fuzzy Automatic Text Summarizer

Daniel Leite; Lucia H. M. Rino

Daniel Leite UFSCar
Lucia H. M. Rino UFSCar

Resumo

In this paper we report on a fuzzy-based ranking system for selecting sentences for extractive summarization. The fuzzy knowledge base was automatically generated through a genetic algorithm. A corpus of newswire texts and their corresponding manual summaries was used to generate fuzzy classification rules. ROUGE informativeness measure was adopted as the fitness function of such an algorithm.

Referências

Barzilay, R. and Elhadad, M. (1997). Using Lexical Chains for Text Summarization. In: Proc. of the Intelligent Scalable Text Summarization Workshop, Madri, Spain. Also In I. Mani and M.T. Maybury (eds.), Advances in Automatic Text Summarization, pp. 111-121. MIT Press, 1999.

Collovini, Sandra, Carbonel, T. I. , Fuchs, J. T. , Coelho, J. C. B. , Rino, L. H. M and Vieira, Renata. (2007). Summ-it: um corpus anotado com informações discursivas visando à sumarização automática. In: Proceedings do Congresso Nacional da SBC 2007.

Cordon O., Herrera F., Herrera-Viedma E. and Lozano F. (1996) Genetic Algorithms and Fuzzy Logic in Control Processes. Archives of Control Sciences.

Eberhart, Russell and Shi, Yuhiu. (2007). Computation Intelligence: Concepts to Implementations. Morgan Kaufman, New York.

Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM, 16(2):264-285.

Hall, M. A. (2000). Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In: Langley, P. (ed.): Proc. of 17th International Conference on Machine Learning. Morgan Kaufmann, San Francisco, CA, 359-366.

Hearst, M. A. (1993). TextTiling: A Quantitative Approach to Discourse Segmentation. Technical Report 93/24. University of California, Berkeley.

Herrera, F., Lozano, M. and Verdegay, J.L. (1993). Genetic Algorithm Applications to Fuzzy Logic Based Systems. In: Proc. of the 9th Polish-Italian and 5th Polish-Finnish Symposium on Systems Analysis and Decision Support in Economics and Technology, pp. 125-134. Warsaw, 1993.

Herrera, F. and Lozano, M. (1999). Fuzzy genetic algorithms: issues and models. Dept. of Science and A. I., University of Granada, Technical Report No. 18071, Granada, Spain.

Kiani, A. and Akbarzadeh, M.R. (2006). Automatic Text Summarization Using Hybrid Fuzzy GA-GP Proceeding of the 11th IEEE International Conference on Fuzzy Systems, pp. 270-278, Vancouver Canada, 2006.

Kang, Bo-Yeong, Kim, Dae-Won and Li, Qing. (2006). Fuzzy Ranking Model Based on User Preference. IEICE Transactions 89-D(6): 1971-1974

Klir, George J. and Yuan, Bo (1995). Fuzzy sets and fuzzy logic: theory and applications. Upper Saddle River, NJ: Prentice Hall PTR.

Kupiec, Julian , Pedersen, Jan and Chen, Francine (1995). A trainable document summarizer. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp 68-73.

Larocca Neto, J., Santos, A. D., Kaestner, C. A. A. and Freitas, A. A. (2000). Generating Text Summaries through the Relative Importance of Topics. In: M. C. Monard and J. S. Sichman (Eds.), Proceedings of Iberamia-Sbia 2000, pp. 300-309. Springer-Verlag, Berlin, Heidelberg.

Leite, D. S. and Rino, L.H.M. (2006). Selecting a Feature Set to Summarize Texts in Brazilian Portuguese. In: J. S. Sichman et al. (eds.): Proc. of 18th. Brazilian Symposium on Artificial Intelligence (SBIA'06) and 10th. Ibero-American Artificial Intelligence Conference (IBERAMIA'06), pp. 462-471. Lecture Notes on Artificial Intelligence, No. 4140, Springer-Verlag.

Leite, D.S.; Rino, L.H.M.; Pardo. T.A.S. and Nunes, M.G.V. (2007). Extractive Automatic Summarization: Does more linguistic knowledge make a difference? In: Chris Biemann, Irina Matveeva, Rada Mihalcea, Dragomir Radev (eds.), Proc. of the Workshop on TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pp.17-24. 26 April, Rochester, NY, USA.

Luhn, H. P. (1958). The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development, 2, 157-165.

Li, S., Ouyang, Y., Wang, W., and Sun, B. (2007). Multi-Document Summarization Using Support Vector Regression. In Proc. of Document Understanding Conference 2007. Rochester, New York USA, April.

Lin, C. Y. (2004). ROUGE: A package for automatic evaluation of summaries. In: Proc. of the Workshop on Text Summarization Branches Out (WAS-2004), pp. 74-81. Barcelona, Spain.

Mani, I. (2001). Automatic Summarization. John Benjamins Publishing Co., Amsterdam.

Mihalcea, R. (2005). Language Independent Extractive Summarization. In: Proc. of the 43th. Annual Meeting of the Association for Computational Linguistics, Companion Volume (ACL 2005), Ann Arbor, MI, June.

Pardo, T.A.S. , Rino, L.H.M. and Nunes, M.G.V. (2003). GistSumm: A Summarization Tool Based on a New Extractive Method. In N.J. Mamede, J. Baptista, I. Trancoso, M.G.V. Nunes (eds.), 6th Workshop on Computational Processing of the Portuguese Language - Written and Spoken – PROPOR (Lecture Notes in Artificial Intelligence 2721), pp. 210-218. Faro, Portugal. June 26-27.

Pardo, T.A.S. and Rino, L.H.M. (2003). TeMário: Um Corpus para a Sumarização Automática de Textos. NILC Tech. Report. NILC-TR-03-09. São Carlos, Outubro, 12p.

Pingali, P., K., R. and Varma, V. (2007). IIIT Hyderabad at DUC 2007. In Proc. of Document Understanding Conference 2007. Rochester, New York USA, April.

Salton, G. and Buckley, C. (1988). Term-weighting approaches in Automatic Text Retrieval. Information Processing and Management 24, pp. 513-523.

Salton, G., Singhal, A., Mitra, M. and Buckley, C. (1997). Automatic Text Structuring and Summarization. Information Processing & Management, 33(2), pp. 193-207.

Wang, L. and Mendel, J. (1992). Generating fuzzy rules by learning from examples. IEEE Trans. on SMC, 22, pp. 414-1427.

Witten, Ian H. and Eibe, Frank (2005). Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco 788