WANQA: An Approach for Identifying New Unanswerable Questions in Q&A Communities

  • Lucas V. Knochenhauer Federal University of Santa Catarina (UFSC)
  • Carina F. Dorneles Federal University of Santa Catarina (UFSC)
  • Leandro K. Wives Federal University of Santa Catarina (UFSC)

Abstract


Big knowledge repositories are on the web and Question and Answer Communities (CQAs) are one of the most collaborative. Daily, their users post a large volume of questions and a great part of them receives no answers, becoming it useless content. Previous works that aim to solve this problem are dependent on the given characteristics of each community. This article proposes an approach based on a classification that results a model able to classify whether a new question is answerable or not. It uses features available in most CQAs. Experiments with data from different CQAs show that the proposal fulfills its goals.
Keywords: Comunidades CQA, questões não respondíveis, classificação

References

Aggarwal, C. C. (2015). Mining text data. In Data Mining: The Textbook, chapter 13, pages 288–291;429–433. Springer Publishing Company, Incorporated.

Asaduzzaman, M., Mashiyat, A. S., Roy, C. K., and Schneider, K. A. (2013). Answering questions about unanswered questions of stack overflow. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR ’13, pages 97–100, Piscataway, NJ, USA. IEEE Press.

Baeza-Yates, R. and Ribeiro-Neto, B. (2008). Modern Information Retrieval. Addison-Wesley Publishing Company, USA, 2nd edition.

Chua, A. Y. and Banerjee, S. (2015). Answers or no answers: Studying question answerability in stack overflow. Journal of Information Science, 41(5):720–731.

Dror, G., Maarek, Y., and Szpektor, I. (2013). Will my question be answered? predicting ”question answerability”in community question-answering sites. In Proceedings of the 2013th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III, ECMLPKDD’13, pages 499–514, Berlin. Springer.

Fong, S., Zhou, S., and Moutinho, L. (2015). Text analytics for predicting question acceptance rates. IT Professional, 17(4):34–41.

Saha, R. K., Saha, A. K., and Perry, D. E. (2013). Toward understanding the causes of unanswered questions in software information sites: A case study of stack overflow. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2013, pages 663–666, New York, NY, USA. ACM.

Srba, I. and Bielikova, M. (2016). A comprehensive survey and classification of approaches for community question answering. ACM Transactions on the Web, 10(3):18:1–18:63.

Yang, L., Bao, S., Lin, Q., Wu, X., Han, D., Su, Z., and Yu, Y. (2011a). Analyzing and predicting not-answered questions in community-based question answering services. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI’11, pages 1273–1278. AAAI Press.

Yang, X.-S., Deb, S., and Fong, S. (2011b). Accelerated particle swarm optimization and support vector machine for business optimization and applications. Networked Digital Technologies, pages 53–66.

Zhou, S. and Fong, S. (2016). Exploring the feature selection-based data analytics solutions for text mining online communities by investigating the influential factors: A case study of programming cqa in stack overflow. In Big Data Applications and Use Cases, pages 49–93. Springer.
Published
2018-08-25
KNOCHENHAUER, Lucas V.; DORNELES, Carina F.; WIVES, Leandro K.. WANQA: An Approach for Identifying New Unanswerable Questions in Q&A Communities. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 33. , 2018, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 1-12. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2018.22214.