Answer features extraction from StackOverflow - An Analysis of StackExchange Questions and Answers

  • Jardel Batista Gonçalves UFSM
  • Rafael Teodósio Pereira UFSM
  • Simone Regina Ceolin UFSM
  • Renato Preigschadt de Azevedo UFSM


Question & Answering (QA) sites (e.g., StakeExchange (SE) and StackOverflow (SO)) provide a platform where users can ask questions on specialized topics and get feedback provided by users who can have knowledge on the subject. With more than 2.9 billion answers submitted in 2017 is a challenge to get the best answer. In this paper we present preliminary findings based on an analysis of data from a Q&A site, StackOverflow StackExchange, concentrating on discovering characteristics present in the answers that have the Python tag. Identify characteristics of the preferred answers can give us a way to classify which answer has a better probability to be chosen as an accepted answer. We used a quantitative approach to analyze the data and discover if some characteristics are true. We analyzed 5 features and their representativeness, such as scope awareness, explained codes, presence of code, length and time after the question been made. Our findings shows what features are relevant in the answers from the extracted patterns. The exploratory study used in this paper could improve the understanding about characteristics of preferred answers.


Bell, C. (2012). Expert MySQL. Apress, Berkely, CA, USA, 2nd edition.

Bhanu, M. and Chandra, J. Exploiting Response Patterns for Identifying Topical Experts in StackOverflow.

Fu, H. and Fan, Y. Music Information Seeking via Social Q&A: An Analysis of Questions in Music StackExchange Community.

Gruetze, T., Krestel, R., and Naumann, F. (2016). Topic shifts in stackoverflow: Ask it like socrates. In Métais, E., Meziane, F., Saraee, M., Sugumaran, V., and Vadera, S., editors, Natural Language Processing and Information Systems, pages 213–221, Cham. Springer International Publishing.

Honsel, V., Herbold, S., and Grabowski, J. (2015). Intuition vs. truth: Evaluation of common myths about StackOverflow posts. In IEEE International Working Conference on Mining Software Repositories.

Mamykina, L., Manoim, B., Mittal, M., Hripcsak, G., and Hartmann, B. (2011). Design lessons from the fastest q&38;a site in the west. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’11, pages 2857–2866, New York, NY, USA. ACM.

Mehdi Nasehi, S., Sillito, J., Maurer, F., and Burns, C. (2012). What Makes a Good Code Example? A Study of Programming Q&A in StackOverflow.

Vasilescu, B., Capiluppi, A., and Serebrenik, A. (2012). Gender, representation and online participation: A quantitative study of stackoverflow. In 2012 International Conference on Social Informatics, pages 332–338.
GONÇALVES, Jardel Batista; PEREIRA, Rafael Teodósio; CEOLIN, Simone Regina; AZEVEDO, Renato Preigschadt de. Answer features extraction from StackOverflow - An Analysis of StackExchange Questions and Answers. In: ESCOLA REGIONAL DE REDES DE COMPUTADORES (ERRC), 20. , 2023, Porto Alegre/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 91-96. DOI: