Answer features extraction from StackOverflow - An Analysis of StackExchange Questions and Answers

  • Jardel Batista Gonçalves UFSM
  • Rafael Teodósio Pereira UFSM
  • Simone Regina Ceolin UFSM
  • Renato Preigschadt de Azevedo UFSM


Question & Answering (QA) sites (e.g., StakeExchange (SE) and StackOverflow (SO)) provide a platform where users can ask questions on specialized topics and get feedback provided by users who can have knowledge on the subject. With more than 2.9 billion answers submitted in 2017 is a challenge to get the best answer. In this paper we present preliminary findings based on an analysis of data from a Q&A site, StackOverflow StackExchange, concentrating on discovering characteristics present in the answers that have the Python tag. Identify characteristics of the preferred answers can give us a way to classify which answer has a better probability to be chosen as an accepted answer. We used a quantitative approach to analyze the data and discover if some characteristics are true. We analyzed 5 features and their representativeness, such as scope awareness, explained codes, presence of code, length and time after the question been made. Our findings shows what features are relevant in the answers from the extracted patterns. The exploratory study used in this paper could improve the understanding about characteristics of preferred answers.


GONÇALVES, Jardel Batista; PEREIRA, Rafael Teodósio; CEOLIN, Simone Regina; AZEVEDO, Renato Preigschadt de. Answer features extraction from StackOverflow - An Analysis of StackExchange Questions and Answers. In: ESCOLA REGIONAL DE REDES DE COMPUTADORES (ERRC), 20. , 2023, Porto Alegre/RS.