Reliability and Validity of Learning Performance Assessment on Machine Learning in Basic Education

  • Marcelo Fernando Rauber Federal University of Santa Catarina / Federal Institute of Santa Catarina
  • Abisague Belém Garcia Federal University of Santa Catarina
  • Christiane Gresse von Wangenheim Federal University of Santa Catarina https://orcid.org/0000-0002-6566-1606
  • Adriano F. Borgatto Federal University of Santa Catarina
  • Ramon Mayor Martins Federal Institute of Santa Catarina
  • Jean C. R. Hauck Federal University of Santa Catarina https://orcid.org/0000-0001-6550-9092

Abstract


Observing the trend of teaching Machine Learning (ML) already in K-12, the need for assessing the learning also arises. In order to ensure a reliable and valid assessment, we present the evaluation of a rubric for the assessment of the learning of the application of ML concepts based on the learning outcomes of 108 middle and high school students. Both the reliability analysis (Omega coefficient of 0.646) and the analysis of the convergent validity of the construct through the polychoric correlation matrix indicate the possibility of two dimensions. Even indicating the need for a revision with a larger sample, these results can already support the application of the rubric.

Keywords: Machine Learning, Basic Education, Teaching, Learning Performance Assessment

References

Amershi S. et al. (2019), Software Engineering for Machine Learning: A Case Study. Proc. of the 41st International Conference on Software Engineering: Software Engineering in Practice, IEEE, 291–300.

Avila C. et al. (2017), Metodologias de Avaliação do Pensamento Computacional: uma revisão sistemática. Anais do Simpósio Brasileiro de Informática na Educação, 113.

Basili V. R., Caldiera G., and Rombach H. D., (1994), Goal Question Metric Paradigm. In Encyclopedia of Software Engineering, Wiley

Brennan K. e Resnick M., (2012), New frameworks for studying and assessing the development of computational thinking. Proc. of the Annual Meeting of the American Educational Research Association, Vancouver, Canada, 25.

Brown T. A., (2015), Confirmatory factor analysis for applied research, Guilford publications.

Camada M. Y. e Durães G. M., (2020), Ensino da Inteligência Artificial na Educação Básica: um novo horizonte para as pesquisas brasileiras. Anais do XXXI Simpósio Brasileiro de Informática na Educação, SBC, 1553–1562.

Caruso A. L. M. e Cavalheiro S. A. da C., (2021), Integração entre Pensamento Computacional e Inteligência Artificial: uma Revisão Sistemática de Literatura. Anais do XXXII Simpósio Brasileiro de Informática na Educação, SBC, 1051–1062.

DeVellis R. F., (2017), Scale development: theory and applications, 4th ed. SAGE.

Flora D. B., (2020), Your Coefficient Alpha Is Probably Wrong, but Which Coefficient Omega Is Right? A Tutorial on Using R to Obtain Better Reliability Estimates. Advances in Methods and Practices in Psychological Science, 3(4), 484–501.

Google, (2020), Google Teachable Machine. Retrieved 01/06/2020 from https://teachablemachine.withgoogle.com/,

Gresse von Wangenheim C., Alves N. da C., Rauber M. F., Hauck J. C. R., and Yeter I. H., (2021), A Proposal for Performance-based Assessment of the Learning of Machine Learning Concepts and Practices in K-12. Informatics in Education, online.

Gresse von Wangenheim C., Marques L. S., and Hauck J. C. R., (2020), Machine Learning for All – Introducing Machine Learning in K-12, SocArXiv, 1-10.

Grover S., Pea R., and Cooper S., (2015), "Systems of Assessments” for deeper learning of computational thinking in K-12. Proc. of the Annual Meeting of the American Educational Research Association, 15–20.

Hattie J. and Timperley H., (2007), The Power of Feedback. Review of Educational Research, 77(1), 81–112.

Ho J. W. and Scadding M., (2019), Classroom Activities for Teaching Artificial Intelligence to Primary School Students. Proc. of the Int. Conference on Computational Thinking, 157-159.

House of Lords, (2018), AI in the UK: ready, willing and able, HL Paper 100.

Kandlhofer M., Steinbauer G., Hirschmugl-Gaisch S., and Huber P., (2016), Artificial intelligence and computer science in education: From kindergarten to university. Proc. of the Frontiers in Education Conference, IEEE, 1–9.

Laydner M., (2022), Automação da Avaliação de Aprendizagem de Machine Learning para classificação de Imagens no Ensino Fundamental. Trabalho de Conclusão de Curso. (Graduação em Sistemas de Informação) – Universidade Federal de Santa Catarina.

LeCun Y., Bengio Y., and Hinton G., (2015), Deep learning. Nature, 521(7553), 436–444.

Long D. and Magerko B., (2020), What is AI literacy? Competencies and design considerations. Proc. of the Conference on Human Factors in Computing Systems, ACM, 1–16

Lordelo L. M. K., Hongyu K., Borja P. C., e Porsani M. J., (2018), Análise Fatorial por Meio da Matriz de Correlação de Pearson e Policórica no Campo das Cisternas. E&S Engineering and Science, 7(1), 58–70.

Lwakatare L. E., Raj A., Bosch J., Olsson H. H., and Crnkovic I., (2019), A taxonomy of software engineering challenges for machine learning systems: An empirical investigation. Proc. of the Int. Conference on Agile Software Development, Springer, 227–243.

Lye S. Y. and Koh J. H. L., (2014), Review on teaching and learning of computational thinking through programming: What is next for K-12? Computers in Human Behavior, 41, 51–61.

Marques L. S., von Wangenheim C. G., e Rossa Hauck J. C., (2020), Ensino de Machine Learning na Educação Básica: um Mapeamento Sistemático do Estado da Arte. Anais do XXXI Simpósio Brasileiro de Informática na Educação, SBC, 21–30.

Ministério da Educação, (2018), Base Nacional Comum Curricular. Retrieved 01/06/2022 from http://basenacionalcomum.mec.gov.br/

Mislevy R. J., Almond R. G., and Lukas J. F., (2003), A Brief Introduction to Evidence-Centered Design. ETS Research Report Series, 2003(1), i–29.

Moskal B. M. and Leydens J. A., (2000), Scoring rubric development: Validity and reliability. Practical assessment, research, and evaluation, 7(1), 10.

Mukaka M. M., (2012), A guide to appropriate use of correlation coefficient in medical research. Malawi Medical journal, 24(3), 69–71.

Pedro F., Subosa M., Rivas A., and Valverde P., (2019), Artificial intelligence in education: Challenges and opportunities for sustainable development.

Ramos G., Meek C., Simard P., Suh J., and Ghorashi S., (2020), Interactive machine teaching: a human-centered approach to building machine-learned models. Human–Computer Interaction, 35(5–6), 413–451.

Rauber M. F. and Gresse Von Wangenheim C., (2022), Assessing the Learning of Machine Learning in K-12: A Ten-Year Systematic Mapping. Informatics in Education, online.

Royal Society, (2017), Machine learning: the power and promise of computers that learn by example. Retrieved 01/06/2022 from https://royalsociety.org/machine-learning.

Sadler D. R., (1989), Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119–144.

Sakulkueakulsuk B. et al., (2018), Kids making AI: Integrating machine learning, gamification, and social context in STEM education. Proc. of the Int. Conference on Teaching, Assessment, and Learning for Engineering, IEEE, 1005–1010.

Tang D., Utsumi Y., and Lao N., (2019), PIC: A Personal Image Classification Webtool for High School Students. Proc. of the 2019 IJCAI EduAI Workshop. IJCAI.

Touretzky D., Gardner-McCune C., Martin F., and Seehorn D., (2019), Envisioning AI for K-12: What Should Every Child Know about AI? Proc. of the AAAI Conference on Artificial Intelligence, 9795–9799.

Trochim W. M. K. and Donnelly J. P., (2008), The research methods knowledge base, 3rd ed. Mason, Atomic Dog/Cengage Learning.

Gresse von Wangenheim C. et al., (2018), CodeMaster - Automatic Assessment and Grading of App Inventor and Snap! Programs. Informatics in Education, 17(1), 117–150.

Yasar O., Veronesi P., Maliekal J., Little L., Vattana S., and Yeter I., (2016), Computational Pedagogy: Fostering a New Method of Teaching. Proc. of the Annual Conference & Exposition Proceedings, ASEE, 26550.
Published
2022-11-16
RAUBER, Marcelo Fernando; GARCIA, Abisague Belém; GRESSE VON WANGENHEIM, Christiane; F. BORGATTO, Adriano; MARTINS, Ramon Mayor; HAUCK, Jean C. R.. Reliability and Validity of Learning Performance Assessment on Machine Learning in Basic Education. In: BRAZILIAN SYMPOSIUM ON COMPUTERS IN EDUCATION (SBIE), 33. , 2022, Manaus. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 1255-1267. DOI: https://doi.org/10.5753/sbie.2022.224688.