Exploring Machine Learning for Early Autism Spectrum Disorder Prediction Using Umbilical Cord Blood Gene Expression

  • Laura G. Speggiorin Universidade Federal do Rio Grande do Sul (UFRGS) / Hospital de Clínicas de Porto Alegre (HCPA) https://orcid.org/0009-0005-4555-5485
  • Thayne W. Kowalski Hospital de Clínicas de Porto Alegre (HCPA) / Universidade Federal do Rio Grande do Sul (UFRGS)
  • Mariana Recamonde-Mendoza Universidade Federal do Rio Grande do Sul (UFRGS) / Hospital de Clínicas de Porto Alegre (HCPA)

Resumo


This study presents a proof-of-concept machine learning (ML) model for early Autism Spectrum Disorder (ASD) prediction using transcriptomic data from umbilical cord blood. We analyzed 224 samples (53 ASD, 80 with non-typical development [Non-TD], 91 with typical development [TD]) from high-risk cohorts, proposing a two-step classification pipeline based on eight distinct algorithms and ensemble approaches. The first model (TD vs. Non-TD/ASD) achieved an F1.5-score of 0.89 and recall of 1.0; the second model (ASD vs. Non-TD) yielded 75% accuracy and an F1.5-score of 0.54. Results suggest subtle, yet detectable, transcriptomic signals in perinatal blood that may support early ASD risk stratification, warranting further investigation in larger cohorts.

Palavras-chave: machine learning. autism spectrum disorder. ensemble. bioinformatics. gene expression. transcriptomics. neurodevelopment

Referências

Hertz-Picciotto, I., Schmidt, R. J., Walker, C. K., Bennett, D. H., Oliver, M., SheddWise, K. M., LaSalle, J. M., Giulivi, C., Puschner, B., Thomas, J., Roa, D. L., Pessah, I. N., Van de Water, J., Tancredi, D. J., and Ozonoff, S. (2018). A prospective study of environmental exposures and early biomarkers in autism spectrum disorder: Design, protocols, and preliminary data from the marbles study. Environmental Health Perspectives, 126(11):117004.

Hodges, H., Fealko, C., and Soares, N. (2020). Autism spectrum disorder: definition, epidemiology, causes, and clinical evaluation. Translational Pediatrics, 9(Suppl 1):S55–S65.

Liu, W., Li, M., and Yi, L. (2016). Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework. Autism Research, 9(8):888–898.

Lord, C., Brugha, T. S., Charman, T., Cusack, J., Dumas, G., Frazier, T., Jones, E. J. H., Jones, R. M., Pickles, A., State, M. W., Taylor, J. L., and Veenstra-VanderWeele, J. (2020). Autism spectrum disorder. Nature Reviews Disease Primers, 6(1):1–23.

Mordaunt, C. E., Park, B. Y., Bakulski, K. M., Feinberg, J. I., Croen, L. A., Ladd-Acosta, C., Newschaffer, C. J., Volk, H. E., Ozonoff, S., Hertz-Picciotto, I., LaSalle, J. M., Schmidt, R. J., and Fallin, M. D. (2019). A meta-analysis of two high-risk prospective cohort studies reveals autism-specific transcriptional changes to chromatin, autoimmune, and environmental response genes in umbilical cord blood. Molecular Autism, 10(1):36.

Newschaffer, C. J., Croen, L. A., Fallin, M. D., Hertz-Picciotto, I., Nguyen, D. V., Lee, N. L., Berry, C. A., Farzadegan, H., Hess, H. N., Landa, R. J., Levy, S. E., Massolo, M. L., Meyerer, S. C., Mohammed, S. M., Oliver, M. C., Ozonoff, S., Pandey, J., Schroeder, A., and Shedd-Wise, K. M. (2012). Infant siblings and the investigation of autism risk factors. Journal of Neurodevelopmental Disorders, 4(1):7.

Omar, K. S., Mondal, P., Khan, N. S., Rizvi, M. R. K., and Islam, M. N. (2019). A machine learning approach to predict autism spectrum disorder. In 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), page 1–6.

Sharma, S. R., Gonda, X., and Tarazi, F. I. (2018). Autism spectrum disorder: Classification, diagnosis and therapy. Pharmacology Therapeutics, 190:91–104.

Tylee, D. S., Hess, J. L., Quinn, T. P., Barve, R., Huang, H., Zhang-James, Y., Chang, J., Stamova, B. S., Sharp, F. R., Hertz-Picciotto, I., Faraone, S. V., Kong, S. W., and Glatt, S. J. (2017). Blood transcriptomic comparison of individuals with and without autism spectrum disorder: A combined-samples mega-analysis. American journal of medical genetics. Part B, Neuropsychiatric genetics: the official publication of the International Society of Psychiatric Genetics, 174(3):181–201.

Zablotsky, B., Black, L. I., and Blumberg, S. J. (2017). Estimated prevalence of children with diagnosed developmental disabilities in the united states, 2014-2016. NCHS data brief, (291):1–8.

Zhang, F., Savadjiev, P., Cai, W., Song, Y., Rathi, Y., Tunç, B., Parker, D., Kapur, T., Schultz, R. T., Makris, N., Verma, R., and O’Donnell, L. J. (2018). Whole brain white matter connectivity analysis using machine learning: an application to autism. NeuroImage, 172:826–837.
Publicado
29/09/2025
G. SPEGGIORIN, Laura; W. KOWALSKI, Thayne; RECAMONDE-MENDOZA, Mariana. Exploring Machine Learning for Early Autism Spectrum Disorder Prediction Using Umbilical Cord Blood Gene Expression. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 18. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 84-95. ISSN 2316-1248. DOI: https://doi.org/10.5753/bsb.2025.15153.