Assessing the impact of missing value mechanisms on anomaly detection in healthcare wearable data
Abstract
Remote health monitoring using wearable devices has transformed healthcare by enabling continuous observation and early intervention. However, these systems frequently suffer from missing data, which can introduce bias and impair clinical decision-making. The uncertainty surrounding the cause of missing values further complicates data analysis. This paper investigates the impact of different missing data mechanisms (i.e., MCAR, MAR, and MNAR) in the healthcare wearable data anomaly detection task. Using heart rate and step count data from patients with respiratory illnesses, we assess the performance of an anomaly detection method under varying missingness conditions. Our findings demonstrate that more complex mechanisms (MAR and MNAR) significantly degrade detection performance, even at low missing rates, highlighting the importance of developing robust imputation strategies tailored to the nature of missingness.
Keywords:
Wearable Data, Missing Values, Healthcare
References
Canali, S., Schiaffonati, V., and Aliverti, A. (2022). Challenges and recommendations for wearable devices in digital health: Data quality, interoperability, health equity, fairness. PLOS Digital Health, 1(10):e0000104.
Emmanuel, T., Maupong, T., Mpoeleng, D., Semong, T., Mphago, B., and Tabona, O. (2021). A survey on missing data in machine learning. Journal of Big data, 8:1–37.
Getzen, E., Ungar, L., Mowery, D., Jiang, X., and Long, Q. (2023). Mining for equitable health: Assessing the impact of missing data in electronic health records. Journal of biomedical informatics, 139:104269.
Isgut, M., Gloster, L., Choi, K., Venugopalan, J., and Wang, M. D. (2022). Systematic review of advanced ai methods for improving healthcare data quality in post covid-19 era. IEEE Reviews in Biomedical Engineering, 16:53–69.
Lima, A. S. and Sousa, E. (2024). Handling missing values in data streams: An overview. In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 750–756, Florianópolis, SC, Brasil. SBC.
Lin, S., Wu, X., Martinez, G., and Chawla, N. V. (2020). Filling missing values on wearable-sensory time series data. In Proceedings of the 2020 SIAM International Conference on Data Mining, pages 46–54. SIAM.
Mangussi, A. D., Santos, M. S., Lopes, F. L., Pereira, R. C., Lorena, A. C., and Abreu, P. H. (2024). mdatagen: A python library for generating missing data. [link].
Mishra, T., Wang, M., Metwally, A. A., Bogu, G. K., Brooks, A. W., Bahmani, A., Alavi, A., Celli, A., Higgs, E., Dagan-Rosenfeld, O., et al. (2020). Pre-symptomatic detection of covid-19 from smartwatch data. Nature biomedical engineering, 4(12):1208–1220.
Psychogyios, K., Ilias, L., Ntanos, C., and Askounis, D. (2023). Missing value imputation methods for electronic health records. IEEE Access, 11:21562–21574.
Ren, L., Wang, T., Seklouli, A. S., Zhang, H., and Bouras, A. (2023). A review on missing values for main challenges and methods. Information Systems, page 102268.
Santos, M. S., Pereira, R. C., Costa, A. F., Soares, J. P., Santos, J., and Abreu, P. H. (2019). Generating synthetic missing data: A review by missing mechanism. IEEE Access, 7:11651–11667.
Emmanuel, T., Maupong, T., Mpoeleng, D., Semong, T., Mphago, B., and Tabona, O. (2021). A survey on missing data in machine learning. Journal of Big data, 8:1–37.
Getzen, E., Ungar, L., Mowery, D., Jiang, X., and Long, Q. (2023). Mining for equitable health: Assessing the impact of missing data in electronic health records. Journal of biomedical informatics, 139:104269.
Isgut, M., Gloster, L., Choi, K., Venugopalan, J., and Wang, M. D. (2022). Systematic review of advanced ai methods for improving healthcare data quality in post covid-19 era. IEEE Reviews in Biomedical Engineering, 16:53–69.
Lima, A. S. and Sousa, E. (2024). Handling missing values in data streams: An overview. In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 750–756, Florianópolis, SC, Brasil. SBC.
Lin, S., Wu, X., Martinez, G., and Chawla, N. V. (2020). Filling missing values on wearable-sensory time series data. In Proceedings of the 2020 SIAM International Conference on Data Mining, pages 46–54. SIAM.
Mangussi, A. D., Santos, M. S., Lopes, F. L., Pereira, R. C., Lorena, A. C., and Abreu, P. H. (2024). mdatagen: A python library for generating missing data. [link].
Mishra, T., Wang, M., Metwally, A. A., Bogu, G. K., Brooks, A. W., Bahmani, A., Alavi, A., Celli, A., Higgs, E., Dagan-Rosenfeld, O., et al. (2020). Pre-symptomatic detection of covid-19 from smartwatch data. Nature biomedical engineering, 4(12):1208–1220.
Psychogyios, K., Ilias, L., Ntanos, C., and Askounis, D. (2023). Missing value imputation methods for electronic health records. IEEE Access, 11:21562–21574.
Ren, L., Wang, T., Seklouli, A. S., Zhang, H., and Bouras, A. (2023). A review on missing values for main challenges and methods. Information Systems, page 102268.
Santos, M. S., Pereira, R. C., Costa, A. F., Soares, J. P., Santos, J., and Abreu, P. H. (2019). Generating synthetic missing data: A review by missing mechanism. IEEE Access, 7:11651–11667.
Published
2025-09-29
How to Cite
S. LIMA, Afonso M..
Assessing the impact of missing value mechanisms on anomaly detection in healthcare wearable data. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 40. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 781-787.
ISSN 2763-8979.
DOI: https://doi.org/10.5753/sbbd.2025.247702.
