Sensor Validation for Indoor Air Quality using Machine Learning

  • Vagner Seibert Universidade Federal de Pelotas
  • Ricardo Araújo Universidade Federal de Pelotas
  • Richard McElligott Universidade Federal de Pelotas


To guarantee a high indoor air quality is an increasingly important task. Sensors measure pollutants in the air and allow for monitoring and controlling air quality. However, all sensors are susceptible to failures, either permanent or transitory, that can yield incorrect readings. Automatically detecting such faulty readings is therefore crucial to guarantee sensors' reliability. In this paper we evaluate three Machine Learning algorithms applied to the task of classifying a single reading from a sensor as faulty or not, comparing them to standard statistical approaches. We show that all tested machine learning methods -- Multi-layer Perceptron, K-Nearest Neighbor and Random Forest -- outperform their statistical counterparts, both by allowing better separation boundaries and by allowing for the use of contextual information. We further show that this result does not depend on the amount of data, but ML methods are able to continue to improve as more data is made available.

Palavras-chave: Machine Learning, Data Science


Alwosheel, A., van Cranenburgh, S., and Chorus, C. G. (2018). Is your dataset big enough? sample size requirements when using artificial neural networks for discrete choice analysis. Journal of choice modelling, 28:167–182.

Blum, A., Kalai, A., and Langford, J. (1999). Beating the hold-out: Bounds for k-fold and progressive cross-validation. In Proceedings of the twelfth annual conference on Computational learning theory, pages 203–208.

Bradley, A. P. (1997). The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7):1145–1159.

Eren, L. (2017). Bearing fault detection by one-dimensional convolutional neural networks. Mathematical Problems in Engineering, 2017.

Friswell, M. I. and Inman, D. J. (1999). Sensor validation for smart structures. Journal of intelligent material systems and structures, 10(12):973–982.

Goldberger, J., Hinton, G. E., Roweis, S. T., and Salakhutdinov, R. R. (2005). Neighbourhood components analysis. In Advances in neural information processing systems, pages 513–520.

Gupta, S., Chatar, C., R Celaya, J., et al. (2020). Recurrent auto encoders for automatic sensor validation; tomorrows data with yesterday’s sensors. In IADC/SPE International Drilling Conference and Exhibition. Society of Petroleum Engineers.

Henry, M. and Clarke, D. (1993). The self-validating sensor: rationale, definitions and examples. Control Engineering Practice, 1(4):585–610.

Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, volume 1, pages 278–282. IEEE.

Ibarguengoytia, P. H., Sucar, L. E., and Vadera, S. (2001). Real time intelligent sensor validation. IEEE Transactions on Power Systems, 16(4):770–775.

Kerschen, G., De Boe, P., Golinval, J.-C., and Worden, K. (2004). Sensor validation using principal component analysis. Smart materials and structures, 14(1):36.

Kubat, M. (1999). Neural networks: a comprehensive foundation by simon haykin, macmillan, 1994, isbn 0-02-352781-7. The Knowledge Engineering Review, 13(4):409–412.

Loy-Benitez, J., Heo, S., and Yoo, C. (2020). Soft sensor validation for monitoring and resilient control of sequential subway indoor air quality through memory-gated recurrent neural networks-based autoencoders. Control Engineering Practice, 97:104330.

Mattern, D., Jaw, L., Guo, T.-H., Graham, R., and McCoy, W. (1998). Using neural networks for sensor validation. In 34th AIAA/ASME/SAE/ASEE Joint Propulsion Conference and Exhibit, page 3547.

Mead, M., Popoola, O., Stewart, G., Landshoff, P., Calleja, M., Hayes, M., Baldovi, J., McLeod, M., Hodgson, T., Dicks, J., et al. (2013). The use of electrochemical sensors for monitoring urban air quality in low-cost, high-density networks. Atmospheric Environment, 70:186–203.

Muller, C., Fish, P., Glover, N., McElligott, R., and Bennett, D. (2018). Does control of indoor co2 levels negatively impact iaq? In Does Control of Indoor CO2 Levels Negatively Impact IAQ?

Napolitano, M. R., Windon, D. A., Casanova, J. L., Innocenti, M., and Silvestri, G. (1998). Kalman filters and neural-network schemes for sensor validation in flight control systems. IEEE transactions on control systems technology, 6(5):596–611.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Upadhyaya, B. R. and Eryurek, E. (1992). Application of neural networks for sensor validation and plant monitoring. Nuclear Technology, 97(2):170–176.

World Health Organization (2020). Air pollution. westernpacific/health-topics/air-pollution. Accessed: 2020-0320.

Yang, J., Sun, Z., and Chen, Y. (2016). Fault detection using the clustering-knn rule for gas sensor arrays. Sensors, 16(12):2069.
Como Citar

Selecione um Formato
SEIBERT, Vagner; ARAÚJO, Ricardo; MCELLIGOTT, Richard. Sensor Validation for Indoor Air Quality using Machine Learning. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 17. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 730-739. ISSN 2763-9061. DOI: