Semiautomatic Generation of Reference Values for Identifying Obstructions in Continuous Casting
Abstract
Clogging of submerged entry valves in the continuous casting process increase the frequency of interruptions in operation. These interruptions increase operating costs, and can cause a variety of quality problems. The absence of data sets labeled for clogging has prevented the application of machine learning methods for predicting this anomaly. This work sought to develop semiautomatic techniques for labeling reference data sets. As a first step, a clustering technique was applied over time series using the DBSCAN algorithm. The generated clusters were used as seeds for a semi-supervised label propagation process. This process generated a database that was validated by specialists and 100% of the data labeled as obstructions were considered correctly labeled.
References
V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, 41(3):1–58, jul 2009.
M. Christ, N. Braun, J. Neuffer, and A. W. Kempa-Liehr. Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (TSFresh – A Python package). Neurocompu- ting, 307:72–77, sep 2018.
M.Ester,H.-P.Kriegel,J.Sander,andX.Xu.ADensity-BasedAlgorithmforDiscovering Clusters in Large Spatial Databases with Noise. In 2nd International Conference on Knowledge Discovery and Data Mining, pages 226–231, 1996.
S. García, J. Luengo, and F. Herrera. Tutorial on practical tips of the most influential data preprocessing algorithms in data mining. Knowledge-Based Systems, 98:1–29, apr 2016.
L. McInnes, J. Healy, and J. Melville. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
L.Ometto,S.Challapalli,M.Polo,G.Cestari,A.Villagrossi,M.Sandri,andE.Pellegrini. Successful Use Case Applications of Artificial Intelligence in the Steel Industry. In AISTech2019 Proceedings of the Iron and Steel Technology Conference, pages 2573–2584. AIST, 2019.
J. J. M. Peixoto. Modelamento físico e matemático do fluxo no interior de um molde de lingotamento contínuo de beam blank alimentado com duas válvulas submersas tubulares. Master’s thesis, Programa de Pós-Graduação em Engenharia de Materiais. Escola de Minas, Universidade Federal de Ouro Preto., 2016.
S. Rodpongpun, V. Niennattrakul, and A. Ratanamahatana. Selective Subsequence Time Series clustering. Knowledge-Based Systems, 35:361–368, 2012.
P. J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53 – 65, 1987.
B. G. Thomas and H. Bai. Tundish Nozzle Clogging-Application Of Computational Mo- dels. In 18rd Process Technology Division Conference Proceedings, volume 18. Iron and Steel Society, 2001.
M. Vannucci and V. Colla. Novel classification method for sensitive problems and uneven datasets based on neural networks and fuzzy logic. Applied Soft Computing Journal, 11(2):2383–2390, 2011.
M. Vannucci, V. Colla, G. Nastasi, and N. Matarese. Detection of rare events within in- dustrial datasets by means of data resampling and specific algorithms. International Journal of Simulation: Systems, Science and Technology, 11(3):1–11, 2010.
B. Wang, Z. Tu, and J. Tsotsos. Dynamic label propagation for semi-supervised multi- class multi-label classification. Proceedings of the IEEE International Conference on Computer Vision, 52:425–432, 12 2013.
F. Yuan, X. Wang, J. Zhang, and L. Zhang. Online forecasting model of tundish nozzle clogging. Journal of University of Science and Technology Beijing: Mineral Metal- lurgy Materials (Eng Ed), 13(1):21–24, feb 2006.
X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propa- gation. Technical Report CMU-CALD-02-107, Carnegie Mellon University, Pitts- burgh, PA, 2002.
