Food Data Analysis using Multidimensional Visualizations based on Point Placement
Resumo
Food data comprise records regarding nutrients, ingredients, amounts of different vitamins and minerals that can be found in foods. The wide variety of food products that can be stored in large datasets makes the traditional analysis tasks unfeasible and time-consuming when conducted manually by the dietitians and related professionals. This paper describes a method for visualizing food data using point placement strategies to support specialists in tasks related to determining similar food products that can be replaced in specific diets. The proposed method generates a structured representation for food data to be used as input to some state-of-the-art and recent visualizations, such as PCA, t-SNE, UMAP and TriMap. Experiments were conducted to assess the quality of visualizations and the results reported that the nonlinear visualizations presented satisfactory discriminability regarding some food categories and better preservation of the data patterns. A case study based on a visual exploration process was also conducted and demonstrates the specialist successfully finding substitute food products for planning a vegan diet plan.
Referências
W. Min, S. Jiang, L. Liu, Y. Rui, and R. Jain, "A survey on food computing," ACM Computing Surveys (CSUR), vol. 52, no. 5, pp. 1-36, 2019.
A. Morales-Garzón, J. Gómez-Romero, and M. J. Martin-Bautista, "A word embedding-based method for unsupervised adaptation of cooking recipes," IEEE Access, vol. 9, pp. 27 389-27 404, 2021.
M. Petković, G. Popovski, B. K. Seljak, D. Kocev, and T. Eftimov, "Diethub: Dietary habits analysis through understanding the content of recipes," Trends in Food Science & Technology, vol. 107, pp. 183-194, 2021.
D. A. Keim, "Information visualization and visual data mining," IEEE Transactions on Visualization and Computer Graphics, vol. 8, no. 1, pp. 1-8, 2002.
I. T. Jolliffe, "Principal component analysis and factor analysis," Principal Component Analysis, pp. 150-166, 1986.
L. Van der Maaten and G. Hinton, "Visualizing data using t-sne," Journal of Machine Learning Research, vol. 9, no. 11, 2008.
L. McInnes, J. Healy, and J. Melville, "Umap: Uniform manifold approximation and projection for dimension reduction," arXiv preprint arXiv:1802.03426, 2018.
E. Amid and M. K. Warmuth, "TriMap: Large-scale Dimensionality Reduction Using Triplets," arXiv preprint arXiv:1910.00204, 2019.
F. V. Paulovich and R. Minghim, "Hipp: A novel hierarchical point placement strategy and its application to the exploration of document collections," IEEE Transactions on Visualization and Computer Graphics, vol. 14, no. 6, pp. 1229-1236, 2008.
R. Motta, R. Minghim, A. de Andrade Lopes, and M. C. F. Oliveira, "Graph-based measures to assist user assessment of multidimensional projections," Neurocomputing, vol. 150, pp. 583-598, 2015.
J. G. S. Paiva, W. R. Schwartz, H. Pedrini, and R. Minghim, "Semi-supervised dimensionality reduction based on partial least squares for visual analysis of high dimensional data," in Computer Graphics Forum, vol. 31, no. 3pt4. Wiley Online Library, 2012, pp. 1345-1354.
L. Van Der Maaten, "Learning a parametric embedding by preserving local structure," in Artificial Intelligence and Statistics. PMLR, 2009, pp. 384-391.
C. T. Gallagher, P. Hanley, and K. E. Lane, "Pattern analysis of vegan eating reveals healthy and unhealthy patterns within the vegan diet," Public Health Nutrition, pp. 1-11, 2021.