Retail product descriptions standardization using NER

Abstract


Product descriptions of retail establishments are information used in market analysis, but normally, these descriptions are poorly structured, non-standardized, and vary a lot for the same product. This article proposes the use of natural language processing techniques, more specifically, a branch known as NER (Named Entity Recognition), to solve the automatic generation of standardized descriptions from retail product descriptions. As a result, the trained model proved to be adequate to extract the characteristic information of new products launched on the market and the consequent construction of their standardized descriptions.

Keywords: Natural Language Processing, Named Entity Recognition, Product Description

References

Ratcliff, J. and Metzener, D. (1988) “Pattern Matching: The Gestalt Approach”, Dr. Dobb's Journal, Issue 46, July.

Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013) Efficient Estimation of Word Representations in Vector Space, Proceedings of Workshop at ICLR.

Li, J., Sun, A., Han, J. and Li, C. (2018) A survey on deep learning for named entity recognition, arXiv preprint arXiv:1812.09449.

Schuster, M. and Paliwal, K. (1997) Bidirectional recurrent neural networks, Signal Processing, IEEE Transactions on. 45. 2673-2681.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. and Polosukhin, I. (2017) Attention is all you need, preprint arXiv:1706.03762.

Goldberg, Y. (2016) A Primer on Neural Network Models for Natural Language Processing, Journal of Artificial Intelligence Research 57 (2016) 345-420.

Sidorov, M. (2018) Attribute extraction from eCommerce product descriptions, Final CS229 project report. Stanford University.

Putthividhya, D. and Hu, J. (2011) Bootstrapped Named Entity Recognition for Product Attribute Extraction, Proceedings of 2011 Conference on Empirical Methods in Natural Language Processing, p.1557-1567, Edinburgh, July 27-31.

Zhang, H., Hennig, L., Alt, C., Hu, C., Meng, Y. and Wang, C. (2020) Bootstrapping Named Entity Recognition in E-Commerce with Positive Unlabeled Learning, Proceedings of 3rd Workshop on e-Commerce and NLP (ECNLP 3), p.1-6, July 10.

Bhange, B. R., Chengy, X., Bowden, M., Goyaly, P., Packery, T. and Javedy, F. (2020) Named Entity Recognition for E-Commerce Search Queries, March 8.

Singh, S. (2018) Natural Language Processing for Information Extraction, arXiv eprints, arXiv:1807.02383, July 1st.
Published
2022-09-19
LUCCHESI, Laércio; ESCOVEDO, Tatiana; KALINOWSKI, Marcos. Retail product descriptions standardization using NER. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 37. , 2022, Búzios. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 445-450. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2022.224347.