Extrapolation-Based Data Augmentation for Sequential Recommendation

  • Vinicius Gabriel Machado UFPR
  • Murilo F. L. Schmitt UNICENTRO
  • Eduardo J. Spinosa UFPR

Resumo


With the advent of deep learning in the recommendation field, a lot ofwork has been and still is being done to bring deep learning-based models to their full potential. One line of work, Self-Supervised Learning (SSL), focuses on extracting the maximum potential of datasets to improve recommendation performance, and at the same time, attempts to diminish data-related problems, such as data sparsity, that are commonly seen in machine learning techniques. One branch of SSL for recommender systems uses predictive strategies to create new labels and examples for training by, for instance, adding new interactions to the user’s history of interactions. However, the existing models that explore this idea are somewhat limited. They focus on adding new interactions at the start of the sequences, ignoring the performance improvements that could be achieved by adding interactions in the middle and end of the sequences. We propose Extrapolation-based Sequence Augmentation for Sequential Recommendation (ESA4SRec), a model that uses the sequence reconstruction capabilities of BERT4Rec to generate new data at any position of a sequence by extrapolating the existing knowledge to unknown, novel interactions. The resulting augmented dataset is then used as input to a modelagnostic sequential recommender system. We compare our approach to related models and demonstrate the performance improvements when compared with the original datasets and the overall best performance of our method. ESA4SRec’s code available at https://github.com/viniciusgm000/ESA4SRec.

Palavras-chave: Recommender Systems, Self-Supervised Learning, Sequential Recommendation, Data Augmentation

Referências

Shuqing Bian, Wayne Xin Zhao, Kun Zhou, Jing Cai, Yancheng He, Cunxiang Yin, and Ji-Rong Wen. 2021. Contrastive Curriculum Learning for Sequential User Behavior Modeling via Data Augmentation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (Virtual Event, Queensland, Australia) (CIKM ’21). Association for Computing Machinery, New York, NY, USA, 3737–3746. DOI: 10.1145/3459637.3481905

Jae-Won Chung and Sung Min Cho. 2019. BERT4Rec-VAE-Pytorch. [link].

Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the Fourth ACM Conference on Recommender Systems (Barcelona, Spain) (RecSys ’10). Association for Computing Machinery, New York, NY, USA, 39–46. DOI: 10.1145/1864708.1864721

Yizhou Dang, Enneng Yang, Yuting Liu, Guibing Guo, Linying Jiang, Jianzhe Zhao, and Xingwei Wang. 2024. Data Augmentation for Sequential Recommendation: A Survey. arXiv:2409.13545 [cs.IR] [link]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. DOI: 10.18653/v1/N19-1423

Zhicheng Ding, Jiahao Tian, Zhenkai Wang, Jinman Zhao, and Siyang Li. 2024. Data Imputation using Large Language Model to Accelerate Recommendation System. arXiv:2407.10078 [cs.IR] [link]

Hui Fang, Danning Zhang, Yiheng Shu, and Guibing Guo. 2020. Deep Learning for Sequential Recommendation: Algorithms, Influential Factors, and Evaluations. ACM Trans. Inf. Syst. 39, 1, Article 10 (Nov. 2020), 42 pages. DOI: 10.1145/3426723

Maxwell F. Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 5, 4, Article 19 (Dec. 2015), 19 pages. DOI: 10.1145/2827872

Ruining He and Julian McAuley. 2016. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the 25th International Conference on World Wide Web (Montréal, Québec, Canada) (WWW ’16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 507–517. DOI: 10.1145/2872427.2883037

Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (Torino, Italy) (CIKM ’18). Association for Computing Machinery, New York, NY, USA, 843–852. DOI: 10.1145/3269206.3271761

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016. Session-based Recommendations with Recurrent Neural Networks. arXiv:1511.06939 [cs.LG] [link]

Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In 2008 Eighth IEEE International Conference on Data Mining. Institute of Electrical and Electronics Engineers, New York, NY, USA, 263–272. DOI: 10.1109/ICDM.2008.22

Jian Huang, Junyi Chai, and Stella Cho. 2020. Deep learning in finance and banking: A literature review and classification. Frontiers of Business Research in China 14, 1 (2020), 13. DOI: 10.1186/s11782-020-00082-6

Won-Seok Hwang, Shaoyu Li, Sang-Wook Kim, and Kichun Lee. 2018. Data imputation using a trust network for recommendation via matrix factorization. Computer Science and Information Systems 15, 2 (2018), 347–368.

Juyong Jiang, Peiyan Zhang, Yingtao Luo, Chaozhuo Li, Jae Boum Kim, Kai Zhang, Senzhang Wang, Sunghun Kim, and Philip S. Yu. 2025. Improving Sequential Recommendations via Bidirectional Temporal Data Augmentation With Pre-Training. IEEE Transactions on Knowledge and Data Engineering 37, 5 (2025), 2652–2664. DOI: 10.1109/TKDE.2025.3546035

Andreas Kamilaris and Francesc X. Prenafeta-Boldú. 2018. Deep learning in agriculture: A survey. Computers and Electronics in Agriculture 147 (2018), 70–90. DOI: 10.1016/j.compag.2018.02.016

Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. In 2018 IEEE International Conference on Data Mining (ICDM). Institute of Electrical and Electronics Engineers, New York, NY, USA, 197–206. DOI: 10.1109/ICDM.2018.00035

Kibum Kim, Dongmin Hyun, Sukwon Yun, and Chanyoung Park. 2023. MELT: Mutual Enhancement of Long-Tailed User and Item for Sequential Recommendation. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (Taipei, Taiwan) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 68–77. DOI: 10.1145/3539618.3591725

Yehuda Koren, Steffen Rendle, and Robert Bell. 2022. Advances in Collaborative Filtering. Springer US, New York, NY, 91–142. DOI: 10.1007/978-1-0716-2197-4_3

Chaoliu Li, Lianghao Xia, Xubin Ren, Yaowen Ye, Yong Xu, and Chao Huang. 2023. Graph Transformer for Recommendation. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (Taipei, Taiwan) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 1680–1689. DOI: 10.1145/3539618.3591723

Jiacheng Li, Yujie Wang, and Julian McAuley. 2020. Time Interval Aware Self-Attention for Sequential Recommendation. In Proceedings of the 13th International Conference on Web Search and Data Mining (Houston, TX, USA) (WSDM ’20). Association for Computing Machinery, New York, NY, USA, 322–330. DOI: 10.1145/3336191.3371786

Lei Li, Yongfeng Zhang, and Li Chen. 2021. Personalized Transformer for Explainable Recommendation. arXiv:2105.11601 [cs.IR] [link]

Yujie Lin, Chenyang Wang, Zhumin Chen, Zhaochun Ren, Xin Xin, Qiang Yan, Maarten de Rijke, Xiuzhen Cheng, and Pengjie Ren. 2023. A Self-Correcting Sequential Recommender. In Proceedings of the ACMWeb Conference 2023 (Austin, TX, USA) (WWW ’23). Association for Computing Machinery, New York, NY, USA, 1283–1293. DOI: 10.1145/3543507.3583479

Qidong Liu, Jiaxi Hu, Yutian Xiao, Xiangyu Zhao, Jingtong Gao, Wanyu Wang, Qing Li, and Jiliang Tang. 2024. Multimodal Recommender Systems: A Survey. ACM Comput. Surv. 57, 2, Article 26 (Oct. 2024), 17 pages. DOI: 10.1145/3695461

Zhiwei Liu, Yongjun Chen, Jia Li, Philip S. Yu, Julian McAuley, and Caiming Xiong. 2021. Contrastive Self-supervised Sequential Recommendation with Robust Augmentation. arXiv:2108.06479 [cs.IR] [link]

Zhiwei Liu, Ziwei Fan, Yu Wang, and Philip S. Yu. 2021. Augmenting Sequential Recommendation with Pseudo-Prior Items via Reversely Pre-training Transformer. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). Association for Computing Machinery, New York, NY, USA, 1608–1612. DOI: 10.1145/3404835.3463036

Jianxin Ma, Chang Zhou, Hongxia Yang, Peng Cui, Xin Wang, and Wenwu Zhu. 2020. Disentangled Self-Supervision in Sequential Recommenders. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 483–491. DOI: 10.1145/3394486.3403091

Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015. Image-Based Recommendations on Styles and Substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (Santiago, Chile) (SIGIR ’15). Association for Computing Machinery, New York, NY, USA, 43–52. DOI: 10.1145/2766462.2767755

Riccardo Miotto, Fei Wang, Shuang Wang, Xiaoqian Jiang, and Joel T Dudley. 2017. Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics 19, 6 (05 2017), 1236–1246. arXiv: [link] DOI: 10.1093/bib/bbx044

Narjes Nikzad-Khasmakhi, Mohammad Ali Balafar, Mohammad Reza Feizi-Derakhshi, and Cina Motamed. 2021. BERTERS: Multimodal representation learning for expert recommendation system with transformers and graph embeddings. Chaos, Solitons & Fractals 151 (2021), 111260. DOI: 10.1016/j.chaos.2021.111260

Aleksandr Petrov and Craig Macdonald. 2022. A Systematic Review and Replicability Study of BERT4Rec for Sequential Recommendation. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (Rec-Sys ’22). Association for Computing Machinery, New York, NY, USA, 436–447. DOI: 10.1145/3523227.3548487

Ruihong Qiu, Zi Huang, Hongzhi Yin, and Zijian Wang. 2022. Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (Virtual Event, AZ, USA) (WSDM ’22). Association for Computing Machinery, New York, NY, USA, 813–823. DOI: 10.1145/3488560.3498433

Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-Aware Recommender Systems. ACM Comput. Surv. 51, 4, Article 66 (July 2018), 36 pages. DOI: 10.1145/3190616

Shaina Raza and Chen Ding. 2019. Progress in context-aware recommender systems — An overview. Computer Science Review 31 (2019), 84–97. DOI: 10.1016/j.cosrev.2019.01.001

Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized Markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web (Raleigh, North Carolina, USA) (WWW ’10). Association for Computing Machinery, New York, NY, USA, 811–820. DOI: 10.1145/1772690.1772773

Kyuyong Shin, Hanock Kwak, Kyung-Min Kim, Minkyu Kim, Young-Jin Park, Jisu Jeong, and Seungjae Jung. 2021. One4all User Representation for Recommender Systems in E-commerce. arXiv:2106.00573 [cs.IR] [link]

Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (Beijing, China) (CIKM ’19). Association for Computing Machinery, New York, NY, USA, 1441–1450. DOI: 10.1145/3357384.3357895

Jiaxi Tang and Ke Wang. 2018. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA) (WSDM ’18). Association for Computing Machinery, New York, NY, USA, 565–573. DOI: 10.1145/3159652.3159656

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.

HongweiWang, Miao Zhao, Xing Xie,Wenjie Li, and Minyi Guo. 2019. Knowledge Graph Convolutional Networks for Recommender Systems. In The World Wide Web Conference (San Francisco, CA, USA) (WWW’19). Association for Computing Machinery, New York, NY, USA, 3307–3313. DOI: 10.1145/3308558.3313417

JianlingWang, Ya Le, Bo Chang, YuyanWang, Ed H. Chi, and Minmin Chen. 2022. Learning to Augment for Casual User Recommendation. In Proceedings of the ACM Web Conference 2022 (Virtual Event, Lyon, France) (WWW ’22). Association for Computing Machinery, New York, NY, USA, 2183–2194. DOI: 10.1145/3485447.3512147

Wei Wei, Xubin Ren, Jiabin Tang, Qinyong Wang, Lixin Su, Suqi Cheng, Junfeng Wang, Dawei Yin, and Chao Huang. 2024. LLMRec: Large Language Models with Graph Augmentation for Recommendation. In Proceedings of the 17th ACM International Conference onWeb Search and Data Mining (Merida, Mexico) (WSDM ’24). Association for Computing Machinery, New York, NY, USA, 806–815. DOI: 10. 1145/3616855.3635853

AlexanderWettig, Tianyu Gao, Zexuan Zhong, and Danqi Chen. 2023. Should You Mask 15% in Masked Language Modeling?. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, Andreas Vlachos and IsabelleAugenstein (Eds.).Association for Computational Linguistics, Dubrovnik, Croatia, 2985–3000. DOI: 10.18653/v1/2023.eacl-main.217

ChuhanWu, FangzhaoWu, Tao Qi, Jianxun Lian, Yongfeng Huang, and Xing Xie. 2020. PTUM: Pre-training User Model from Unlabeled User Behaviors via Selfsupervision. In Findings of the Association for Computational Linguistics: EMNLP 2020, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, Online, 1939–1944. DOI: 10.18653/v1/2020.findings-emnlp.174

Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Jiandong Zhang, Bolin Ding, and Bin Cui. 2022. Contrastive Learning for Sequential Recommendation. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). Institute of Electrical and Electronics Engineers, New York, NY, USA, 1259–1273. DOI: 10.1109/ICDE53745.2022.00099

Joo yeong Song and Bongwon Suh. 2022. Data Augmentation Strategies for Improving Sequential Recommender Systems. arXiv:2203.14037 [cs.IR] DOI: 10.48550/ARXIV.2203.14037

Mingjia Yin, HaoWang,Wei Guo, Yong Liu, Suojuan Zhang, Sirui Zhao, Defu Lian, and Enhong Chen. 2024. Dataset Regeneration for Sequential Recommendation. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Barcelona, Spain) (KDD ’24). Association for Computing Machinery, New York, NY, USA, 3954–3965. DOI: 10.1145/3637528.3671841

Junliang Yu, Hongzhi Yin, Xin Xia, Tong Chen, Jundong Li, and Zi Huang. 2024. Self-Supervised Learning for Recommender Systems: A Survey. IEEE Transactions on Knowledge and Data Engineering 36, 1 (2024), 335–355. DOI: 10.1109/TKDE.2023.3282907

Jie Zou, Evangelos Kanoulas, Pengjie Ren, Zhaochun Ren, Aixin Sun, and Cheng Long. 2022. Improving Conversational Recommender Systems via Transformerbased Sequential Modelling. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR ’22).Association for Computing Machinery, NewYork,NY, USA, 2319–2324. DOI: 10.1145/3477495.3531852
Publicado
10/11/2025
MACHADO, Vinicius Gabriel; SCHMITT, Murilo F. L.; SPINOSA, Eduardo J.. Extrapolation-Based Data Augmentation for Sequential Recommendation. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 31. , 2025, Rio de Janeiro/RJ. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 238-247. DOI: https://doi.org/10.5753/webmedia.2025.15127.