ST-MTLNet: Representações Espaço-Temporais de Pontos de Interesse para Aprendizado Multitarefa

Tarik S. Paiva; Vitor H. O. Silva; Germano B. dos Santos; Fabrício A. Silva

doi:10.5753/courb.2026.22960

Tarik S. Paiva UFV
Vitor H. O. Silva UFV
Germano B. dos Santos UFV
Fabrício A. Silva UFV

DOI: https://doi.org/10.5753/courb.2026.22960

Resumo

Este trabalho propõe o ST-MTLNet, uma arquitetura multitarefa para classificação de categoria de POI e predição do próximo POI baseada em representações desacopladas. O modelo combina uma representação espacial contínua para coordenadas geográficas, uma representação temporal (Time2Vec) para padrões de visitação e uma representação categórica hierárquica (HGI) para contexto estrutural e regional dos POIs. Duas arquiteturas de codificação espacial, SIREN e Sphere2Vec-M, originalmente propostas para sensoriamento remoto e ecologia, são avaliadas no contexto de tarefas multitarefa de POIs em LBSNs. Experimentos com o dataset Gowalla nos estados da Flórida, Califórnia e Texas demonstram que a abordagem proposta supera o baseline em todas as 21 combinações de categoria e estado para classificação, com ganhos médios de 20 a 24 pontos percentuais, e em 76% das combinações para predição do próximo POI. A comparação entre as arquiteturas espaciais revela ainda perfis complementares de desempenho associados à distribuição geográfica dos POIs em cada território.

Referências

Baxter, J. (2000). A model of inductive bias learning. Journal of Artificial Intelligence Research, 12:149–198.

Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. volume 15, pages 1373–1396.

Caruana, R. (1997). Multitask learning. Machine Learning, 28(1):41–75.

Cho, E., Myers, S. A., and Leskovec, J. (2011). Friendship and mobility: user movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1082–1090.

Church, K. W. (2017). Word2vec. Natural Language Engineering, 23(1):155–162.

Feng, S., Cong, G., An, B., and Chee, Y. M. (2017). Poi2vec: Geographical latent representation for predicting future visitors. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31.

Grover, A. and Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 855–864.

Halder, S., Lim, K. H., Chan, J., and Zhang, X. (2022). Poi recommendation with queuing time and user interest awareness. Data Mining and Knowledge Discovery, 36:2379–2409.

Huang, W., Zhang, D., Mai, G., Guo, X., and Cui, L. (2023). Learning urban region representations with pois and hierarchical graph infomax. ISPRS Journal of Photogrammetry and Remote Sensing, 196:134–145.

Jure, L. (2014). Snap datasets: Stanford large network dataset collection. Retrieved December 2021 from [link].

Kazemi, S. M., Goel, R., Eghbali, S., Ramanan, J., Sahota, J., Thakur, S., Wu, S., Smyth, C., Poupart, P., and Brubaker, M. (2019). Time2vec: Learning a vector representation of time. arXiv preprint arXiv:1907.05321.

Liao, D., Liu, W., Zhong, Y., Li, J., and Wang, G. (2018). Predicting activity and location with multi-task context aware recurrent neural network. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 3435–3441. International Joint Conferences on Artificial Intelligence Organization.

Lim, N., Hooi, B., Ng, S.-K., Goh, Y. L., Weng, R., and Tan, R. (2022). Hierarchical multi-task graph recurrent network for next poi recommendation. In SIGIR ’22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1133–1143. ACM.

Liu, Y., Wei, W., Sun, A., and Miao, C. (2014). Exploiting geographical neighborhood characteristics for location recommendation. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM ’14, page 739–748, New York, NY, USA. Association for Computing Machinery.

Mai, G., Xuan, Y., Zuo, W., He, Y., Song, J., Ermon, S., Janowicz, K., and Lao, N. (2023). Sphere2vec: A general-purpose location representation learning over a spherical surface for large-scale geospatial predictions.

Navon, A., Shamsian, A., Achituve, I., Maron, H., Kawaguchi, K., Chechik, G., and Fetaya, E. (2022). Multi-task learning as a bargaining game. arXiv preprint arXiv:2202.01017.

Perez, E., Strub, F., de Vries, H., Dumoulin, V., and Courville, A. C. (2018). FiLM: Visual reasoning with a general conditioning layer. In Proc. AAAI Conf. Artificial Intelligence, pages 3942–3951.

Rahmani, H. A., Aliannejadi, M., Mirzaei Zadeh, R., Baratchi, M., Afsharchi, M., and Crestani, F. (2019). Category-aware location embedding for point-of-interest recommendation. In Proceedings of the 2019 ACM SIGIR international conference on theory of information retrieval, pages 173–176.

Rußwurm, M., Klemmer, K., Rolf, E., Zbinden, R., and Tuia, D. (2024). Geographic location encoding with spherical harmonics and sinusoidal representation networks.

Silva, V. H. O., Almeida, I. F., Paiva, T. S., Santos, G. B., Silva, F. A., and Sousa, F. T. (2025). An investigation into multi-task learning for point-of-interest category classification and next-poi prediction. In Proceedings of the Brazilian Conference on Intelligent Systems (CBIC). Submetido.

Sitzmann, V., Martel, J., Bergman, A., Lindell, D., and Wetzstein, G. (2020). Implicit neural representations with periodic activation functions. In Advances in Neural Information Processing Systems, volume 33, pages 7462–7473.

Sun, K., Qian, T., Chen, T., Liang, Y., Nguyen, Q. V. H., and Yin, H. (2020). Where to go next: Modeling long-and short-term user preferences for point-of-interest recommendation. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 214–221.

Sun, Y. (2024). Transtarec: Time-adaptive translating embedding model for next poi recommendation. In 2024 5th International Conference on Computer Engineering and Application (ICCEA), pages 647–651. IEEE.

Veličković, P., Fedus, W., Hamilton, W. L., Liò, P., Bengio, Y., and Hjelm, R. D. (2019). Deep graph infomax. In International Conference on Learning Representations (ICLR).

Wu, N., Cao, Q., Wang, Z., Liu, Z., Qi, Y., Zhang, J., Ni, J., Yao, X., Ma, H., Mu, L., et al. (2024). Torchspatial: A location encoding framework and benchmark for spatial representation learning. Advances in Neural Information Processing Systems, 37:81437–81460.

Xia, B., Bai, Y., Yin, J., Li, Q., and Xu, L. (2020). Mtpr: A multi-task learning based poi recommendation considering temporal check-ins and geographical locations. Applied Sciences, 10(19):6664.

Xu, R., Chen, M., Gong, Y., Liu, Y., Yu, X., and Nie, L. (2023). Tme: Tree-guided multi-task embedding learning towards semantic venue annotation. ACM Trans. Inf. Syst., 41(4):112.