Classificação de Tweets sobre Trânsito Utilizando Diferentes Técnicas de Deep Learning

Estevan Teixeira; Pedro Moura; Carlos Alberto Campos

doi:10.5753/sbesc_estendido.2020.13095

Estevan Teixeira UNIRIO
Pedro Moura UNIRIO
Carlos Alberto Campos UNIRIO

DOI: https://doi.org/10.5753/sbesc_estendido.2020.13095

Resumo

No âmbito da mobilidade urbana, a obtenção de informação de forma rápida e localizada para tomada de decisão e um dos principais desafios atuais. Nesse contexto, redes sociais podem funcionar como uma das fontes de extração de conhecimento para diversas tarefas, dentre as quais controle de trânsito. Contudo, tais dados precisam ser bem classificados para garantir que somente informações relevantes sejam utilizadas. Particularmente em países lusófonos, não há muitos estudos sobre tal classificação, em especial explorando o potencial das redes neurais. Assim, este trabalho propõe um modelo de representação e classificação de microtexto para a língua portuguesa através de técnicas modernas de deep learning, com o objetivo de gerar informações de trânsito. Para tal, são analisados os resultados da combinação de diversas arquiteturas de deep learning para representação e classificação, levando a resultados de acurácia e precisão acima de 95%.

Palavras-chave: Smart Cities, Social Networks, NLP, Deep Learning.

Referências

S. K. Endarnoto, S. Pradipta, A. S. Nugroho, and J. Purnama, "Traffic condition information extraction & visualization from social media twitter for android mobile application," in Proceedings of the 2011 International Conference on Electrical Engineering and Informatics.IEEE, 2011, pp. 1–4.

B. Sriram, D. Fuhry, E. Demir, H. Ferhatosmanoglu, and M. Demirbas, "Short text classification in twitter to improve information filtering," in Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, 2010, pp. 841–842.

E. Riloff and W. Lehnert, "Information extraction as a basis for high-precision text classification," ACM Transactions on Information Systems(TOIS), vol. 12, no. 3, pp. 296–333, 1994.

H. Ragas and C. H. Koster, "Four text classification algorithms comparedon a dutch corpus," in Proceedings of the 21st annual international ACMSIGIR conference on Research and development in information retrieval,1998, pp. 369–370.

J. Fürnkranz, "Exploiting structural information for text classificationon the www," in International Symposium on Intelligent Data Analysis. Springer, 1999, pp. 487–497.

"Termos de serviço do twitter," acessado: 16 de agosto de 2020.[Online]. Available: https://twitter.com/pt/tos

L. Teteo, P. Moura, E. Soares, and C. Campos, "Um frameworkde extração e etiquetamento de informações de trânsito," in Anais do XVIII Workshop em Desempenho de Sistemas Computacionais ede Comunicação. Porto Alegre, RS, Brasil: SBC, 2019. [Online]. Available: https://portaldeconteudo.sbc.org.br/index.php/wperformance/article/view/6472

B. Y. Pratama and R. Sarno, "Personality classification based on twitter text using naive bayes, knn and svm," in 2015 International Conferenceon Data and Software Engineering (ICoDSE).IEEE, 2015, pp. 170–174.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016, http://www.deeplearningbook.org.

S. Lai, L. Xu, K. Liu, and J. Zhao, "Recurrent convolutional neural networks for text classification," in Twenty-ninth AAAI conference onartificial intelligence, 2015.

J. P. Chiu and E. Nichols, "Named entity recognition with bidirectionallstm-cnns," Transactions of the Association for Computational Linguistics, vol. 4, pp. 357–370, 2016.

S. Dabiri and K. Heaslip, "Developing a twitter-based traffic event detection model using deep learning architectures," Expert Systems with Applications, vol. 118, pp. 425–439, 2019.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard,W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition, "Neural Computation, vol. 1, no. 4, pp. 541–551,1989.

W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, "A survey of deep neural network architectures and their applications," Neurocomputing, vol. 234, pp. 11–26, 2017.

S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Comput., vol. 9, no. 8, p. 1735–1780, Nov. 1997. [Online]. Available:https://doi.org/10.1162/neco.1997.9.8.1735

P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. Pearson Education, 2006.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781,2013.

J. Pennington, R. Socher, and C. D. Manning, "Glove: Global vectors for word representation," in Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532–1543. [Online]. Available:http://www.aclweb.org/anthology/D14-1162

S. Wang, X. Zhang, J. Cao, L. He, L. Stenneth, P. S. Yu, Z. Li, andZ. Huang, "Computing urban traffic congestions by incorporating sparse gps probe data and social media data," ACM Transactions on Information Systems (TOIS), vol. 35, no. 4, pp. 1–30, 2017.

Z. Zheng, C. Wang, P. Wang, Y. Xiong, F. Zhang, and Y. Lv, "Framework for fusing traffic information from social and physical transportation data," PloS one, vol. 13, no. 8, 2018.

P. H. L. Rettore, I. Araujo, J. G. M. de Menezes, L. Villas, and A. A. F.Loureiro, "Serviço de detecção e enriquecimento de eventos rodoviários baseado em fusão de dados heterogêneos para vanets," in Anais do XXXVII Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos. SBC, 2019, pp. 363–376.

F. F. A. Tenorio, E. Chagas, P. Barros, and H. S. Ramos, "Detecção de eventos no twitter através de grafos de visibilidade natural," in Anais do III Workshop de Computação Urbana. SBC, 2019, pp. 181–193.

S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python. O’Reilly Media Inc, 2009.

N. S. Hartmann, E. R. Fonseca, C. D. Shulby, M. V. Treviso, J. S.Rodrigues, and S. M. Aluísio, "Portuguese word embeddings: Evaluatingon word analogies and natural language tasks," in Anais do XI Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana. SBC,2017, pp. 122–131.