Extração e Avaliação de uma Base de Dados sobre Criminalidade em Português a partir do Twitter

  • Gabriel V. da Fonseca Miranda UFV
  • Vinícius Gabriel de J. Almeida UFV
  • Thais R. M. Braga Silva UFV
  • Fabrício A. Silva UFV

Abstract


In the last years studies on security solutions to smart homes, transportation systems and even cities have been developed. In this scenario, criminal data have became increasly important. Although occurences from police databases are frequently used, many times the most ordinary crimes end up not being registered by them. The goal of this work is to present a method to extract criminal data from São Paulo city portuguese Twitter posts. The majority of related work found perform automatic extractions for english texts. When portuguese is considered, the accurancy is frequently not presented and the final dataset is small. In this work, the final dataset has 1,333 labeled tweets, which were compared to a police database, highlighting information similarities but also possibilities for complementation.

References

Adesola, F., Misra, S., Omoregbe, N., Damasevicius, R., and Maskeliunas, R. (2019). An IOT-Based Architecture for Crime Management in Nigeria, pages 245–254. Springer Singapore, Singapore.

Cai, L. and Zhu, Y. (2015). The challenges of data quality and data quality assessment in the big data era. Data Sci. J., 14:2.

Clarindo, J. P., Coutinho, F., and Freitas, A. L. (2016). Detecção de casos de violência patrimonial a partir do twitter. In Anais do V Brazilian Workshop on Social Network Analysis and Mining, pages 211–216. SBC.

dos Reis, G. O. and Nakamura, E. F. (2017). Crimes: reportes oficiais vs. postagens no twitter. In Anais Estendidos do XXIII Simpósio Brasileiro de Sistemas Multimídia e Web, pages 111–114. SBC.

dos Santos, L. S. F. C. (2015). Estudo online da dinâmica espaço-temporal de crimes através de dados da rede social twitter. Master’s thesis, Universidade Federal de Minas Gerais, Belo Horizonte.

Gerber, M. S. (2014). Predicting crime using twitter and kernel density estimation. Decision Support Systems, 61:115–125.

Laufs, J., Borrion, H., and Bradford, B. (2020). Security and the smart city: A systematic review. Sustainable Cities and Society, 55:102023.

Mahajan, R. and Mansotra, V. (2021). Correlating crime and social media: using semantic sentiment analysis. International Journal of Advanced Computer Science and Applications, 12(3).

Neto, A. J. V., Zhao, Z., Rodrigues, J. J. P. C., Camboim, H. B., and Braun, T. (2018). Fog-based crime-assistance in smart iot transportation system. IEEE Access, 6:11101–11111.

Prathap, B. R. and Ramesha, K. (2018). Twitter sentiment for analysing different types of crimes. In 2018 International Conference on Communication, Computing and Internet of Things (IC3IoT), pages 483–488. IEEE.

Sarhan, Q. I. (2020). Systematic survey on smart home safety and security systems using the arduino platform. IEEE Access, 8:128362–128384.

Secron, T. M., da Silva, E. R., de Farias, C. M., and Cruz, T. (2016). Sigaciente: Uma ferramenta para inferência do trânsito e de rotas seguras baseada em dados sociais. In ERSI’2016, pages 58–65.

Vivek, M. and Prathap, B. R. (2023). Spatio-temporal crime analysis and forecasting on twitter data using machine learning algorithms. SN Computer Science, 4(4):383.
Published
2023-08-06
MIRANDA, Gabriel V. da Fonseca; ALMEIDA, Vinícius Gabriel de J.; SILVA, Thais R. M. Braga; SILVA, Fabrício A.. Extração e Avaliação de uma Base de Dados sobre Criminalidade em Português a partir do Twitter. In: PROCEEDINGS OF BRAZILIAN SYMPOSIUM ON UBIQUITOUS AND PERVASIVE COMPUTING (SBCUP), 15. , 2023, João Pessoa/PB. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 61-70. ISSN 2595-6183. DOI: https://doi.org/10.5753/sbcup.2023.230076.