Lince: Uma Ferramenta para Ofuscação Automática de Textos em Português

  • Antônio Franco UFMG
  • Leonardo Oliveira UFMG

Abstract


Currently, there are several approaches to provide anonymity on the Internet. However, one can still identify anonymous users through their writing style. With the advances in neural network and natural language processing research, the success of a classifier when accurately identify the author of a text is growing. On the other hand, new approaches that use neural networks for automatic generation of obfuscated texts have also arisen to fight anonymity adversaries. In this work, we present Lince, a tool that implements two machine learning state-of-art text obfuscation approaches.

References

Bagnall, D. (2015). Author identication using multi-headed recurrent neural networks. arXiv preprint arXiv:1506.04891.

Cho, K., Van Merri¨enboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078. 9 https://github.com/Maluuba/nlg-eval

Davis, R. C. (2019). Obfuscating authorship: Results of a user study on nondescript, a digital privacy tool. CUNY Academic Works.

Denkowski, M. and Lavie, A. (2014). Meteor universal: Language specic translation evaluation for any target language. In Proceedings of the ninth workshop on statistical machine translation, pages 376–380.

Emmery, C., Manjavacas, E., and Chrupaa, G. (2018). Style Obfuscation by Invariance. In COLING 2018, pages 984–996.

Ganin, Y. and Lempitsky, V. (2014). Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning. MIT press.

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780.

Mihaylova, T., Karadjov, G., Kiprov, Y., Georgiev, G., Koychev, I., and Nakov, P. (2016). SU@ PAN'2016: Author Obfuscation. In CLEF (Working Notes), pages 956–969.

Narayanan, A., Paskov, H., Gong, N. Z., Bethencourt, J., Stefanov, E., Shin, E. C. R., and Song, D. (2012). On the feasibility of internet-scale author identication. In 2012 IEEE Symposium on Security and Privacy, pages 300–314. IEEE.

Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311–318.

Potthast, M., Hagen, M., and Stein, B. (2016). Author obfuscation: Attacking the state of the art in authorship verication. In CLEF (Working Notes), pages 716–749.

Shetty, R., Schiele, B., and Fritz, M. (2018). A4NT: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation. In 27th USENIX Security Symposium (USENIX Security 18), pages 1633–1650, Baltimore, MD. USENIX Association.

Stamatatos, E., Rangel-Pardo, F. M., Tschuggnall, M., Stein, B., Kestemont, M., Rosso, P., and Potthast, M. (2018). Overview of PAN 2018. Author identication, author proling, and author obfuscation. Lecture Notes in Computer Science, 11018:267– 285.

Varela, P., Justino, E., and Oliveira, L. S. (2011). Selecting syntactic attributes for authorship attribution. In IJCNN, pages 167–172. IEEE.
Published
2019-09-02
FRANCO, Antônio; OLIVEIRA, Leonardo. Lince: Uma Ferramenta para Ofuscação Automática de Textos em Português. In: TOOLS - BRAZILIAN SYMPOSIUM ON INFORMATION AND COMPUTATIONAL SYSTEMS SECURITY (SBSEG), 19. , 2019, São Paulo. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 19-26. DOI: https://doi.org/10.5753/sbseg_estendido.2019.14000.