Lince: Uma Ferramenta para Ofuscação Automática de Textos em Português

Antônio Franco; Leonardo Oliveira

doi:10.5753/sbseg_estendido.2019.14000

Antônio Franco UFMG
Leonardo Oliveira UFMG

DOI: https://doi.org/10.5753/sbseg_estendido.2019.14000

Resumo

Atualmente existem diversas abordagens que fornecem anonimato na Internet. No entanto, usuários anônimos ainda podem ser identiﬁcados pelo seu estilo de escrita. Com o avanço das pesquisas em redes neurais e processamento de linguagem natural, a chance de sucesso de um classiﬁcador identiﬁcar corretamente o autor de um texto tem crescido cada vez mais. Por outro lado, novas abordagens que utilizam redes neurais para geração automática de textos ofuscados também têm surgido para combater os adversários do anonimato na Internet. Neste trabalho, nós apresentamos Lince, uma ferramenta que implementa duas abordagens do estado da arte baseadas em aprendizado de máquina para ofuscação de textos.

Referências

Bagnall, D. (2015). Author identication using multi-headed recurrent neural networks. arXiv preprint arXiv:1506.04891.

Cho, K., Van Merri¨enboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078. 9 https://github.com/Maluuba/nlg-eval

Davis, R. C. (2019). Obfuscating authorship: Results of a user study on nondescript, a digital privacy tool. CUNY Academic Works.

Denkowski, M. and Lavie, A. (2014). Meteor universal: Language specic translation evaluation for any target language. In Proceedings of the ninth workshop on statistical machine translation, pages 376–380.

Emmery, C., Manjavacas, E., and Chrupaa, G. (2018). Style Obfuscation by Invariance. In COLING 2018, pages 984–996.

Ganin, Y. and Lempitsky, V. (2014). Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning. MIT press.

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780.

Mihaylova, T., Karadjov, G., Kiprov, Y., Georgiev, G., Koychev, I., and Nakov, P. (2016). SU@ PAN'2016: Author Obfuscation. In CLEF (Working Notes), pages 956–969.

Narayanan, A., Paskov, H., Gong, N. Z., Bethencourt, J., Stefanov, E., Shin, E. C. R., and Song, D. (2012). On the feasibility of internet-scale author identication. In 2012 IEEE Symposium on Security and Privacy, pages 300–314. IEEE.

Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311–318.

Potthast, M., Hagen, M., and Stein, B. (2016). Author obfuscation: Attacking the state of the art in authorship verication. In CLEF (Working Notes), pages 716–749.

Shetty, R., Schiele, B., and Fritz, M. (2018). A4NT: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation. In 27th USENIX Security Symposium (USENIX Security 18), pages 1633–1650, Baltimore, MD. USENIX Association.

Stamatatos, E., Rangel-Pardo, F. M., Tschuggnall, M., Stein, B., Kestemont, M., Rosso, P., and Potthast, M. (2018). Overview of PAN 2018. Author identication, author proling, and author obfuscation. Lecture Notes in Computer Science, 11018:267– 285.

Varela, P., Justino, E., and Oliveira, L. S. (2011). Selecting syntactic attributes for authorship attribution. In IJCNN, pages 167–172. IEEE.

Lince: Uma Ferramenta para Ofuscação Automática de Textos em Português

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)