Recognition of Compromised Accounts on Twitter
Resumo
In this work, we propose an approach for recognition of compromised Twitter accounts based on Authorship Verification. Our solution can detect accounts that became compromised by analysing their user writing styles. This way, when an account content does not match its user writing style, we affirm that the account has been compromised, similar to Authorship Verification. Our approach follows the profile-based paradigm and uses N-grams as its kernel. Then, a threshold is found to represent the boundary of an account writing style. Experiments were performed using a subsampled dataset from Twitter. Experimental results showed that the developed model is very suitable for compromised recognition of Online Social Networks accounts due to the capability of recognize user styles over 95% accuracy.
Referências
A. K. Uysal and S. Gunal. The impact of preprocessing on text classification. Information Processing & Management, 50(1):104–112, 2014.
C. A. Bliss, I. M. Kloumann, K. D. Harris, C. M. Danforth, and P. S. Dodds. Twitter reciprocal reply networks exhibit assortativity with respect to happiness. Journal of Computational Science, 3(5):388–397, 2012.
C. Grier, K. Thomas, V. Paxson, and M. Zhang. @ spam: the underground on 140 characters or less. In Proceedings of the 17th ACM conference on Computer and communications security, pages 27–37. ACM, 2010.
C. Zhang, X. Wu, Z. Niu, and W. Ding. Authorship identification from unstructured texts. Knowledge-Based Systems, 2014.
C.-H. Li, F.-H. Hsu, S.-J. Chen, C.-S. Wang, Y.-H. Chen, and Y.-L. Hwang. Hawkeye: Finding spamming accounts. In Network Operations and Management Symposium (APNOMS), 2014 16th Asia-Pacific, pages 1–4. IEEE, 2014.
D. L. Olson and D. Delen. Advanced data mining techniques. Springer Science & Business Media, 2008.
H. Gao, J. Hu, C. Wilson, Z. Li, Y. Chen, and B. Y. Zhao. Detecting and characterizing social spam campaigns. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, pages 35–47. ACM, 2010.
J. A. Donais, R. A. Frost, S. M. Peelar, and R. A. Roddy. Summary: A system for the automated author attribution of text and instant messages. In Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on, pages 1484–1485. IEEE, 2013.
J. Smailovi´c, M. Grˇcar, N. Lavraˇc, and M. Znidarˇsiˇc. ˇ Stream-based active learning for sentiment analysis in the financial domain. Information Sciences, 2014.
J. Sun, Z. Yang, P. Wang, and S. Liu. Variable length character n-gram approach for online writeprint identification. In Multimedia Information Networking and Security (MINES), 2010 International Conference on, pages 486–490. IEEE, 2010.
J. Yang and J. Leskovec. Patterns of temporal variation in online media. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 177–186. ACM, 2011.
K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and evaluation of a real-time url spam filtering service. In Security and Privacy (SP), 2011 IEEE Symposium on, pages 447–462. IEEE, 2011.
L.-C. Hsieh, C.-W. Lee, T.-H. Chiu, and W. Hsu. Live semantic sport highlight detection based on analyzing tweets of twitter. In Multimedia and Expo (ICME), 2012 IEEE International Conference on, pages 949–954. IEEE, 2012.
M. Egele, G. Stringhini, C. Kruegel, and G. Vigna. Compa: Detecting compromised accounts on social networks. In NDSS, 2013.
M. L. Brocardo, I. Traore, S. Saad, and I. Woungang. Authorship verification for short messages using stylometry. In Computer, Information and Telecommunication Systems (CITS), 2013 International Conference on, pages 1–6. IEEE, 2013.
M. M. Mostafa. More than words: Social networks’ text mining for consumer brand sentiments. Expert Systems with Applications, 40(10):4241–4251, 2013.
M. Zappavigna. Ambient affiliation: A linguistic perspective on twitter. New Media & Society, 13(5):788–806, 2011.
N. Potha and E. Stamatatos. A profile-based method for authorship verification. In Artificial Intelligence: Methods and Applications, pages 313–326. Springer, 2014.
R. Layton, P. Watters, and R. Dazeley. Authorship attribution for twitter in 140 characters or less. In Cybercrime and Trustworthy Computing Workshop (CTC), 2010 Second, pages 1–8. IEEE, 2010.
R. Ramezani, N. Sheydaei, and M. Kahani. Evaluating the effects of textual features on authorship attribution accuracy. In Computer and Knowledge Engineering (ICCKE), 2013 3th International eConference on, pages 108–113. IEEE, 2013.
S. J. Yu. The dynamic competitive recommendation algorithm in social network services. Information Sciences, 187:1–14, 2012.
S. Keretna, A. Hossny, and D. Creighton. Recognising user identity in twitter social networks via text mining. In Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on, pages 3079–3082. IEEE, 2013.
S. Khanna and H. Chaudhry. Anatomy of compromising email accounts. In Information and Automation (ICIA), 2012 International Conference on, pages 640–645, June 2012.
S. Y. Bhat and M. Abulaish. Community-based features for identifying spammers in online social networks. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 100–107. ACM, 2013.
S.-A. Bahrainian and A. Dengel. Sentiment analysis and summarization of twitter data. In Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on, pages 227–234. IEEE, 2013.
T. Stein, E. Chen, and K. Mangla. Facebook immune system. In Proceedings of the 4th Workshop on Social Network Systems, page 8. ACM, 2011.
X. Zhou, S. Wu, C. Chen, G. Chen, and S. Ying. Real-time recommendation for microblogs. Information Sciences, 279:301–325, 2014.