Fraud Detection in Social Commerce: combining structured attributes and images
Resumo
Social Commerce has risen and evolved in the last years due to changes either in e-commerce or social networks applications. On top of that, the number of online ads and transactions in Social Commerce has grown. This environment is attractive to either good users and bad users. The bad users cause harm to their victims by making them lose money or suffer psychological damage. Since the volume of transactions is high and the fraud occurrence is low, the manual detection is highly inefficient (too much resource required for low detection) and unscalable. The existing solutions for automatic fraud detection in Social Commerce are based on structured information available in ads such as price, product type, brand, new/used, among others. However, such solutions ignore possible fraud signs from the ads images that exhibit the products sold. Therefore, this work aims to evaluate if combining structured information and images available in the ads provides more effective models than the ones considering only structured information. To this end, it proposes FDSC, a method that combines information obtained from ads images through deep learning with structured information available in the corresponding ads, in order to detect fraud in Social Commerce. Experimental evidence shows an incremental opportunity of 7% in F-score by the adoption of FDSC.
Palavras-chave:
fraud detection, social commerce (s-commerce), e-commerce, machine learning, image evaluation, image classification, image
Referências
Aisha Abdallah, Mohd Aizaini Maarof, and Anazida Zainal. 2016. Fraud detection system: A survey. Journal of Network and Computer Applications 68 (2016).
Kayode Sakariyah Adewole, Nor Badrul Anuar, Amirrudin Kamsin, Kasturi Dewi Varathan, and Syed Abdul. 2017. Malicious accounts : Dark of the social networks. Journal of Network and Computer Applications 79, September 2016(2017), 41–67.
Abdulrahman Alarifi, Mansour Alsaleh, and Abdulmalik Al-salman. 2016. Twitter turing test : Identifying social machines R. Information Sciences 372(2016).
V. Almendra. 2013. Finding the needle: A risk-based ranking of product listings at online auction sites for non-delivery fraud prediction. Expert Systems with Applications 40, 12 (2013), 4805–4811.
Naomi S Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician 46, 3 (1992), 175–185.
Noticias Automotivas. 2018. Carros só pra rodar: entenda como funciona. Retrieved June 3, 2019 from http://www.noticiasautomotivas.com.br/carros-so-pra-rodar
Aashir Baig and K Nagi Reddy. 2020. Utilizing product features for fraud detection on e-commerce platforms in big data transactions. Inter. Journal 5, 11 (2020).
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.
Nuno Carneiro, Gonçalo Figueira, and Miguel Costa. 2017. A data mining based system for credit-card fraud detection in e-tail. Decision Support Systems 95 (2017), 91–101.
Salvatore Carta, Gianni Fenu, Diego Reforgiato Recupero, and Roberto Saia. 2019. Fraud detection for E-commerce transactions by employing a prudential Multiple Consensus model. Journal of Information Security and Applications 46, 1(2019).
Ming Cheung, James She, and Lufi Liu. 2018. Deep Learning-based Online Counterfeit-seller Detection. INFOCOM - IEEE Inter. Conf. on Computer Communications 1 (2018), 51–56.
Ming Cheung, James She, and Ning Wang. 2017. Characterizing User Connections in Social Media through User Shared Image. IEEE Trans. on Big Data(2017).
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273–297.
Renata Gonçalves Curty and Ping Zhang. 2011. Social commerce: Looking back and forward. Proc. of the ASIST Annual Meeting 48 (2011), 1–10.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conf. on Computer Vision and pattern recognition. Ieee, 248–255.
Ksenia Dobreva. 2018. Global E-commerce Trends and Statistics. Retrieved April 24, 2018 from http://www.amasty.com/blog/2018/02/global-trends-and-statistics.pdf
FBI. 2017. 2017 Annual Internet Crime Report. Federal Bureau of Investigation - Internet Crime Complaint Center (2017).
Kathleen Fearn-Banks. 2016. Crisis communications: A casebook approach. Routledge.
Benjamin J. Ford, Haiping Xu, and Iren Valova. 2013. A real-time self-adaptive classifier for identifying suspicious bidders in online auctions. Computer Journal 56, 5 (2013), 646–663.
David A Freedman. 2009. Statistical models: theory and practice. Cambridge University Press.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT.
Simon Haykin. 2007. Redes neurais: princípios e prática. Bookman Editora.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. of the IEEE Conf. on Computer Vision and pattern recognition. 770–778.
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. preprint arXiv:1704.04861(2017).
Mohammad Hanif Jhaveri, Orcun Cetin, Carlos Gañán, Tyler Moore, and Michel Van Eeten. 2017. Abuse Reporting and the Fight Against Cybercrime. Comput. Surveys 49, 4 (2017), 1–27.
George H John and Pat Langley. 1995. Estimating continuous distributions in Bayesian classifiers. In Proc. of the 11th Conf. of Uncertainty in A.I.338–345.
Fujun Lai, Dahui Li, and Chang-tseh Hsieh. 2012. Fighting identity theft : The coping perspective. Decision Support Systems 52, 2 (2012), 353–363.
Tamilla Mavlanova, Raquel Benbunan-fich, and Guido Lang. 2016. The role of external and internal signals in E-commerce. Decision Support Systems 87 (2016).
Geoffrey J McLachlan. 2004. Discriminant analysis and statistical pattern recognition. Vol. 544. John Wiley & Sons.
Jay Nanduri, Yuting Jia, Anand Oka, John Beaver, and Yung-Wen Liu. 2020. Microsoft uses machine learning and optimization to reduce E-Commerce fraud. INFORMS Journal on Applied Analytics 50, 1 (2020), 64–79.
E. W.T. Ngai, Yong Hu, Y. H. Wong, Yijun Chen, and Xin Sun. 2011. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems 50, 3 (2011).
J. Ross Quinlan. 1987. Simplifying Decision Trees. Inter. Journal of Man-machine Studies 27, 3 (1987), 221–234.
Shailendra Rathore, Pradip Kumar, Vincenzo Loia, Young-sik Jeong, and Jong Hyuk. 2017. Social network security: Issues, challenges, threats, and solutions. Information Sciences 421(2017), 43–69.
Yusuf Sahin, Serol Bulkan, and Ekrem Duman. 2013. Expert Systems with Applications A cost-sensitive decision tree approach for fraud detection. Expert Systems With Applications 40, 15 (2013), 5916–5923.
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proc. of the IEEE Conf. on Computer Vision and pattern recognition. 4510–4520.
SEBRAE. 2017. O que você precisa saber sobre comércio eletrônico. Retrieved July 15, 2020 from https://tinyurl.com/sebraesbsi
Dossier Statista. 2017. Monthly Active Facebook Users Worldwide. Retrieved May 3, 2018 from https://tinyurl.com/sbsi-2021-facebook
Dossier Statista. 2018. Monthly Active Instagram Users. Retrieved May 6, 2018 from https://tinyurl.com/sbsi-2021-instagram
Lisa Torrey and Jude Shavlik. 2010. Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. Vol. 1. IGI global, 242–264.
Sidney Tsang, Yun Sing Koh, Gillian Dobbie, and Shafiq Alam. 2014. SPAN: Finding collaborative frauds in online auctions. Knowledge-Based Systems 71 (2014), 389–408.
Monica T Whitty and Tom Buchanan. 2012. The online romance scam: A serious cybercrime. CyberPsychology, Behavior, and Social Networking 15, 3(2012).
Hayden Wimmer and Victoria Y. Yoon. 2017. Counterfeit product detection: Bridging the gap between design science and behavioral science in information systems research. Decision Support Systems 104 (2017), 1–12.
Shan-hung Wu, Man-ju Chou, Chun-hsiung Tseng, Yuh-jye Lee, and Kuan-ta Chen. 2015. Detecting Identity Fraud on Social Network Services: A Case Study With Facebook. IEEE Systems Journal 11, 4 (2015), 1–12.
Shan-Hung Wu, Man-Ju Chou, Chun-Hsiung Tseng, Yuh-Jye Lee, and Kuan-Ta Chen. 2017. Detecting In Situ Identity Fraud on Social Network Services: A Case Study With Facebook. IEEE Systems Journal 11, 4 (2017), 2432–2443.
Lin Yang, Wei Bang Chen, Chengcui Zhang, John K. Johnstone, Song Gao, and Gary Warner. 2012. Profiling online auction sellers using image-editing styles. IEEE Multimedia 19, 1 (2012), 29–39.
Yasser Yasami and Farshad Safaei. 2017. A statistical infinite feature cascade-based approach to anomaly detection for dynamic social networks. Computer Communications 100 (2017), 52–64.
Kem Z K Zhang and Morad Benyoucef. 2016. Consumer behavior in social commerce : A literature review. Decision Support Systems 86 (2016), 95–108.
Jie Zhao, Raymond Y K Lau, Wenping Zhang, Kaihang Zhang, Xu Chen, and Deyu Tang. 2016. Extracting and reasoning about implicit behavioral evidences for detecting fraudulent online transactions in e-Commerce. Decision Support Systems 86 (2016), 109–121.
Liping Zhou, Wei Bang Chen, and Chengcui Zhang. 2013. Authorship detection and encoding for eBay images. Multimedia Data Engineering Applications and Processing 1 (2013), 20–34.
Yongchun Zhu, Dongbo Xi, Bowen Song, Fuzhen Zhuang, Shuai Chen, Xi Gu, and Qing He. 2020. Modeling users’ behavior sequences with hierarchical explainable network for cross-domain fraud detection. In Proc. of The Web Conf. 2020. 928–938.
Kayode Sakariyah Adewole, Nor Badrul Anuar, Amirrudin Kamsin, Kasturi Dewi Varathan, and Syed Abdul. 2017. Malicious accounts : Dark of the social networks. Journal of Network and Computer Applications 79, September 2016(2017), 41–67.
Abdulrahman Alarifi, Mansour Alsaleh, and Abdulmalik Al-salman. 2016. Twitter turing test : Identifying social machines R. Information Sciences 372(2016).
V. Almendra. 2013. Finding the needle: A risk-based ranking of product listings at online auction sites for non-delivery fraud prediction. Expert Systems with Applications 40, 12 (2013), 4805–4811.
Naomi S Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician 46, 3 (1992), 175–185.
Noticias Automotivas. 2018. Carros só pra rodar: entenda como funciona. Retrieved June 3, 2019 from http://www.noticiasautomotivas.com.br/carros-so-pra-rodar
Aashir Baig and K Nagi Reddy. 2020. Utilizing product features for fraud detection on e-commerce platforms in big data transactions. Inter. Journal 5, 11 (2020).
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.
Nuno Carneiro, Gonçalo Figueira, and Miguel Costa. 2017. A data mining based system for credit-card fraud detection in e-tail. Decision Support Systems 95 (2017), 91–101.
Salvatore Carta, Gianni Fenu, Diego Reforgiato Recupero, and Roberto Saia. 2019. Fraud detection for E-commerce transactions by employing a prudential Multiple Consensus model. Journal of Information Security and Applications 46, 1(2019).
Ming Cheung, James She, and Lufi Liu. 2018. Deep Learning-based Online Counterfeit-seller Detection. INFOCOM - IEEE Inter. Conf. on Computer Communications 1 (2018), 51–56.
Ming Cheung, James She, and Ning Wang. 2017. Characterizing User Connections in Social Media through User Shared Image. IEEE Trans. on Big Data(2017).
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273–297.
Renata Gonçalves Curty and Ping Zhang. 2011. Social commerce: Looking back and forward. Proc. of the ASIST Annual Meeting 48 (2011), 1–10.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conf. on Computer Vision and pattern recognition. Ieee, 248–255.
Ksenia Dobreva. 2018. Global E-commerce Trends and Statistics. Retrieved April 24, 2018 from http://www.amasty.com/blog/2018/02/global-trends-and-statistics.pdf
FBI. 2017. 2017 Annual Internet Crime Report. Federal Bureau of Investigation - Internet Crime Complaint Center (2017).
Kathleen Fearn-Banks. 2016. Crisis communications: A casebook approach. Routledge.
Benjamin J. Ford, Haiping Xu, and Iren Valova. 2013. A real-time self-adaptive classifier for identifying suspicious bidders in online auctions. Computer Journal 56, 5 (2013), 646–663.
David A Freedman. 2009. Statistical models: theory and practice. Cambridge University Press.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT.
Simon Haykin. 2007. Redes neurais: princípios e prática. Bookman Editora.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. of the IEEE Conf. on Computer Vision and pattern recognition. 770–778.
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. preprint arXiv:1704.04861(2017).
Mohammad Hanif Jhaveri, Orcun Cetin, Carlos Gañán, Tyler Moore, and Michel Van Eeten. 2017. Abuse Reporting and the Fight Against Cybercrime. Comput. Surveys 49, 4 (2017), 1–27.
George H John and Pat Langley. 1995. Estimating continuous distributions in Bayesian classifiers. In Proc. of the 11th Conf. of Uncertainty in A.I.338–345.
Fujun Lai, Dahui Li, and Chang-tseh Hsieh. 2012. Fighting identity theft : The coping perspective. Decision Support Systems 52, 2 (2012), 353–363.
Tamilla Mavlanova, Raquel Benbunan-fich, and Guido Lang. 2016. The role of external and internal signals in E-commerce. Decision Support Systems 87 (2016).
Geoffrey J McLachlan. 2004. Discriminant analysis and statistical pattern recognition. Vol. 544. John Wiley & Sons.
Jay Nanduri, Yuting Jia, Anand Oka, John Beaver, and Yung-Wen Liu. 2020. Microsoft uses machine learning and optimization to reduce E-Commerce fraud. INFORMS Journal on Applied Analytics 50, 1 (2020), 64–79.
E. W.T. Ngai, Yong Hu, Y. H. Wong, Yijun Chen, and Xin Sun. 2011. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems 50, 3 (2011).
J. Ross Quinlan. 1987. Simplifying Decision Trees. Inter. Journal of Man-machine Studies 27, 3 (1987), 221–234.
Shailendra Rathore, Pradip Kumar, Vincenzo Loia, Young-sik Jeong, and Jong Hyuk. 2017. Social network security: Issues, challenges, threats, and solutions. Information Sciences 421(2017), 43–69.
Yusuf Sahin, Serol Bulkan, and Ekrem Duman. 2013. Expert Systems with Applications A cost-sensitive decision tree approach for fraud detection. Expert Systems With Applications 40, 15 (2013), 5916–5923.
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proc. of the IEEE Conf. on Computer Vision and pattern recognition. 4510–4520.
SEBRAE. 2017. O que você precisa saber sobre comércio eletrônico. Retrieved July 15, 2020 from https://tinyurl.com/sebraesbsi
Dossier Statista. 2017. Monthly Active Facebook Users Worldwide. Retrieved May 3, 2018 from https://tinyurl.com/sbsi-2021-facebook
Dossier Statista. 2018. Monthly Active Instagram Users. Retrieved May 6, 2018 from https://tinyurl.com/sbsi-2021-instagram
Lisa Torrey and Jude Shavlik. 2010. Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. Vol. 1. IGI global, 242–264.
Sidney Tsang, Yun Sing Koh, Gillian Dobbie, and Shafiq Alam. 2014. SPAN: Finding collaborative frauds in online auctions. Knowledge-Based Systems 71 (2014), 389–408.
Monica T Whitty and Tom Buchanan. 2012. The online romance scam: A serious cybercrime. CyberPsychology, Behavior, and Social Networking 15, 3(2012).
Hayden Wimmer and Victoria Y. Yoon. 2017. Counterfeit product detection: Bridging the gap between design science and behavioral science in information systems research. Decision Support Systems 104 (2017), 1–12.
Shan-hung Wu, Man-ju Chou, Chun-hsiung Tseng, Yuh-jye Lee, and Kuan-ta Chen. 2015. Detecting Identity Fraud on Social Network Services: A Case Study With Facebook. IEEE Systems Journal 11, 4 (2015), 1–12.
Shan-Hung Wu, Man-Ju Chou, Chun-Hsiung Tseng, Yuh-Jye Lee, and Kuan-Ta Chen. 2017. Detecting In Situ Identity Fraud on Social Network Services: A Case Study With Facebook. IEEE Systems Journal 11, 4 (2017), 2432–2443.
Lin Yang, Wei Bang Chen, Chengcui Zhang, John K. Johnstone, Song Gao, and Gary Warner. 2012. Profiling online auction sellers using image-editing styles. IEEE Multimedia 19, 1 (2012), 29–39.
Yasser Yasami and Farshad Safaei. 2017. A statistical infinite feature cascade-based approach to anomaly detection for dynamic social networks. Computer Communications 100 (2017), 52–64.
Kem Z K Zhang and Morad Benyoucef. 2016. Consumer behavior in social commerce : A literature review. Decision Support Systems 86 (2016), 95–108.
Jie Zhao, Raymond Y K Lau, Wenping Zhang, Kaihang Zhang, Xu Chen, and Deyu Tang. 2016. Extracting and reasoning about implicit behavioral evidences for detecting fraudulent online transactions in e-Commerce. Decision Support Systems 86 (2016), 109–121.
Liping Zhou, Wei Bang Chen, and Chengcui Zhang. 2013. Authorship detection and encoding for eBay images. Multimedia Data Engineering Applications and Processing 1 (2013), 20–34.
Yongchun Zhu, Dongbo Xi, Bowen Song, Fuzhen Zhuang, Shuai Chen, Xi Gu, and Qing He. 2020. Modeling users’ behavior sequences with hierarchical explainable network for cross-domain fraud detection. In Proc. of The Web Conf. 2020. 928–938.
Publicado
07/06/2021
Como Citar
BATISTA, Apolo Takeshi Arai; FIGUEIREDO, Karla Tereza; GOLDSCHMIDT, Ronaldo Ribeiro.
Fraud Detection in Social Commerce: combining structured attributes and images. In: SIMPÓSIO BRASILEIRO DE SISTEMAS DE INFORMAÇÃO (SBSI), 17. , 2021, Uberlândia.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2021
.