Automated assessment of visual aesthetics of Android user interfaces with deep learning


Visual aesthetics is seen as an essential success factor for mobile apps, affecting user experience and perception, which makes their evaluation crucial as part of the interface designing process. Machine learning approaches have shown to be very promising in this context, but so far, there are only solutions for assessing the aesthetics of web-based graphical user interfaces (GUIs). This article presents a deep learning approach to automatically quantify the visual aesthetics of GUIs of Android apps developed with App Inventor, based on a convolutional neural network (CNN) and adopting a regression-based supervised learning approach. The performance results demonstrate that the CNN can learn to evaluate the visual aesthetics of GUIs with a mean squared error of 0.023 for the validation set and 0.017 for the test set. The model ratings are also highly correlated with human ratings (Spearman rank correlation coefficient rho = 0.86 for the validation set and rho = 0.95 for the test set). And the results of a Bland & Altman analysis show that more than 95% of them agree with the human ratings. These promising results indicate that the model can be an effective and efficient means to automate the visual aesthetics assessment during the GUI design process for mobile apps.

Palavras-chave: visual aesthetics, mobile application, android, deep learning, visual design, app inventor


Khalid Alemerien and Kenneth Magel. 2014. GUIEvaluator: A Metric-tool for Evaluating the Complexity of Graphical User Interfaces. In Proceedings of the Twenty-Sixth International Conference on Software Engineering & Knowledge Engineering. Knowledge Systems Institute Graduate School, Vancouver, BC, Canada, 13--18.

Nathalia da Cruz Alves, Christiane Gresse Von Wangenheim, and Jean Carlo Rossa Hauck. 2019. Approaches to Assess Computational Thinking Competences Based on Code Analysis in K-12 Education: A Systematic Mapping Study. Informatics in Education 18, 1 (2019), 17--39. Publisher: Vilnius University Institute of Mathematics and Informatics, Lithuanian Academy of Sciences.

Stephen P. Anderson. 2011. Seductive Interaction Design: Creating Playful, Fun, and Effective User Experiences (1st edition ed.). New Riders Pub.

Rob Ashmore, Radu Calinescu, and Colin Paterson. 2021. Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges. Comput. Surveys 54, 5 (2021), 111:1--111:39.

Upasna Bhandari, Klarissa Chang, and Tillmann Neben. 2019. Understanding the Impact of Perceived Visual Aesthetics on User Evaluations: An Emotional perspective. Information & Management 56, 1 (2019), 85--93.

Upasna Bhandari, Tillmann Neben, Klarissa Chang, and Wen Yong Chua. 2017. Effects of Interface Design Factors on Affective Responses and Quality Evaluations in Mobile Applications. Computers in Human Behavior 72 (2017), 525--534. Place: Netherlands Publisher: Elsevier Science.

J. Martin Bland and Douglas G. Altman. 1986. Statistical Methods for Assessing Agreement Between Two Methods of Clinical Measurement. The Lancet 327, 8476 (1986), 307--310.

Alan Bryman and Duncan Cramer. 1990. Quantitative Data Analysis for Social Scientists. Taylor & Francis/Routledge. Pages: xiv, 290.

Margaret M. Burnett and Christopher Scaffidi. 2013. End-User Development. In The Encyclopedia of Human Interaction. [link].

Junho H. Choi and Hye-Jin Lee. 2012. Facets of Simplicity for the Smartphone Interface: A Structural Model. International Journal of Human-Computer Studies 70, 2 (2012), 129--142.

Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2017. Image Aesthetic Assessment: An experimental survey. IEEE Signal Processing Magazine 34, 4 (2017), 80--106. Conference Name: IEEE Signal Processing Magazine.

Qi Dou, Xianjun Sam Zheng, Tongfang Sun, and Pheng-Ann Heng. 2019. Webthetics: Quantifying Webpage Aesthetics with Deep Learning. International Journal of Human-Computer Studies 124 (2019), 56--66.

Harleen K. Flora, Xiaofeng Wang, and Swati V. Chande. 2014. An Investigation into Mobile Application Development Processes: Challenges and Best Practices. International Journal of Modern Education and Computer Science (IJMECS) 6, 6 (2014), 1.

Rubén D. Fonnegra, Bryan Blair, and Gloria M. Díaz. 2017. Performance Comparison of Deep Learning Frameworks in Image Classification Problems Using Convolutional and Recurrent Networks. In 2017 IEEE Colombian Conference on Communications and Computing (COLCOM). 1--6.

Davide Giavarina. 2015. Understanding Bland Altman Analysis. Biochemia Medica 25, 2 (2015), 141--151.

Christiane Gresse von Wangenheim, Jean Carlo Rossa Hauck, Matheus Faustino Demetrio, Rafael Pelle, Nathalia da Cruz Alves, Luiz Felipe Azevedo, and Heliziane Barbosa. 2018. CodeMaster - Automatic Assessment and Grading of App Inventor and Snap! Programs. Informatics in Education - An International Journal 17, 1 (2018), 117--150. Publisher: Vilniaus Universiteto Leidykla.

Christiane Gresse von Wangenheim, João V. Araujo Porto, Jean C. R. Hauck, and Adriano F. Borgatto. 2018. Do We Agree on User Interface Aesthetics of Android Apps? arXiv:1812.09049 [cs] (2018). arXiv:1812.09049

Kai-Christoph Hamborg, Julia Hülsmann, and Kai Kaspar. 2014. The Interplay between Usability and Aesthetics: More Evidence for the "What Is Usable Is Beautiful" Notion. Advances in Human-Computer Interaction 2014 (2014), e946239. Publisher: Hindawi.

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778. [link].

Jeremy Howard and Sylvain Gugger. 2020. Fastai: A Layered API for Deep Learning. Information 11, 2 (2020), 108. Number: 2 Publisher: Multidisciplinary Digital Publishing Institute.

Kuo-Ying Huang. 2009. Challenges in Human-Computer Interaction Design for Mobile Devices. In Proceedings of the World Congress on Engineering and Computer Science (San Francisco, USA), Vol. 1. 236--241.

ISO. 2011. ISO/IEC 25010:2011, Systems and Software Engineering --- Systems and Software Quality Requirements and Evaluation (SQuaRE) --- System and Software Quality Models. [link].

Masoud Ganj Khani, Mohammad Reza Mazinani, Mohsen Fayyaz, and Mojtaba Hoseini. 2016. A Novel Approach for Website Aesthetic Evaluation Based on Convolutional Neural Networks. In Proceedings of the 2016 Second International Conference on Web Research (ICWR). 48--53.

Jinwoo Kim, Jooeun Lee, and Dongseong Choi. 2003. Designing Emotionally Evocative Homepages: An Empirical Study of the Quantitative Relations Between Design Factors and Emotional Dimensions. International Journal of Human-Computer Studies 59, 6 (2003), 899--940.

Jens Kirchner, Andreas Heberle, and Welf Löwe. 2015. Classification vs. Regression - Machine Learning Approaches for Service Recommendation Based on Measured Consumer Experiences. In 2015 IEEE World Congress on Services. 278--285. ISSN: 2378-3818.

Talia Lavie and Noam Tractinsky. 2004. Assessing Dimensions of Perceived Visual Aesthetics of Web Sites. International Journal of Human-Computer Studies 60, 3 (2004), 269--298.

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep Learning. Nature 521, 7553 (2015), 436--444.

Christophe Leys, Christophe Ley, Olivier Klein, Philippe Bernard, and Laurent Licata. 2013. Detecting Outliers: Do Not Use Standard Deviation Around the Mean, Use Absolute Deviation Around the Median. Journal of Experimental Social Psychology 49, 4 (2013), 764--766.

Adriano Luiz de Souza Lima and Christiane Gresse von Wangenheim. 2021. Assessing the Visual Esthetics of User Interfaces: A Ten-Year Systematic Mapping. International Journal of Human-Computer Interaction (2021), 1--21. Publisher: Taylor & Francis _eprint:

Adriano Luiz de Souza Lima, Christiane Gresse von Wangenheim, and Adriano Ferreti Borgatto. 2021. Comparing Scales for the Assessment of Visual Aesthetics of Mobile User Interfaces through Human Judgments. (2021).

Gitte Lindgaard, Cathy Dudek, Devjani Sen, Livia Sumegi, and Patrick Noonan. 2011. An Exploration of Relations Between Visual Appeal, Trustworthiness and Perceived Usability of Homepages. ACM Transactions on Computer-Human Interaction 18, 1 (2011), 1:1--1:30.

Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z. Wang. 2014. RAPID: Rating Pictorial Aesthetics Using Deep Learning. In Proceedings of the 22nd ACM international conference on Multimedia (New York, NY, USA) (MM '14). Association for Computing Machinery, 457--466.

Gautam Malu, Raju S. Bapi, and Bipin Indurkhya. 2017. Learning Photography Aesthetics with Deep CNNs. arXiv:1707.03981 [cs] (2017). arXiv:1707.03981

Aliaksei Miniukovich and Antonella De Angeli. 2014. Quantification of Interface Visual Complexity. In Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces (New York, NY, USA) (AVI '14). Association for Computing Machinery, 153--160.

Aliaksei Miniukovich and Antonella De Angeli. 2014. Visual Impressions of Mobile App Interfaces. In Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational (New York, NY, USA) (NordiCHI '14). Association for Computing Machinery, 31--40.

Aliaksei Miniukovich and Antonella De Angeli. 2015. Computation of Interface Aesthetics. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1163--1172.

MIT App Inventor. 2022. MIT App Inventor | Explore MIT App Inventor.

Tom M. Mitchell. 1997. Machine Learning (1st ed.). McGraw-Hill Education.

Morten Moshagen and Meinald Thielsch. 2010. Facets of Visual Aesthetics. International Journal of Human-Computer Studies 68, 10 (2010), 689--709.

Morten Moshagen and Meinald Thielsch. 2013. A Short Version of the Visual Aesthetics of Websites Inventory. Behaviour & Information Technology 32, 12 (2013), 1305--1311.

Don Norman. 2002. Emotion & Design: Attractive Things Work Better. Interactions 9, 4 (2002), 36--42.

Jum C. Nunnally and Ira H. Bernstein. 1994. Psychometric Theory (3rd ed.). McGraw-Hill.

Fabio Paternò. 2013. End User Development: Survey of an Emerging Field for Empowering People. ISRN Software Engineering 2013 (2013), e532659. Publisher: Hindawi.

Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2017. Data Management Challenges in Production Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data (New York, NY, USA) (SIGMOD '17). Association for Computing Machinery, 1723--1726.

Lumpapun Punchoojit and Nuttanont Hongwarittorrn. 2017. Usability Studies on Mobile User Interface Design Patterns: A Systematic Literature Review. Advances in Human-Computer Interaction 2017 (2017), e6787504. Publisher: Hindawi.

Hazwani Rahmat, Hazura Zulzalil, Abdul Azim Abd Ghani, and Azrina Kamaruddin. 2018. A Comprehensive Usability Model for Evaluating Smartphone Apps. Advanced Science Letters 24, 3 (2018), 1633--1637.

Katharina Reinecke, Tom Yeh, Luke Miratrix, Rahmatri Mardiko, Yuechen Zhao, Jenny Liu, and Krzysztof Z. Gajos. 2013. Predicting Users' First Impressions of Website Aesthetics with a Quantification of Perceived Visual Complexity and Colorfulness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, 2049--2058.

Brian D. Ripley. 2007. Pattern Recognition and Neural Networks. Cambridge University Press. Google-Books-ID: m12UR8QmLqoC.

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115, 3 (2015), 211--252.

Bo N. Schenkman and Fredrik U. Jönsson. 2000. Aesthetics and Preferences of Web Pages. Behaviour & Information Technology 19, 5 (2000), 367--377. Publisher: Taylor & Francis _eprint:

Tania Schlatter and Deborah Levinson. 2013. Visual Usability: Principles and Practices for Designing Digital Applications (1st ed.). Morgan Kaufmann.

Mirjam Seckler, Klaus Opwis, and Alexandre N. Tuch. 2015. Linking Objective Design Factors with Subjective Aesthetics: An Experimental Study on How Structure and Color of Websites Affect the Facets of Users' Visual Aesthetic Perception. Computers in Human Behavior 49 (2015), 375--389.

Leslie N. Smith. 2018. A Disciplined Approach to Neural Network Hyperparameters: Part 1 - Learning Rate, Batch Size, Momentum, and Weight Decay. arXiv:1803.09820 [cs, stat] (2018). arXiv:1803.09820

Leslie N. Smith and Nicholay Topin. 2019. Super-convergence: Very Fast Training of Neural Networks Using Large Learning Rates. In Proceedings of the Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, Vol. 11006. International Society for Optics and Photonics, 1100612.

Marina Sokolova and Guy Lapalme. 2009. A Systematic Analysis of Performance Measures for Classification Tasks. Information Processing & Management 45, 4 (2009), 427--437.

Igor Solecki, João Porto, Nathalia da Cruz Alves, Christiane Gresse von Wangenheim, Jean Hauck, and Adriano Ferreti Borgatto. 2020. Automated Assessment of the Visual Design of Android Apps Developed with App Inventor. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (New York, NY, USA) (SIGCSE '20). Association for Computing Machinery, 51--57.

Seyyed Ehsan Salamati Taba, Iman Keivanloo, Ying Zou, Joanna Ng, and Tinny Ng. 2014. An Exploratory Study on the Relation between User Interface Complexity and the Perceived Quality. In Web Engineering (Cham) (Lecture Notes in Computer Science), Sven Casteleyn, Gustavo Rossi, and Marco Winckler (Eds.). Springer International Publishing, 370--379.

Noam Tractinsky. 2013. Visual Aesthetics. In The Encyclopedia of Human Interaction (2nd ed.). [link].

Noam Tractinsky, Avivit Cokhavi, Moti Kirschenbaum, and Tal Sharfi. 2006. Evaluating the Consistency of Immediate Aesthetic Perceptions of Web Pages. International Journal of Human-Computer Studies 64, 11 (2006), 1071--1083.

Noam Tractinsky, A.S Katz, and D Ikar. 2000. What is Beautiful is Usable. Interacting with Computers 13, 2 (2000), 127--145.

Alexandre N. Tuch, Sandra P. Roth, Kasper Hornbæk, Klaus Opwis, and Javier A. Bargas-Avila. 2012. Is Beautiful Really Usable? Toward Understanding the Relation Between Usability, Aesthetics, and Affect in HCI. Computers in Human Behavior 28, 5 (2012), 1596--1607.

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How Transferable Are Features in Deep Neural Networks? arXiv:1411.1792 [cs] (2014). arXiv:1411.1792

Mathieu Zen and Jean Vanderdonckt. 2016. Assessing User Interface Aesthetics based on the Inter-subjectivity of Judgment. In Proceedings of the 30th International BCS Human Computer Interaction Conference (Poole, UK). BCS Learning & Development.
LIMA, Adriano Luiz de Souza; MARTINS, Osvaldo P. Heiderscheidt Roberge; GRESSE VON WANGENHEIM, Christiane; VON WANGENHEIM, Aldo; BORGATTO, Adriano Ferreti; HAUCK, Jean C. R.. Automated assessment of visual aesthetics of Android user interfaces with deep learning. In: SIMPÓSIO BRASILEIRO SOBRE FATORES HUMANOS EM SISTEMAS COMPUTACIONAIS (IHC), 21. , 2022, Diamantina. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 .