Using Visual Features and Early Views to Classify the Popularity of Facebook Videos
Keywords:Popularity, Video Analysis, Visual Features, Random Forest
These days it is easy to create and share content online. Millions of people create and share their online content, that is consumed by millions more, daily. This flow of content and consumption has been used as a channel for disseminating digital advertisements, generating publicity for brands and financial return for content creators. Thus, identifying whether a video will be popular in the first moments after its publication is of great value to advertisers. Using Random Forest, we classify Facebook videos as popular or unpopular based on their number of views, using early views and visual features extracted from the videos as predictor features. Our results indicate that using the combination of early views with visual features yields the best results, allowing the prediction of popularity to be made as early as possible.
About Facebook (2018). Facebook watch: What we’ve built and what’s ahead. Online [link]; accessed 13-Sep-2021.
About Facebook (2020). The evolution of facebook watch. Online [link]; accessed 07-Nov-2022.
Breiman, L. (2001). Random forests. Machine Learning, 45(1):5-32. DOI: 10.1023/A:1010933404324.
Dalmoro, B. M. and Musse, S. R. (2021). Predicting popularity of facebook videos through visual features using support vector machine classifier. In Anais do XLVIII Seminário Integrado de Software e Hardware (SEMISH 2021). Sociedade Brasileira de Computação - SBC. DOI: 10.5753/semish.2021.15815.
Kuhn, M. and Johnson, K. (2013). Applied predictive modeling, volume 26. Springer.
Kuhn, M. et al. (2008). Building predictive models in r using the caret package. Journal of statistical software, 28(5):1-26.
Landis, J. R. and Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1):159-174.
Ouyang, S., Li, C., and Li, X. (2016). A peek into the future: Predicting the popularity of online videos. IEEE Access, 4:3026–3033. DOI: 10.1109/access.2016.2580911.
R Core Team (2020). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Rodriguez, J. D., Perez, A., and Lozano, J. A. (2010). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3):569-575. DOI: 10.1109/TPAMI.2009.187.
Spearman, C. (1904). The proof and measurement of association between two things. The American Journal of Psychology, 15(1):72-101.
Trzciński, T. and Rokita, P. (2017). Predicting popularity of online videos using support vector regression. IEEE Transactions on Multimedia, 19(11):2561-2570.
Vieira, S. M., Kaymak, U., and Sousa, J. M. C. (2010). Cohen's kappa coefficient as a performance measure for feature selection. In International Conference on Fuzzy Systems, pages 1-8. DOI: 10.1109/FUZZY.2010.5584447.
Youtube for Press (2022). Youtube by the numbers. Online [link]; accessed 07-Nov-2022.
Zohourian, A., Sajedi, H., and Yavary, A. (2018). Popularity prediction of images and videos on instagram. In 2018 4th International Conference on Web Research (ICWR). IEEE. DOI: 10.1109/icwr.2018.8387246.
How to Cite
Copyright (c) 2022 The authors
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.