Partial Least Squares: A Deep Space Odyssey
ResumoModern visual pattern recognition models are based on deep convolutional networks. Such models are computationally expensive, hindering applicability on resource-constrained devices. To handle this problem, we propose three strategies. The first removes unimportant structures (neurons or layers) of convolutional networks, reducing their computational cost. The second inserts structures to design architectures automatically, enabling us to build high-performance networks. The third combines multiple layers of convolutional networks, enhancing data representation at negligible additional cost. These strategies are based on Partial Least Squares (PLS) which, despite promising results, is infeasible on large datasets due to memory constraints. To address this issue, we also propose a discriminative and low-complexity incremental PLS that learns a compact representation of the data using a single sample at a time, thus enabling applicability on large datasets.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Computer Vision and Pattern Recognition (CVPR).
Kolesnikov, A., Beyer, L., Zhai, X., Puigcerver, J., Yung, J., Gelly, S., and Houlsby, N. (2020). Big transfer (bit): General visual representation learning. In European Conference on Computer Vision (ECCV).
Lacoste, A., Luccioni, A., Schmidt, V., and Dandres, T. (2019). Quantifying the carbon emissions of machine learning. In Neural Information Processing Systems (NeurIPS).
Li, Y., Yang, M., and Zhang, Z. (2019). A survey of multi-view representation learning. Transactions on Knowledge and Data Engineering, 31(10):1863–1883.
Luo, J.-H. and Wu, J. (2020). Neural network pruning with residual-connections and limited-data. In Conference on Computer Vision and Pattern Recognition (CVPR).
Sharma, A. and Jacobs, D. W. (2011). Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In Conference on Computer Vision and Pattern Recognition (CVPR).
Sindagi, V. and Patel, V. M. (2019). Multi-level bottom-top and top-bottom feature fusion for crowd counting. In International Conference on Computer Vision (ICCV).
Strubell, E., Ganesh, A., and McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. In Conference of the Association for Computational Linguistics.
Suau, X., Zappella, L., and Apostoloff, N. (2020). Filter distillation for network compression. In Winter Conference on Applications of Computer Vision (WACV).
Tan, M. and Le, Q. V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning (ICML).
Yang, L., Han, Y., Chen, X., Song, S., Dai, J., and Huang, G. (2020). Resolution adaptive networks for efficient inference. In Conference on Computer Vision and Pattern Recognition (CVPR).
Zeng, X. and Li, G. (2014). Incremental partial least squares analysis of big streaming data. Pattern Recognition, 47:3726–3735.
Zoph, B., Vasudevan, V., Shlens, J., and Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. In Conference on Computer Vision and Pattern Recognition (CVPR).